Sorry, I should have clarified that my definition of empty was this exactly, ie the event queue is only scheduling an event at the end of time, therefore the simulation quits. What I am unsure of is why this deadlock occurs with the %strand_sts_reg sequence of instructions.
On Fri, Mar 16, 2018 at 3:35 AM, Gabe Black <[email protected]> wrote: > The event queue should never be empty, since gem5 itself schedules an event > at the end of time (or at least the maximum value of a Tick) which ends the > simulation. If that event executes, that means your simulation has > deadlocked and stopped generating events, potentially because all the > threads of execution and asychronous sources of events (simulated timers, > etc.) have gone to sleep and stopped doing things, potentially because > they're all waiting for something else to wake them back up and put them to > work. > > Gabe > > On Thu, Mar 15, 2018 at 9:18 PM, Khalique Ahmed <[email protected]> > wrote: > > > Hi, > > > > I had mentioned in a previous thread that when running SPARC fs mode, > there > > is a random chance that the simulator may abruptly exit because the event > > queue is not being filled properly. After performing some fault injection > > experiments (related to my research, but an unintended side effect), I > was > > able to create deterministic scenarios in which this bug occurs. I > gathered > > an execution trace, and I believe I found the culprit: > > > > 2363127130000: system.cpu 0x40ac18 : rd %strand_sts_reg, %g1 : > > 2363127130500: system.cpu 0x40ac1c : wr %g1, 0x1, %strand_sts_reg : > > > > These were the last set of instructions before exiting in all of the > > execution traces where the bug was present. If I interpret these sequence > > of instructions correctly, the value of %strand_sts_reg is first loaded > > into %g1, then using %g1 and an immediate (here the LSB, hence 0x1), a > > specific field of %strand_sts_reg can be set. Currently the > implementation > > XORs %g1 and the immediate, so each time this sequence of instructions > are > > executed, the LSB in %strand_sts_reg alternates. > > > > For whatever reason, the alternating of the LSB is causing this empty > event > > queue bug to occur. As a quick test, I pulled the value of the LSB to 1 > by > > replacing XOR with OR. Once this change was made, I reran fs mode, and I > > currently do not see any immediate side effects, and the empty event > queue > > bug has not appeared, even after letting the simulation run for 1.8E12 > > ticks. I haven't looked too closely into all of the details of the Strand > > Status Register, so I can't say for sure what the implications are for my > > temporary patch. > > > > > > -- > > Thanks, > > Khalique Ahmed > > _______________________________________________ > > gem5-dev mailing list > > [email protected] > > http://m5sim.org/mailman/listinfo/gem5-dev > _______________________________________________ > gem5-dev mailing list > [email protected] > http://m5sim.org/mailman/listinfo/gem5-dev -- Thanks, Khalique Ahmed _______________________________________________ gem5-dev mailing list [email protected] http://m5sim.org/mailman/listinfo/gem5-dev
