Hi,

I had mentioned in a previous thread that when running SPARC fs mode, there
is a random chance that the simulator may abruptly exit because the event
queue is not being filled properly. After performing some fault injection
experiments (related to my research, but an unintended side effect), I was
able to create deterministic scenarios in which this bug occurs. I gathered
an execution trace, and I believe I found the culprit:

2363127130000: system.cpu 0x40ac18    : rd    %strand_sts_reg, %g1 :
2363127130500: system.cpu 0x40ac1c    : wr    %g1, 0x1, %strand_sts_reg :

These were the last set of instructions before exiting in all of the
execution traces where the bug was present. If I interpret these sequence
of instructions correctly, the value of %strand_sts_reg is first loaded
into %g1, then using %g1 and an immediate (here the LSB, hence 0x1), a
specific field of %strand_sts_reg can be set. Currently the implementation
XORs %g1 and the immediate, so each time this sequence of instructions are
executed, the LSB in %strand_sts_reg alternates.

For whatever reason, the alternating of the LSB is causing this empty event
queue bug to occur. As a quick test, I pulled the value of the LSB to 1 by
replacing XOR with OR. Once this change was made, I reran fs mode, and I
currently do not see any immediate side effects, and the empty event queue
bug has not appeared, even after letting the simulation run for 1.8E12
ticks. I haven't looked too closely into all of the details of the Strand
Status Register, so I can't say for sure what the implications are for my
temporary patch.


-- 
Thanks,
Khalique Ahmed
_______________________________________________
gem5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/gem5-dev

Reply via email to