Hi, I had mentioned in a previous thread that when running SPARC fs mode, there is a random chance that the simulator may abruptly exit because the event queue is not being filled properly. After performing some fault injection experiments (related to my research, but an unintended side effect), I was able to create deterministic scenarios in which this bug occurs. I gathered an execution trace, and I believe I found the culprit:
2363127130000: system.cpu 0x40ac18 : rd %strand_sts_reg, %g1 : 2363127130500: system.cpu 0x40ac1c : wr %g1, 0x1, %strand_sts_reg : These were the last set of instructions before exiting in all of the execution traces where the bug was present. If I interpret these sequence of instructions correctly, the value of %strand_sts_reg is first loaded into %g1, then using %g1 and an immediate (here the LSB, hence 0x1), a specific field of %strand_sts_reg can be set. Currently the implementation XORs %g1 and the immediate, so each time this sequence of instructions are executed, the LSB in %strand_sts_reg alternates. For whatever reason, the alternating of the LSB is causing this empty event queue bug to occur. As a quick test, I pulled the value of the LSB to 1 by replacing XOR with OR. Once this change was made, I reran fs mode, and I currently do not see any immediate side effects, and the empty event queue bug has not appeared, even after letting the simulation run for 1.8E12 ticks. I haven't looked too closely into all of the details of the Strand Status Register, so I can't say for sure what the implications are for my temporary patch. -- Thanks, Khalique Ahmed _______________________________________________ gem5-dev mailing list [email protected] http://m5sim.org/mailman/listinfo/gem5-dev
