Ok, here is another shot at explaining what might be going on wrong. I
think this is what happens --
1. A sysret instruction is in execution. It commits some of its microops,
including the one that write the CS selector.
2. There is a pending interrupt, but it is not being handled since the
cpu->instList() is non-empty.
3. One of the microops is marked isSquashAfter. So after this microop
commmits, all the microops after it are squashed. This empties the
cpu->instList().
4. The interrupt is handled, and a trap is scheduled.
5. The trap squashes sysret's microops (which were added again by fetch),
changes the PC to Microcode_ROM:<something>.
6. It seems this microcode saves the CS that was written by the sysret,
but the instruction pointer saved is the pointer to the sysret
instruction since this instruction is yet to complete.
7. When the iret instruction tries to restore the context to the
previously executing context (which is the sysret instruction), it
restores the CS as the one that the sysret instruction wrote. This CS has
a user mode privilege level. But rip points to the sysret instruction,
which should be on a kernel mode page since we are returning from a system
call.
8. When the TLB lookup is performed for fetching the instruction pointed
to by the rip, the TLB generates a page fault, in which a user mode
process is trying to access a kernel mode page. This instruction reaches
the commit stage of the processor at which point (I think) the operating
system realizes that some thing wrong has happened and it outputs the
segfault message to the terminal.
I think the fix remains the same, isSquashAfter should not squash all the
microops, but only those that do not belong to this particular
instruction. Or else, fetch should simply not fetch any microop once it
sees a microop that is marked isSquashAfter.
Any opinions on this?
--
Nilay
_______________________________________________
gem5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/gem5-dev