Re: [gem5-dev] O3 FS and Ruby

Nilay Thu, 29 Dec 2011 20:59:36 -0800

On Thu, December 29, 2011 9:06 pm, Korey Sewell wrote:
> Hi Nilay,
> are you saying that the the sysret instruction is in the middle of some
> microops which includes a CS instruction. Then, in the middle of a microop
> sequence, a interrupt happens. When it comes back from the interrupt, it
> then resumes the microcode but since the CS microop is marked user mode it
> faults from looking up a kernel page.

I think your understanding is pretty close to mine.

>
> A few questions:
> 1. If no interrupt would occur in the middle of the sysret, would the CS
> microop cause a fault?

I don't expect so. It only touches upon some of the registers, does not
even perform any loads / stores.

> 2. In the segfault message, can you match the address of the "rip" pointer
> with whatever instructions set the rip pointer in the M5 trace?

As per the trace, the rip in the message is the address of the sysret
instruction.

> 3. I'm also unsure how squashing solves this (still trying to grasp the
> whole problem). It seems like the CS instruction needs to be marked kernel
> if it was microcode from a originally kernel instruction.

My understanding is that problem is with the way we implement the flag
isSquashAfter. I think the idea behind this flag is that certain
instructions right some control registers, which can change the way
instructions are fetched. For example, the CS register is over written by
the sysret instruction. So next instruction (the one after sysret), should
be fetched from a user page, instead of it being from the kernel page that
contains sysret. But since fetch module starts fetching the next
instruction almost immediately, it would almost certainly fetch the next
instruction on the kernel page after the sysret instruction. Hence the
need for the flag isSquashAfter. If a microop with this flag is committed,
then all the microops following that microop are squashed.
This should work fine, but the problem is that this also empties the
instList maintained by the CPU which triggers the interrup handler
function in commit_impl.hh, which in turn schedules the trap event. If the
flag isSquashAfter only squashes microops that belong to the next
instruction and not this instruction, then, I think, the instList would
not be empty in the middle of the instruction and the interrupt would be
handled only at the end of the instruction.

>
> I'm admittedly a little behind in this convo, but would like to help if I
> can. Please clarify any misunderstandings if you can. If I can grasp the
> problem then I can better help/suggest how the squashing mechanisms can be
> updated to support this.
>
> On Thu, Dec 29, 2011 at 1:49 PM, Nilay Vaish <[email protected]> wrote:
>
>> Ok, here is another shot at explaining what might be going on wrong. I
>> think this is what happens --
>>
>> 1. A sysret instruction is in execution. It commits some of its
>> microops,
>> including the one that write the CS selector.
>>
>> 2. There is a pending interrupt, but it is not being handled since the
>> cpu->instList() is non-empty.
>>
>> 3. One of the microops is marked isSquashAfter. So after this microop
>> commmits, all the microops after it are squashed. This empties the
>> cpu->instList().
>>
>> 4. The interrupt is handled, and a trap is scheduled.
>>
>> 5. The trap squashes sysret's microops (which were added again by
>> fetch),
>> changes the PC to Microcode_ROM:<something>.
>>
>> 6. It seems this microcode saves the CS that was written by the sysret,
>> but the instruction pointer saved is the pointer to the sysret
>> instruction
>> since this instruction is yet to complete.
>>
>> 7. When the iret instruction tries to restore the context to the
>> previously executing context (which is the sysret instruction), it
>> restores
>> the CS as the one that the sysret instruction wrote. This CS has a user
>> mode privilege level. But rip points to the sysret instruction, which
>> should be on a kernel mode page since we are returning from a system
>> call.
>>
>> 8. When the TLB lookup is performed for fetching the instruction pointed
>> to by the rip, the TLB generates a page fault, in which a user mode
>> process
>> is trying to access a kernel mode page. This instruction reaches the
>> commit
>> stage of the processor at which point (I think) the operating system
>> realizes that some thing wrong has happened and it outputs the segfault
>> message to the terminal.
>>
>>
>> I think the fix remains the same, isSquashAfter should not squash all
>> the
>> microops, but only those that do not belong to this particular
>> instruction.
>> Or else, fetch should simply not fetch any microop once it sees a
>> microop
>> that is marked isSquashAfter.
>>
>> Any opinions on this?
>>
>>
>> --
>> Nilay
>> ______________________________**_________________
>> gem5-dev mailing list
>> [email protected]
>> http://m5sim.org/mailman/**listinfo/gem5-dev<http://m5sim.org/mailman/listinfo/gem5-dev>
>>
>
>
>
> --
> - Korey
> _______________________________________________
> gem5-dev mailing list
> [email protected]
> http://m5sim.org/mailman/listinfo/gem5-dev
>

--
Nilay

_______________________________________________
gem5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/gem5-dev

Re: [gem5-dev] O3 FS and Ruby

Reply via email to