Re: [gem5-dev] Review Request: cpu: some changes to switching

Andreas Sandberg Wed, 26 Jun 2013 02:25:54 -0700

On 06/22/2013 05:33 AM, Nilay Vaish wrote:

On Wed, 19 Jun 2013, Andreas Sandberg wrote:
Hi Nilay,
Sorry for not replying earlier, I've been pretty busy fixing up bugsin the x86 frontend to make SPEC CPU2006 actually work on x86. About20% of the benchmarks in my setup execute unsupported instructionsand don't verify correctly (some even terminate long before they aresupposed to).
My take on the state of things at the moment is that there isprobably a bug if the second CPU is still executing in microcode andno CPU in a drained system should ever be executing microcode. Thepatch as it is now breaks all the assumptions we make about a drainedsystem, so I'm still opposed to committing it.
I did some investigations of my own and it seems like the second CPUis held at pc 0xfffffff0.0 (i.e, not executing microcode early in theboot). Could you try to find out why the second CPU in your system isstuck in microcode? I suspect that the fact that your system is stuckin microcode is actually a bug.
We should probably clean up the whole CPU state mess once and for allat some point. I'm still not exactly sure, despite my implementing myown CPU model, what is expected to happen in a CPU when the differentmethods controlling state transitions are called. Not to mentionwhere the methods are called from and why. We really need to documentthat.
So here is what happens. cpu0 sends an INIT interrupt to cpu1. Thecode for the interrupt appears in the file src/arch/x86/faults.cc,where the processor state is initialized and the micropc is set toX86ISAInst::RomLabels::extern_label_initIntHalt. This label appears insrc/arch/x86/isa/insts/romutil.py. Following five instructions areexecuted:
def rom
{
    extern initIntHalt:
    rflags t1
    limm t2, "~IFBit"
    and t1, t1, t2
    wrflags t1, t0
    halt
    eret
};
The halt instruction suspends the thread context, and micro pc remainssomething close to the address of the label.
Since you observed a different code path, my guess is that both of usare using different versions of the Linux kernel, or you are using acompletely different os. In either case, I am now even more convincedthat a cpu in Idle state means that there is no PC/uPC attached to it.So, you should not be bothered about what its value is.

I wasn't actually waiting for the system to boot completely, so Iprobably never reached the point where the init interrupt was sent.

I'm pretty sure you are mistaken about the lack of a valid PC/uPC. TheIntel manual states: "If aninterrupt (including NMI) is used to resume execution after a HLTinstruction, the saved instruction pointer

(CS:EIP) points to the instruction following the HLT instruction."

So clearly, the PC is valid and will be used in all but special cases.I'm pretty sure I've seen cases (for example idle loops) where the CPUexecutes WFI instructions on ARM (and presumably HLT instructions onx86) where the interrupt handler returns the the instruction after theWFI/HLT. The case where the PC/uPC doesn't matter is just a special case.

I think the cleanest way of solving this would be to leave draining asit is and introduce a "return and halt" microop that we use instead of"halt; eret" in the init microcode. This would solve the deadlockproblem since the CPU wouldn't be executing microcode (the halt_eretinstruction would commit, but suspend execution before executing thefirst instruction following the microcode sequence).


//Andreas

_______________________________________________
gem5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/gem5-dev

Re: [gem5-dev] Review Request: cpu: some changes to switching

Reply via email to