Thanks for the more detailed explanation... that helped a lot. Sounds to me like you're on the right track.
Steve On Sun, Nov 13, 2011 at 8:20 PM, Gabe Black <[email protected]> wrote: > No, we're not trying to undo anything. An example might help. Lets look > at a dramatically simplified version of iret, the instruction that > returns from an interrupt handler. The microops might do the following. > > 1. Restore prior privilege level. > 2. If we were in kernel level, skip to 4. > 3. Restore user level stack. > 4. End. > > O3 fetches the bytes that go with iret, decodes that to a macroop, and > starts picking microops out of it. Microop 1 is executed and drops to > user level. Now microop 2 is executed, and O3 misspeculates that the > branch is taken (for example). The mispredict is detected, and later > microops in flight are squashed. O3 then attempts to restart where it > should have gone, microop 3. > > Now, O3 looks at the PC involved and starts fetching the bytes which > become the macroop which the microops are pulled from. Because microop 1 > successfully completed, the CPU is now at user level, but because the > iret is on a kernel page, it can't be accessed. The kernel gets a page > fault. > > As I mentioned before, my partially implemented fix is to not only pass > back the PC, but to also pass back the macroop fetch should use instead > of making it refetch memory. The problem is that it's partially > implemented, and the way squashes work in O3 make it really tricky to > implement it properly, or to tell whether or not it's implemented properly. > > Gabe > > > On 11/13/11 19:21, Steve Reinhardt wrote: > > I'd like to understand the issue a little better before commenting on a > > solution. > > > > Gabe, when you say "instruction" in your original description, do you > mean > > micro-op? > > > > It seems to me that the fundamental problem is that we're trying to undo > > the effects of a non-speculative micro-op, correct? So the solution > you're > > pursuing is that branch mispredictions only roll back to the offending > > micro-op, and don't force the entire macro-op containing that micro-op to > > re-execute? > > > > Is this predicted control flow entirely internal to the macro-op? Or is > > this an RFI where we are integrating the control transfer and the > privilege > > change? If it is the latter, why does the RFI need to get squashed at > all? > > > > Steve > > > > On Sun, Nov 13, 2011 at 4:34 PM, Gabe Black <[email protected]> > wrote: > > > >> Yes, this is an existing bug and the branch predictor just pokes things > >> in the right way to expose it. The macroop isn't passed back in this > >> particular case, and with the code the way it is, it's difficult to even > >> tell that that's the case, let alone how to fix it. Cleaning things up > >> won't fix the problem itself, but it will make fixing the actual problem > >> tractable. > >> > >> Gabe > >> > >> On 11/13/11 16:16, Ali Saidi wrote: > >>> I think this bug is just latently in the code right now and the branch > >> predictor change runs into it (this patch causes that branch to be > >> mispredicted). In any case I think the issue exists today and it's just > >> luck that it works currently. > >>> Looking at your list I imagine you should be able to recover most > things > >> from the dyninst, however I don't know if that is actually the case. > >> Excepted that the squashing mechanisms should be cleaned up, I'm not > sure > >> how that is actually going to solve the problem. Don't we currently send > >> back the instruction? With the current instructions can't you figure out > >> the macro-op it belongs to? > >>> Ali > >>> > >>> > >>> > >>> On Nov 13, 2011, at 5:40 PM, Gabe Black wrote: > >>> > >>>> Hey folks. Ali has had a change out for a while ("Fix several Branch > >>>> Predictor issues") which improves branch predictor performance > >>>> substantially but breaks X86_FS on O3. It turns out the problem is > that > >>>> an instruction is started which returns from kernel to user level and > is > >>>> microcoded. The instruction is fetched from the kernel's address space > >>>> successfully and starts to execute, along the way dropping down to > user > >>>> mode. Some microops later, there's some microop control flow which O3 > >>>> mispredicts. When it squashes the mispredict and tries to restart, it > >>>> first tries to refetch the instruction involved. Since it's now at > user > >>>> level and the instruction is on a kernel level only page, there's a > page > >>>> fault and things go downhill from there. > >>>> > >>>> I partially implemented a solution to this before where O3 reinstates > >>>> the macroop it had been using when it restarts fetch. The problem here > >>>> is that the path this kind of squash takes doesn't pass back the right > >>>> information, and my attempts to fix that have been unsuccessful. The > >>>> code that handles squashing in O3 is too complex, there's too much > going > >>>> in all directions, it's not always very clear what affect a change > will > >>>> have in unrelated situations, or which callsites are involved in a > >>>> particular type of fault. > >>>> > >>>> To me, it seems like the first step in fixing this problem is to clean > >>>> up how squashes are handled in O3 so that they can be made to > >>>> consistently handle squashes in non-restartable macroops. > >>>> > >>>> Without having really dug into the specifics, I think we only need two > >>>> pieces of information when squashing, a pointer to the guilty > >>>> instruction and whether execution should start at or after it. It > would > >>>> start at it if the instruction needed to be reexecuted due to a memory > >>>> dependence violation, for instance, and would start after it for > faults, > >>>> interrupts, or branch mispredicts. Any other information that's needed > >>>> like sequence numbers or actual control flow targets can be retrieved > >>>> from the instructions where needed without having to split everything > >>>> out and pass them around individually. > >>>> > >>>> Is there any obvious problem with doing things this way? I don't think > >>>> I'll personally have a lot of time to dedicate to this at the very > least > >>>> in the short term, but I wanted to get the conversation going so we > know > >>>> what to do when somebody has a chance to do it. > >>>> > >>>> Gabe > >>>> _______________________________________________ > >>>> gem5-dev mailing list > >>>> [email protected] > >>>> http://m5sim.org/mailman/listinfo/gem5-dev > >>>> > >>> _______________________________________________ > >>> gem5-dev mailing list > >>> [email protected] > >>> http://m5sim.org/mailman/listinfo/gem5-dev > >> _______________________________________________ > >> gem5-dev mailing list > >> [email protected] > >> http://m5sim.org/mailman/listinfo/gem5-dev > >> > > _______________________________________________ > > gem5-dev mailing list > > [email protected] > > http://m5sim.org/mailman/listinfo/gem5-dev > > _______________________________________________ > gem5-dev mailing list > [email protected] > http://m5sim.org/mailman/listinfo/gem5-dev > _______________________________________________ gem5-dev mailing list [email protected] http://m5sim.org/mailman/listinfo/gem5-dev
