Re: [m5-dev] tlb misses in x86

Gabe Black Sat, 17 Jan 2009 20:00:05 -0800

Sorry I didn't reply sooner. Yes that would work up to a point, and is
basically how x86 currently works. A fault is generated, it trickles to
the end of the pipe, and then when that instruction commits, the fault
fixes things up with the page table walker and the instruction replays.
The problem comes from when the page table walk turns up a not-present
page or a protection violation, etc. The new fault needs to slip into
the system someplace at that point. Either the tlb miss could
effectively turn into that fault by invoking the new fault within it's
invoke method, or the instruction could hang around in place waiting for
the translation, and instead of carrying a tlb miss fault to commit
would carry the actual fault. Alternatively the tlb could handle the
miss by recording that the page isn't present and cause the fault the
second time the instruction comes by, but that seems unrealistic since
the instruction will have to make multiple trips through the pipeline
when the information would be available sooner. It would also make the
x86 TLB more complex and a bit slower since it needs to check for those
conditions. The difference in performance would likely be minor.


Gabe

Ali Saidi wrote:
> Yea, it seems like it could start translating, get to commit cause a  
> fault that refetchs or replays that instruction...
>
> Ali
>
> On Jan 16, 2009, at 11:08 PM, Korey Sewell wrote:
>
>   
>> Isn't "before it commits" the same as "before/while it's executing"?
>>
>> On Sat, Jan 17, 2009 at 1:35 AM, Gabe Black <[email protected]>  
>> wrote:
>> I don't think that'll work because the fix up needs to happen
>> before/while the instruction executes, not on the side before it  
>> commits.
>>
>> Korey Sewell wrote:
>>     
>>> OK, I'm not sure why you cant just let the instruction go on as
>>> regular, have some object does does your special x86 stuff, then  
>>>       
>> when
>>     
>>> that finishes signals something to the CPU to acknowledge the fix  
>>>       
>> up.
>>     
>>> Basically, I'm not sure you cant just manipulate the signals that  
>>>       
>> are
>>     
>>> sent between stages and also maybe the condition for when a
>>> instruction commits so that this handles the x86 special case.
>>>
>>> However, I havent thought about it for very long so may be more
>>> complex than I know.
>>>
>>> On Fri, Jan 16, 2009 at 2:51 PM, <[email protected]
>>> <mailto:[email protected]>> wrote:
>>>
>>>     Sure. Also the current process is not inaccurate, or at least
>>>     mostly accurate if
>>>     you want to be picky, for all the ISAs except x86.
>>>
>>>     Currently, translation works like this as I'm sure you know:
>>>     1. Instruction generates request.
>>>     2. CPU asks TLB to translate request possibly generating a  
>>>       
>> fault.
>>     
>>>     3. If there's a fault, the request is dropped and the fault is
>>>     handled.
>>>     4. If not, the translated request is sent to the memory system.
>>>     5. Get coffee while request is handled.
>>>     6. The request comes back and the instruction can be finished.
>>>
>>>     The problem is that with current ISAs if there's a TLB miss that
>>>     generates an
>>>     architected fault which gets handled in the normal way in step  
>>>       
>> 3,
>>     
>>>     and normal
>>>     execution fixes things up. In x86, though, a TLB miss triggers a
>>>     hardware
>>>     mechanism which fixes things up, and the current instruction
>>>     continues as if
>>>     nothing happened. In the case of a TLB miss, x86 would
>>>     realistically do
>>>     something more like:
>>>     1. Instruction generates request.
>>>     2. CPU asks TLB to translate request possibly generating a  
>>>       
>> fault.
>>     
>>>     2.5 Get coffee while page table walk happens.
>>>     3. If there's a fault, the request is dropped and the fault is
>>>     handled.
>>>     4. If not, the translated request is sent to the memory system.
>>>     5. Get coffee while request is handled.
>>>     6. The request comes back and the instruction can be finished.
>>>
>>>     What I've been doing to fake this is that the TLB miss itself
>>>     fixes up the TLB
>>>     when it's invoked. This sort of works, except if the walk itself
>>>     turns up a not
>>>     present page or encounters some other problem. Then you've  
>>>       
>> already
>>     
>>>     started
>>>     handling one fault, so there's nothing to do with the new one.
>>>
>>>     The two options I mentioned before were to either:
>>>     1. Invoke the new fault from the invoke method of the TLB miss.
>>>     2. Change the CPU models so that translation can put off  
>>>       
>> finishing.
>>     
>>>     Gabe
>>>
>>>     Quoting Korey Sewell <[email protected] <mailto:[email protected] 
>>>       
>>>> :
>>>>         
>>>     > Gabe,
>>>     > Can you step-by-step explain what's inaccurate about the  
>>>       
>> current TLB
>>     
>>>     > process?
>>>     >
>>>     > On Wed, Jan 14, 2009 at 6:31 PM, <[email protected]
>>>     <mailto:[email protected]>> wrote:
>>>     >
>>>     > > Has anyone had a chance to give this some thought? Could
>>>     Kevin/Korey
>>>     > > comment on
>>>     > > how hard they think it would be and/or how much overhead  
>>>       
>> there
>>     
>>>     would be to
>>>     > > make
>>>     > > translation be deferrable in O3?
>>>     > >
>>>     > > Gabe
>>>     > >
>>>     > > Quoting [email protected]  
>>>       
>> <mailto:[email protected]>:
>>     
>>>     > >
>>>     > > > I've been putting off starting a discussion about this  
>>>       
>> since
>>     
>>>     I know some
>>>     > > > people
>>>     > > > are otherwise occupied, but it would be useful for it to  
>>>       
>> at
>>     
>>>     least be in
>>>     > > the
>>>     > > > back of someones mind. I haven't spent a huge amount of  
>>>       
>> time
>>     
>>>     thinking
>>>     > > about
>>>     > > > this recently, but I see two possible ways to handle it.
>>>     > > >
>>>     > > > 1. Translation is reworked so that it can be delayed  
>>>       
>> like memory
>>     
>>>     > > transations.
>>>     > > > In
>>>     > > > atomic mode it could be blocking and immediate, and in
>>>     timing mode the
>>>     > > CPU
>>>     > > > would
>>>     > > > get a call back. This isn't ideal because it would require
>>>     changes to the
>>>     > > CPU
>>>     > > > models which would potentially cause performance overhead
>>>     for the other
>>>     > > ISAs,
>>>     > > > potentially break ARM (more?), and would be painful to add
>>>     to O3 in the
>>>     > > long
>>>     > > > term. It's the most realistic, though, in terms of  
>>>       
>> mimicking
>>     
>>>     actual CPUs.
>>>     > > >
>>>     > > > 2. Make the TLB miss fault invoke whichever other faults  
>>>       
>> may
>>     
>>>     come up
>>>     > > inside
>>>     > > > it's
>>>     > > > own invoke method. This would be comparatively easy, but
>>>     would be
>>>     > > inaccurate
>>>     > > > as
>>>     > > > far as performance. It also goes behind the CPU's back as
>>>     far as who is
>>>     > > in
>>>     > > > control of faults/exceptions, etc., and could cause  
>>>       
>> problems
>>     
>>>     with generic
>>>     > > > statistics for instance. I don't know if such statistics  
>>>       
>> exist.
>>     
>>>     > > >
>>>     > > > Gabe
>>>     > > >
>>>     > > > _______________________________________________
>>>     > > > m5-dev mailing list
>>>     > > > [email protected] <mailto:[email protected]>
>>>     > > > http://m5sim.org/mailman/listinfo/m5-dev
>>>     > > >
>>>     > >
>>>     > >
>>>     > >
>>>     > >
>>>     > > _______________________________________________
>>>     > > m5-dev mailing list
>>>     > > [email protected] <mailto:[email protected]>
>>>     > > http://m5sim.org/mailman/listinfo/m5-dev
>>>     > >
>>>     >
>>>     >
>>>     >
>>>     > --
>>>     > ----------
>>>     > Korey L Sewell
>>>     > Graduate Student - PhD Candidate
>>>     > Computer Science & Engineering
>>>     > University of Michigan
>>>     >
>>>
>>>
>>>
>>>
>>>     _______________________________________________
>>>     m5-dev mailing list
>>>     [email protected] <mailto:[email protected]>
>>>     http://m5sim.org/mailman/listinfo/m5-dev
>>>
>>>
>>>
>>>
>>> --
>>> ----------
>>> Korey L Sewell
>>> Graduate Student - PhD Candidate
>>> Computer Science & Engineering
>>> University of Michigan
>>>  
>>>       
>> ------------------------------------------------------------------------
>>     
>>> _______________________________________________
>>> m5-dev mailing list
>>> [email protected]
>>> http://m5sim.org/mailman/listinfo/m5-dev
>>>
>>>       
>> _______________________________________________
>> m5-dev mailing list
>> [email protected]
>> http://m5sim.org/mailman/listinfo/m5-dev
>>
>>
>>
>> -- 
>> ----------
>> Korey L Sewell
>> Graduate Student - PhD Candidate
>> Computer Science & Engineering
>> University of Michigan
>> _______________________________________________
>> m5-dev mailing list
>> [email protected]
>> http://m5sim.org/mailman/listinfo/m5-dev
>>     
>
> _______________________________________________
> m5-dev mailing list
> [email protected]
> http://m5sim.org/mailman/listinfo/m5-dev
>   

_______________________________________________
m5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/m5-dev

Re: [m5-dev] tlb misses in x86

Reply via email to