Hey Gabe,

My suggestion would be to allow translation to be put off until it has completed or generated a different fault.  It would be a bit of a pain to include, but given that the infrastructure is already there in terms of delaying when cache misses occur, it should be feasible.

Regarding the other downsides, assuming the TLBs are working properly, then hopefully TLB misses shouldn't happen too often.  For TLB hits, ideally there should be a relatively fast response path that will hopefully keep it from having too much overhead.  What issues might you foresee with ARM and this TLB configuration?

Kevin

[email protected] wrote:
Sure. Also the current process is not inaccurate, or at least mostly accurate if
you want to be picky, for all the ISAs except x86.

Currently, translation works like this as I'm sure you know:
1. Instruction generates request.
2. CPU asks TLB to translate request possibly generating a fault.
3. If there's a fault, the request is dropped and the fault is handled.
4. If not, the translated request is sent to the memory system.
5. Get coffee while request is handled.
6. The request comes back and the instruction can be finished.

The problem is that with current ISAs if there's a TLB miss that generates an
architected fault which gets handled in the normal way in step 3, and normal
execution fixes things up. In x86, though, a TLB miss triggers a hardware
mechanism which fixes things up, and the current instruction continues as if
nothing happened. In the case of a TLB miss, x86 would realistically do
something more like:
1. Instruction generates request.
2. CPU asks TLB to translate request possibly generating a fault.
2.5 Get coffee while page table walk happens.
3. If there's a fault, the request is dropped and the fault is handled.
4. If not, the translated request is sent to the memory system.
5. Get coffee while request is handled.
6. The request comes back and the instruction can be finished.

What I've been doing to fake this is that the TLB miss itself fixes up the TLB
when it's invoked. This sort of works, except if the walk itself turns up a not
present page or encounters some other problem. Then you've already started
handling one fault, so there's nothing to do with the new one.

The two options I mentioned before were to either:
1. Invoke the new fault from the invoke method of the TLB miss.
2. Change the CPU models so that translation can put off finishing.

Gabe

Quoting Korey Sewell <[email protected]>:

  
Gabe,
Can you step-by-step explain what's inaccurate about the current TLB
process?

On Wed, Jan 14, 2009 at 6:31 PM, <[email protected]> wrote:

    
Has anyone had a chance to give this some thought? Could Kevin/Korey
comment on
how hard they think it would be and/or how much overhead there would be to
make
translation be deferrable in O3?

Gabe

Quoting [email protected]:

      
I've been putting off starting a discussion about this since I know some
people
are otherwise occupied, but it would be useful for it to at least be in
        
the
      
back of someones mind. I haven't spent a huge amount of time thinking
        
about
      
this recently, but I see two possible ways to handle it.

1. Translation is reworked so that it can be delayed like memory
        
transations.
      
In
atomic mode it could be blocking and immediate, and in timing mode the
        
CPU
      
would
get a call back. This isn't ideal because it would require changes to the
        
CPU
      
models which would potentially cause performance overhead for the other
        
ISAs,
      
potentially break ARM (more?), and would be painful to add to O3 in the
        
long
      
term. It's the most realistic, though, in terms of mimicking actual CPUs.

2. Make the TLB miss fault invoke whichever other faults may come up
        
inside
      
it's
own invoke method. This would be comparatively easy, but would be
        
inaccurate
      
as
far as performance. It also goes behind the CPU's back as far as who is
        
in
      
control of faults/exceptions, etc., and could cause problems with generic
statistics for instance. I don't know if such statistics exist.

Gabe

_______________________________________________
m5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/m5-dev

        


_______________________________________________
m5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/m5-dev

      

--
----------
Korey L Sewell
Graduate Student - PhD Candidate
Computer Science & Engineering
University of Michigan

    




_______________________________________________
m5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/m5-dev


  

_______________________________________________
m5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/m5-dev

Reply via email to