I haven't looked at the code... how is the x86 page-table walk handled currently? Is it done in microcode or do we have a "hardware" state machine for it? It seems to me that in the long run we want an autonomous HW page-table walker, and that the idea of a "TLB miss fault" for x86 should go away.
One way to change the CPU/memory interface that might not be too disruptive to the CPU models and would also mirror a real HW implementation more closely (always a good sign, IMO) would be simply to push translation to the other side of the decoupled callback interface. In Gabe's model, this would be (for timing mode): 1. Instruction generates request. 2. CPU asks TLB/cache to translate and satisfy request. There is no immediate feedback. 3. Get coffee while request is handled. 4. The request comes back, possibly indicating a fault. If there's a fault, handle it; if not, finish the instruction. Then all of the translate/page-table walk/skip cache on page fault etc. stuff happens concurrently in the memory system in step 3. I haven't thought through what this implies in detail... is the TLB now a first-class memory-system object with a Port interface that sits between the CPU and the cache? If that's too much overhead, is there a better way to do it? Steve
_______________________________________________ m5-dev mailing list [email protected] http://m5sim.org/mailman/listinfo/m5-dev
