Actually the EAComp and MemAcc sub-instructions have been around since the beginning of time, as this is how the original SimpleScalar-based out-of-order CPU model split up loads and stores (since it did non-speculative memory disambiguation it needed the addresses before it would issue the references to the memory system, IIRC). The switch to initiateAcc/completeAcc only came with the switch to real split-transaction memory accesses in timing mode, and it turns out that since O3 does speculative memory disambiguation we don't need to split the effective address computation from the memory access initiation for any of our current models. However in the general case there are up to three independent steps:
1. Calculate EA 2. Initiate access 3. Complete access Right now execute() does 1-3, initiateAcc() does 1-2, completeAcc() does 3, EAComp::execute() does 1, and MemAcc::execute() does 2-3. I'm fine with ditching MemAcc, since the EAComp/MemAcc split only makes sense for a more detailed pipeline model, and if your model is that detailed you should use timing-mode memory, which means that MemAcc won't work. So it's basically pointless. However it sounds like Korey needs EAComp for his in-order pipeline, so I'm not sure why he's proposing to get rid of it. It would be nice to know more about what the in-order model needs so we can do the right thing there. If we do need to split out the effective address computation, then it may make sense to have a separate function for step 2 above as well, though the overhead of redundantly doing step 1 twice with an EAComp::execute/initiateAcc/completeAcc sequence is probably negligible. As far as the purported bug, I don't think there is one; this code has worked since the dawn of time so I'd be surprised if something turned up now (unless it was something that got tickled by some subtle difference between MIPS and Alpha). The key thing to note as far as Korey's observation about register indices is that EAComp and MemAcc are StaticInst objects of their own right, and have their own srcRegIdx and destRegIdx arrays. This was (is?) necessary since in the old SS OOO model these two sub-instructions got scheduled independently, so for example the EAComp of a store could execute as soon as the base address reg was available regardless of whether the store data reg was available. So in Korey's example, even though Rb may be at index 1 in the top-level instruction's srcRegIdx array and EAComp is reading index 0, it should work OK since Rb should be at index 0 in the EAComp instruction's srcRegIdx array. You can look at the Foo::EAComp::EAComp constructors in decoder.cc in the Alpha ISA code to see this at work. Steve _______________________________________________ m5-dev mailing list [email protected] http://m5sim.org/mailman/listinfo/m5-dev
