I'll admit that I'm not following this thread very closely, but the biggest question I have is: is it necessary to actually do the translation in the translation pipe stage? Can't it just be there to burn a cycle while you do the actual translation when executing the read?
Nate On Tue, Apr 14, 2009 at 7:53 AM, Korey Sewell <[email protected]> wrote: > Comments below: >> >> There's still one aspect that confuses me though: why does translation >> need to be separated out of xc->read() and xc->write()? The model for >> all the other CPUs is that translation is part of that process. I can >> believe that in the InOrder model you may want to separate the >> translation and the cache access into separate cycles, but I think >> that could be done without involving the StaticInst at all. Basically >> I'd think you could view the call to xc->read() or xc->write() from >> initiateAcc "kicks off" the access, but whether the translation and >> cache access both get started right away or they happen in separate >> phases is up to the CPU model. > > Well, the quick story is I really need to explain the InOrder model in more > detail and post to the M5sim Wiki! In interim, the IOcpu allows instructions > to create a schedule of the resources they need per pipeline stage. This > gives you the flexibility to have an arbitrary amount of pipeline stages and > also experiment with instructions performing certain functions in different > parts of the pipeline. > > So what happens here is that "TranslateTLB" and "InitiateCacheAccess" are > two different resource requests from an instruction. I get what you are > saying that typically the TLB translation just signals the imminent > CacheAccess so why not join them? Just off the top of my head, you could > have a case where say for instance you want to run a system "Bare Iron" and > there is no virtual-to-physical translation necessary (embedded domain?). Or > maybe you want to model a large, multicycle TLB access, but while an > instruction is performing it's TLB access it can amortize that latency in > other pipeline stages (e.g. compute store data). To maintain that > flexibility of how and if you want to use a TLB then the IOcpu allows that > to be a separate resource request and on a given stage an instruction has to > separately ask for access to the TLB and then access to the Cache. > > This is why I think a "translate()" function would work well for > memory-instruction objects and may work better than explicitly asking for > the size and memaccessflags from the instruction. > > OK, so as I wrote that last paragraph, one complication became >> >> apparent to me: I think what I wrote applies for read(), but possibly >> not for write(), as the EA computation and the translation could both >> be kicked off before the store data is available. This leads me to my >> second question: now that EAComp is not a separate sub-instruction >> with its own source operand list, how do you distinguish the EA >> operands from the store data operand to allow the EA computation to >> possibly proceed before the store data is ready? > > The EA operand and the store data operand get saved inside the DynInst > instruction based on their original instruction indexes. When those operands > are ready, then the translation or the cache access can happen if the EA or > Store Data are ready respectively. Currently, this is enforced by the > instruction schedule, but I'm thinking now I should put an assert() or > sanity-check to make sure you can't use data that hasn't been set yet. > > > > > -- > ---------- > Korey L Sewell > Graduate Student - PhD Candidate > Computer Science & Engineering > University of Michigan > > _______________________________________________ > m5-dev mailing list > [email protected] > http://m5sim.org/mailman/listinfo/m5-dev > > _______________________________________________ m5-dev mailing list [email protected] http://m5sim.org/mailman/listinfo/m5-dev
