I'll admit that I'm not following this thread very closely, but the
biggest question I have is: is it necessary to actually do the
translation in the translation pipe stage?  Can't it just be there to
burn a cycle while you do the actual translation when executing the
read?

  Nate

On Tue, Apr 14, 2009 at 7:53 AM, Korey Sewell <[email protected]> wrote:
> Comments below:
>>
>> There's still one aspect that confuses me though: why does translation
>> need to be separated out of xc->read() and xc->write()?  The model for
>> all the other CPUs is that translation is part of that process.  I can
>> believe that in the InOrder model you may want to separate the
>> translation and the cache access into separate cycles, but I think
>> that could be done without involving the StaticInst at all.  Basically
>> I'd think you could view the call to xc->read() or xc->write() from
>> initiateAcc "kicks off" the access, but whether the translation and
>> cache access both get started right away or they happen in separate
>> phases is up to the CPU model.
>
> Well, the quick story is I really need to explain the InOrder model in more
> detail and post to the M5sim Wiki! In interim, the IOcpu allows instructions
> to create a schedule of the resources they need per pipeline stage. This
> gives you the flexibility to have an arbitrary amount of pipeline stages and
> also experiment with instructions performing certain functions in different
> parts of the pipeline.
>
> So what happens here is that "TranslateTLB" and "InitiateCacheAccess" are
> two different resource requests from an instruction. I get what you are
> saying that typically the TLB translation just signals the imminent
> CacheAccess so why not join them? Just off the top of my head, you could
> have a case where say for instance you want to run a system "Bare Iron" and
> there is no virtual-to-physical translation necessary (embedded domain?). Or
> maybe you want to model  a  large, multicycle TLB access, but while an
> instruction is performing it's TLB access it can amortize that latency in
> other pipeline stages (e.g. compute store data).  To maintain that
> flexibility of how and if you want to use a TLB then the IOcpu allows that
> to be a separate resource request and on a given stage an instruction has to
> separately ask for access to the TLB and then access to the Cache.
>
> This is why I think a "translate()" function would work well for
> memory-instruction objects and may work better than explicitly asking for
> the size and memaccessflags from the instruction.
>
> OK, so as I wrote that last paragraph, one complication became
>>
>> apparent to me: I think what I wrote applies for read(), but possibly
>> not for write(), as the EA computation and the translation could both
>> be kicked off before the store data is available.  This leads me to my
>> second question: now that EAComp is not a separate sub-instruction
>> with its own source operand list, how do you distinguish the EA
>> operands from the store data operand to allow the EA computation to
>> possibly proceed before the store data is ready?
>
> The EA operand and the store data operand get saved inside the DynInst
> instruction based on their original instruction indexes. When those operands
> are ready, then the translation or the cache access can happen if the EA or
> Store Data are ready respectively. Currently, this is enforced by the
> instruction schedule, but I'm thinking now I should put an assert() or
> sanity-check to make sure you can't use data that hasn't been set yet.
>
>
>
>
> --
> ----------
> Korey L Sewell
> Graduate Student - PhD Candidate
> Computer Science & Engineering
> University of Michigan
>
> _______________________________________________
> m5-dev mailing list
> [email protected]
> http://m5sim.org/mailman/listinfo/m5-dev
>
>
_______________________________________________
m5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/m5-dev

Reply via email to