Re: [m5-dev] [PATCH 3 of 3] O3PCU: Split loads and stores that cross cache line boundaries

Steve Reinhardt Tue, 01 Dec 2009 12:31:13 -0800

Looking at just the part I've left below, it looks like you're
separating out the split vs non-split calls at the top, and then they
combine back into common functions at the bottom... I think it would
be cleaner if we got rid of the separate calls, and just used NULL
values for the split packets all the way down the call stack as a
signal that this isn't a split call.  That is, get rid of all the
    if (!isSplit) { } else { }
code since it all ends up at the same place anyway.


What would *really* be nice is if we could keep a common code path,
but compile out all the split-access code for ISAs that don't need it
(based on some const bool value in the TheISA namespace)... but that's
definitely extra credit :-).  I'm fine with just getting the current
patches checked in and leaving that for the indefinite future.  Just
thought I'd mention it in case you had any ideas.

Steve

On Mon, Nov 9, 2009 at 5:30 AM, Timothy M. Jones <[email protected]> wrote:
> # HG changeset patch
> # User Timothy M. Jones <[email protected]>
> # Date 1257772288 0
> # Node ID 1c63ee4b8afa271d3ae645419e37913bbb97fe6b
> # Parent  da27e67385cca6cf4dd6d18cdead5cfd54559afb
> O3PCU: Split loads and stores that cross cache line boundaries.
>
> When each load or store is sent to the LSQ, we check whether it will cross a
> cache line boundary and, if so, split it in two. This creates two TLB
> translations and two memory requests. Care has to be taken if the first
> packet of a split load is sent but the second blocks the cache. Similarly,
> for a store, if the first packet cannot be sent, we must store the second
> one somewhere to retry later.
>
> This modifies the DataTranslation class to support split translations. It
> also adds state into the LSQSenderState class to record both packets in a
> split load or store.
>
> diff --git a/src/cpu/base_dyn_inst.hh b/src/cpu/base_dyn_inst.hh
> --- a/src/cpu/base_dyn_inst.hh
> +++ b/src/cpu/base_dyn_inst.hh
> @@ -871,12 +879,24 @@
>     Request *req = new Request(asid, addr, sizeof(T), flags, this->PC,
>                                thread->contextId(), threadNumber);
>
> -    initiateTranslation(req, NULL, BaseTLB::Read);
> +    BaseTLB::Mode mode = BaseTLB::Read;
> +    Request *sreqLow = NULL;
> +    Request *sreqHigh = NULL;
> +
> +    bool isSplit = splitRequest(req, sreqLow, sreqHigh);
> +    if (!isSplit) {
> +        initiateTranslation(req, NULL, mode);
> +    } else {
> +        initiateSplitTranslation(req, sreqLow, sreqHigh, NULL, mode);
> +    }
>
>     effAddr = req->getVaddr();
>     effAddrValid = true;
>     if (fault == NoFault) {
> -        cpu->read(req, data, lqIdx);
> +        if (!isSplit)
> +            cpu->read(req, data, lqIdx);
> +        else
> +            cpu->read(true, req, sreqLow, sreqHigh, data, lqIdx);
>     } else {
>
>         // Return a fixed value to keep simulation deterministic even
> @@ -910,48 +930,103 @@
>     Request *req = new Request(asid, addr, sizeof(T), flags, this->PC,
>                                thread->contextId(), threadNumber);
>
> -    initiateTranslation(req, res, BaseTLB::Write);
> +    BaseTLB::Mode mode = BaseTLB::Write;
> +    Request *sreqLow = NULL;
> +    Request *sreqHigh = NULL;
> +
> +    bool isSplit = splitRequest(req, sreqLow, sreqHigh);
> +    if (!isSplit) {
> +        initiateTranslation(req, res, mode);
> +    } else {
> +        initiateSplitTranslation(req, sreqLow, sreqHigh, res, mode);
> +    }
>
>     effAddr = req->getVaddr();
>     effAddrValid = true;
>     if (fault == NoFault) {
> -        cpu->write(req, data, sqIdx);
> +        if (!isSplit)
> +            cpu->write(req, data, sqIdx);
> +        else
> +            cpu->write(true, req, sreqLow, sreqHigh, data, sqIdx);
>     }
>
>     return fault;
>  }
> @@ -445,10 +477,20 @@
>     template <class T>
>     Fault read(Request *req, T &data, int load_idx);
>
> +    /** Executes the split load at the given index. */
> +    template <class T>
> +    Fault read(bool isSplit, Request *req, Request *sreqLow,
> +               Request *sreqHigh, T &data, int load_idx);
> +
>     /** Executes the store at the given index. */
>     template <class T>
>     Fault write(Request *req, T &data, int store_idx);
>
> +    /** Executes the split store at the given index. */
> +    template <class T>
> +    Fault write(bool isSplit, Request *req, Request *sreqLow,
> +                Request *sreqHigh, T &data, int store_idx);
> +
>     /** Returns the index of the head load instruction. */
>     int getLoadHead() { return loadHead; }
>     /** Returns the sequence number of the head load instruction. */
> @@ -484,6 +526,15 @@
>  Fault
>  LSQUnit<Impl>::read(Request *req, T &data, int load_idx)
>  {
> +    return read(false, req, NULL, NULL, data, load_idx);
> +}
> +
> +template <class Impl>
> +template <class T>
> +Fault
> +LSQUnit<Impl>::read(bool isSplit, Request *req, Request *sreqLow,
> +                    Request *sreqHigh, T &data, int load_idx)
> +{
>     DynInstPtr load_inst = loadQueue[load_idx];
>
>     assert(load_inst);
> @@ -705,6 +826,15 @@
>  Fault
>  LSQUnit<Impl>::write(Request *req, T &data, int store_idx)
>  {
> +    return write(false, req, NULL, NULL, data, store_idx);
> +}
> +
> +template <class Impl>
> +template <class T>
> +Fault
> +LSQUnit<Impl>::write(bool isSplit, Request *req, Request *sreqLow,
> +                     Request *sreqHigh, T &data, int store_idx)
> +{
>     assert(storeQueue[store_idx].inst);
>
>     DPRINTF(LSQUnit, "Doing write to store idx %i, addr %#x data %#x"
_______________________________________________
m5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/m5-dev

Re: [m5-dev] [PATCH 3 of 3] O3PCU: Split loads and stores that cross cache line boundaries

Reply via email to