Vanchinathan,

On the top of my head, and /functionally/ speaking (no particular idea about timings when you do selective replay) :

As I previously said, beforehand, you need to :
- Modify src/cpu/o3/inst_queue_impl.hh and/or src/cpu/o3/dep_graph.hh so that wakeupDependents in the former does not clear register dependencies. - Ensure the same thing in src/cpu/o3/mem_dep_unit_impl.hh but for memory dependencies.

And after you find your memory order violation, you need to :
- Replace the code that calls for a squash by a walk of the register/memory dependency graph (respectively in mem_dep_unit_impl.hh and dep_graph.hh). - During the walk, resetting the DynInst flags as needed (clear the Issued flag for instance, I believe this is in src/cpu/base_dyn_inst_impl.hh) and restore dependencies as needed (this depends on how you choose to augment dep_graph.hh/inst_queue_impl.hh/mem_dep_unit_impl.hh). You might also want to take a look at what happens in src.cpu/o3/store_sets.hh/cc when you do that. - Add to the ready list of inst_queue_impl.hh the instructions that you need to replay and that are ready (probably the few ones depending on the store that just triggered the memory order violation) through addIfReady() or addReadyMemInst() (one may call the other, both are in inst_queue_impl.hh). - Ensure no instruction needing to be replayed get committed (this should be done using the DynInst flags actually).

Caveats :
- This is not an exhaustive step-by-step howto do selective replay with gem5 but it should point you to the files you need to change.
- This does not say anything about the time it takes to do selective replay.
- This does not address the issue of what you do when you issue an instruction : do you remove it from the IQ (then where do you replay it from? Reinsert in the IQ from the ROB? Dedicated buffer?) or do you keep it until you are sure it executed correctly (so, virtually reducing the size of the IQ). - I believe some care has to be taken wrt nonSpeculative instructions and barriers but I may be wrong.

Some related references on replay :

Understanding Scheduling Replay Schemes by Kim and Lipasti
Recovery mechanism for latency misprediction by Morancho et al.

They deal with latency misprediction and not memory order violation, but they do discuss selective replay. Hope it helps.


--
Arthur Perais
INRIA Bretagne Atlantique
Bâtiment 12E, Bureau E303, Campus de Beaulieu
35042 Rennes, France




Le 08/12/2014 03:42, Vanchinathan Venkataramani a écrit :
Hi Andreas and Arthur

It would be really helpful if you can provide some hints.

Thanks!

On Mon, Dec 1, 2014 at 10:56 PM, Vanchinathan Venkataramani <[email protected] <mailto:[email protected]>> wrote:

    Hi Arthur

    Thanks a lot for your reply.

    Your interpretation of LAS is what I require.

    I want to replay execution starting from the Load. It will be
    really helpful if you can give me hints on how to replay execution
    from this load instruction.

    Thanks

    On Mon, Dec 1, 2014 at 10:54 PM, Vanchinathan Venkataramani
    <[email protected] <mailto:[email protected]>> wrote:

        Hi Arthur

        Thanks a lot for your reply.

        Your interpretation of LAS is what I require.

        I want to replay execution starting from the Load. It will be
        really helpful if you can give me hints on how to replay
        execution from this load instruction.

        Thanks


        On Mon, Dec 1, 2014 at 10:01 PM, Arthur Perais
        <[email protected] <mailto:[email protected]>> wrote:

            Okay, the next comments assume that you are talking about
            a load that executed before an older store writing to the
            same address executed, and therefore got the wrong value.
            If what you call LAS refers to something else, disregard that.

            From what I gathered, the only replay mechanism currently
            implemented in the o3 CPU is there to deal with partial
            matches with store-to-load forwarding.
            For instance, when a load needs data that is part written
            by a store, and part in the dcache. In that case, the
            instruction is replayed when the store writes to the
            dcache (the mechanism is actually coarser than that but
            you get the idea).

            If you want selective replay for memory order violation
            (which is okay but quite complex in my opinion), you need
            to implement it yourself. This entails :
            - Getting all the instructions you need to replay (through
            register dependencies and memory dependencies).
            - Restore their state (clear the Issued flag, clear the
            Executed flag, and so on).
            - Restore dependencies which is non trivial since
            wakeDependents in inst_queue_impl.hh clears dependencies
            in dep_graph.hh when waking up insts. This means that you
            need to retain dependencies even after instructions have
            issued. You also need to deal with memory dependencies.
            - How do you replay? From the IQ? if so, then you can't
            free the IQ entry upon issue. If not, then you need a
            particular buffer to replay instructions from.

            If you want non-selective replay, this should be easier,
            although dependencies still have to be restored and you
            have to deal with the question of where the instructions
            are replayed from.

            Hope this helps, and if anyone sees a gross mistake in
            what I said, do not hesitate.

            Le 01/12/2014 14:47, Vanchinathan Venkataramani via
            gem5-users a écrit :
            Hi Andreas

            In ARM O3CPU, when there is a load after store violation,
            the younger instructions are being squashed and
            re-fetched again.

            Is it possible to re-execute these instructions instead
            of squashing all the younger instructions?

            Thanks


            _______________________________________________
            gem5-users mailing list
            [email protected]  <mailto:[email protected]>
            http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users


-- Arthur Perais
            INRIA Bretagne Atlantique
            Bâtiment 12E, Bureau E303, Campus de Beaulieu
            35042 Rennes, France






_______________________________________________
gem5-users mailing list
[email protected]
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users

Reply via email to