changeset 0733a1c08600 in /z/repo/gem5
details: http://repo.gem5.org/gem5?cmd=changeset;node=0733a1c08600
description:
cpu: Add TraceCPU to playback elastic traces
This patch defines a TraceCPU that replays trace generated using the
elastic
trace probe attached to the O3 CPU model. The elastic trace is an
execution
trace with data dependencies and ordering dependencies annoted to it.
It also
replays fixed timestamp instruction fetch trace that is also generated
by the
elastic trace probe.
The TraceCPU inherits from BaseCPU as a result of which some methods
need
to be defined. It has two port subclasses inherited from MasterPort for
instruction and data ports. It issues the memory requests deducing the
timing from the trace and without performing real execution of
micro-ops.
As soon as the last dependency for an instruction is complete,
its computational delay, also provided in the input trace is added. The
dependency-free nodes are maintained in a list, called 'ReadyList',
ordered by ready time. Instructions which depend on load stall until the
responses for read requests are received thus achieving elastic replay.
If
the dependency is not found when adding a new node, it is assumed
complete.
Thus, if this node is found to be completely dependency-free its issue
time is
calculated and it is added to the ready list immediately. This is
encapsulated
in the subclass ElasticDataGen.
If ready nodes are issued in an unconstrained way there can be more
nodes
outstanding which results in divergence in timing compared to the O3CPU.
Therefore, the Trace CPU also models hardware resources. A sub-class to
model
hardware resources is added which contains the maximum sizes of load
buffer,
store buffer and ROB. If resources are not available, the node is not
issued.
The 'depFreeQueue' structure holds nodes that are pending issue.
Modeling the ROB size in the Trace CPU as a resource limitation is
arguably the
most important parameter of all resources. The ROB occupancy is
estimated using
the newly added field 'robNum'. We need to use ROB number as sequence
number is
at times much higher due to squashing and trace replay is focused on
correct
path modeling.
A map called 'inFlightNodes' is added to track nodes that are not only
in
the readyList but also load nodes that are executed (and thus removed
from
readyList) but are not complete. ReadyList handles what and when to
execute
next node while the inFlightNodes is used for resource modelling. The
oldest
ROB number is updated when any node occupies the ROB or when an entry
in the
ROB is released. The ROB occupancy is equal to the difference in the
ROB number
of the newly dependency-free node and the oldest ROB number in flight.
If no node dependends on a non load/store node then there is no reason
to track
it in the dependency graph. We filter out such nodes but count them and
add a
weight field to the subsequent node that we do include in the trace.
The weight
field is used to model ROB occupancy during replay.
The depFreeQueue is chosen to be FIFO so that child nodes which are in
program order get pushed into it in that order and thus issued in the in
program order, like in the O3CPU. This is also why the dependents is
made a
sequential container, std::set to std::vector. We only check head of the
depFreeQueue as nodes are issued in order and blocking on head models
that
better than looping the entire queue. An alternative choice would be to
inspect
top N pending nodes where N is the issue-width. This is left for future
as the
timing correlation looks good as it is.
At the start of an execution event, first we attempt to issue such
pending
nodes by checking if appropriate resources have become available. If
yes, we
compute the execute tick with respect to the time then. Then we proceed
to
complete nodes from the readyList.
When a read response is received, sometimes a dependency on it that was
supposed to be released when it was issued is still not released. This
occurs
because the dependent gets added to the graph after the read was sent.
So the
check is made less strict and the dependency is marked complete on read
response instead of insisting that it should have been removed on read
sent.
There is a check for requests spanning two cache lines as this condition
triggers an assert fail in the L1 cache. If it does then truncate the
size
to access only until the end of that line and ignore the remainder.
Strictly-ordered requests are skipped and the dependencies on such
requests
are handled by simply marking them complete immediately.
The simulated seconds can be calculated as the difference between the
final_tick stat and the tickOffset stat. A CountedExitEvent that
contains
a static int belonging to the Trace CPU class as a down counter is used
to
implement multi Trace CPU simulation exit.
diffstat:
src/cpu/trace/SConscript | 12 +
src/cpu/trace/TraceCPU.py | 71 ++
src/cpu/trace/trace_cpu.cc | 1454 ++++++++++++++++++++++++++++++++++++++++++++
src/cpu/trace/trace_cpu.hh | 1101 +++++++++++++++++++++++++++++++++
4 files changed, 2638 insertions(+), 0 deletions(-)
diffs (truncated from 2654 to 300 lines):
diff -r f6db1e80a878 -r 0733a1c08600 src/cpu/trace/SConscript
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/src/cpu/trace/SConscript Mon Dec 07 16:42:15 2015 -0600
@@ -0,0 +1,12 @@
+Import('*')
+
+if env['TARGET_ISA'] == 'null':
+ Return()
+
+# Only build TraceCPU if we have support for protobuf as TraceCPU relies on it
+if env['HAVE_PROTOBUF']:
+ SimObject('TraceCPU.py')
+ Source('trace_cpu.cc')
+
+DebugFlag('TraceCPUData')
+DebugFlag('TraceCPUInst')
diff -r f6db1e80a878 -r 0733a1c08600 src/cpu/trace/TraceCPU.py
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/src/cpu/trace/TraceCPU.py Mon Dec 07 16:42:15 2015 -0600
@@ -0,0 +1,71 @@
+# Copyright (c) 2013 - 2015 ARM Limited
+# All rights reserved.
+#
+# The license below extends only to copyright in the software and shall
+# not be construed as granting a license to any other intellectual
+# property including but not limited to intellectual property relating
+# to a hardware implementation of the functionality of the software
+# licensed hereunder. You may use the software subject to the license
+# terms below provided that you ensure that this notice is replicated
+# unmodified and in its entirety in all distributions of the software,
+# modified or unmodified, in source code or in binary form.
+#
+# Redistribution and use in source and binary forms, with or without
+# modification, are permitted provided that the following conditions are
+# met: redistributions of source code must retain the above copyright
+# notice, this list of conditions and the following disclaimer;
+# redistributions in binary form must reproduce the above copyright
+# notice, this list of conditions and the following disclaimer in the
+# documentation and/or other materials provided with the distribution;
+# neither the name of the copyright holders nor the names of its
+# contributors may be used to endorse or promote products derived from
+# this software without specific prior written permission.
+#
+# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+#
+# Authors: Radhika Jagtap
+# Andreas Hansson
+# Thomas Grass
+
+from m5.params import *
+from BaseCPU import BaseCPU
+
+class TraceCPU(BaseCPU):
+ """Trace CPU model which replays traces generated in a prior simulation
+ using DerivO3CPU or its derived classes. It interfaces with L1 caches.
+ """
+ type = 'TraceCPU'
+ cxx_header = "cpu/trace/trace_cpu.hh"
+
+ @classmethod
+ def memory_mode(cls):
+ return 'timing'
+
+ @classmethod
+ def require_caches(cls):
+ return True
+
+ def addPMU(self, pmu = None):
+ pass
+
+ @classmethod
+ def support_take_over(cls):
+ return True
+
+ instTraceFile = Param.String("", "Instruction trace file")
+ dataTraceFile = Param.String("", "Data dependency trace file")
+ sizeStoreBuffer = Param.Unsigned(16, "Number of entries in the store "\
+ "buffer")
+ sizeLoadBuffer = Param.Unsigned(16, "Number of entries in the load buffer")
+ sizeROB = Param.Unsigned(40, "Number of entries in the re-order buffer")
+
diff -r f6db1e80a878 -r 0733a1c08600 src/cpu/trace/trace_cpu.cc
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/src/cpu/trace/trace_cpu.cc Mon Dec 07 16:42:15 2015 -0600
@@ -0,0 +1,1454 @@
+/*
+ * Copyright (c) 2013 - 2015 ARM Limited
+ * All rights reserved
+ *
+ * The license below extends only to copyright in the software and shall
+ * not be construed as granting a license to any other intellectual
+ * property including but not limited to intellectual property relating
+ * to a hardware implementation of the functionality of the software
+ * licensed hereunder. You may use the software subject to the license
+ * terms below provided that you ensure that this notice is replicated
+ * unmodified and in its entirety in all distributions of the software,
+ * modified or unmodified, in source code or in binary form.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are
+ * met: redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer;
+ * redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in the
+ * documentation and/or other materials provided with the distribution;
+ * neither the name of the copyright holders nor the names of its
+ * contributors may be used to endorse or promote products derived from
+ * this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ * Authors: Radhika Jagtap
+ * Andreas Hansson
+ * Thomas Grass
+ */
+
+#include "cpu/trace/trace_cpu.hh"
+
+#include "sim/sim_exit.hh"
+
+// Declare and initialize the static counter for number of trace CPUs.
+int TraceCPU::numTraceCPUs = 0;
+
+TraceCPU::TraceCPU(TraceCPUParams *params)
+ : BaseCPU(params),
+ icachePort(this),
+ dcachePort(this),
+ instMasterID(params->system->getMasterId(name() + ".inst")),
+ dataMasterID(params->system->getMasterId(name() + ".data")),
+ instTraceFile(params->instTraceFile),
+ dataTraceFile(params->dataTraceFile),
+ icacheGen(*this, ".iside", icachePort, instMasterID, instTraceFile),
+ dcacheGen(*this, ".dside", dcachePort, dataMasterID, dataTraceFile,
+ params->sizeROB, params->sizeStoreBuffer,
+ params->sizeLoadBuffer),
+ icacheNextEvent(this),
+ dcacheNextEvent(this),
+ oneTraceComplete(false),
+ firstFetchTick(0),
+ execCompleteEvent(nullptr)
+{
+ // Increment static counter for number of Trace CPUs.
+ ++TraceCPU::numTraceCPUs;
+
+ // Check that the python parameters for sizes of ROB, store buffer and load
+ // buffer do not overflow the corresponding C++ variables.
+ fatal_if(params->sizeROB > UINT16_MAX, "ROB size set to %d exceeds the "
+ "max. value of %d.\n", params->sizeROB, UINT16_MAX);
+ fatal_if(params->sizeStoreBuffer > UINT16_MAX, "ROB size set to %d "
+ "exceeds the max. value of %d.\n", params->sizeROB,
+ UINT16_MAX);
+ fatal_if(params->sizeLoadBuffer > UINT16_MAX, "Load buffer size set to"
+ " %d exceeds the max. value of %d.\n",
+ params->sizeLoadBuffer, UINT16_MAX);
+}
+
+TraceCPU::~TraceCPU()
+{
+
+}
+
+TraceCPU*
+TraceCPUParams::create()
+{
+ return new TraceCPU(this);
+}
+
+void
+TraceCPU::takeOverFrom(BaseCPU *oldCPU)
+{
+ // Unbind the ports of the old CPU and bind the ports of the TraceCPU.
+ assert(!getInstPort().isConnected());
+ assert(oldCPU->getInstPort().isConnected());
+ BaseSlavePort &inst_peer_port = oldCPU->getInstPort().getSlavePort();
+ oldCPU->getInstPort().unbind();
+ getInstPort().bind(inst_peer_port);
+
+ assert(!getDataPort().isConnected());
+ assert(oldCPU->getDataPort().isConnected());
+ BaseSlavePort &data_peer_port = oldCPU->getDataPort().getSlavePort();
+ oldCPU->getDataPort().unbind();
+ getDataPort().bind(data_peer_port);
+}
+
+void
+TraceCPU::init()
+{
+ DPRINTF(TraceCPUInst, "Instruction fetch request trace file is \"%s\"."
+ "\n", instTraceFile);
+ DPRINTF(TraceCPUData, "Data memory request trace file is \"%s\".\n",
+ dataTraceFile);
+
+ BaseCPU::init();
+
+ // Get the send tick of the first instruction read request and schedule
+ // icacheNextEvent at that tick.
+ Tick first_icache_tick = icacheGen.init();
+ schedule(icacheNextEvent, first_icache_tick);
+
+ // Get the send tick of the first data read/write request and schedule
+ // dcacheNextEvent at that tick.
+ Tick first_dcache_tick = dcacheGen.init();
+ schedule(dcacheNextEvent, first_dcache_tick);
+
+ // The static counter for number of Trace CPUs is correctly set at this
+ // point so create an event and pass it.
+ execCompleteEvent = new CountedExitEvent("end of all traces reached.",
+ numTraceCPUs);
+ // Save the first fetch request tick to dump it as tickOffset
+ firstFetchTick = first_icache_tick;
+}
+
+void
+TraceCPU::schedIcacheNext()
+{
+ DPRINTF(TraceCPUInst, "IcacheGen event.\n");
+
+ // Try to send the current packet or a retry packet if there is one
+ bool sched_next = icacheGen.tryNext();
+ // If packet sent successfully, schedule next event
+ if (sched_next) {
+ DPRINTF(TraceCPUInst, "Scheduling next icacheGen event "
+ "at %d.\n", curTick() + icacheGen.tickDelta());
+ schedule(icacheNextEvent, curTick() + icacheGen.tickDelta());
+ ++numSchedIcacheEvent;
+ } else {
+ // check if traceComplete. If not, do nothing because sending failed
+ // and next event will be scheduled via RecvRetry()
+ if (icacheGen.isTraceComplete()) {
+ // If this is the first trace to complete, set the variable. If it
+ // is already set then both traces are complete to exit sim.
+ checkAndSchedExitEvent();
+ }
+ }
+ return;
+}
+
+void
+TraceCPU::schedDcacheNext()
+{
+ DPRINTF(TraceCPUData, "DcacheGen event.\n");
+
+ dcacheGen.execute();
+ if (dcacheGen.isExecComplete()) {
+ checkAndSchedExitEvent();
+ }
+}
+
+void
+TraceCPU::checkAndSchedExitEvent()
+{
+ if (!oneTraceComplete) {
+ oneTraceComplete = true;
+ } else {
+ // Schedule event to indicate execution is complete as both
+ // instruction and data access traces have been played back.
+ inform("%s: Execution complete.\n", name());
+
+ // Record stats which are computed at the end of simulation
+ tickOffset = firstFetchTick;
+ numCycles = (clockEdge() - firstFetchTick) / clockPeriod();
+ numOps = dcacheGen.getMicroOpCount();
+ schedule(*execCompleteEvent, curTick());
+ }
+}
+
+void
+TraceCPU::regStats()
+{
+
+ BaseCPU::regStats();
+
+ numSchedDcacheEvent
+ .name(name() + ".numSchedDcacheEvent")
+ .desc("Number of events scheduled to trigger data request generator")
+ ;
+
+ numSchedIcacheEvent
+ .name(name() + ".numSchedIcacheEvent")
+ .desc("Number of events scheduled to trigger instruction request
generator")
_______________________________________________
gem5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/gem5-dev