[gem5-dev] Re: Upstreaming power-gem5
Hi Sandipan. You are correct, except that I would say you don't need to force push, just regular push. If I were at the head of a branch I wanted to (re)upload to gerrit, I would run: git push origin HEAD:refs/for/develop Gerrit will look at the Change-Id field in the commit message and use that to identify reviews. If one already exists, it will create a new patchset version (you can look at and compare them in the review UI if you like), and if it doesn't it will make a new review. In the output of the git command it will tell you which ones are new and which ones are updated if you're curious. While it can be mildly disorienting clicking around a series where the ordering of reviews has changed (gerrit tries to go off of the patchset version you're currently looking at), it has no problem keeping track of everything. Gabe On Wed, Feb 24, 2021 at 10:18 PM Sandipan Das wrote: > Hello Boris, Gabe, > > I think I now have a good amount of changes to address from the initial > posting of the patch series. In case of mailing list based reviews, we > would typically post the whole series again with a V2 tag but I guess > Gerrit tracks changes based on Change-Id. > > So as long as the Change-Id is preserved, force pushing the branch with > revised patches will upload the new revision to Gerrit while still > preserving all of the historical data such as review comments, etc. > Is this correct? > > I am also planning to add a new patch that splits makeCRField() into > signed and unsigned variants (like Gabe suggested) and that would now > be the first patch of the series. Can that create any problems? > > > - Sandipan > ___ gem5-dev mailing list -- gem5-dev@gem5.org To unsubscribe send an email to gem5-dev-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
[gem5-dev] Change in gem5/gem5[develop]: WIP
Gabe Black has uploaded this change for review. ( https://gem5-review.googlesource.com/c/public/gem5/+/41894 ) Change subject: WIP .. WIP Change-Id: I0ccbd634cc3374d28c0e79ea82cef39f1ba2c141 --- M src/arch/generic/isa.hh M src/arch/x86/isa.hh M src/cpu/minor/execute.cc M src/cpu/minor/scoreboard.cc M src/cpu/minor/scoreboard.hh M src/cpu/simple_thread.cc M src/cpu/simple_thread.hh M src/cpu/thread_context.cc M src/cpu/thread_context.hh 9 files changed, 76 insertions(+), 31 deletions(-) diff --git a/src/arch/generic/isa.hh b/src/arch/generic/isa.hh index 2fc8df4..78303ac 100644 --- a/src/arch/generic/isa.hh +++ b/src/arch/generic/isa.hh @@ -45,6 +45,15 @@ class ThreadContext; +class RegisterClassInfo +{ + protected: +size_t _size = 0; + + public: +size_t size() const { return _size; } +}; + class BaseISA : public SimObject { protected: @@ -70,6 +79,8 @@ { return initVecRegRenameMode(); } + +const RegisterClassInfo (RegClass reg_class) const = 0; }; #endif // __ARCH_GENERIC_ISA_HH__ diff --git a/src/arch/x86/isa.hh b/src/arch/x86/isa.hh index 5d31d87..a653bbb 100644 --- a/src/arch/x86/isa.hh +++ b/src/arch/x86/isa.hh @@ -53,6 +53,16 @@ std::string vendorString; +RegisterClassInfo regClassInfo[NumRegClasses] = { +{ NumIntRegs }, +{ NumFloatRegs }, +{ 1 }, +{ 1 }, +{ 1 }, +{ NumCCRegs }, +{ NUM_MISCREGS } +}; + public: void clear(); @@ -115,6 +125,12 @@ void setThreadContext(ThreadContext *_tc) override; std::string getVendorString() const; + +const RegisterClassInfo & +registerClassInfo(RegClass reg_class) const override +{ +return regClassInfo[reg_class]; +} }; } diff --git a/src/cpu/minor/execute.cc b/src/cpu/minor/execute.cc index ed582ad..0f33954 100644 --- a/src/cpu/minor/execute.cc +++ b/src/cpu/minor/execute.cc @@ -168,6 +168,9 @@ for (ThreadID tid = 0; tid < params.numThreads; tid++) { std::string tid_str = std::to_string(tid); +ThreadContext *tc = cpu.threads[tid]->getTC(); +const int numRegs = ; + /* Input Buffers */ inputBuffer.push_back( InputBuffer( @@ -175,7 +178,11 @@ params.executeInputBufferSize)); /* Scoreboards */ -scoreboard.push_back(Scoreboard(name_ + ".scoreboard" + tid_str)); +scoreboard.emplace_back(name_ + ".scoreboard" + tid_str, +tc->registerClassInfo(IntRegClass).size(), +TheISA::NumCCRegs, TheISA::NumFloatRegs, +TheISA::NumVecRegs, TheISA::NumVecElemPerVecReg, +TheISA::NumVecPredRegs); /* In-flight instruction records */ executeInfo[tid].inFlightInsts = new Queue writingInst; public: -Scoreboard(const std::string ) : +Scoreboard(const std::string , +unsigned numIntRegs, unsigned numCcRegs, unsigned numFloatRegs, +unsigned numVecRegs, unsigned numVecElemPerReg, +unsigned numVecPredRegs) : Named(name), -numRegs(TheISA::NumIntRegs + TheISA::NumCCRegs + -TheISA::NumFloatRegs + -(TheISA::NumVecRegs * TheISA::NumVecElemPerVecReg) + -TheISA::NumVecPredRegs), +ccRegOffset(numIntRegs), +floatRegOffset(ccRegOffset + numCcRegs), +vecRegOffset(floatRegOffset + numFloatRegs), +vecPredRegOffset(vecRegOffset + numVecRegs), +numRegs(numIntRegs + numCcRegs + numFloatRegs + +(numVecRegs * numVecElemPerReg) + numVecPredRegs), numResults(numRegs, 0), numUnpredictableResults(numRegs, 0), fuIndices(numRegs, 0), diff --git a/src/cpu/simple_thread.cc b/src/cpu/simple_thread.cc index f15be91..d293ba0 100644 --- a/src/cpu/simple_thread.cc +++ b/src/cpu/simple_thread.cc @@ -67,6 +67,7 @@ Process *_process, BaseMMU *_mmu, BaseISA *_isa) : ThreadState(_cpu, _thread_num, _process), + intRegs(_isa->registerClassInfo(IntRegClass).size()), isa(dynamic_cast(_isa)), predicate(true), memAccPredicate(true), comInstEventQueue("instruction-based event queue"), @@ -80,6 +81,7 @@ SimpleThread::SimpleThread(BaseCPU *_cpu, int _thread_num, System *_sys, BaseMMU *_mmu, BaseISA *_isa) : ThreadState(_cpu, _thread_num, NULL), + intRegs(_isa->registerClassInfo(IntRegClass).size()), isa(dynamic_cast(_isa)), predicate(true), memAccPredicate(true), comInstEventQueue("instruction-based event queue"), diff --git a/src/cpu/simple_thread.hh b/src/cpu/simple_thread.hh index 7a13825..9d206b6 100644 --- a/src/cpu/simple_thread.hh +++ b/src/cpu/simple_thread.hh @@ -42,8 +42,6 @@ #ifndef __CPU_SIMPLE_THREAD_HH__ #define __CPU_SIMPLE_THREAD_HH__ -#include - #include "arch/decoder.hh"
[gem5-dev] Change in gem5/gem5[develop]: x86: Minor cleanup of the ISA class.
Gabe Black has uploaded this change for review. ( https://gem5-review.googlesource.com/c/public/gem5/+/41893 ) Change subject: x86: Minor cleanup of the ISA class. .. x86: Minor cleanup of the ISA class. Remove namespace indentation, get rid of some unnecessary includes and class prototypes, and make members consistently private. Change-Id: If8e6375bf664c125f6776de62aefe44923f73c2e --- M src/arch/x86/isa.hh 1 file changed, 63 insertions(+), 65 deletions(-) diff --git a/src/arch/x86/isa.hh b/src/arch/x86/isa.hh index 584933d..5d31d87 100644 --- a/src/arch/x86/isa.hh +++ b/src/arch/x86/isa.hh @@ -37,88 +37,86 @@ #include "arch/x86/regs/misc.hh" #include "base/types.hh" #include "cpu/reg_class.hh" -#include "sim/sim_object.hh" -class Checkpoint; -class EventManager; class ThreadContext; struct X86ISAParams; namespace X86ISA { -class ISA : public BaseISA + +class ISA : public BaseISA +{ + private: +RegVal regVal[NUM_MISCREGS]; +void updateHandyM5Reg(Efer efer, CR0 cr0, +SegAttr csAttr, SegAttr ssAttr, RFLAGS rflags); + +std::string vendorString; + + public: +void clear(); + +using Params = X86ISAParams; + +ISA(const Params ); + +RegVal readMiscRegNoEffect(int miscReg) const; +RegVal readMiscReg(int miscReg); + +void setMiscRegNoEffect(int miscReg, RegVal val); +void setMiscReg(int miscReg, RegVal val); + +RegId +flattenRegId(const RegId& regId) const { - protected: -RegVal regVal[NUM_MISCREGS]; -void updateHandyM5Reg(Efer efer, CR0 cr0, -SegAttr csAttr, SegAttr ssAttr, RFLAGS rflags); - - public: -void clear(); - -using Params = X86ISAParams; - -ISA(const Params ); - -RegVal readMiscRegNoEffect(int miscReg) const; -RegVal readMiscReg(int miscReg); - -void setMiscRegNoEffect(int miscReg, RegVal val); -void setMiscReg(int miscReg, RegVal val); - -RegId -flattenRegId(const RegId& regId) const -{ -switch (regId.classValue()) { - case IntRegClass: -return RegId(IntRegClass, flattenIntIndex(regId.index())); - case FloatRegClass: -return RegId(FloatRegClass, flattenFloatIndex(regId.index())); - case CCRegClass: -return RegId(CCRegClass, flattenCCIndex(regId.index())); - case MiscRegClass: -return RegId(MiscRegClass, flattenMiscIndex(regId.index())); - default: -break; -} -return regId; +switch (regId.classValue()) { + case IntRegClass: +return RegId(IntRegClass, flattenIntIndex(regId.index())); + case FloatRegClass: +return RegId(FloatRegClass, flattenFloatIndex(regId.index())); + case CCRegClass: +return RegId(CCRegClass, flattenCCIndex(regId.index())); + case MiscRegClass: +return RegId(MiscRegClass, flattenMiscIndex(regId.index())); + default: +break; } +return regId; +} -int flattenIntIndex(int reg) const { return reg & ~IntFoldBit; } +int flattenIntIndex(int reg) const { return reg & ~IntFoldBit; } -int -flattenFloatIndex(int reg) const -{ -if (reg >= NUM_FLOATREGS) { -reg = FLOATREG_STACK(reg - NUM_FLOATREGS, - regVal[MISCREG_X87_TOP]); -} -return reg; +int +flattenFloatIndex(int reg) const +{ +if (reg >= NUM_FLOATREGS) { +reg = FLOATREG_STACK(reg - NUM_FLOATREGS, + regVal[MISCREG_X87_TOP]); } +return reg; +} -int flattenVecIndex(int reg) const { return reg; } -int flattenVecElemIndex(int reg) const { return reg; } -int flattenVecPredIndex(int reg) const { return reg; } -int flattenCCIndex(int reg) const { return reg; } -int flattenMiscIndex(int reg) const { return reg; } +int flattenVecIndex(int reg) const { return reg; } +int flattenVecElemIndex(int reg) const { return reg; } +int flattenVecPredIndex(int reg) const { return reg; } +int flattenCCIndex(int reg) const { return reg; } +int flattenMiscIndex(int reg) const { return reg; } -bool -inUserMode() const override -{ -HandyM5Reg m5reg = readMiscRegNoEffect(MISCREG_M5_REG); -return m5reg.cpl == 3; -} +bool +inUserMode() const override +{ +HandyM5Reg m5reg = readMiscRegNoEffect(MISCREG_M5_REG); +return m5reg.cpl == 3; +} -void serialize(CheckpointOut ) const override; -void unserialize(CheckpointIn ) override; +void serialize(CheckpointOut ) const override; +void unserialize(CheckpointIn
[gem5-dev] Change in gem5/gem5[develop]: misc: Adding 'make' to the compiler Dockerfiles
Bobby R. Bruce has uploaded this change for review. ( https://gem5-review.googlesource.com/c/public/gem5/+/41873 ) Change subject: misc: Adding 'make' to the compiler Dockerfiles .. misc: Adding 'make' to the compiler Dockerfiles While gem5 will compile without make, LTO cannot link on multiple threads without it. Change-Id: Id5552aaa295e194789ab5f355bb62a3657384d38 --- M util/dockerfiles/ubuntu-18.04_clang-version/Dockerfile M util/dockerfiles/ubuntu-18.04_gcc-version/Dockerfile M util/dockerfiles/ubuntu-20.04_gcc-version/Dockerfile 3 files changed, 3 insertions(+), 3 deletions(-) diff --git a/util/dockerfiles/ubuntu-18.04_clang-version/Dockerfile b/util/dockerfiles/ubuntu-18.04_clang-version/Dockerfile index 97f3dbc..869a2c1 100644 --- a/util/dockerfiles/ubuntu-18.04_clang-version/Dockerfile +++ b/util/dockerfiles/ubuntu-18.04_clang-version/Dockerfile @@ -40,7 +40,7 @@ RUN apt -y upgrade RUN apt -y install git m4 scons zlib1g zlib1g-dev clang-${version} \ libprotobuf-dev protobuf-compiler libprotoc-dev libgoogle-perftools-dev \ -python3-dev python3 python3-six doxygen +python3-dev python3 python3-six doxygen make RUN apt-get --purge -y remove gcc diff --git a/util/dockerfiles/ubuntu-18.04_gcc-version/Dockerfile b/util/dockerfiles/ubuntu-18.04_gcc-version/Dockerfile index 9f3da37..1723fd9 100644 --- a/util/dockerfiles/ubuntu-18.04_gcc-version/Dockerfile +++ b/util/dockerfiles/ubuntu-18.04_gcc-version/Dockerfile @@ -38,7 +38,7 @@ RUN apt -y install git m4 scons zlib1g zlib1g-dev gcc-multilib \ libprotobuf-dev protobuf-compiler libprotoc-dev libgoogle-perftools-dev \ python3-dev python3 python3-six doxygen wget zip gcc-${version} \ -g++-${version} +g++-${version} make RUN update-alternatives --install \ /usr/bin/g++ g++ /usr/bin/g++-${version} 100 diff --git a/util/dockerfiles/ubuntu-20.04_gcc-version/Dockerfile b/util/dockerfiles/ubuntu-20.04_gcc-version/Dockerfile index d2008b6..923fe63 100644 --- a/util/dockerfiles/ubuntu-20.04_gcc-version/Dockerfile +++ b/util/dockerfiles/ubuntu-20.04_gcc-version/Dockerfile @@ -38,7 +38,7 @@ RUN apt -y install git m4 scons zlib1g zlib1g-dev libprotobuf-dev \ protobuf-compiler libprotoc-dev libgoogle-perftools-dev python3-dev \ python3-six python-is-python3 doxygen libboost-all-dev libhdf5-serial-dev \ -python3-pydot libpng-dev gcc-${version} g++-${version} +python3-pydot libpng-dev gcc-${version} g++-${version} make RUN update-alternatives --install \ /usr/bin/g++ g++ /usr/bin/g++-${version} 100 -- To view, visit https://gem5-review.googlesource.com/c/public/gem5/+/41873 To unsubscribe, or for help writing mail filters, visit https://gem5-review.googlesource.com/settings Gerrit-Project: public/gem5 Gerrit-Branch: develop Gerrit-Change-Id: Id5552aaa295e194789ab5f355bb62a3657384d38 Gerrit-Change-Number: 41873 Gerrit-PatchSet: 1 Gerrit-Owner: Bobby R. Bruce Gerrit-MessageType: newchange ___ gem5-dev mailing list -- gem5-dev@gem5.org To unsubscribe send an email to gem5-dev-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
[gem5-dev] Re: Upstreaming power-gem5
Hello Boris, Gabe, I think I now have a good amount of changes to address from the initial posting of the patch series. In case of mailing list based reviews, we would typically post the whole series again with a V2 tag but I guess Gerrit tracks changes based on Change-Id. So as long as the Change-Id is preserved, force pushing the branch with revised patches will upload the new revision to Gerrit while still preserving all of the historical data such as review comments, etc. Is this correct? I am also planning to add a new patch that splits makeCRField() into signed and unsigned variants (like Gabe suggested) and that would now be the first patch of the series. Can that create any problems? - Sandipan ___ gem5-dev mailing list -- gem5-dev@gem5.org To unsubscribe send an email to gem5-dev-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
[gem5-dev] Re: vector register indexing modes and renaming?
On Wed, Feb 24, 2021 at 8:05 AM Giacomo Travaglini < giacomo.travagl...@arm.com> wrote: > > > > -Original Message- > > From: Gabe Black > > Sent: 24 February 2021 15:24 > > To: Giacomo Travaglini > > Cc: gem5 Developer List > > Subject: Re: [gem5-dev] vector register indexing modes and renaming? > > > > So, I started really diving into the interfaces in ThreadContext and > ExecContext > > and their various implementations. What I wanted to do was to define a > much > > narrower set of maybe 3 virtual functions that actually implements the > core of > > what's needed, and not 15-20 different independent virtual methods that > all > > need to be reimplemented every time. *That* was quite the rabbit hole, > and > > after a number of hours I decided I needed to regroup and come at it from > > another angle. It definitely looks to me like somebody came in with the > idea to > > represent these registers using a data, model, view architecture (or > something > > like that) which would make sense in other contexts with other types of > data, > > but here I don't think is really the right way to go about this. > > > > Right now, I have two questions for you. > > > > 1. Are there tests which exercise this stuff? If I start chopping things > up, I > > would be a lot more comfortable if I can tell if/when I break something. > > I will ask within Arm if there's something we can provide to you. > In the meantime I gave a quick look at NEON enabled libraries [1]; the > Ne10 library provides a set of functions optimized for NEON and a set > of examples making use of it [2] (e.g FIR filter, GEMM etc etc). > > You could probably cross-compile those examples and use them in SE mode > (recommending to use the O3 model) > Ok, thanks, I'll take a look. This might even be something we want in the testing infrastructure? I might look into that when I have a chance. > > > 2. What's the difference between a lane and an element? Those terms seem > > like they should be synonyms and are treated as almost the same thing, > but > > there is clearly a difference between them. What is it, and why does it > exist? > > > > Gabe > > > > I have the hunch the vector lane logic it's not really used. > My understanding is that Lane/Elem differ in the O3 model only. > The key point is that VecRegister and VecElems are represented by a > different set of physical registers; you cannot access a vector element if > the renaming is set to Full[3]; the physical vector register file will be > made of valid entries, while the vector element register file will be > empty. The vector lane getters/setters are probably a way to do a > functional read of the element anyway [4]. > In a way we could think of VecReg/VecElem as being the interface to the > vector file for a guest instruction, while the VecLane to be the interface > for the host (even though it could be used by an instruction as well) > > This is my interpretation of the VecLane > Ok, thanks. If there are things we can eliminate from the interfaces then that will make the whole problem simpler. Part of what makes this hard to work on are that there are so many things that need to move in parallel to keep everything working (whole registers, elements, lanes, ThreadContext, ExecContext, SimpleThread, dynamic inst classes, O3 register file and rename map, minor CPU and O3 scoreboard, parser implementation, operand definitions, instruction definitions). Finding a place to unravel a small part of this at a time has been tricky... Gabe ___ gem5-dev mailing list -- gem5-dev@gem5.org To unsubscribe send an email to gem5-dev-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
[gem5-dev] Change in gem5/gem5[develop]: configs,mem-ruby: dumps network route profile
Tiago Mück has uploaded this change for review. ( https://gem5-review.googlesource.com/c/public/gem5/+/41866 ) Change subject: configs,mem-ruby: dumps network route profile .. configs,mem-ruby: dumps network route profile if trace_routes parameter is set when using SimpleNetwork, each unique route used in the network is tracked and dumped in a text file at the end of the simulation in the format: delay_cy msg_cnt vnet src_port -> r0 -> .. -> rn -> dest_port delay_cy is the average delay of the route. For each router the average internal delays are also included. The dumped routes are ordered by: delay_cy / #hops The TraceRoutesDebug debug flag can also be use to print additional routing information. JIRA: https://gem5.atlassian.net/browse/GEM5-920 Change-Id: I15ac24ef28d28cf8e141522704bc47be0712b2c1 Signed-off-by: Tiago Mück --- M configs/network/Network.py M src/mem/ruby/network/BasicRouter.hh M src/mem/ruby/network/MessageBuffer.cc M src/mem/ruby/network/MessageBuffer.hh A src/mem/ruby/network/RouteProfiler.cc A src/mem/ruby/network/RouteProfiler.hh M src/mem/ruby/network/SConscript M src/mem/ruby/network/simple/PerfectSwitch.cc M src/mem/ruby/network/simple/SimpleNetwork.cc M src/mem/ruby/network/simple/SimpleNetwork.hh M src/mem/ruby/network/simple/SimpleNetwork.py M src/mem/ruby/network/simple/Switch.cc M src/mem/ruby/network/simple/Switch.hh M src/mem/ruby/network/simple/Throttle.cc M src/mem/ruby/structures/TimerTable.hh M src/mem/ruby/structures/WireBuffer.hh M src/mem/slicc/symbols/StateMachine.py 17 files changed, 541 insertions(+), 5 deletions(-) diff --git a/configs/network/Network.py b/configs/network/Network.py index 4a708ea..d475c16 100644 --- a/configs/network/Network.py +++ b/configs/network/Network.py @@ -76,6 +76,9 @@ action="store_true", default=False, help="""SimpleNetwork links uses a separate physical channel for each virtual network""") +parser.add_option("--simple-trace-routes", + action="store_true", default=False, + help="""SimpleNetwork traces latency for all routes""") def create_network(options, ruby): @@ -187,6 +190,7 @@ network.physical_vnets_channels = \ [1] * int(network.number_of_virtual_networks) network.setup_buffers() +network.trace_routes = options.simple_trace_routes if InterfaceClass != None: netifs = [InterfaceClass(id=i) \ diff --git a/src/mem/ruby/network/BasicRouter.hh b/src/mem/ruby/network/BasicRouter.hh index 9417342..2c5f80a 100644 --- a/src/mem/ruby/network/BasicRouter.hh +++ b/src/mem/ruby/network/BasicRouter.hh @@ -1,4 +1,16 @@ /* + * Copyright (c) 2021 ARM Limited + * All rights reserved. + * + * The license below extends only to copyright in the software and shall + * not be construed as granting a license to any other intellectual + * property including but not limited to intellectual property relating + * to a hardware implementation of the functionality of the software + * licensed hereunder. You may use the software subject to the license + * terms below provided that you ensure that this notice is replicated + * unmodified and in its entirety in all distributions of the software, + * modified or unmodified, in source code or in binary form. + * * Copyright (c) 2011 Advanced Micro Devices, Inc. * All rights reserved. * @@ -45,6 +57,9 @@ void init(); void print(std::ostream& out) const; + +uint32_t getID() const { return m_id; } + protected: // // ID in relation to other routers in the system diff --git a/src/mem/ruby/network/MessageBuffer.cc b/src/mem/ruby/network/MessageBuffer.cc index bde4de7..d16f70d 100644 --- a/src/mem/ruby/network/MessageBuffer.cc +++ b/src/mem/ruby/network/MessageBuffer.cc @@ -60,6 +60,7 @@ m_randomization(p.randomization), m_allow_zero_latency(p.allow_zero_latency), m_routing_priority(p.routing_priority), +m_is_inport(false), ADD_STAT(m_not_avail_count, "Number of times this buffer did not have " "N slots available"), ADD_STAT(m_buf_msgs, "Average number of messages in buffer"), diff --git a/src/mem/ruby/network/MessageBuffer.hh b/src/mem/ruby/network/MessageBuffer.hh index 8c6ceda..ae9339f 100644 --- a/src/mem/ruby/network/MessageBuffer.hh +++ b/src/mem/ruby/network/MessageBuffer.hh @@ -96,7 +96,7 @@ bool areNSlotsAvailable(unsigned int n, Tick curTime); int getPriority() { return m_priority_rank; } void setPriority(int rank) { m_priority_rank = rank; } -void setConsumer(Consumer* consumer) +void setConsumer(Consumer* consumer, bool is_inport = false) { DPRINTF(RubyQueue, "Setting consumer: %s\n", *consumer); if (m_consumer != NULL) { @@ -105,6 +105,7 @@ *consumer, *this, *m_consumer);
[gem5-dev] Change in gem5/gem5[develop]: mem-ruby: add priorities in SimpleNetwork routing
Tiago Mück has uploaded this change for review. ( https://gem5-review.googlesource.com/c/public/gem5/+/41864 ) Change subject: mem-ruby: add priorities in SimpleNetwork routing .. mem-ruby: add priorities in SimpleNetwork routing Configurations can specify a routing priority for message buffers. This priority is used by SimpleNetwork when checking for messages in the routers' input ports. Higher priority ports are always checked first. JIRA: https://gem5.atlassian.net/browse/GEM5-920 Change-Id: I7e2b35e2cae63086a76def1145f9b4b56220a2ba Signed-off-by: Tiago Mück --- M src/mem/ruby/network/MessageBuffer.cc M src/mem/ruby/network/MessageBuffer.hh M src/mem/ruby/network/MessageBuffer.py M src/mem/ruby/network/simple/PerfectSwitch.cc M src/mem/ruby/network/simple/PerfectSwitch.hh 5 files changed, 76 insertions(+), 26 deletions(-) diff --git a/src/mem/ruby/network/MessageBuffer.cc b/src/mem/ruby/network/MessageBuffer.cc index 5c8bf59..bde4de7 100644 --- a/src/mem/ruby/network/MessageBuffer.cc +++ b/src/mem/ruby/network/MessageBuffer.cc @@ -59,6 +59,7 @@ m_last_arrival_time(0), m_strict_fifo(p.ordered), m_randomization(p.randomization), m_allow_zero_latency(p.allow_zero_latency), +m_routing_priority(p.routing_priority), ADD_STAT(m_not_avail_count, "Number of times this buffer did not have " "N slots available"), ADD_STAT(m_buf_msgs, "Average number of messages in buffer"), diff --git a/src/mem/ruby/network/MessageBuffer.hh b/src/mem/ruby/network/MessageBuffer.hh index d940dcb..8c6ceda 100644 --- a/src/mem/ruby/network/MessageBuffer.hh +++ b/src/mem/ruby/network/MessageBuffer.hh @@ -152,6 +152,9 @@ void setIncomingLink(int link_id) { m_input_link_id = link_id; } void setVnet(int net) { m_vnet_id = net; } +int getIncomingLink() const { return m_input_link_id; } +int getVnet() const { return m_vnet_id; } + Port & getPort(const std::string &, PortID idx=InvalidPortID) override { @@ -181,6 +184,8 @@ return functionalAccess(pkt, true, ) == 1; } +int routingPriority() const { return m_routing_priority; } + private: void reanalyzeList(std::list &, Tick); @@ -264,6 +269,8 @@ const MessageRandomization m_randomization; const bool m_allow_zero_latency; +const int m_routing_priority; + int m_input_link_id; int m_vnet_id; diff --git a/src/mem/ruby/network/MessageBuffer.py b/src/mem/ruby/network/MessageBuffer.py index cb7f02d..d0161d6 100644 --- a/src/mem/ruby/network/MessageBuffer.py +++ b/src/mem/ruby/network/MessageBuffer.py @@ -69,3 +69,6 @@ max_dequeue_rate = Param.Unsigned(0, "Maximum number of messages that can \ be dequeued per cycle \ (0 allows dequeueing all ready messages)") +routing_priority = Param.Int(0, "Buffer priority when messages are \ + consumed by the network. Smaller value \ + means higher priority") diff --git a/src/mem/ruby/network/simple/PerfectSwitch.cc b/src/mem/ruby/network/simple/PerfectSwitch.cc index 201d091..f7a4313 100644 --- a/src/mem/ruby/network/simple/PerfectSwitch.cc +++ b/src/mem/ruby/network/simple/PerfectSwitch.cc @@ -82,11 +82,40 @@ in[i]->setConsumer(this); in[i]->setIncomingLink(port); in[i]->setVnet(i); +updatePriorityGroups(i, in[i]); } } } void +PerfectSwitch::updatePriorityGroups(int vnet, MessageBuffer* in_buf) +{ +while (m_in_prio.size() <= vnet) { +m_in_prio.emplace_back(); +m_in_prio_groups.emplace_back(); +} + +m_in_prio[vnet].push_back(in_buf); + +struct MessageBufferSort { +bool operator() (const MessageBuffer* i, + const MessageBuffer* j) +{ return i->routingPriority() < j->routingPriority(); } +} sortObj; +std::sort(m_in_prio[vnet].begin(), m_in_prio[vnet].end(), sortObj); + +// reset groups +m_in_prio_groups[vnet].clear(); +int cur_prio = m_in_prio[vnet].front()->routingPriority(); +m_in_prio_groups[vnet].emplace_back(); +for (auto buf : m_in_prio[vnet]) { +if (buf->routingPriority() != cur_prio) +m_in_prio_groups[vnet].emplace_back(); +m_in_prio_groups[vnet].back().push_back(buf); +} +} + +void PerfectSwitch::addOutPort(const std::vector& out, const NetDest& routing_table_entry, const PortDirection _inport, @@ -111,35 +140,38 @@ PerfectSwitch::operateVnet(int vnet) { if (m_pending_message_count[vnet] > 0) { -// first check the port with the oldest message -unsigned incoming = 0; -Tick lowest_tick = MaxTick; -for (int counter = 0; counter < m_in.size(); ++counter) { -
[gem5-dev] Change in gem5/gem5[develop]: mem-ruby: additional SimpleNetwork stats
Tiago Mück has uploaded this change for review. ( https://gem5-review.googlesource.com/c/public/gem5/+/41865 ) Change subject: mem-ruby: additional SimpleNetwork stats .. mem-ruby: additional SimpleNetwork stats Additional stats allow more detailed monitoring of switch bandwidth and stalls. JIRA: https://gem5.atlassian.net/browse/GEM5-920 Change-Id: I56604f315024f19df5f89c6f6ea1e3aa0ea185ea Signed-off-by: Tiago Mück --- M src/mem/ruby/network/simple/Throttle.cc M src/mem/ruby/network/simple/Throttle.hh 2 files changed, 66 insertions(+), 14 deletions(-) diff --git a/src/mem/ruby/network/simple/Throttle.cc b/src/mem/ruby/network/simple/Throttle.cc index f3dd82c..ed9b26f 100644 --- a/src/mem/ruby/network/simple/Throttle.cc +++ b/src/mem/ruby/network/simple/Throttle.cc @@ -50,6 +50,7 @@ #include "mem/ruby/network/simple/Switch.hh" #include "mem/ruby/slicc_interface/Message.hh" #include "mem/ruby/system/RubySystem.hh" +#include "sim/stats.hh" const int MESSAGE_SIZE_MULTIPLIER = 1000; //const int BROADCAST_SCALING = 4; // Have a 16p system act like a 64p systems @@ -130,7 +131,7 @@ void Throttle::operateVnet(int vnet, int channel, int _bw_remaining, - bool _wakeup, + bool _saturated, bool _blocked, MessageBuffer *in, MessageBuffer *out) { if (out == nullptr || in == nullptr) { @@ -158,6 +159,7 @@ // Find the size of the message we are moving MsgPtr msg_ptr = in->peekMsgPtr(); Message *net_msg_ptr = msg_ptr.get(); +Tick msg_enqueue_time = msg_ptr->getLastEnqueueTime(); units_remaining = network_message_to_size(net_msg_ptr); DPRINTF(RubyNetwork, "throttle: %d my bw %d bw spent " @@ -173,6 +175,15 @@ // Count the message (*(throttleStats. m_msg_counts[net_msg_ptr->getMessageSize()]))[vnet]++; +throttleStats.m_total_msg_count += 1; +uint32_t total_size = + Network::MessageSizeType_to_int(net_msg_ptr->getMessageSize()); +throttleStats.m_total_msg_bytes += total_size; +total_size -= +Network::MessageSizeType_to_int(MessageSizeType_Control); +throttleStats.m_total_data_msg_bytes += total_size; +throttleStats.m_total_msg_wait_time += +current_time - msg_enqueue_time; DPRINTF(RubyNetwork, "%s\n", *out); } @@ -189,15 +200,14 @@ assert(bw_remaining >= 0); assert(total_bw_remaining >= 0); -// Make sure to continue work next cycle if +// Notify caller if // - we ran out of bandwith and still have stuff to do // - we had something to do but output queue was unavailable -if (((bw_remaining == 0) && - (in->isReady(current_time) || (units_remaining > 0))) || -(ready && !out->areNSlotsAvailable(1, current_time))) { -DPRINTF(RubyNetwork, "vnet: %d set schedule_wakeup\n", vnet); -schedule_wakeup = true; -} +bw_saturated = bw_saturated || +((bw_remaining == 0) && + (in->isReady(current_time) || (units_remaining > 0))); +output_blocked = output_blocked || +(ready && !out->areNSlotsAvailable(1, current_time)); } void @@ -208,7 +218,8 @@ int bw_remaining = getTotalLinkBandwidth(); m_wakeups_wo_switch++; -bool schedule_wakeup = false; +bool bw_saturated = false; +bool output_blocked = false; // variable for deciding the direction in which to iterate bool iteration_direction = false; @@ -223,13 +234,15 @@ if (iteration_direction) { for (int vnet = 0; vnet < m_vnets; ++vnet) { for (int channel = 0; channel < getChannelCnt(vnet); ++channel) -operateVnet(vnet, channel, bw_remaining, schedule_wakeup, +operateVnet(vnet, channel, bw_remaining, +bw_saturated, output_blocked, m_in[vnet], m_out[vnet]); } } else { for (int vnet = m_vnets-1; vnet >= 0; --vnet) { for (int channel = 0; channel < getChannelCnt(vnet); ++channel) -operateVnet(vnet, channel, bw_remaining, schedule_wakeup, +operateVnet(vnet, channel, bw_remaining, +bw_saturated, output_blocked, m_in[vnet], m_out[vnet]); } } @@ -245,7 +258,10 @@ // If ratio = 0, we used no bandwidth, if ratio = 1, we used all m_link_utilization_proxy += ratio; -if (!schedule_wakeup) { +if (bw_saturated) throttleStats.m_total_bw_sat_cy += 1; +if (output_blocked) throttleStats.m_total_stall_cy += 1; + +if (!bw_saturated && !output_blocked) { // We have extra bandwidth and our output buffer was // available, so we must not have anything else to do
[gem5-dev] Change in gem5/gem5[develop]: tests: extend ruby_mem_test
Tiago Mück has uploaded this change for review. ( https://gem5-review.googlesource.com/c/public/gem5/+/41867 ) Change subject: tests: extend ruby_mem_test .. tests: extend ruby_mem_test Replace ruby_mem_test by these tests which run different configurations: ruby_mem_test-garnet: use Garnet ruby_mem_test-simple: use SimpleNetwork (same as original ruby_mem_test) ruby_mem_test-simple-extra: use SimpleNetwork with --simple-physical-channels and --simple-trace-routes options ruby_mem_test-simple-extra-multicore: same as previous using 4 cores JIRA: https://gem5.atlassian.net/browse/GEM5-920 Change-Id: I7716cd222dd56ddbf06f53f92ec9b568ed5a182c Signed-off-by: Tiago Mück --- M tests/gem5/memory/test.py 1 file changed, 22 insertions(+), 8 deletions(-) diff --git a/tests/gem5/memory/test.py b/tests/gem5/memory/test.py index db20ab5..8beca78 100644 --- a/tests/gem5/memory/test.py +++ b/tests/gem5/memory/test.py @@ -68,17 +68,31 @@ ) null_tests = [ -('garnet_synth_traffic', ['--sim-cycles', '500']), -('memcheck', ['--maxtick', '20', '--prefetchers']), -('ruby_mem_test', ['--abs-max-tick', '2000', -'--functional', '10']), -('ruby_random_test', ['--maxloads', '5000']), -('ruby_direct_test', ['--requests', '5']), +('garnet_synth_traffic', None, ['--sim-cycles', '500']), +('memcheck', None, ['--maxtick', '20', '--prefetchers']), +('ruby_mem_test-garnet', 'ruby_mem_test', +['--abs-max-tick', '2000', '--functional', '10', \ + '--network=garnet']), +('ruby_mem_test-simple', 'ruby_mem_test', +['--abs-max-tick', '2000', '--functional', '10', \ + '--network=simple']), +('ruby_mem_test-simple-extra', 'ruby_mem_test', +['--abs-max-tick', '2000', '--functional', '10', \ + '--network=simple', '--simple-physical-channels', + '--simple-trace-routes']), +('ruby_mem_test-simple-extra-multicore', 'ruby_mem_test', +['--abs-max-tick', '2000', '--functional', '10', \ + '--network=simple', '--simple-physical-channels', + '--simple-trace-routes', '--num-cpus=4']), +('ruby_random_test', None, ['--maxloads', '5000']), +('ruby_direct_test', None, ['--requests', '5']), ] -for basename_noext, args in null_tests: +for test_name, basename_noext, args in null_tests: +if basename_noext == None: +basename_noext = test_name gem5_verify_config( -name=basename_noext, +name=test_name, fixtures=(), verifiers=(), config=joinpath(config.base_dir, 'configs', -- To view, visit https://gem5-review.googlesource.com/c/public/gem5/+/41867 To unsubscribe, or for help writing mail filters, visit https://gem5-review.googlesource.com/settings Gerrit-Project: public/gem5 Gerrit-Branch: develop Gerrit-Change-Id: I7716cd222dd56ddbf06f53f92ec9b568ed5a182c Gerrit-Change-Number: 41867 Gerrit-PatchSet: 1 Gerrit-Owner: Tiago Mück Gerrit-MessageType: newchange ___ gem5-dev mailing list -- gem5-dev@gem5.org To unsubscribe send an email to gem5-dev-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
[gem5-dev] Change in gem5/gem5[develop]: mem-ruby: fine tunning SimpleNetwork buffers
Tiago Mück has uploaded this change for review. ( https://gem5-review.googlesource.com/c/public/gem5/+/41863 ) Change subject: mem-ruby: fine tunning SimpleNetwork buffers .. mem-ruby: fine tunning SimpleNetwork buffers If physical_vnets_channels is set we adjust the link buffer sizes and the max_dequeue_rate in order to achieve the expected maximum throughput assuming a fully pipelined link, i.e., throughput of 1 msg per cycle per channel (assuming the channels width matches the protocol logical message size, otherwise maximum throughput may be smaller). JIRA: https://gem5.atlassian.net/browse/GEM5-920 Change-Id: Id99ab745ed54686d8ffcc630d622fb07ac0fc352 Signed-off-by: Tiago Mück --- M src/mem/ruby/network/simple/SimpleLink.py M src/mem/ruby/network/simple/SimpleNetwork.py 2 files changed, 28 insertions(+), 14 deletions(-) diff --git a/src/mem/ruby/network/simple/SimpleLink.py b/src/mem/ruby/network/simple/SimpleLink.py index 3d66375..5f31f4a 100644 --- a/src/mem/ruby/network/simple/SimpleLink.py +++ b/src/mem/ruby/network/simple/SimpleLink.py @@ -61,12 +61,25 @@ if hasattr(self, 'buffers') > 0: fatal("User should not manually set links' \ in_buffers or out_buffers") -# Note that all SimpleNetwork MessageBuffers are currently ordered # The network needs number_of_virtual_networks buffers per # in and out port buffers = [] for i in range(int(network.number_of_virtual_networks)): -buffers.append(MessageBuffer(ordered = True, -buffer_size = network.vnet_buffer_size(i))) +buffers.append(MessageBuffer(ordered = True)) + +# If physical_vnets_channels is set we adjust the buffer sizes and +# the max_dequeue_rate in order to achieve the expected thoughput +# assuming a fully pipelined link, i.e., throughput of 1 msg per cycle +# per channel (assuming the channels width matches the protocol +# logical message size, otherwise maximum thoughput may be smaller). +if len(network.physical_vnets_channels) != 0: +assert(len(network.physical_vnets_channels) == \ + int(network.number_of_virtual_networks)) +for i in range(int(network.number_of_virtual_networks)): +buffers[i].buffer_size = \ +network.physical_vnets_channels[i] * (self.latency + 1) +buffers[i].max_dequeue_rate = \ +network.physical_vnets_channels[i] + self.buffers = buffers diff --git a/src/mem/ruby/network/simple/SimpleNetwork.py b/src/mem/ruby/network/simple/SimpleNetwork.py index fbb5c8d..c11deaa 100644 --- a/src/mem/ruby/network/simple/SimpleNetwork.py +++ b/src/mem/ruby/network/simple/SimpleNetwork.py @@ -63,15 +63,6 @@ physical_vnets_bandwidth = VectorParam.Int([], "Assign a bandwidth for each vnet channel") -# Gets the size of the message buffers associated to a vnet -# If physical_vnets_channels is set we just multiply the size of the -# buffers as SimpleNetwork does not actually creates multiple physical -# channels per vnet -def vnet_buffer_size(self, vnet): -if len(self.physical_vnets_channels) == 0: -return self.buffer_size -else: -return self.buffer_size * self.physical_vnets_channels[vnet] def setup_buffers(self): # Setup internal buffers for links and routers @@ -116,6 +107,16 @@ "Routing strategy to be used") def setup_buffers(self, network): +# Gets the size of the message buffers associated with a vnet +# If physical_vnets_channels is set we just multiply the size of the +# buffers as each vnet buffer is shared between physical channels +def vnet_buffer_size(vnet): +if len(network.physical_vnets_channels) == 0: +return network.buffer_size +else: +return network.buffer_size * \ + network.physical_vnets_channels[vnet] + if hasattr(self, 'port_buffers') > 0: fatal("User should not manually set routers' port_buffers") router_buffers = [] @@ -126,7 +127,7 @@ for i in range(int(network.number_of_virtual_networks)): router_buffers.append(MessageBuffer(ordered = True, allow_zero_latency = True, -buffer_size = network.vnet_buffer_size(i))) +buffer_size = vnet_buffer_size(i))) # Add message buffers to routers for each external link connection for link in network.ext_links: @@ -135,6 +136,6 @@ for i in range(int(network.number_of_virtual_networks)):
[gem5-dev] Change in gem5/gem5[develop]: mem-ruby: refactored SimpleNetwork routing
Tiago Mück has uploaded this change for review. ( https://gem5-review.googlesource.com/c/public/gem5/+/41858 ) Change subject: mem-ruby: refactored SimpleNetwork routing .. mem-ruby: refactored SimpleNetwork routing The routing algorithm is encapsulated in a separate SimObject to allow user to implement different routing strategies. The default implementation (WeightBased) maintains the original behavior. JIRA: https://gem5.atlassian.net/browse/GEM5-920 Change-Id: I5c8927f358b8b04b2da55e59679c2f629c7cd2f9 Signed-off-by: Tiago Mück --- M src/mem/ruby/network/simple/PerfectSwitch.cc M src/mem/ruby/network/simple/PerfectSwitch.hh M src/mem/ruby/network/simple/SConscript M src/mem/ruby/network/simple/SimpleNetwork.cc M src/mem/ruby/network/simple/SimpleNetwork.hh M src/mem/ruby/network/simple/SimpleNetwork.py M src/mem/ruby/network/simple/Switch.cc M src/mem/ruby/network/simple/Switch.hh A src/mem/ruby/network/simple/routing/BaseRoutingUnit.hh A src/mem/ruby/network/simple/routing/WeightBased.cc A src/mem/ruby/network/simple/routing/WeightBased.hh 11 files changed, 360 insertions(+), 102 deletions(-) diff --git a/src/mem/ruby/network/simple/PerfectSwitch.cc b/src/mem/ruby/network/simple/PerfectSwitch.cc index 38191b5..de3547d 100644 --- a/src/mem/ruby/network/simple/PerfectSwitch.cc +++ b/src/mem/ruby/network/simple/PerfectSwitch.cc @@ -53,13 +53,6 @@ const int PRIORITY_SWITCH_LIMIT = 128; -// Operator for helper class -bool -operator<(const LinkOrder& l1, const LinkOrder& l2) -{ -return (l1.m_value < l2.m_value); -} - PerfectSwitch::PerfectSwitch(SwitchID sid, Switch *sw, uint32_t virt_nets) : Consumer(sw, Switch::PERFECTSWITCH_EV_PRI), m_switch_id(sid), m_switch(sw) @@ -95,17 +88,15 @@ void PerfectSwitch::addOutPort(const std::vector& out, - const NetDest& routing_table_entry) + const NetDest& routing_table_entry, + const PortDirection _inport) { -// Setup link order -LinkOrder l; -l.m_value = 0; -l.m_link = m_out.size(); -m_link_order.push_back(l); - -// Add to routing table +// Add to routing unit +m_switch->getRoutingUnit().addOutPort(m_out.size(), + out, + routing_table_entry, + dst_inport); m_out.push_back(out); -m_routing_table.push_back(routing_table_entry); } PerfectSwitch::~PerfectSwitch() @@ -150,8 +141,7 @@ Message *net_msg_ptr = NULL; // temporary vectors to store the routing results -std::vector output_links; -std::vector output_link_destinations; +std::vector output_links; Tick current_time = m_switch->clockEdge(); while (buffer->isReady(current_time)) { @@ -162,72 +152,16 @@ net_msg_ptr = msg_ptr.get(); DPRINTF(RubyNetwork, "Message: %s\n", (*net_msg_ptr)); + output_links.clear(); -output_link_destinations.clear(); -NetDest msg_dsts = net_msg_ptr->getDestination(); - -// Unfortunately, the token-protocol sends some -// zero-destination messages, so this assert isn't valid -// assert(msg_dsts.count() > 0); - -assert(m_link_order.size() == m_routing_table.size()); -assert(m_link_order.size() == m_out.size()); - -if (m_network_ptr->getAdaptiveRouting()) { -if (m_network_ptr->isVNetOrdered(vnet)) { -// Don't adaptively route -for (int out = 0; out < m_out.size(); out++) { -m_link_order[out].m_link = out; -m_link_order[out].m_value = 0; -} -} else { -// Find how clogged each link is -for (int out = 0; out < m_out.size(); out++) { -int out_queue_length = 0; -for (int v = 0; v < m_virtual_networks; v++) { -out_queue_length += m_out[out][v]->getSize(current_time); -} -int value = -(out_queue_length << 8) | -random_mt.random(0, 0xff); -m_link_order[out].m_link = out; -m_link_order[out].m_value = value; -} - -// Look at the most empty link first -sort(m_link_order.begin(), m_link_order.end()); -} -} - -for (int i = 0; i < m_routing_table.size(); i++) { -// pick the next link to look at -int link = m_link_order[i].m_link; -NetDest dst = m_routing_table[link]; -DPRINTF(RubyNetwork, "dst: %s\n", dst); - -if (!msg_dsts.intersectionIsNotEmpty(dst)) -continue; - -// Remember what link we're using -output_links.push_back(link); - -
[gem5-dev] Change in gem5/gem5[develop]: configs,mem-ruby: SimpleNetwork physical channels
Tiago Mück has uploaded this change for review. ( https://gem5-review.googlesource.com/c/public/gem5/+/41854 ) Change subject: configs,mem-ruby: SimpleNetwork physical channels .. configs,mem-ruby: SimpleNetwork physical channels Setting the physical_vnets_channels parameter enables the emulation of the bandwidth impact of having multiple physical channels for each virtual network. This is implemented by computing bandwidth in a per-vnet/channel basis within Throttle objects. The size of the message buffers are also scaled according to this setting (when buffer are not unlimited). The physical_vnets_bandwidth can be used to override the channel width set for each link and assign different widths for each virtual network. The --simple-physical-channels option can be used with the generic configuration scripts to automatically a single physical channel to each virtual network defined in the protocol. JIRA: https://gem5.atlassian.net/browse/GEM5-920 Change-Id: Ia8c9ec8651405eac8710d3f4d67f637a8054a76b Signed-off-by: Tiago Mück --- M configs/network/Network.py M src/mem/ruby/network/simple/SimpleNetwork.cc M src/mem/ruby/network/simple/SimpleNetwork.hh M src/mem/ruby/network/simple/SimpleNetwork.py M src/mem/ruby/network/simple/Switch.cc M src/mem/ruby/network/simple/Throttle.cc M src/mem/ruby/network/simple/Throttle.hh 7 files changed, 246 insertions(+), 49 deletions(-) diff --git a/configs/network/Network.py b/configs/network/Network.py index 8690912..4a708ea 100644 --- a/configs/network/Network.py +++ b/configs/network/Network.py @@ -72,6 +72,10 @@ parser.add_option("--garnet-deadlock-threshold", action="store", type="int", default=5, help="network-level deadlock threshold.") +parser.add_option("--simple-physical-channels", + action="store_true", default=False, + help="""SimpleNetwork links uses a separate physical + channel for each virtual network""") def create_network(options, ruby): @@ -179,6 +183,9 @@ extLink.int_cred_bridge = int_cred_bridges if options.network == "simple": +if options.simple_physical_channels: +network.physical_vnets_channels = \ +[1] * int(network.number_of_virtual_networks) network.setup_buffers() if InterfaceClass != None: diff --git a/src/mem/ruby/network/simple/SimpleNetwork.cc b/src/mem/ruby/network/simple/SimpleNetwork.cc index 0f90565..0138060 100644 --- a/src/mem/ruby/network/simple/SimpleNetwork.cc +++ b/src/mem/ruby/network/simple/SimpleNetwork.cc @@ -68,6 +68,23 @@ m_int_link_buffers = p.int_link_buffers; m_num_connected_buffers = 0; + +const std::vector _vnets_channels = +params().physical_vnets_channels; +const std::vector _vnets_bandwidth = +params().physical_vnets_bandwidth; +bool physical_vnets = physical_vnets_channels.size() > 0; +int vnets = params().number_of_virtual_networks; + +fatal_if(physical_vnets && (physical_vnets_channels.size() != vnets), +"physical_vnets_channels must provide channel count for all vnets"); + +fatal_if(!physical_vnets && (physical_vnets_bandwidth.size() != 0), +"physical_vnets_bandwidth also requires physical_vnets_channels"); + +fatal_if((physical_vnets_bandwidth.size() != vnets) && + (physical_vnets_bandwidth.size() != 0), +"physical_vnets_bandwidth must provide BW for all vnets"); } void @@ -94,6 +111,13 @@ SimpleExtLink *simple_link = safe_cast(link); +// some destinations don't use all vnets, but Switch requires the size +// output buffer list to match the number of vnets +while (m_fromNetQueues[local_dest].size() < + params().number_of_virtual_networks) { +m_fromNetQueues[local_dest].push_back(nullptr); +} + m_switches[src]->addOutPort(m_fromNetQueues[local_dest], routing_table_entry[0], simple_link->m_latency, simple_link->m_bw_multiplier); diff --git a/src/mem/ruby/network/simple/SimpleNetwork.hh b/src/mem/ruby/network/simple/SimpleNetwork.hh index 55546a0..a83d7d2 100644 --- a/src/mem/ruby/network/simple/SimpleNetwork.hh +++ b/src/mem/ruby/network/simple/SimpleNetwork.hh @@ -59,6 +59,12 @@ SimpleNetwork(const Params ); ~SimpleNetwork() = default; +const Params& +params() const +{ +return dynamic_cast(_params); +} + void init(); int getBufferSize() { return m_buffer_size; } diff --git a/src/mem/ruby/network/simple/SimpleNetwork.py b/src/mem/ruby/network/simple/SimpleNetwork.py index b4fd81f..2e45116 100644 --- a/src/mem/ruby/network/simple/SimpleNetwork.py +++ b/src/mem/ruby/network/simple/SimpleNetwork.py @@ -1,3 +1,15 @@ +# Copyright (c) 2021 ARM Limited +# All rights reserved. +# +#
[gem5-dev] Change in gem5/gem5[develop]: mem-ruby: refactor SimpleNetwork buffers
Tiago Mück has uploaded this change for review. ( https://gem5-review.googlesource.com/c/public/gem5/+/41859 ) Change subject: mem-ruby: refactor SimpleNetwork buffers .. mem-ruby: refactor SimpleNetwork buffers This removes the int_link_buffers param from SimpleNetwork. Internal link buffers are now created as children of SimpleIntLink objects. This results in a cleaner configuration and simplifies some code in SimpleNetwork.cc. setup_buffers is also split between Switch.setup_buffers and SimpleIntLink.setup_buffers for clarity. JIRA: https://gem5.atlassian.net/browse/GEM5-920 Change-Id: I68ad36ec0e682b8d5600c2950bcb56debe186af3 Signed-off-by: Tiago Mück --- M src/mem/ruby/network/simple/SimpleLink.cc M src/mem/ruby/network/simple/SimpleLink.hh M src/mem/ruby/network/simple/SimpleLink.py M src/mem/ruby/network/simple/SimpleNetwork.cc M src/mem/ruby/network/simple/SimpleNetwork.hh M src/mem/ruby/network/simple/SimpleNetwork.py 6 files changed, 103 insertions(+), 52 deletions(-) diff --git a/src/mem/ruby/network/simple/SimpleLink.cc b/src/mem/ruby/network/simple/SimpleLink.cc index 52d5822..12c311d 100644 --- a/src/mem/ruby/network/simple/SimpleLink.cc +++ b/src/mem/ruby/network/simple/SimpleLink.cc @@ -1,4 +1,16 @@ /* + * Copyright (c) 2021 ARM Limited + * All rights reserved. + * + * The license below extends only to copyright in the software and shall + * not be construed as granting a license to any other intellectual + * property including but not limited to intellectual property relating + * to a hardware implementation of the functionality of the software + * licensed hereunder. You may use the software subject to the license + * terms below provided that you ensure that this notice is replicated + * unmodified and in its entirety in all distributions of the software, + * modified or unmodified, in source code or in binary form. + * * Copyright (c) 2011 Advanced Micro Devices, Inc. * All rights reserved. * @@ -52,6 +64,8 @@ // endpoint bandwidth multiplier - message size multiplier ratio, // determines the link bandwidth in bytes m_bw_multiplier = p.bandwidth_factor; + +m_buffers = p.buffers; } void diff --git a/src/mem/ruby/network/simple/SimpleLink.hh b/src/mem/ruby/network/simple/SimpleLink.hh index d3050b2..52c21cb 100644 --- a/src/mem/ruby/network/simple/SimpleLink.hh +++ b/src/mem/ruby/network/simple/SimpleLink.hh @@ -1,4 +1,16 @@ /* + * Copyright (c) 2021 ARM Limited + * All rights reserved. + * + * The license below extends only to copyright in the software and shall + * not be construed as granting a license to any other intellectual + * property including but not limited to intellectual property relating + * to a hardware implementation of the functionality of the software + * licensed hereunder. You may use the software subject to the license + * terms below provided that you ensure that this notice is replicated + * unmodified and in its entirety in all distributions of the software, + * modified or unmodified, in source code or in binary form. + * * Copyright (c) 2011 Advanced Micro Devices, Inc. * All rights reserved. * @@ -67,6 +79,7 @@ void print(std::ostream& out) const; int m_bw_multiplier; +std::vector m_buffers; }; inline std::ostream& diff --git a/src/mem/ruby/network/simple/SimpleLink.py b/src/mem/ruby/network/simple/SimpleLink.py index 89d823b..3d66375 100644 --- a/src/mem/ruby/network/simple/SimpleLink.py +++ b/src/mem/ruby/network/simple/SimpleLink.py @@ -1,3 +1,15 @@ +# Copyright (c) 2021 ARM Limited +# All rights reserved. +# +# The license below extends only to copyright in the software and shall +# not be construed as granting a license to any other intellectual +# property including but not limited to intellectual property relating +# to a hardware implementation of the functionality of the software +# licensed hereunder. You may use the software subject to the license +# terms below provided that you ensure that this notice is replicated +# unmodified and in its entirety in all distributions of the software, +# modified or unmodified, in source code or in binary form. +# # Copyright (c) 2011 Advanced Micro Devices, Inc. # All rights reserved. # @@ -26,8 +38,10 @@ from m5.params import * from m5.proxy import * +from m5.util import fatal from m5.SimObject import SimObject from m5.objects.BasicLink import BasicIntLink, BasicExtLink +from m5.objects.MessageBuffer import MessageBuffer class SimpleExtLink(BasicExtLink): type = 'SimpleExtLink' @@ -36,3 +50,23 @@ class SimpleIntLink(BasicIntLink): type = 'SimpleIntLink' cxx_header = "mem/ruby/network/simple/SimpleLink.hh" + +# Buffers for this internal link. +# One buffer is allocated per vnet when setup_buffers is called. +# These are created by setup_buffers and the user should not +# set these manually. +buffers =
[gem5-dev] Change in gem5/gem5[develop]: mem-ruby: int/ext SimpleNetwork routing latency
Tiago Mück has uploaded this change for review. ( https://gem5-review.googlesource.com/c/public/gem5/+/41861 ) Change subject: mem-ruby: int/ext SimpleNetwork routing latency .. mem-ruby: int/ext SimpleNetwork routing latency One now may specify separate routing latencies for internal and external links using the router's int_routing_latency and ext_routing_latency, respectively. JIRA: https://gem5.atlassian.net/browse/GEM5-920 Change-Id: I5532668bf23fc61d02b978bfd9479023a6ce2b16 Signed-off-by: Tiago Mück --- M src/mem/ruby/network/simple/PerfectSwitch.cc M src/mem/ruby/network/simple/PerfectSwitch.hh M src/mem/ruby/network/simple/SimpleNetwork.cc M src/mem/ruby/network/simple/SimpleNetwork.py M src/mem/ruby/network/simple/Switch.cc M src/mem/ruby/network/simple/Switch.hh 6 files changed, 26 insertions(+), 8 deletions(-) diff --git a/src/mem/ruby/network/simple/PerfectSwitch.cc b/src/mem/ruby/network/simple/PerfectSwitch.cc index 63203bd..201d091 100644 --- a/src/mem/ruby/network/simple/PerfectSwitch.cc +++ b/src/mem/ruby/network/simple/PerfectSwitch.cc @@ -90,6 +90,7 @@ PerfectSwitch::addOutPort(const std::vector& out, const NetDest& routing_table_entry, const PortDirection _inport, + Tick routing_latency, int link_weight) { // Add to routing unit @@ -99,6 +100,7 @@ dst_inport, link_weight); m_out.push_back(out); +m_out_latencies.push_back(routing_latency); } PerfectSwitch::~PerfectSwitch() @@ -220,7 +222,7 @@ incoming, vnet, outgoing, vnet); m_out[outgoing][vnet]->enqueue(msg_ptr, current_time, - m_switch->latencyTicks()); + m_out_latencies[outgoing]); } } } diff --git a/src/mem/ruby/network/simple/PerfectSwitch.hh b/src/mem/ruby/network/simple/PerfectSwitch.hh index fe32c7f..d4f35e3 100644 --- a/src/mem/ruby/network/simple/PerfectSwitch.hh +++ b/src/mem/ruby/network/simple/PerfectSwitch.hh @@ -75,6 +75,7 @@ void addOutPort(const std::vector& out, const NetDest& routing_table_entry, const PortDirection _inport, +Tick routing_latency, int link_weight); int getInLinks() const { return m_in.size(); } @@ -102,6 +103,9 @@ std::vector > m_in; std::vector > m_out; +// latency for routing to each out port +std::vector m_out_latencies; + uint32_t m_virtual_networks; int m_wakeups_wo_switch; diff --git a/src/mem/ruby/network/simple/SimpleNetwork.cc b/src/mem/ruby/network/simple/SimpleNetwork.cc index c5da257..97164f8 100644 --- a/src/mem/ruby/network/simple/SimpleNetwork.cc +++ b/src/mem/ruby/network/simple/SimpleNetwork.cc @@ -117,7 +117,7 @@ m_switches[src]->addOutPort(m_fromNetQueues[local_dest], routing_table_entry[0], simple_link->m_latency, 0, -simple_link->m_bw_multiplier); +simple_link->m_bw_multiplier, true); } // From an endpoint node to a switch @@ -145,6 +145,7 @@ simple_link->m_latency, simple_link->m_weight, simple_link->m_bw_multiplier, +false, dst_inport); // Maitain a global list of buffers (used for functional accesses only) m_int_link_buffers.insert(m_int_link_buffers.end(), diff --git a/src/mem/ruby/network/simple/SimpleNetwork.py b/src/mem/ruby/network/simple/SimpleNetwork.py index 3bea9e2..fbb5c8d 100644 --- a/src/mem/ruby/network/simple/SimpleNetwork.py +++ b/src/mem/ruby/network/simple/SimpleNetwork.py @@ -99,6 +99,11 @@ virt_nets = Param.Int(Parent.number_of_virtual_networks, "number of virtual networks") +int_routing_latency = Param.Cycles(BasicRouter.latency, +"Routing latency to internal links") +ext_routing_latency = Param.Cycles(BasicRouter.latency, +"Routing latency to external links") + # Internal port buffers used between the PerfectSwitch and # Throttle objects. There is one buffer per virtual network # and per output port. diff --git a/src/mem/ruby/network/simple/Switch.cc b/src/mem/ruby/network/simple/Switch.cc index 224f8bc..2b7ac2f 100644 --- a/src/mem/ruby/network/simple/Switch.cc +++ b/src/mem/ruby/network/simple/Switch.cc @@ -52,7 +52,9 @@ Switch::Switch(const Params ) : BasicRouter(p), -perfectSwitch(m_id, this, p.virt_nets), m_latency(p.latency), +perfectSwitch(m_id,
[gem5-dev] Change in gem5/gem5[develop]: mem-ruby: fix SimpleNetwork WeightBased routing
Tiago Mück has uploaded this change for review. ( https://gem5-review.googlesource.com/c/public/gem5/+/41860 ) Change subject: mem-ruby: fix SimpleNetwork WeightBased routing .. mem-ruby: fix SimpleNetwork WeightBased routing Individual link weights are propagated to the routing algorithms and WeightBased routing now uses this information to select the output link when multiple routing options exist. JIRA: https://gem5.atlassian.net/browse/GEM5-920 Change-Id: I86a4deb610a1b94abf745e9ef249961fb52e9800 Signed-off-by: Tiago Mück --- M src/mem/ruby/network/simple/PerfectSwitch.cc M src/mem/ruby/network/simple/PerfectSwitch.hh M src/mem/ruby/network/simple/SimpleNetwork.cc M src/mem/ruby/network/simple/Switch.cc M src/mem/ruby/network/simple/Switch.hh M src/mem/ruby/network/simple/routing/BaseRoutingUnit.hh M src/mem/ruby/network/simple/routing/WeightBased.cc M src/mem/ruby/network/simple/routing/WeightBased.hh 8 files changed, 38 insertions(+), 17 deletions(-) diff --git a/src/mem/ruby/network/simple/PerfectSwitch.cc b/src/mem/ruby/network/simple/PerfectSwitch.cc index de3547d..63203bd 100644 --- a/src/mem/ruby/network/simple/PerfectSwitch.cc +++ b/src/mem/ruby/network/simple/PerfectSwitch.cc @@ -89,13 +89,15 @@ void PerfectSwitch::addOutPort(const std::vector& out, const NetDest& routing_table_entry, - const PortDirection _inport) + const PortDirection _inport, + int link_weight) { // Add to routing unit m_switch->getRoutingUnit().addOutPort(m_out.size(), out, routing_table_entry, - dst_inport); + dst_inport, + link_weight); m_out.push_back(out); } diff --git a/src/mem/ruby/network/simple/PerfectSwitch.hh b/src/mem/ruby/network/simple/PerfectSwitch.hh index 9b67527..fe32c7f 100644 --- a/src/mem/ruby/network/simple/PerfectSwitch.hh +++ b/src/mem/ruby/network/simple/PerfectSwitch.hh @@ -74,7 +74,8 @@ void addInPort(const std::vector& in); void addOutPort(const std::vector& out, const NetDest& routing_table_entry, -const PortDirection _inport); +const PortDirection _inport, +int link_weight); int getInLinks() const { return m_in.size(); } int getOutLinks() const { return m_out.size(); } diff --git a/src/mem/ruby/network/simple/SimpleNetwork.cc b/src/mem/ruby/network/simple/SimpleNetwork.cc index aa618d0..c5da257 100644 --- a/src/mem/ruby/network/simple/SimpleNetwork.cc +++ b/src/mem/ruby/network/simple/SimpleNetwork.cc @@ -115,7 +115,8 @@ } m_switches[src]->addOutPort(m_fromNetQueues[local_dest], -routing_table_entry[0], simple_link->m_latency, +routing_table_entry[0], +simple_link->m_latency, 0, simple_link->m_bw_multiplier); } @@ -142,6 +143,7 @@ m_switches[dest]->addInPort(simple_link->m_buffers); m_switches[src]->addOutPort(simple_link->m_buffers, routing_table_entry[0], simple_link->m_latency, +simple_link->m_weight, simple_link->m_bw_multiplier, dst_inport); // Maitain a global list of buffers (used for functional accesses only) diff --git a/src/mem/ruby/network/simple/Switch.cc b/src/mem/ruby/network/simple/Switch.cc index 0f0849c..224f8bc 100644 --- a/src/mem/ruby/network/simple/Switch.cc +++ b/src/mem/ruby/network/simple/Switch.cc @@ -79,7 +79,8 @@ void Switch::addOutPort(const std::vector& out, const NetDest& routing_table_entry, - Cycles link_latency, int bw_multiplier, + Cycles link_latency, int link_weight, + int bw_multiplier, PortDirection dst_inport) { const std::vector _vnets_channels = @@ -117,7 +118,7 @@ // Hook the queues to the PerfectSwitch perfectSwitch.addOutPort(intermediateBuffers, routing_table_entry, -dst_inport); + dst_inport, link_weight); // Hook the queues to the Throttle throttles.back().addLinks(intermediateBuffers, out); diff --git a/src/mem/ruby/network/simple/Switch.hh b/src/mem/ruby/network/simple/Switch.hh index 2b2ba5f..50a9eac 100644 --- a/src/mem/ruby/network/simple/Switch.hh +++ b/src/mem/ruby/network/simple/Switch.hh @@ -86,7 +86,7 @@ void addInPort(const std::vector& in); void addOutPort(const std::vector& out, const NetDest& routing_table_entry, -
[gem5-dev] Change in gem5/gem5[develop]: mem-ruby: fixed SimpleNetwork starvation
Tiago Mück has uploaded this change for review. ( https://gem5-review.googlesource.com/c/public/gem5/+/41857 ) Change subject: mem-ruby: fixed SimpleNetwork starvation .. mem-ruby: fixed SimpleNetwork starvation The round-robing scheduling seed is shared across all ports and vnets in the router and it's possible that, under certain heavy traffic scenarios, the same port will always fill the input buffers before any other port is checked. This patch removes the round-robin scheduling. The port to be checked first is always the one with the oldest message. JIRA: https://gem5.atlassian.net/browse/GEM5-920 Change-Id: I918694d46faa0abd00ce9180bc98c58a9b5af0b5 Signed-off-by: Tiago Mück --- M src/mem/ruby/network/MessageBuffer.cc M src/mem/ruby/network/MessageBuffer.hh M src/mem/ruby/network/simple/PerfectSwitch.cc M src/mem/ruby/network/simple/PerfectSwitch.hh 4 files changed, 51 insertions(+), 25 deletions(-) diff --git a/src/mem/ruby/network/MessageBuffer.cc b/src/mem/ruby/network/MessageBuffer.cc index d67b678..2ecefad 100644 --- a/src/mem/ruby/network/MessageBuffer.cc +++ b/src/mem/ruby/network/MessageBuffer.cc @@ -485,6 +485,15 @@ (m_prio_heap.front()->getLastEnqueueTime() <= current_time)); } +Tick +MessageBuffer::readyTime() const +{ +if (m_prio_heap.empty()) +return MaxTick; +else +return m_prio_heap.front()->getLastEnqueueTime(); +} + uint32_t MessageBuffer::functionalAccess(Packet *pkt, bool is_read, WriteMask *mask) { diff --git a/src/mem/ruby/network/MessageBuffer.hh b/src/mem/ruby/network/MessageBuffer.hh index 4d70f30..16a520b 100644 --- a/src/mem/ruby/network/MessageBuffer.hh +++ b/src/mem/ruby/network/MessageBuffer.hh @@ -80,6 +80,9 @@ // TRUE if head of queue timestamp <= SystemTime bool isReady(Tick current_time) const; +// earliest tick the head of queue will be ready, or MaxTick if empty +Tick readyTime() const; + void delayHead(Tick current_time, Tick delta) { diff --git a/src/mem/ruby/network/simple/PerfectSwitch.cc b/src/mem/ruby/network/simple/PerfectSwitch.cc index a821344..38191b5 100644 --- a/src/mem/ruby/network/simple/PerfectSwitch.cc +++ b/src/mem/ruby/network/simple/PerfectSwitch.cc @@ -1,5 +1,5 @@ /* - * Copyright (c) 2020 ARM Limited + * Copyright (c) 2020-2021 ARM Limited * All rights reserved. * * The license below extends only to copyright in the software and shall @@ -64,7 +64,6 @@ : Consumer(sw, Switch::PERFECTSWITCH_EV_PRI), m_switch_id(sid), m_switch(sw) { -m_round_robin_start = 0; m_wakeups_wo_switch = 0; m_virtual_networks = virt_nets; } @@ -116,32 +115,28 @@ void PerfectSwitch::operateVnet(int vnet) { -// This is for round-robin scheduling -int incoming = m_round_robin_start; -m_round_robin_start++; -if (m_round_robin_start >= m_in.size()) { -m_round_robin_start = 0; -} - if (m_pending_message_count[vnet] > 0) { -// for all input ports, use round robin scheduling -for (int counter = 0; counter < m_in.size(); counter++) { -// Round robin scheduling -incoming++; -if (incoming >= m_in.size()) { -incoming = 0; -} - -// Is there a message waiting? -if (m_in[incoming].size() <= vnet) { +// first check the port with the oldest message +unsigned incoming = 0; +Tick lowest_tick = MaxTick; +for (int counter = 0; counter < m_in.size(); ++counter) { +MessageBuffer *buffer = inBuffer(counter, vnet); +if (buffer == nullptr) continue; +if (buffer->readyTime() < lowest_tick){ +lowest_tick = buffer->readyTime(); +incoming = counter; } - -MessageBuffer *buffer = m_in[incoming][vnet]; -if (buffer == nullptr) { +} +DPRINTF(RubyNetwork, "vnet %d: %d pending msgs. " + "Checking port %d first\n", +vnet, m_pending_message_count[vnet], incoming); +// check all ports starting with the one with the oldest message +for (int counter = 0; counter < m_in.size(); + ++counter, incoming = (incoming + 1) % m_in.size()) { +MessageBuffer *buffer = inBuffer(incoming, vnet); +if (buffer == nullptr) continue; -} - operateMessageBuffer(buffer, incoming, vnet); } } diff --git a/src/mem/ruby/network/simple/PerfectSwitch.hh b/src/mem/ruby/network/simple/PerfectSwitch.hh index 12d5e46..7f6e36f 100644 --- a/src/mem/ruby/network/simple/PerfectSwitch.hh +++ b/src/mem/ruby/network/simple/PerfectSwitch.hh @@ -1,4 +1,16 @@ /* + * Copyright (c) 2021 ARM Limited + * All rights reserved. + * + * The license below extends only to copyright in the software and shall + *
[gem5-dev] Change in gem5/gem5[develop]: mem-ruby: Optionally set Consumer ev. priority
Tiago Mück has uploaded this change for review. ( https://gem5-review.googlesource.com/c/public/gem5/+/41855 ) Change subject: mem-ruby: Optionally set Consumer ev. priority .. mem-ruby: Optionally set Consumer ev. priority JIRA: https://gem5.atlassian.net/browse/GEM5-920 Change-Id: I62dc6656bbed4e7f4d575a6a82ac254382294ed1 Signed-off-by: Tiago Mück --- M src/mem/ruby/common/Consumer.cc M src/mem/ruby/common/Consumer.hh 2 files changed, 6 insertions(+), 5 deletions(-) diff --git a/src/mem/ruby/common/Consumer.cc b/src/mem/ruby/common/Consumer.cc index fcaa132..bf11756 100644 --- a/src/mem/ruby/common/Consumer.cc +++ b/src/mem/ruby/common/Consumer.cc @@ -1,5 +1,5 @@ /* - * Copyright (c) 2020 ARM Limited + * Copyright (c) 2020-2021 ARM Limited * All rights reserved. * * The license below extends only to copyright in the software and shall @@ -40,9 +40,9 @@ #include "mem/ruby/common/Consumer.hh" -Consumer::Consumer(ClockedObject *_em) +Consumer::Consumer(ClockedObject *_em, Event::Priority ev_prio) : m_wakeup_event([this]{ processCurrentEvent(); }, -"Consumer Event", false), +"Consumer Event", false, ev_prio), em(_em) { } diff --git a/src/mem/ruby/common/Consumer.hh b/src/mem/ruby/common/Consumer.hh index 2c7065b..2d3c358 100644 --- a/src/mem/ruby/common/Consumer.hh +++ b/src/mem/ruby/common/Consumer.hh @@ -1,5 +1,5 @@ /* - * Copyright (c) 2020 ARM Limited + * Copyright (c) 2020-2021 ARM Limited * All rights reserved. * * The license below extends only to copyright in the software and shall @@ -55,7 +55,8 @@ class Consumer { public: -Consumer(ClockedObject *_em); +Consumer(ClockedObject *em, + Event::Priority ev_prio = Event::Default_Pri); virtual ~Consumer() -- To view, visit https://gem5-review.googlesource.com/c/public/gem5/+/41855 To unsubscribe, or for help writing mail filters, visit https://gem5-review.googlesource.com/settings Gerrit-Project: public/gem5 Gerrit-Branch: develop Gerrit-Change-Id: I62dc6656bbed4e7f4d575a6a82ac254382294ed1 Gerrit-Change-Number: 41855 Gerrit-PatchSet: 1 Gerrit-Owner: Tiago Mück Gerrit-MessageType: newchange ___ gem5-dev mailing list -- gem5-dev@gem5.org To unsubscribe send an email to gem5-dev-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
[gem5-dev] Change in gem5/gem5[develop]: mem-ruby: dequeue rate limit for message buffers
Tiago Mück has uploaded this change for review. ( https://gem5-review.googlesource.com/c/public/gem5/+/41862 ) Change subject: mem-ruby: dequeue rate limit for message buffers .. mem-ruby: dequeue rate limit for message buffers The 'max_dequeue_rate' parameter limits the rate at which messages can be dequeued in a single cycle. When set, 'isReady' returns false if after max_dequeue_rate is reached. This can be used to fine tune the performance of cache controllers. For the record, other ways of achieving a similar effect could be: 1) Modifying the SLICC compiler to limit message consumption in the generated wakeup() function 2) Set the buffer size to max_dequeue_rate. This can potentially cut the the expected throughput in half. For instance if a producer can enqueue every cycle, and a consumer can dequeue every cycle, a message can only be actually enqueued every two (assuming buffer_size=1) since the buffer entries available after dequeue are only visible in the next cycle (even if the consumer executes before the producer). JIRA: https://gem5.atlassian.net/browse/GEM5-920 Change-Id: I3a446c7276b80a0e3f409b4fbab0ab65ff5c1f81 Signed-off-by: Tiago Mück --- M src/mem/ruby/network/MessageBuffer.cc M src/mem/ruby/network/MessageBuffer.hh M src/mem/ruby/network/MessageBuffer.py 3 files changed, 30 insertions(+), 5 deletions(-) diff --git a/src/mem/ruby/network/MessageBuffer.cc b/src/mem/ruby/network/MessageBuffer.cc index 2ecefad..5c8bf59 100644 --- a/src/mem/ruby/network/MessageBuffer.cc +++ b/src/mem/ruby/network/MessageBuffer.cc @@ -52,8 +52,9 @@ using m5::stl_helpers::operator<<; MessageBuffer::MessageBuffer(const Params ) -: SimObject(p), m_stall_map_size(0), -m_max_size(p.buffer_size), m_time_last_time_size_checked(0), +: SimObject(p), m_stall_map_size(0), m_max_size(p.buffer_size), +m_max_dequeue_rate(p.max_dequeue_rate), m_dequeues_this_cy(0), +m_time_last_time_size_checked(0), m_time_last_time_enqueue(0), m_time_last_time_pop(0), m_last_arrival_time(0), m_strict_fifo(p.ordered), m_randomization(p.randomization), @@ -290,7 +291,11 @@ m_size_at_cycle_start = m_prio_heap.size(); m_stalled_at_cycle_start = m_stall_map_size; m_time_last_time_pop = current_time; +m_dequeues_this_cy = 0; } +++m_dequeues_this_cy; +assert((m_max_dequeue_rate == 0) || + (m_dequeues_this_cy <= m_max_dequeue_rate)); pop_heap(m_prio_heap.begin(), m_prio_heap.end(), std::greater()); m_prio_heap.pop_back(); @@ -481,8 +486,17 @@ bool MessageBuffer::isReady(Tick current_time) const { -return ((m_prio_heap.size() > 0) && -(m_prio_heap.front()->getLastEnqueueTime() <= current_time)); +assert(m_time_last_time_pop <= current_time); +bool can_dequeue = (m_max_dequeue_rate == 0) || + (m_time_last_time_pop < current_time) || + (m_dequeues_this_cy < m_max_dequeue_rate); +bool is_ready = (m_prio_heap.size() > 0) && + (m_prio_heap.front()->getLastEnqueueTime() <= current_time); +if (!can_dequeue && is_ready) { +// Make sure the Consumer executes next cycle to dequeue the ready msg +m_consumer->scheduleEvent(Cycles(1)); +} +return can_dequeue && is_ready; } Tick diff --git a/src/mem/ruby/network/MessageBuffer.hh b/src/mem/ruby/network/MessageBuffer.hh index 16a520b..d940dcb 100644 --- a/src/mem/ruby/network/MessageBuffer.hh +++ b/src/mem/ruby/network/MessageBuffer.hh @@ -237,6 +237,14 @@ */ const unsigned int m_max_size; +/** + * When != 0, isReady returns false once m_max_dequeue_rate + * messages have been dequeued in the same cycle. + */ +const unsigned int m_max_dequeue_rate; + +unsigned int m_dequeues_this_cy; + Tick m_time_last_time_size_checked; unsigned int m_size_last_time_size_checked; diff --git a/src/mem/ruby/network/MessageBuffer.py b/src/mem/ruby/network/MessageBuffer.py index 807ffb4..cb7f02d 100644 --- a/src/mem/ruby/network/MessageBuffer.py +++ b/src/mem/ruby/network/MessageBuffer.py @@ -1,4 +1,4 @@ -# Copyright (c) 2020 ARM Limited +# Copyright (c) 2020-2021 ARM Limited # All rights reserved. # # The license below extends only to copyright in the software and shall @@ -66,3 +66,6 @@ master = DeprecatedParam(out_port, '`master` is now called `out_port`') in_port = ResponsePort("Response port from MessageBuffer sender") slave = DeprecatedParam(in_port, '`slave` is now called `in_port`') +max_dequeue_rate = Param.Unsigned(0, "Maximum number of messages that can \ + be dequeued per cycle \ +(0 allows dequeueing all ready messages)") -- To view, visit https://gem5-review.googlesource.com/c/public/gem5/+/41862 To unsubscribe, or for help
[gem5-dev] Change in gem5/gem5[develop]: mem-ruby: fix MI_example functional read
Tiago Mück has uploaded this change for review. ( https://gem5-review.googlesource.com/c/public/gem5/+/41853 ) Change subject: mem-ruby: fix MI_example functional read .. mem-ruby: fix MI_example functional read Changing AccessPermission to Read_Write for transient states waiting on memory when to or from Invalid. In all cases the memory will have the latest data, so this also modifies functionalRead to always send the access to memory. Change-Id: I99f557539b4f9d0d2f99558752b7ddb7e85ab3c6 Signed-off-by: Tiago Mück --- M src/mem/ruby/protocol/MI_example-dir.sm 1 file changed, 25 insertions(+), 12 deletions(-) diff --git a/src/mem/ruby/protocol/MI_example-dir.sm b/src/mem/ruby/protocol/MI_example-dir.sm index ed315e8..11d2862 100644 --- a/src/mem/ruby/protocol/MI_example-dir.sm +++ b/src/mem/ruby/protocol/MI_example-dir.sm @@ -1,4 +1,16 @@ /* + * Copyright (c) 2021 ARM Limited + * All rights reserved. + * + * The license below extends only to copyright in the software and shall + * not be construed as granting a license to any other intellectual + * property including but not limited to intellectual property relating + * to a hardware implementation of the functionality of the software + * licensed hereunder. You may use the software subject to the license + * terms below provided that you ensure that this notice is replicated + * unmodified and in its entirety in all distributions of the software, + * modified or unmodified, in source code or in binary form. + * * Copyright (c) 2009-2012 Mark D. Hill and David A. Wood * Copyright (c) 2010-2012 Advanced Micro Devices, Inc. * All rights reserved. @@ -56,13 +68,17 @@ M_DRD, AccessPermission:Busy, desc="Blocked on an invalidation for a DMA read"; M_DWR, AccessPermission:Busy, desc="Blocked on an invalidation for a DMA write"; -M_DWRI, AccessPermission:Busy, desc="Intermediate state M_DWR-->I"; -M_DRDI, AccessPermission:Busy, desc="Intermediate state M_DRD-->I"; +M_DWRI, AccessPermission:Read_Write, desc="Intermediate state M_DWR-->I"; +M_DRDI, AccessPermission:Read_Write, desc="Intermediate state M_DRD-->I"; -IM, AccessPermission:Busy, desc="Intermediate state I-->M"; -MI, AccessPermission:Busy, desc="Intermediate state M-->I"; -ID, AccessPermission:Busy, desc="Intermediate state for DMA_READ when in I"; -ID_W, AccessPermission:Busy, desc="Intermediate state for DMA_WRITE when in I"; +IM, AccessPermission:Read_Write, desc="Intermediate state I-->M"; +MI, AccessPermission:Read_Write, desc="Intermediate state M-->I"; +ID, AccessPermission:Read_Write, desc="Intermediate state for DMA_READ when in I"; +ID_W, AccessPermission:Read_Write, desc="Intermediate state for DMA_WRITE when in I"; + +// Note: busy states when we wait for memory in transitions from or to 'I' +// have AccessPermission:Read_Write so this controller can get the latest +// data from memory during a functionalRead } // Events @@ -180,12 +196,9 @@ } void functionalRead(Addr addr, Packet *pkt) { -TBE tbe := TBEs[addr]; -if(is_valid(tbe)) { - testAndRead(addr, tbe.DataBlk, pkt); -} else { - functionalMemoryRead(pkt); -} +// if this is called; state is always either invalid or data was just been WB +// to memory (and we are waiting for an ack), so go directly to memory +functionalMemoryRead(pkt); } int functionalWrite(Addr addr, Packet *pkt) { -- To view, visit https://gem5-review.googlesource.com/c/public/gem5/+/41853 To unsubscribe, or for help writing mail filters, visit https://gem5-review.googlesource.com/settings Gerrit-Project: public/gem5 Gerrit-Branch: develop Gerrit-Change-Id: I99f557539b4f9d0d2f99558752b7ddb7e85ab3c6 Gerrit-Change-Number: 41853 Gerrit-PatchSet: 1 Gerrit-Owner: Tiago Mück Gerrit-MessageType: newchange ___ gem5-dev mailing list -- gem5-dev@gem5.org To unsubscribe send an email to gem5-dev-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
[gem5-dev] Change in gem5/gem5[develop]: mem-ruby: SimpleNetwork router latencies
Tiago Mück has uploaded this change for review. ( https://gem5-review.googlesource.com/c/public/gem5/+/41856 ) Change subject: mem-ruby: SimpleNetwork router latencies .. mem-ruby: SimpleNetwork router latencies SimpleNetwork takes into account the network router latency parameter. The latency may be set to zero. PerfectSwitch and Throttle events were assigned different priorities to ensure they always execute in the same order for zero-latency forwarding. JIRA: https://gem5.atlassian.net/browse/GEM5-920 Change-Id: I6cae6a0fc22b25078c27a1e2f71744c08efd7753 Signed-off-by: Tiago Mück --- M src/mem/ruby/network/simple/PerfectSwitch.cc M src/mem/ruby/network/simple/SimpleNetwork.py M src/mem/ruby/network/simple/Switch.cc M src/mem/ruby/network/simple/Switch.hh M src/mem/ruby/network/simple/Throttle.cc 5 files changed, 32 insertions(+), 5 deletions(-) diff --git a/src/mem/ruby/network/simple/PerfectSwitch.cc b/src/mem/ruby/network/simple/PerfectSwitch.cc index b90fd73..a821344 100644 --- a/src/mem/ruby/network/simple/PerfectSwitch.cc +++ b/src/mem/ruby/network/simple/PerfectSwitch.cc @@ -1,4 +1,16 @@ /* + * Copyright (c) 2020 ARM Limited + * All rights reserved. + * + * The license below extends only to copyright in the software and shall + * not be construed as granting a license to any other intellectual + * property including but not limited to intellectual property relating + * to a hardware implementation of the functionality of the software + * licensed hereunder. You may use the software subject to the license + * terms below provided that you ensure that this notice is replicated + * unmodified and in its entirety in all distributions of the software, + * modified or unmodified, in source code or in binary form. + * * Copyright (c) 1999-2008 Mark D. Hill and David A. Wood * All rights reserved. * @@ -49,7 +61,8 @@ } PerfectSwitch::PerfectSwitch(SwitchID sid, Switch *sw, uint32_t virt_nets) -: Consumer(sw), m_switch_id(sid), m_switch(sw) +: Consumer(sw, Switch::PERFECTSWITCH_EV_PRI), + m_switch_id(sid), m_switch(sw) { m_round_robin_start = 0; m_wakeups_wo_switch = 0; @@ -275,7 +288,7 @@ incoming, vnet, outgoing, vnet); m_out[outgoing][vnet]->enqueue(msg_ptr, current_time, - m_switch->cyclesToTicks(Cycles(1))); + m_switch->latencyTicks()); } } } diff --git a/src/mem/ruby/network/simple/SimpleNetwork.py b/src/mem/ruby/network/simple/SimpleNetwork.py index 2e45116..fa01ca2 100644 --- a/src/mem/ruby/network/simple/SimpleNetwork.py +++ b/src/mem/ruby/network/simple/SimpleNetwork.py @@ -95,6 +95,7 @@ if link.dst_node == router: for i in range(int(self.number_of_virtual_networks)): router_buffers.append(MessageBuffer(ordered = True, + allow_zero_latency = True, buffer_size = self.vnet_buffer_size(i))) # Add message buffers to routers for each external link connection @@ -103,6 +104,7 @@ if link.int_node in self.routers: for i in range(int(self.number_of_virtual_networks)): router_buffers.append(MessageBuffer(ordered = True, + allow_zero_latency = True, buffer_size = self.vnet_buffer_size(i))) router.port_buffers = router_buffers diff --git a/src/mem/ruby/network/simple/Switch.cc b/src/mem/ruby/network/simple/Switch.cc index 501b619..fe04ed0 100644 --- a/src/mem/ruby/network/simple/Switch.cc +++ b/src/mem/ruby/network/simple/Switch.cc @@ -52,8 +52,8 @@ Switch::Switch(const Params ) : BasicRouter(p), -perfectSwitch(m_id, this, p.virt_nets), m_num_connected_buffers(0), -switchStats(this) +perfectSwitch(m_id, this, p.virt_nets), m_latency(p.latency), +m_num_connected_buffers(0), switchStats(this) { m_port_buffers.reserve(p.port_buffers.size()); for (auto& buffer : p.port_buffers) { diff --git a/src/mem/ruby/network/simple/Switch.hh b/src/mem/ruby/network/simple/Switch.hh index 271d090..180696d 100644 --- a/src/mem/ruby/network/simple/Switch.hh +++ b/src/mem/ruby/network/simple/Switch.hh @@ -71,6 +71,12 @@ class Switch : public BasicRouter { public: + +// Makes sure throttle sends messages to the links after the switch is +// done forwarding the messages in the same cycle +static const Event::Priority PERFECTSWITCH_EV_PRI = Event::Default_Pri; +static const Event::Priority THROTTLE_EV_PRI = Event::Default_Pri + 1; + typedef SwitchParams Params; Switch(const Params ); ~Switch() = default; @@ -94,6 +100,10 @@ bool functionalRead(Packet *, WriteMask&); uint32_t
[gem5-dev] Change in gem5/gem5[develop]: mem-ruby: removed Message copy constructors
Tiago Mück has submitted this change. ( https://gem5-review.googlesource.com/c/public/gem5/+/41813 ) Change subject: mem-ruby: removed Message copy constructors .. mem-ruby: removed Message copy constructors Prevents error with deprecated implicitly-declared operator= when Message assignment operator is used. The copy constructor in the Message class and the ones generated from SLICC are not doing anything special so use the compiler-generated ones instead. Change-Id: I0edec4a44cbb7858f07ed2f2f189455994055c33 Signed-off-by: Tiago Mück Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/41813 Reviewed-by: Jason Lowe-Power Maintainer: Jason Lowe-Power Tested-by: kokoro --- M src/mem/ruby/slicc_interface/Message.hh M src/mem/slicc/symbols/Type.py 2 files changed, 3 insertions(+), 24 deletions(-) Approvals: Jason Lowe-Power: Looks good to me, approved; Looks good to me, approved kokoro: Regressions pass diff --git a/src/mem/ruby/slicc_interface/Message.hh b/src/mem/ruby/slicc_interface/Message.hh index d7acd2c..b8449e7 100644 --- a/src/mem/ruby/slicc_interface/Message.hh +++ b/src/mem/ruby/slicc_interface/Message.hh @@ -62,12 +62,7 @@ m_DelayedTicks(0), m_msg_counter(0) { } -Message(const Message ) -: m_time(other.m_time), - m_LastEnqueueTime(other.m_LastEnqueueTime), - m_DelayedTicks(other.m_DelayedTicks), - m_msg_counter(other.m_msg_counter) -{ } +Message(const Message ) = default; virtual ~Message() { } diff --git a/src/mem/slicc/symbols/Type.py b/src/mem/slicc/symbols/Type.py index 4e064b5..85a3c41 100644 --- a/src/mem/slicc/symbols/Type.py +++ b/src/mem/slicc/symbols/Type.py @@ -1,4 +1,4 @@ -# Copyright (c) 2020 ARM Limited +# Copyright (c) 2020-2021 ARM Limited # All rights reserved. # # The license below extends only to copyright in the software and shall @@ -259,23 +259,7 @@ code('}') # Copy constructor -if not self.isGlobal: -code('${{self.c_ident}}(const ${{self.c_ident}})') - -# Call superclass constructor -if "interface" in self: -code(': ${{self["interface"]}}(other)') - -code('{') -code.indent() - -for dm in self.data_members.values(): -code('m_${{dm.ident}} = other.m_${{dm.ident}};') - -code.dedent() -code('}') -else: -code('${{self.c_ident}}(const ${{self.c_ident}}&) = default;') +code('${{self.c_ident}}(const ${{self.c_ident}}&) = default;') # Assignment operator -- To view, visit https://gem5-review.googlesource.com/c/public/gem5/+/41813 To unsubscribe, or for help writing mail filters, visit https://gem5-review.googlesource.com/settings Gerrit-Project: public/gem5 Gerrit-Branch: develop Gerrit-Change-Id: I0edec4a44cbb7858f07ed2f2f189455994055c33 Gerrit-Change-Number: 41813 Gerrit-PatchSet: 2 Gerrit-Owner: Tiago Mück Gerrit-Reviewer: Jason Lowe-Power Gerrit-Reviewer: Jason Lowe-Power Gerrit-Reviewer: Tiago Mück Gerrit-Reviewer: kokoro Gerrit-CC: Giacomo Travaglini Gerrit-MessageType: merged ___ gem5-dev mailing list -- gem5-dev@gem5.org To unsubscribe send an email to gem5-dev-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
[gem5-dev] Change in gem5/gem5[develop]: mem-ruby: RubyRequest getter for request ptr
Tiago Mück has submitted this change. ( https://gem5-review.googlesource.com/c/public/gem5/+/41814 ) Change subject: mem-ruby: RubyRequest getter for request ptr .. mem-ruby: RubyRequest getter for request ptr Change-Id: Ib3d12c9030d18d96388dd66f0a409b42543ee9a8 Signed-off-by: Tiago Mück Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/41814 Reviewed-by: Jason Lowe-Power Maintainer: Jason Lowe-Power Tested-by: kokoro --- M src/mem/ruby/protocol/RubySlicc_Exports.sm M src/mem/ruby/protocol/RubySlicc_Types.sm M src/mem/ruby/slicc_interface/RubyRequest.hh 3 files changed, 5 insertions(+), 1 deletion(-) Approvals: Jason Lowe-Power: Looks good to me, approved; Looks good to me, approved kokoro: Regressions pass diff --git a/src/mem/ruby/protocol/RubySlicc_Exports.sm b/src/mem/ruby/protocol/RubySlicc_Exports.sm index 0eb10a7..c2f2c9d 100644 --- a/src/mem/ruby/protocol/RubySlicc_Exports.sm +++ b/src/mem/ruby/protocol/RubySlicc_Exports.sm @@ -50,6 +50,7 @@ external_type(Addr, primitive="yes"); external_type(Cycles, primitive="yes", default="Cycles(0)"); external_type(Tick, primitive="yes", default="0"); +external_type(RequestPtr, primitive="yes", default="nullptr"); structure(WriteMask, external="yes", desc="...") { void clear(); diff --git a/src/mem/ruby/protocol/RubySlicc_Types.sm b/src/mem/ruby/protocol/RubySlicc_Types.sm index 339e99a..c3a2f2d 100644 --- a/src/mem/ruby/protocol/RubySlicc_Types.sm +++ b/src/mem/ruby/protocol/RubySlicc_Types.sm @@ -1,5 +1,5 @@ /* - * Copyright (c) 2020 ARM Limited + * Copyright (c) 2020-2021 ARM Limited * All rights reserved * * The license below extends only to copyright in the software and shall @@ -170,6 +170,8 @@ PacketPtr pkt, desc="Packet associated with this request"; bool htmFromTransaction, desc="Memory request originates within a HTM transaction"; int htmTransactionUid, desc="Used to identify the unique HTM transaction that produced this request"; + + RequestPtr getRequestPtr(); } structure(AbstractCacheEntry, primitive="yes", external = "yes") { diff --git a/src/mem/ruby/slicc_interface/RubyRequest.hh b/src/mem/ruby/slicc_interface/RubyRequest.hh index 55b645e..3a2f486 100644 --- a/src/mem/ruby/slicc_interface/RubyRequest.hh +++ b/src/mem/ruby/slicc_interface/RubyRequest.hh @@ -154,6 +154,7 @@ const RubyAccessMode& getAccessMode() const { return m_AccessMode; } const int& getSize() const { return m_Size; } const PrefetchBit& getPrefetch() const { return m_Prefetch; } +RequestPtr getRequestPtr() const { return m_pkt->req; } void print(std::ostream& out) const; bool functionalRead(Packet *pkt); -- To view, visit https://gem5-review.googlesource.com/c/public/gem5/+/41814 To unsubscribe, or for help writing mail filters, visit https://gem5-review.googlesource.com/settings Gerrit-Project: public/gem5 Gerrit-Branch: develop Gerrit-Change-Id: Ib3d12c9030d18d96388dd66f0a409b42543ee9a8 Gerrit-Change-Number: 41814 Gerrit-PatchSet: 2 Gerrit-Owner: Tiago Mück Gerrit-Reviewer: Jason Lowe-Power Gerrit-Reviewer: Jason Lowe-Power Gerrit-Reviewer: Tiago Mück Gerrit-Reviewer: kokoro Gerrit-CC: Giacomo Travaglini Gerrit-MessageType: merged ___ gem5-dev mailing list -- gem5-dev@gem5.org To unsubscribe send an email to gem5-dev-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
[gem5-dev] Re: vector register indexing modes and renaming?
> -Original Message- > From: Gabe Black > Sent: 24 February 2021 15:24 > To: Giacomo Travaglini > Cc: gem5 Developer List > Subject: Re: [gem5-dev] vector register indexing modes and renaming? > > So, I started really diving into the interfaces in ThreadContext and > ExecContext > and their various implementations. What I wanted to do was to define a much > narrower set of maybe 3 virtual functions that actually implements the core of > what's needed, and not 15-20 different independent virtual methods that all > need to be reimplemented every time. *That* was quite the rabbit hole, and > after a number of hours I decided I needed to regroup and come at it from > another angle. It definitely looks to me like somebody came in with the idea > to > represent these registers using a data, model, view architecture (or something > like that) which would make sense in other contexts with other types of data, > but here I don't think is really the right way to go about this. > > Right now, I have two questions for you. > > 1. Are there tests which exercise this stuff? If I start chopping things up, I > would be a lot more comfortable if I can tell if/when I break something. I will ask within Arm if there's something we can provide to you. In the meantime I gave a quick look at NEON enabled libraries [1]; the Ne10 library provides a set of functions optimized for NEON and a set of examples making use of it [2] (e.g FIR filter, GEMM etc etc). You could probably cross-compile those examples and use them in SE mode (recommending to use the O3 model) > 2. What's the difference between a lane and an element? Those terms seem > like they should be synonyms and are treated as almost the same thing, but > there is clearly a difference between them. What is it, and why does it exist? > > Gabe > I have the hunch the vector lane logic it's not really used. My understanding is that Lane/Elem differ in the O3 model only. The key point is that VecRegister and VecElems are represented by a different set of physical registers; you cannot access a vector element if the renaming is set to Full[3]; the physical vector register file will be made of valid entries, while the vector element register file will be empty. The vector lane getters/setters are probably a way to do a functional read of the element anyway [4]. In a way we could think of VecReg/VecElem as being the interface to the vector file for a guest instruction, while the VecLane to be the interface for the host (even though it could be used by an instruction as well) This is my interpretation of the VecLane Kind Regards Giacomo [1]: https://developer.arm.com/architectures/instruction-sets/simd-isas/neon [2]: https://github.com/projectNe10/Ne10/tree/master/samples [3]: https://github.com/gem5/gem5/blob/stable/src/cpu/o3/rename_map.hh#L282 [4]: https://github.com/gem5/gem5/blob/stable/src/cpu/o3/regfile.hh#L229 IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you. ___ gem5-dev mailing list -- gem5-dev@gem5.org To unsubscribe send an email to gem5-dev-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
[gem5-dev] Re: vector register indexing modes and renaming?
So, I started really diving into the interfaces in ThreadContext and ExecContext and their various implementations. What I wanted to do was to define a much narrower set of maybe 3 virtual functions that actually implements the core of what's needed, and not 15-20 different independent virtual methods that all need to be reimplemented every time. *That* was quite the rabbit hole, and after a number of hours I decided I needed to regroup and come at it from another angle. It definitely looks to me like somebody came in with the idea to represent these registers using a data, model, view architecture (or something like that) which would make sense in other contexts with other types of data, but here I don't think is really the right way to go about this. Right now, I have two questions for you. 1. Are there tests which exercise this stuff? If I start chopping things up, I would be a lot more comfortable if I can tell if/when I break something. 2. What's the difference between a lane and an element? Those terms seem like they should be synonyms and are treated as almost the same thing, but there is clearly a difference between them. What is it, and why does it exist? Gabe On Tue, Feb 23, 2021 at 4:21 AM Gabe Black wrote: > That said, the first would avoid adding another register file while that > would still mean plumbing new interfaces all over the place for all the > ThreadContext and ExecContexts, etc. Once all that code is generic and you > can add or remove register files willy-nilly, it might make sense to switch > to the second option. > > Gabe > > On Tue, Feb 23, 2021 at 4:15 AM Gabe Black wrote: > >> >>> > Hey ARM folks. Could someone please explain to me what the deal is >>> with the >>> > vector registers and renaming modes? What is fundamentally going on >>> there? >>> > My best guess is that the granularity that the registers are being >>> renamed at >>> > changes between the modes, or in other words you index by and rename by >>> > entire registers in one mode, and in the other mode you index by and >>> rename >>> > by just the "elements" within the registers? >>> >>> Yes that is correct, let me know if you need further info on this >>> >>> >>> >> Focusing just on this part for now (not to dismiss the other part), this >> brings me back to an idea in a proposal I sent out a while ago (you >> commented on it, I think) where there are "normal" register files for >> integers, etc, which use uint64_ts as entries, and then register files >> which are for other things which are opaque blobs. Those later register >> files would be basically an array of bytes with an index scaled by some >> arbitrary value and sized based on the scale and some register count. The >> "registers" would be passed around by pointer and cast/copied locally so >> the accessors can be generic. It sounds like the effect of changing between >> element/register indexing could be generically implemented by making it >> possible to reset the scale value for those register files. Another option >> would be to have two different register files, and then just copy things >> over to update the new one when switching. That would make the register >> files themselves simpler, and you have to do something kind of like that >> anyway to make the elements contiguous when switching from element indexing >> to register indexing. Which do you think makes more sense? I'm feeling like >> the second option makes the most sense since it would be easier to >> implement on the CPU side and would push the part that cares about indexing >> semantics and what maps equivalently to what into the thing doing the >> switch which is (presumably) already ISA specific. >> >> Gabe >> >> Gabe >> > ___ gem5-dev mailing list -- gem5-dev@gem5.org To unsubscribe send an email to gem5-dev-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
[gem5-dev] Change in gem5/gem5[develop]: base-stats: Fixed System "work_item" stat name
Bobby R. Bruce has uploaded this change for review. ( https://gem5-review.googlesource.com/c/public/gem5/+/41833 ) Change subject: base-stats: Fixed System "work_item" stat name .. base-stats: Fixed System "work_item" stat name The name of this stat was prefixed with 'system.'. Something which is unecessary and undesirable for the stats output. Change-Id: I873a77927e1ae6bb52f66e9c935e91ef43649dcd --- M src/sim/system.cc 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/sim/system.cc b/src/sim/system.cc index 93902af..4a9c6cd 100644 --- a/src/sim/system.cc +++ b/src/sim/system.cc @@ -483,7 +483,7 @@ std::stringstream namestr; ccprintf(namestr, "work_item_type%d", j); workItemStats[j]->init(20) - .name(name() + "." + namestr.str()) + .name(namestr.str()) .desc("Run time stat for" + namestr.str()) .prereq(*workItemStats[j]); } -- To view, visit https://gem5-review.googlesource.com/c/public/gem5/+/41833 To unsubscribe, or for help writing mail filters, visit https://gem5-review.googlesource.com/settings Gerrit-Project: public/gem5 Gerrit-Branch: develop Gerrit-Change-Id: I873a77927e1ae6bb52f66e9c935e91ef43649dcd Gerrit-Change-Number: 41833 Gerrit-PatchSet: 1 Gerrit-Owner: Bobby R. Bruce Gerrit-MessageType: newchange ___ gem5-dev mailing list -- gem5-dev@gem5.org To unsubscribe send an email to gem5-dev-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s