Hello and thanks for your response.

Following your guidelines, I have been examining during the past few days the source code, of the simulators and the actual components as well, so as to get a more in-depth information of the implementations.

In particular, I have been examining the CMP.L2SharedNUCA.Inorder model that comes with flexus-4.0. For the rest of this e-mail, I will be referring to this model

---------------------------------------------

First of all, I have designed a layout of the architecture, in trying to understand the interconnection between the different components. This is based on the wiring.cpp file of the simulator and can be found here:

https://pithos.grnet.gr/pithos/rest/[email protected]/files/SimFlex/CMP.L2SharedNUCA.Inorder-layout.pdf

If I am correct:

* The Feeder provides the instructions
* The Fetcher fetches the instructions and forwards them to L1 Instruction Cache and to BPWarm
* The BPWarm component must be the Branch Predictor
* The Execute component - very obviously - executes the instructions and requests data from L1 Data Cache.

* L1 Instruction / Data cache components: obvious
* L2 cache: is the component I have to configure in order to implement a shared NUCA cache (correct?) * NetMapper: is the splitter that distributes requests among the components
* Memory: must be the memory below L2 (i.e. RAM)

Have I understood the architecture correctly?
Moreover, what is the purpose of the "NIC" and "Network" components?

As I have seen:
* The Network component is an instance of a "NetShim/MemoryNetwork" component, though it is not very obvious what is its relevancy with the L2 Cache.

* Concerning the NIC, I guess it must be a Network Interface Controller. I have taken a look in the MultiNic component folder, where I've seen that it has multiple implementations: MultiNic1, MultiNic2, MultiNic3, MultiNic4 and a general MultiNicX, where it must hold general implementations. I have seen that the various implementations have been defining different values for FLEXUS_MULTI_NIC_NUMPORTS: does it have to do with the number of the components the NIC is connected with?

---------------------------------------------

On the L2 cache:
I have seen a sample configuration in wiring.cpp of L2SharedNUCA.Inorder, as well as flexus-4.0/components/CMPCache/CMPCache.hpp.

In wiring.cpp, the parameter theL2Cfg.Cores.initialize(64) initializes 64 cores. What are these cores and how are they related to the CPU cores or the 64 banks of the cache, which are initialized at theL2Cfg.Banks.initialize(64)?

What is more, I am trying to figure out where following issues are defined: * The mapping between "CPU cores - L2 Banks", that is to which core is each CPU mapped to. * Replacement/Migration policies. I have only noticed that the coherence policy is in flexus-4.0/components/CMPCache/NonInclusiveMESIPolicy.cpp, if I am correct.


Finally, I have found in flexus-4.0/components/CMPCache/RTDirectory.hpp the following scheme:

Physical address layout:
+---------+--------------+------+-------------+----------------------------+
| Tag | R Index High | Bank | R Index Low | RegionOffset | BlockOffset |
+---------+--------------+------+-------------+----------------------------+
|<------ setLowShift ------->|
                                |<----------->|
                                   setMaskLow

Where I have not understood the purpose of the fields R Index High/Low.

I presume, by the way, that due to the presence "Bank" field, the placement policy of the data in the corresponding NUCA Banks must be static, i.e. every block will be *initially* always placed in the same bank, according to its address.

---------------------------------------------

On the Network component:

As I have seen in the wiring.cpp file, the parameter "theNetworkCfg.NetworkTopologyFile.initialize()" selects the topology file that will be used for the network.

An example file is 16x3-mesh.topology (it can be initially found in L2ShadedNUCA.OoO folder), which defines a 4x4 grid of "switches", where each switch has 4 ports to interconnect with other switches and 3 ports that connect the switch with nodes (so in total (4x4)x3 nodes).

I have understood the topology and the routing tables that are being defined, but I have not understood how these nodes and switches are related to the L2-NUCA cache, if there is any relationship at all.

---------------------------------------------


Thank you in advance for your help.
I will be glad to provide any additional information, you might need.

-George


On Wed, 23 Mar 2011 16:47:14 +0000, Djordje Jevdjic wrote:
Hello,

Thanks for your message.

Concerning your first question: yes, all the messages exchanged
through this list are in one of those archives. For technical reasons
we decided to split them into two separate archives (the old and the
new archive).

Regarding your second question: I don't think you need to implement
anything to have a NUCA simulator. NUCA systems have
been already implemented (actually, almost all simulators we use in
Flexus are NUCA simulators).
The ones you listed bellow (
flexus-4.0/simulators/CMP.L2SharedNUCA.Inorder and
flexus-4.0/simulators/CMP.L2SharedNUCA.OoO)
are examples of such architectures with a shared and tiled L2 cache.
So, things have already been implemented, there's no need to
reimplement it.

However, if you are interested to know more details of the
implementations, you can look at the source code and find some useful
comments there.
If you are examining the source code, it's a good idea to look at the
code of individual components included in the simulator, not the
simulator directory itself.
You might also want to check the getting started guide on our
website. Besides that and the Simflex publications, we don't maintain
any further documentation.

Also, keep in mind that the current version of Flexus works only with
Simics 3. Whatever you try to do with Simics 4 highly likely will not
work.
We are planning to move to Simics 4 soon.

Regards,
Djordje

Reply via email to