Dear Mahmood,
Attached please find the topology.cpp file that I use to generate topology
files. You should compile this file and use the binary to generate topology
files. In order to generate a 4x4 mesh topology, you should use a command like
./topology -n 12 -m -w 3 4x3-mesh.topology
where "-m" option indicates that the interconnection network is a mesh (for a
torus interconnection network you should use "-t"), "-w" option specifies three
nodes/switch (one for the core, one for the L2 slice, and one for the memory
controller), "-n" option specifies the total number of nodes connected to the
interconnection network (which is the multiplication of the number of nodes ,in
this case four, and the number that is specified in the "-w" option), and
finally the last parameter is the name of the generated topology file.
Regards,
Pejman
________________________________________
From: Mahmood Naderan [[email protected]]
Sent: Wednesday, April 27, 2011 3:29 PM
To: Song Liu
Cc: simflex
Subject: Re: segmentation fault in CMP.L2SharedNUCA.OoO
Thanks for that
// Naderan *Mahmood;
----- Original Message -----
From: Song Liu <[email protected]>
To: Mahmood Naderan <[email protected]>
Cc: simflex <[email protected]>
Sent: Wednesday, April 27, 2011 5:56 PM
Subject: Re: segmentation fault in CMP.L2SharedNUCA.OoO
There are a few topology files in other simulators' folders. You can
copy them to the simulator you are using.
Song
On Wed, Apr 27, 2011 at 8:23 AM, Mahmood Naderan <[email protected]> wrote:
>>For 4 cores, you should use 4x3 topology.
> By default there is no 4x3. However I want to create that. Problem is I
> don't know "Top Switch" and "Route Switch" sections. Do you have one for that?
>
>
> // Naderan *Mahmood;
>
>
> ----- Original Message -----
> From: Song Liu <[email protected]>
> To: Mahmood Naderan <[email protected]>
> Cc: Volos Stavros <[email protected]>; simflex <[email protected]>
> Sent: Wednesday, April 27, 2011 5:17 PM
> Subject: Re: segmentation fault in CMP.L2SharedNUCA.OoO
>
> For 4 cores, you should use 4x3 topology.
>
> Memory map is used for NUMA, not NUCA. In your case, you should use 1
> nodes for memory map.
>
> Song
>
> On Wed, Apr 27, 2011 at 2:16 AM, Mahmood Naderan <[email protected]> wrote:
>> Hi Stavros,
>> Thanks for your reply.
>> using CMP.L2SharedNUCA.OoO with 4 cores, I have set
>>
>> flexus.set "-network:nodes" "12" #
>> "Number of Nodes" (NumNodes)
>>
>> and used 16x3 topology. Is there any relation between them? Is that correct?
>>
>> also what about flexus.set "-memory-map:nodes"
>> "1" # "Number of Nodes" (NumNodes)
>>
>> should it be 1 as default?
>>
>> // Naderan *Mahmood;
>>
>>
>> ----- Original Message -----
>> From: Volos Stavros <[email protected]>
>> To: Mahmood Naderan <[email protected]>; simflex <[email protected]>
>> Cc:
>> Sent: Wednesday, April 27, 2011 12:48 AM
>> Subject: Re: segmentation fault in CMP.L2SharedNUCA.OoO
>>
>> Dear Mahmood,
>>
>> You are trying to simulate a quad-core system where each core consists of
>> four hardware threads. This is
>> because you have set the variable "-fag:threads" to "4".
>>
>> Please note that each SIMICS cpu is a hardware thread.
>>
>> Your workload has only four hardware threads (i.e., 4 simics CPUs). As such
>> with the current workload you can
>> simulate a system that supports only four hardware threads (e.g., either a
>> system with 4 single-thread cores).
>> In order to simulate a quad-core 4-way multithreaded system, you need a
>> workload with 16 hardware threads.
>>
>> Please change the variable as follows
>>
>> flexus.set "-fag:threads" "1"
>>
>> In case you changed other parameters, please also make sure that you are
>> simulating single-threaded cores.
>>
>> flexus.set "-ufetch:threads" "1" #
>> "Number of threads under control of this uFetch" (Threads)
>> flexus.set "-fag:threads" "1" #
>> "Number of threads under control of this FAG" (Threads)
>> flexus.set "-decoder:multithread" "0" #
>> "Enable multi-threaded execution" (Multithread)
>> flexus.set "-uarch:multithread" "0" #
>> "Enable multi-threaded execution" (Multithread)
>> flexus.set "-L1d:cores" "1" #
>> "Number of threads (cores)" (Cores)
>> flexus.set "-L1i:mt_width" "1" #
>> "Number of threads sharing this cache" (MTWidth)
>> flexus.set "-L1d:mt_width" "1" #
>> "Number of threads sharing this cache" (MTWidth)
>>
>>
>>
>> Regards,
>> -Stavros.
>> ________________________________________
>> From: Mahmood Naderan [[email protected]]
>> Sent: Tuesday, April 19, 2011 10:21 AM
>> To: simflex
>> Subject: segmentation fault in CMP.L2SharedNUCA.OoO
>>
>> Hi there,
>> After running a successful trace with CMP.L2Shared.Trace, now I get
>> segmentation
>> fault at the beginning of CMP.L2SharedNUCA.OoO
>>
>> 34 <ComponentManager.cpp:95> {0}- Initalizing components...
>> 35 <ComponentManager.cpp:99> {0}- Initalizing sys-white-box
>> 36 <ComponentManager.cpp:99> {0}- Initalizing 00-fag
>> 37 <mai_api.cpp:274> {0}- Searching 4 cpus.
>> 38 <mai_api.cpp:278> {0}- Processor 0: cpu0 - CPU 0
>> 39 <mai_api.cpp:297> {0}- Found CPU: '' - 0
>> 40 <mai_api.cpp:278> {0}- Processor 1: cpu1 - CPU 1
>> 41 <mai_api.cpp:297> {0}- Found CPU: '' - 1
>> 42 <mai_api.cpp:278> {0}- Processor 2: cpu2 - CPU 2
>> 43 <mai_api.cpp:297> {0}- Found CPU: '' - 2
>> 44 <mai_api.cpp:278> {0}- Processor 3: cpu3 - CPU 3
>> 45 <mai_api.cpp:297> {0}- Found CPU: '' - 3
>> 46 <mai_api.cpp:314> {0}- Found 4 Flexus CPUs and 0 Client CPUs in 0 VMs
>> 47 <mai_api.cpp:353> {0}- VMS per row = 1, CPVM = 4, NVMR = 2, NumRow = 2
>> 48 <mai_api.cpp:380> {0}- theProcMap[0] = (0, 0) (abs_index = 0)
>> 49 <mai_api.cpp:380> {0}- theProcMap[1] = (1, 0) (abs_index = 1)
>> 50 <mai_api.cpp:380> {0}- theProcMap[2] = (2, 0) (abs_index = 2)
>> 51 <mai_api.cpp:380> {0}- theProcMap[3] = (3, 0) (abs_index = 3)
>> 52 <mai_api.cpp:385> {0}- Finished creating Processor Mapper.
>> 53 <FetchAddressGenerateImpl.cpp:93> {0}- Thread[0.0] connected to cpu0
>> Initial
>> PC: v:0ff293010
>> 54 <FetchAddressGenerateImpl.cpp:93> {0}- Thread[0.1] connected to cpu1
>> Initial
>> PC: v:0ff2f7e8c
>> 55 <FetchAddressGenerateImpl.cpp:93> {0}- Thread[0.2] connected to cpu2
>> Initial
>> PC: v:0ff0bebf8
>> 56 <FetchAddressGenerateImpl.cpp:93> {0}- Thread[0.3] connected to cpu3
>> Initial
>> PC: v:000011408
>> 57 <ComponentManager.cpp:99> {0}- Initalizing 01-fag
>>
>>
>>
>> (gdb) backtrace
>> #0 0x00007f727b6884c3 in SIM_get_program_counter ()
>> from /home/mahmood/simics-3.0.29/amd64-linux/bin/libsimics-common.so
>> #1 0x00007f7270b2da16 in
>> nFetchAddressGenerate::FetchAddressGenerateComponent::initialize() ()
>> from
>> /home/mahmood/flexus-4.0/results/blackscholes-timing_v9-CMP.L2SharedNUCA.OoO-19Apr11-120014/blackscholes/blackscholes_000_001/libflexus_CMP.L2SharedNUCA.OoO_v9_iface_gcc.so
>>
>> #2 0x00007f7270d846a2 in
>> Flexus::Core::aux_::ComponentManagerImpl::initComponents (
>> this=<value optimized out>) at components/ComponentManager.cpp:100
>> #3 0x00007f7270e886c9 in Flexus::Core::FlexusImpl::initializeComponents
>> (this=0x31d2dc0)
>> at flexus.cpp:246
>> #4 0x00007f7270e901c2 in Flexus::Core::FlexusImpl::doLoad (this=0x31d2dc0,
>> aDirName=...)
>> at flexus.cpp:538
>>
>> The config is:
>>
>> flexus.set "-fag:threads" "4"
>> flexus.set "-L1d:cores" "4"
>>
>> flexus.set "-L2:cores" "4"
>> flexus.set "-net-mapper:Cores" "4"
>>
>> 00-fag is ok but when it comes to the next fag, it receives SIGSEGV. This
>> fault
>> occur before entering interactive mode. Any idea about that?
>>
>>
>> // Naderan *Mahmood;
>>
>
>
#include <iostream>
#include <fstream>
#include <string.h>
#include <stdlib.h>
#include <assert.h>
#include <math.h>
#include <unistd.h>
#include <getopt.h>
using namespace std;
#define DIMS 2
int numNodes = 0;
int numSwitches = 0;
int nodesPerDim[DIMS] = { 0 };
int fattness = 1;
bool isTorus = false;
bool isAdaptive = false;
char * outFilename;
ofstream outFile;
bool processArguments ( int argc, char ** argv );
bool generateSwitches ( void );
bool generateMeshTopology ( int nodeId );
bool generateTorusTopology ( int nodeId );
bool generateDeadlockFreeMeshRoute ( int currNode,
int destNode );
bool generateDeadlockFreeTorusRoute ( int currNode,
int destNode );
enum Direction {
NORTH,
SOUTH,
EAST,
WEST,
LOCAL
};
int main ( int argc, char ** argv ) {
int i, j;
std::string progname = argv[0];
if ( processArguments ( argc, argv ) ) {
cerr << "Usage: " << progname << " -n <numnodes> [-m|-t] [-a] [-w <width>] <output filename>" << endl;
cerr << "\t-t specifies a 2D torus" << endl;
cerr << "\t-m specifies a 2D mesh" << endl;
cerr << "\t-a specifies adaptive routing tables (not implemented)" << endl;
cerr << "\t-w specifies a fat 2D torus/mesh with <width> nodes per router" << endl;
return 1;
}
// Generate the boilerplate parameters
outFile << "# Boilerplate stuff" << endl;
outFile << "ChannelLatency 1" << endl;
outFile << "ChannelLatencyData 4" << endl;
outFile << "ChannelLatencyControl 1" << endl;
outFile << "LocalChannelLatencyDivider 4" << endl;
outFile << "SwitchInputBuffers 1" << endl;
outFile << "SwitchOutputBuffers 1" << endl;
outFile << "SwitchInternalBuffersPerVC 1" << endl;
outFile << endl;
// Output all of the node->switch connections
if ( generateSwitches() )
return true;
// Make topology
if ( isTorus ) {
outFile << endl << "# Topology for a " << numNodes << " node TORUS with " << fattness << " nodes per router" << endl;
} else {
outFile << endl << "# Topology for a " << numNodes << " node MESH with " << fattness << " nodes per router" << endl;
}
for ( i = 0; i < numSwitches; i++ ) {
if ( isTorus ) {
if ( generateTorusTopology ( i ) )
return 1;
} else {
if ( generateMeshTopology ( i ) )
return 1;
}
}
outFile << endl << "# Deadlock-free routing tables" << endl;
// For each switch, generate a routing table to each destination
for ( i = 0; i < numSwitches; i++ ) {
outFile << endl << "# Switch " << i << " -> *" << endl;
// For each destination
for ( j = 0; j < numNodes; j++ ) {
if ( isTorus ) {
if ( generateDeadlockFreeTorusRoute ( i, j ) )
return 1;
} else {
if ( generateDeadlockFreeMeshRoute ( i, j ) )
return 1;
}
}
}
outFile.close();
return 0;
}
bool processArguments ( int argc, char ** argv ) {
static struct option long_options[] = {
{ "torus", 0, NULL, 't'},
{ "mesh", 0, NULL, 'm'},
{ "adaptive", 0, NULL, 'a'},
{ "width", 1, NULL, 'w'},
{ "nodes", 1, NULL, 'n'},
{ "file", 1, NULL, 'f'}
};
int c, index;
while ( (c = getopt_long(argc, argv, "af:hmn:tw:", long_options, &index)) >= 0) {
switch (c) {
case 'a':
isAdaptive = false;
break;
case 'f':
outFilename = optarg;
break;
case 'h':
return true;
case 'm':
isTorus = false;
break;
case 'n':
numNodes = atoi(optarg);
break;
case 't':
isTorus = true;
break;
case 'w':
fattness = atoi(optarg);
break;
case '?':
cout << "Unrecognized option '" << optopt << "'" << endl;
return true;
}
}
if (optind < argc) {
outFilename = argv[optind];
}
numSwitches = numNodes / fattness;
nodesPerDim[0] = (int)sqrt ( (float)numSwitches );
// Get this to a power of two
while ( nodesPerDim[0] & (nodesPerDim[0] - 1) )
nodesPerDim[0]--;
nodesPerDim[1] = numSwitches / nodesPerDim[0];
if ( numSwitches <= 0 || ( numSwitches & ( numSwitches - 1 ) ) ) {
cerr << "NumSwitches must be greater than zero and a power of two" << endl;
return true;
}
for ( int i = 0; i < DIMS; i++ ) {
cout << nodesPerDim[i] << " switches per dimension " << i << endl;
}
outFile.open ( outFilename );
if ( !outFile.good() ) {
cerr << "ERROR opening output file: " << outFilename << endl;
return true;
}
return false;
}
bool generateSwitches ( void ) {
int i;
outFile << "# Basic Switch/Node connections" << endl;
outFile << "NumNodes " << numNodes << endl;
outFile << "NumSwitches " << numSwitches << endl;
outFile << "SwitchPorts " << (4 + fattness) << endl;
outFile << "SwitchBandwidth 4" << endl;
outFile << endl;
for ( i = 0; i < numNodes; i++ ) {
outFile << "Top Node " << i << " -> Switch " << (i % numSwitches) << ":" << (int)(i / numSwitches) << endl;
}
return false;
}
int getXCoord ( int nodeId ) {
return ( nodeId % nodesPerDim[0] );
}
int getYCoord ( int nodeId ) {
return ( nodeId / nodesPerDim[0] );
}
int getNodeIdCoord ( int x, int y ) {
int
id;
id = ( x + ( y * nodesPerDim[0] ) );
if ( id >= numSwitches ) {
cerr << "ERROR: node coordinates out of bounds: " << x << ", " << y << endl;
exit ( 1 );
}
return id;
}
int getNodeIdOffset ( int nodeId, Direction dir ) {
int
x,
y;
x = getXCoord ( nodeId );
y = getYCoord ( nodeId );
switch ( dir ) {
case NORTH:
y--;
break;
case SOUTH:
y++;
break;
case EAST:
x++;
break;
case WEST:
x--;
break;
default:
cerr << "Invalid direction" << dir << endl;
exit ( 1 );
}
if ( isTorus ) {
if ( x < 0 )
x = nodesPerDim[0] - 1;
if ( x >= nodesPerDim[0] )
x = 0;
if ( y < 0 )
y = nodesPerDim[1] - 1;
if ( y >= nodesPerDim[1] )
y = 0;
} else {
// Invalid offset for a mesh
if ( x < 0 || y < 0 ||
x >= nodesPerDim[0] ||
y >= nodesPerDim[1] ) {
cerr << "ERROR: invalid offset for a mesh!" << endl;
return -1;
}
}
return getNodeIdCoord ( x, y );
}
ostream & operator << ( ostream & out, const Direction dir ) {
switch ( dir ) {
case NORTH:
out << (fattness + 1);
break;
case SOUTH:
out << (fattness + 3);
break;
case EAST:
out << (fattness + 2);
break;
case WEST:
out << (fattness + 0);
break;
case LOCAL:
out << 0;
break;
};
return out;
}
bool writeS2S ( const int node1,
const Direction dir1,
const int node2,
const Direction dir2 ) {
outFile << "Top Switch "
<< node1 << ":" << dir1 << " -> Switch "
<< node2 << ":" << dir2 << endl;
return false;
}
/* A switch looks like this:
* (port numbers on inside, port 0 to n are the local CPUs)
*
* N
* |
* +-----------------+
* | n+2 |
* | |
* W -- | n+1 0-n n+3 | -- E
* | |
* | n+4 |
* +-----------------+
* |
* S
*/
bool generateMeshTopology ( int nodeId ) {
int x, y;
x = getXCoord ( nodeId );
y = getYCoord ( nodeId );
// East
if ( x != nodesPerDim[0] - 1 )
writeS2S ( getNodeIdCoord ( x, y ),
EAST,
getNodeIdCoord ( x + 1, y ),
WEST );
// South
if ( y != nodesPerDim[1] - 1 )
writeS2S ( getNodeIdCoord ( x, y ),
SOUTH,
getNodeIdCoord ( x, y + 1 ),
NORTH );
return false;
}
bool generateTorusTopology ( int nodeId ) {
int
x,
y;
x = getXCoord ( nodeId );
y = getYCoord ( nodeId );
// East
writeS2S ( getNodeIdCoord ( x, y ),
EAST,
getNodeIdCoord ( (x + 1) % nodesPerDim[0], y ),
WEST );
// South
writeS2S ( getNodeIdCoord ( x, y ),
SOUTH,
getNodeIdCoord ( x, (y + 1) % nodesPerDim[1] ),
NORTH );
return false;
}
bool writeBasicRoute ( int currNode,
int destNode,
int outPort,
int outVC ) {
outFile << "Route Switch " << currNode << " -> " << destNode
<< " { " << outPort << ":" << outVC << " } " << endl;
return false;
}
bool writeBasicRoute ( int currNode,
int destNode,
Direction outPort,
int outVC ) {
outFile << "Route Switch " << currNode << " -> " << destNode
<< " { " << outPort << ":" << outVC << " } " << endl;
return false;
}
#define ABS(X) ((X) > 0 ? (X) : -(X))
bool generateDeadlockFreeTorusRoute ( int currNode,
int destNode ) {
int
xoff,
yoff;
// Trivial case, output to port 0, VC 0
if ( currNode == (destNode % numSwitches) ) {
return writeBasicRoute ( currNode,
destNode,
(int)(destNode / numSwitches),
0 );
}
xoff =
getXCoord ( (destNode % numSwitches) ) -
getXCoord ( currNode );
yoff =
getYCoord ( (destNode % numSwitches) ) -
getYCoord ( currNode );
if ( xoff != 0 ) {
if ( xoff > 0 && xoff < (nodesPerDim[0] / 2) ||
xoff < (-nodesPerDim[0] / 2) ) {
// Go EAST
return writeBasicRoute ( currNode,
destNode,
EAST,
getXCoord ( destNode ) >
getXCoord ( currNode ) );
} else {
// Go WEST
return writeBasicRoute ( currNode,
destNode,
WEST,
getXCoord ( destNode ) >
getXCoord ( currNode ) );
}
}
if ( yoff > 0 && yoff < (nodesPerDim[1] / 2) ||
yoff < (-nodesPerDim[1] / 2) ) {
// Go SOUTH
return writeBasicRoute ( currNode,
destNode,
SOUTH,
getYCoord ( destNode ) >
getYCoord ( currNode ) );
} else {
// Go NORTH
return writeBasicRoute ( currNode,
destNode,
NORTH,
getYCoord ( destNode ) >
getYCoord ( currNode ) );
}
return false;
}
bool generateDeadlockFreeMeshRoute ( int currNode,
int destNode ) {
int xoff, yoff;
xoff =
getXCoord ( (destNode % numSwitches) ) -
getXCoord ( currNode );
yoff =
getYCoord ( (destNode % numSwitches) ) -
getYCoord ( currNode );
// Trivial case, output to port 0, VC 0
if ( currNode == (destNode % numSwitches) ) {
return writeBasicRoute ( currNode,
destNode,
(int)(destNode / numSwitches),
0 );
}
if ( xoff < 0 ) {
return writeBasicRoute ( currNode,
destNode,
WEST,
0 );
}
if ( xoff > 0 ) {
return writeBasicRoute ( currNode,
destNode,
EAST,
0 );
}
if ( yoff < 0 ) {
return writeBasicRoute ( currNode,
destNode,
NORTH,
0 );
}
if ( yoff > 0 ) {
return writeBasicRoute ( currNode,
destNode,
SOUTH,
0 );
}
cerr << "ERROR: mesh routing found no route from " << currNode << " to " << destNode << " offset is " << xoff << ", " << yoff << endl;
return true;
}