Re: [gem5-users] vcs per vnet and ni_flit_size and link width in garnet2

2017-08-18 Thread Tushar Krishna
Depending on the coherence protocol, there are certain number of “virtual 
networks" or “message classes”, such as request, response, forward etc.
This to carry coherence messages.
Within each vnet, the NoC can have one or more VCs.
So together, num_vnets x vcs_per_vnet is the total number of VCs *at every 
input port* of each router.


> On Aug 19, 2017, at 2:16 AM, matild breo  wrote:
> 
> Hi thank you
> 
> yes I saw that page,still I can not understand the meaning of vcs_per_vnets.
> if It said 4 vcs_per_vnet, how much vcs a 4*4 network has in throughout of 
> network?
> 
> thanks
> 
> On Sat, Aug 19, 2017 at 10:19 AM, matild breo  > wrote:
> Hi every body
> I have some questions:
> 
> 1- what is default link width in garnet 2 in bytes?
> 
> 2- if the ni_flit_size is 16 byte, does it mean 2 flit pass from every link 
> in one cycle (for example link width is 128bit)?
> 
> 3-what does mean vcs_per_vnet?
> 
> 4- how can determine vcs per link??
> 
> Thank you
> 
> ___
> gem5-users mailing list
> gem5-users@gem5.org
> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users

___
gem5-users mailing list
gem5-users@gem5.org
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users

Re: [gem5-users] vcs per vnet and ni_flit_size and link width in garnet2

2017-08-18 Thread matild breo
Hi thank you

yes I saw that page,still I can not understand the meaning of vcs_per_vnets.
if It said 4 vcs_per_vnet, how much vcs a 4*4 network has in throughout of
network?

thanks

On Sat, Aug 19, 2017 at 10:19 AM, matild breo  wrote:

> Hi every body
> I have some questions:
>
> 1- what is default link width in garnet 2 in bytes?
>
> 2- if the ni_flit_size is 16 byte, does it mean 2 flit pass from every
> link in one cycle (for example link width is 128bit)?
>
> 3-what does mean vcs_per_vnet?
>
> 4- how can determine vcs per link??
>
> Thank you
>
___
gem5-users mailing list
gem5-users@gem5.org
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users

Re: [gem5-users] vcs per vnet and ni_flit_size and link width in garnet2

2017-08-18 Thread Tushar Krishna
You can find the answers here:
http://www.gem5.org/Garnet2.0 


> On Aug 19, 2017, at 1:49 AM, matild breo  wrote:
> 
> Hi every body
> I have some questions:
> 
> 1- what is default link width in garnet 2 in bytes?
> 
> 2- if the ni_flit_size is 16 byte, does it mean 2 flit pass from every link 
> in one cycle (for example link width is 128bit)?
> 
> 3-what does mean vcs_per_vnet?
> 
> 4- how can determine vcs per link??
> 
> Thank you
> ___
> gem5-users mailing list
> gem5-users@gem5.org
> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users

___
gem5-users mailing list
gem5-users@gem5.org
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users

[gem5-users] vcs per vnet and ni_flit_size and link width in garnet2

2017-08-18 Thread matild breo
Hi every body
I have some questions:

1- what is default link width in garnet 2 in bytes?

2- if the ni_flit_size is 16 byte, does it mean 2 flit pass from every link
in one cycle (for example link width is 128bit)?

3-what does mean vcs_per_vnet?

4- how can determine vcs per link??

Thank you
___
gem5-users mailing list
gem5-users@gem5.org
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users

[gem5-users] Branch Misprediction penalty

2017-08-18 Thread Aitzaz Alam
Hi,
My question is related to configuring a branch misprediction penalty in
gem5 detailed CPU. Assuming that I want to configure a pipeline with 12
cycle branch misprediction penalty, how should I change delay parameters in
O3CPU.py? Will using following configurations work:


fetchToDecodeDelay = 3
decodeToRenameDelay = 3
renameToIEWDelay= 4
issueToExecuteDelay=1
iewToCommitDelay=3

Is branch misprediction penalty in gem5 equal to only front end length
(till execution stage) or the entire pipeline length?

Thank you
___
gem5-users mailing list
gem5-users@gem5.org
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users

[gem5-users] Performance of Mutex and Semaphore locks in FS

2017-08-18 Thread Sumanth Sridhar
Hello,

I am trying to determine the performance of mutexes vs semaphores in a full
system simulation of GEM5. I'm running very simple test programs and the
numbers I get aren't what I expect. These are the steps I took:

1. scons -j 3 build/X86/gem5.opt PROTOCOL=MOESI_hammer
2. build/X86/gem5.opt configs/example/fs.py -n 4
--kernel=x86_64-vmlinux-2.6.22.9.smp --disk-image=linux-x86.img
--script=configs/boot/hack_back_ckpt.rcS
3. build/X86/gem5.opt configs/example/fs.py -r1 -n4
--kernel=x86_64-vmlinux-2.6.22.9.smp --disk-image=linux-x86.img
--script=mutex.rcS --ruby --l1d_size=32kB --l1i_size=32kB --l2_size=256kB
--l1d_assoc=8 --l1i_assoc=8 --l2_assoc=4 --cpu-type=TimingSimpleCPU
--restore-with-cpu=TimingSimpleCPU

mutex.rcS:
m5 resetstats
./mutex-test 100
m5 dumpstats

m5 resetstats
m5 exit

Similarly for sem-test.

Results:


mutex-test (sim_ticks) sem-test (sim_ticks) ticks_mutex/ticks_sem
Total sim_ticks to sum up to 2Million using 2 threads 1,114.46 E+09 864.46
E+09
1.29
approx sim_ticks per sum (total sim_ticks by 2Million) 557,229.74 432,228.04

I executed the same binaries on my local system by running:
/usr/bin/time -f "\t%E real" ./mutex-test 10
/usr/bin/time -f "\t%E real" ./sem-test 10

PFA the source files for these test programs.

Results:


mutex-test (seconds) sem-test (seconds) time_mutex/time_sem
Total time to sum up to 2Million using 2 threads 0.19 0.30
0.62
approx time per sum
(total execution time by 2Million) 9.30 E-08 15.1 E-08

I get similar results even if I don't pin threads to cores.

Could someone please shed some light on what is going on? Am I missing
something?

Thanks in advance!
Sumanth
#define _GNU_SOURCE
#include 
#include 
#include 

// shared variable that both threads sum to
int count = 0;
// number of times the count variable should be incremented by each thread
int rounds = 0;
pthread_mutex_t mutexLock;

// pins the current thread to a particular CPU
void pinThread(int cpuId)
{
assert(cpuId < 4);
cpu_set_t cpuset;
CPU_ZERO(&cpuset);
CPU_SET(cpuId, &cpuset);

pthread_t tid = pthread_self();
int res = pthread_setaffinity_np(tid, sizeof(cpu_set_t), &cpuset);

assert(res == 0);
}

// the spawned thread
// increments the 'count' variable 'rounds' times
void* threadFn(void* arg)
{
// ensure that each thread runs on a single CPU by pinning
int cpuId = (int)(long) arg;
pinThread(cpuId);

int i;
for (i = 0; i < rounds; i++)
{
// use mutex locks to atomically increment 'count'
pthread_mutex_lock(&mutexLock);
count++;
pthread_mutex_unlock(&mutexLock);
}
return NULL;

}

int main(int argc, char const *argv[])
{
// user should provide number of times each thread should increment 'count'
if (argc != 2)
{
printf("%s \n", argv[0]);
return 0;
}
rounds = atol(argv[1]);

// initialize the mutex
if (pthread_mutex_init(&mutexLock, NULL) != 0)
{
printf("\n mutex init failed\n");
return 1;
}

int i=0;

// create 2 threads and pin them to CPUs 0 and 1
pthread_t tid[2];
while(i < 2)
{
pthread_create(&(tid[i]), NULL, &threadFn, (void *)(long) i);
i++;
}

pthread_join(tid[0], NULL);
pthread_join(tid[1], NULL);
pthread_mutex_destroy(&mutexLock);

// verify that the threads did their job
if (count != 2 * rounds)
{
printf("Failed: Final count = %d, Correct value = %d\n", count,
2*rounds);
}
else
{
printf("Success, count = %d\n", count);
}   
return 0;
}#define _GNU_SOURCE
#include 
#include 
#include 
#include 

// shared variable that both threads sum to
int count = 0;
// number of times the count variable should be incremented by each thread
int rounds = 0;
sem_t semLock;

// pins the current thread to a particular CPU
void pinThread(int cpuId)
{
assert(cpuId < 4);
cpu_set_t cpuset;
CPU_ZERO(&cpuset);
CPU_SET(cpuId, &cpuset);

pthread_t tid = pthread_self();
int res = pthread_setaffinity_np(tid, sizeof(cpu_set_t), &cpuset);

assert(res == 0);
}

// the spawned thread
// increments the 'count' variable 'rounds' times
void* threadFn(void* arg)
{
// ensure that each thread runs on a single CPU by pinning
int cpuId = (int)(long) arg;
pinThread(cpuId);

int i;
for (i = 0; i < rounds; i++)
{
// use semaphore to atomically increment 'count'
sem_wait(&semLock);
count++;
sem_post(&semLock);
}
return NULL;

}

int main(int argc, char const *argv[])
{
// user should provide number of times each thread should increment 'count'
if (argc != 2)
{
printf("%s \n", argv[0]);
return 0;
}
rounds = atol(argv[1]);

// initialize the semaphore
if (sem_init(&semLock, 0, 1) != 0)
{
printf("\nsemaphore init failed\n");
return 1;
}

int i=0;

// create 2 threads and

[gem5-users] Running on RISCV SE mode

2017-08-18 Thread Vanchinathan Venkataramani
Dear Forum members

I am trying to run hello world application compiled for RISCV on SE mode of
gem5.

I used this link: https://github.com/riscv/riscv-gnu-toolchain  to obtain
the cross compiler for riscv and used*
/opt/riscv/bin/riscv64-unknown-elf-gcc -static -O2 hello.c -o hello* to
compile the hello world program.

When I use "file hello", I get the following output:
hello: ELF 64-bit LSB executable, version 1 (SYSV), statically linked, not
stripped

When I try to run this executable on gem5 using the following command:
./build/RISCV_SE/gem5.opt ./configs/example/se.py -c
/home/vanchi/srjkvr_riscv/benchmarks/hello --cpu-type=atomic -n 1

I get the error: fatal: Object file architecture does not match compiled
ISA (RISCV).

It would be really helpful if you can point me to the right version of
cross compiler or the gem5 command line that can help me in running
application compiled for riscv on gem5.

Thanks a lot in advance!
___
gem5-users mailing list
gem5-users@gem5.org
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users

Re: [gem5-users] ARM Architecture with Ruby Memory model

2017-08-18 Thread Nikos Nikoleris
Hi Amit,


We've recently made some changes and you should be able to run full system 
simulations using arm and ruby as the memory model. I believe you should be 
able to boot Linux using the MOESI_CMP_directory ruby protocol and possibly 
other protocols as well.


Nikos



From: gem5-users  on behalf of Amit Joshi 

Sent: 17 August 2017 08:23
To: gem5-users@gem5.org
Subject: [gem5-users] ARM Architecture with Ruby Memory model

Can we use ARM architecture with Ruby memory model with full system simulation?


AMIT

IMPORTANT NOTICE: The contents of this email and any attachments are 
confidential and may also be privileged. If you are not the intended recipient, 
please notify the sender immediately and do not disclose the contents to any 
other person, use it for any purpose, or store or copy the information in any 
medium. Thank you.
___
gem5-users mailing list
gem5-users@gem5.org
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users