On Thu, July 14, 2011 1:38 am, Hamid Reza Khaleghzadeh wrote:
> Hello all,
>
> I have simulated a 8 cores CMP where consists of 4 chips and each chip has
> 2
> cores and one shared L2. MOESI-CMP-directory is coherency protocol.
>
> Core0 Core1 Core2 Core3 Core4 Core5 Core6
> Core7
> |------------| |------------|
> |---------------| |---------------|
> | |
> | |
> L2 __ Dir0 L2 __ Dir1 L2 __ Dir2
> L2 __ Dir3
>
> |-------------------------|--------------------------|-------------------------------|
> |
> Memory
>
> I have run below application two times. First time, Thread1 and Thread2
> are
> mapped on two cores 2, 3 (there is a shared L2 between them). In another
> run, I have bound these two threads to cores 2, 4 ( L2 is not shared
> between
> them). I have a problem with this application. When number of iteration of
> for loop is increased, difference between execution time of run1 and run2
> is
> increased, too. But, for this application, it's clear that coherency cost
> isn't increased when iteration of for loop is increased. Could you tell me
> why this happen?
>
> for (i=0;i<5;i++)
> {
> THREAD1; // thread1 *read* a large array. Size of the array is
> smaller than L2 cache.
> THREAD2; // thread2 *read* the array that read by thread1
> }
>
What is meant by execution time? If your talking about wall clock time,
that probably is not an indicator of anything. If the difference in cycles
taken is not as expected, then try to look at the break up of where the
cycles are being spent.
--
Nilay
_______________________________________________
gem5-users mailing list
[email protected]
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users