On Thu, July 14, 2011 1:38 am, Hamid Reza Khaleghzadeh wrote:
> Hello all,
>
> I have simulated a 8 cores CMP where consists of 4 chips and each chip has
> 2
> cores and one shared L2. MOESI-CMP-directory is coherency protocol.
>
> Core0    Core1     Core2    Core3     Core4      Core5       Core6
> Core7
>    |------------|             |------------|
> |---------------|              |---------------|
>           |                         |
> |                               |
>          L2 __ Dir0           L2 __ Dir1            L2 __ Dir2
>   L2 __ Dir3
>
> |-------------------------|--------------------------|-------------------------------|
>                                                    |
>                                                Memory
>
> I have run below application two times. First time, Thread1 and Thread2
> are
> mapped on two cores 2, 3 (there is a shared L2 between them). In another
> run, I have bound these two threads to cores 2, 4 ( L2 is not shared
> between
> them). I have a problem with this application. When number of iteration of
> for loop is increased, difference between execution time of run1 and run2
> is
> increased, too. But, for this application, it's clear that coherency cost
> isn't increased when iteration of for loop is increased. Could you tell me
> why this happen?
>
> for (i=0;i<5;i++)
> {
>       THREAD1;     // thread1 *read* a large array. Size of the array is
> smaller than L2 cache.
>       THREAD2;     // thread2 *read* the array that read by thread1
> }
>

What is meant by execution time? If your talking about wall clock time,
that probably is not an indicator of anything. If the difference in cycles
taken is not as expected, then try to look at the break up of where the
cycles are being spent.

--
Nilay

_______________________________________________
gem5-users mailing list
[email protected]
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users

Reply via email to