Hi,

Thanks for your interest. 

Our microbenchmark traverses an array of a given size ( depending on the cache 
size you want to pollute).

The access pattern depends on the value of each accessed array element. For 
example, in the following
array with A[2] = 7 and  A[7] = 40, the access pattern starting from the 
element 2 is 2->7->40.

Now, the tricky part is to find how to initialize the array so we make sure 
that each cache block of the array is 
re-accessed after accessing the rest of the cache blocks. At the same time, we 
want our access pattern to
be random (not captured by the existing on-chip prefetchers) so as to ensure 
that the accesses miss in the 
L1 and L2 caches.

Initialization: 
a) You initialize the array with 1,2,3,4, ....,0
b) For every element of the array, you choose a random element and swap their 
values.  

Traversal: Initiate as many streams as the MLP you want to achieve (to ensure 
that cache blocks are not evicted
by the application). In each stream you can set a variable. Look at the 
assembly code to ensure that the body
of the loop has as many assembly instructions as the number of streams you want 
to have. 

Before running any experiment, make sure that your access pattern is fair (all 
cache blocks are accessed before
accessing a cache block for the second time) and that your microbenchmark has a 
hit ratio close to 100% in the LLC
when running the application of your interest.

Regards,
-Stavros.

________________________________________
From: suixiufeng [[email protected]]
Sent: Saturday, May 19, 2012 9:06 AM
To: [email protected]
Subject: The cache-polluting threads

Hi,
    You perform a cache sensitivity analysis by dedicating two cores to 
cache-polluting threads. I want to know how to write the polluter threads. 
Would you please give me an example? Thank you!

Reply via email to