On Monday 19 September 2011, zap foster wrote:
> How can the results be virtually identical?  I would have thought the "false 
> sharing" version would be significantly higher than the other, but they are 
> nearly identical.
> 
> Is there something I am doing wrong with the tool?

No.

You did not take into account the model cachegrind simulates:
It is one 2-level cache hierarchy, for all threads. Ie. all data accesses
of all threads go through the same L1 data cache.
Because of that, false sharing never can happen, and results are
expected.

Similarily, if you run your code constrained to the 2 hyper-threads of 1 core
of an Intel processor, there will be no false sharing.

> I want to be able to 
> measure the false-sharing so that I can find and improve it in my target 
> program.

To detect that, cachegrind needs an extended model with private
L1 caches per core and according coherence protocol. Further, some
kind of scheduling/mapping between threads and cores of that model
has to be in place (I would vote for fixed 1:1 mapping).
It really would be cool to have that.

However, with the current way Valgrind is implemented, you still would not
see much false sharing as VG is serialising threads. Only in scheduling points,
a false sharing event can happen. But we could make the time slice
quantum changeable via command-line option.

Josef

------------------------------------------------------------------------------
BlackBerry® DevCon Americas, Oct. 18-20, San Francisco, CA
Learn about the latest advances in developing for the 
BlackBerry® mobile platform with sessions, labs & more.
See new tools and technologies. Register for BlackBerry® DevCon today!
http://p.sf.net/sfu/rim-devcon-copy1 
_______________________________________________
Valgrind-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/valgrind-users

Reply via email to