Re: [OMPI devel] shared-memory allocations

Paul H. Hargrove Sat, 13 Dec 2008 17:46:07 -0500

To expand slightly on Patrick's last comment:

>  Cache prefetching is slightly
> more efficient on local socket, so closer to reader may be a bit better.

Ideally one polls from cache, but in the event that the line is evicted thenext poll after the eviction will pay a lower cost if the memory is near tothe reader.


-Paul


Patrick Geoffray wrote:

Richard Graham wrote:
Yes - it is polling volatile memory, so has to load from memory onevery read.
Actually, it will poll in cache, and only load from memory when thecache coherency protocol invalidates the cache line. Volatile semanticonly prevents compiler optimizations.
It does not matter much where the pages are (closer to reader orreceiver) on NUMAs, as long as they are equally distributed among allsockets (ie the choice is consistent). Cache prefetching is slightlymore efficient on local socket, so closer to reader may be a bit better.
Patrick
_______________________________________________
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel



--
Paul H. Hargrove                          phhargr...@lbl.gov
Future Technologies Group
HPC Research Department                   Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory     Fax: +1-510-486-6900

Re: [OMPI devel] shared-memory allocations

Reply via email to