To expand slightly on Patrick's last comment:

>  Cache prefetching is slightly
> more efficient on local socket, so closer to reader may be a bit better.

Ideally one polls from cache, but in the event that the line is evicted the next poll after the eviction will pay a lower cost if the memory is near to the reader.

-Paul


Patrick Geoffray wrote:
Richard Graham wrote:
Yes - it is polling volatile memory, so has to load from memory on every read.

Actually, it will poll in cache, and only load from memory when the cache coherency protocol invalidates the cache line. Volatile semantic only prevents compiler optimizations.

It does not matter much where the pages are (closer to reader or receiver) on NUMAs, as long as they are equally distributed among all sockets (ie the choice is consistent). Cache prefetching is slightly more efficient on local socket, so closer to reader may be a bit better.

Patrick
_______________________________________________
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel


--
Paul H. Hargrove                          phhargr...@lbl.gov
Future Technologies Group
HPC Research Department                   Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory     Fax: +1-510-486-6900

Reply via email to