Jeff Squyres wrote:
FWIW, George found what looks like a race condition in the sm init code today -- it looks like we don't call maffinity anywhere in the sm btl startup, so we're not actually guaranteed that the memory is local to any particular process(or) (!). This race shouldn't cause segvs, though; it should only mean that memory is potentially farther away than we intended.
Is this that business that came up recently on one of these mail lists about setting the memory node to -1 rather than using the value we know it should be? In mca_mpool_sm_alloc(), I do see a call to opal_maffinity_base_bind().
The central question is: does "first touch" mean both read and write? I.e., is the first process that either reads *or* writes to a given location considered "first touch"? Or is it only the first write?
So, maybe the strategy is to create the shared area, have each process initialize its portion (FIFOs and free lists), have all processes sync, and then move on. That way, you know all memory will be written by the appropriate owner before it's read by anyone else. First-touch ownership will be proper and we won't be dependent on zero-filled pages.
The big question in my mind remains that we don't seem to know how to reproduce the failure (segv) that we're trying to fix. I, personally, am reluctant to stick fixes into the code for problems I can't observe.