Patrick Geoffray wrote:

Jeff Squyres wrote:

Why not? The "owning" process can do the touch; then it'll be affinity'ed properly. Right?

Yes, that's what I meant by forcing allocation. From the thread, it looked like nobody touched the pages of the mapped file. If it's already done, no need to write in the whole file.

The shared area is used for two kinds of data structures: FIFOs and fragments. Fragments are first touched (written) by their senders. FIFOs are complicated data structures that used (up to 1.3.1) to be mapped all over the place -- parts local to sender and parts local to receiver. Receivers would touch their part. Once senders believed the receivers set their stuff up, the senders would initialize their parts.

The stuff that occurs "0.01%" of the time that Jeff and Terry saw looked to me like a memory race condition. That is, a receiver would initialize some memory and then publish a pointer. A sender, upon seeing the pointer, would assume the corresponding memory was initialized. But, there weren't a whole lot of memory barriers anywhere, and I've wondered whether the sender might see "pre-initialized" memory. I just don't know.

The stuff that occurs "1%" of the time (e.g., in MTT logs noted by Ralph recently) might be something else.

Anyhow, the first touch should all be happening properly from an affinity point of view and the reason we want zerofill is so that that sender/receiver coordination happens properly (and there may be other ways of addressing that). And, most of all, lots of mysteries remain.

Reply via email to