On Fri, May 1, 2009 at 4:24 AM, arkady kanevsky <[email protected]> wrote: > Jie, > it sounds to me that either the variable is not volatile or compiler > optimization > causes some problem. I would check for these first. > Arkady >
Agreed, it is definitely a caching issue. Atomics are InfiniBand specific, and there are some fairly complex rules that govern how much the HCA can do caching. The gotcha is that they basically provide some cache coherency guarantees within the context of a connection, but not much between connections or versus local applications. That said, it would be rare for HCA caching to be the cause of anything worse than some unexpected ordering. Adapters cache when they have to, but would really rather not allocate or track a lot of resources. Updating to real physical memory ASAP is much simpler. Compilers, on the other hand, *love* optimizing. The key thing to understand is that the HCA is another processor, one that is at least as distant as any other CPU core. Any and all techniques used when sharing memory with another processor apply. Completions hide all that from the application, just promising that specific things are coherent when the user invokes the verbs to reap a completion. So whenever you do without completions you are dealing with an arbitrary multi-processor memory coherence problem. _______________________________________________ general mailing list [email protected] http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
