Here is a related case.
If I remember correctly, the HPCC pingpong test synchronizes
occasionally by having one process send a zero-byte broadcast to all
other processes. What's a zero-byte broadcast? Well, some MPIs
apparently send no data, but do have synchronization semantics. (No
non-root process can exit before the root process has entered.) Other
MPIs treat the zero-byte broadcasts as no-ops; there is no
synchronization and then timing results from the HPCC pingpong test are
very misleading. So far as I can tell, the MPI standard doesn't address
which behavior is correct. The test strikes me as deficient: it would
have been just as easy to have a single-word broadcast to implement the
synchronization they were looking for.
Sigh.