As part of writing a course, I was trying to investigate how OpenMPI
handles transfers when using bog-standard Linux and Ethernet (which I
assume means TCP/IP). Having failed to track down the actual transfer call,
I ran a simple test program under 'strace -f' but, in between two
diagnostic calls (used to pinpoint the MPI_Ssend), there was nothing but
poll() calls! Now that ain't possible ....
Clearly, something odd is going on, but my test is sufficiently simple (and
checked) that I can't see much possibility of a trivial error, but that's
still the most likely possibility. Any suggestions welcome - especially
pointers to the actual transfer call!
Incidentally, the issue I am investigating is how the MPI transfers are
likely to use the cache, and hence how much impact there is likely to be
when overlapping memory-bound computation or GPU use, especially when using
lots of cores. That's a long-standing and not-pretty problem with most
MPIs.
Regards,
Nick Maclaren.