TotalviewTech brought to my attention that their new ReplayEngine
debugger does not support shared memory. To do this, they intercept
mmap() and return ENOMEM to indicate that shared memory is not
available.
In OMPI, if mmap() fails, we unconditionally emit some
opal_output(0, ...) messages. I think we should remove these
opal_output's; perhaps replacing them with verbose equivalents (i.e.,
only conditionally output those messages). But we do fail over to
other transports, as expected.
However, the failover is a somewhat false sense of security -- the sm
BTL's mmap() occurs during BTL add_procs(), not during component or
module startup. So you can run:
mpirun --mca btl tcp,sm,self ...
But actually not use sm at all, even if multiple procs are on the same
node. Specifically, the sm component and module init succeed (so
there's no error); but every time you go to add a proc to the sm btl,
it'll always fail. This can be problematic for transports that do not
support same-host loopback (e.g., iWARP) -- you'll actually end up
with "unreachable" errors, even though you're supposedly using the SM
BTL. That would be very confusing to a user and difficult to diagnose.
Should we add some kind of trivial mmap() test during the sm BTL
component/module init to see if shared memory is available at all?
--
Jeff Squyres
Cisco Systems