On May 29, 2008, at 3:27 AM, Pavel Shamis (Pasha) wrote:
I got some more feedback from Roland off-list explaining that if /
sys/
class/infiniband does exist and is non-empty and /sys/class/
infiniband_verbs/abi_version does not exist, then this is
definitely a
case where we want to warn because it implies that config is screwed
up -- RDMA devices are present but not usable.
Is it possible that /sys/class/infiniband directory exist and it is
empty ? In which cases ?
Roland consistently said "...and not empty" in e-mails to me, so
that's what I assumed.
However, Pasha just did a test: on a machine with a ConnectX HCA, he
manually removed the mlx4 drive and started the openibd service. /sys/
class/infiniband was created, but it was empty.
I guess this is a situation that we want to warn about -- we can
simplify the whole deal by making the overriding assumption: if the
drivers are loaded at all (such that /sys/class/infiniband/ exists at
all), OMPI should expect to be able to find some RDMA devices. If it
doesn't find any, it should issue a warning.
--
Jeff Squyres
Cisco Systems