On May 29, 2008, at 3:27 AM, Pavel Shamis (Pasha) wrote:

I got some more feedback from Roland off-list explaining that if / sys/
class/infiniband does exist and is non-empty and /sys/class/
infiniband_verbs/abi_version does not exist, then this is definitely a
case where we want to warn because it implies that config is screwed
up -- RDMA devices are present but not usable.

Is it possible that /sys/class/infiniband directory exist and it is
empty ? In which cases ?

Roland consistently said "...and not empty" in e-mails to me, so that's what I assumed.

However, Pasha just did a test: on a machine with a ConnectX HCA, he manually removed the mlx4 drive and started the openibd service. /sys/ class/infiniband was created, but it was empty.

I guess this is a situation that we want to warn about -- we can simplify the whole deal by making the overriding assumption: if the drivers are loaded at all (such that /sys/class/infiniband/ exists at all), OMPI should expect to be able to find some RDMA devices. If it doesn't find any, it should issue a warning.

--
Jeff Squyres
Cisco Systems

Reply via email to