On May 28, 2008, at 8:02 AM, Jeff Squyres wrote:
Note that the two /sys checks may be redundant; I'm not entirely sure
how the two files relate to each other. libibverbs will complain
about the first if it is not present; the second is used to indicate
that the kernel drivers are loaded.
I got some more feedback from Roland off-list explaining that if /sys/
class/infiniband does exist and is non-empty and /sys/class/
infiniband_verbs/abi_version does not exist, then this is definitely a
case where we want to warn because it implies that config is screwed
up -- RDMA devices are present but not usable.
In this case, I think the warning that libibverbs itself prints is
suitable ("Fatal: couldn't read..."). So let's just eliminate that
check in OMPI and go with something like the following (pretty much
exactly what was proposed a while ago by Pasha :-) ):
# If sysfs/class/infiniband does not exist, the driver was not
# started. Therefore: assume that the user does not want RDMA
# hardware support -- do *not* print a warning message.
if (! -d "$sysfsdir/class/infiniband") {
if ($always_want_to_see_warnings)
print "Warning: $sysfsdir/class/infiniband does not exist\n";
return SKIP_THIS_BTL;
}
# If we get to this point, the drivers are loaded and therefore we
# will assume that there is supposed to be at least one RDMA device
# present. Warn if we don't find any.
$list = ibv_get_device_list();
if (empty($list)) {
print "Warning: couldn't find any RDMA devices -- if you have
no RDMA devices, stop the driver to avoid this warning message\n";
return SKIP_THIS_BTL;
}
# ...continue with initialization; warnings and errors are
# *always* displayed after this point
--
Jeff Squyres
Cisco Systems