I think that I've discovered the problem is the OFED Roll that I'm using. When
a node is first built it recompiles the OFED modules for the current kernel and
I'm still deciphering the actual sequence of events, but I think that I need to
add a reboot at the end of the process.
Mike
On Apr 14
Hey Mike,
That's pretty odd, it looks like the o2ib module has a symbol mismatch
with the ofed driver. I'm surprised it works at all...can you send the
dmesg output after modprobe lustre + mounting, as well as the lctl
list_nids output?
Thanks,
Kit
On 4/14/2010 1:42 PM, Michael Robbert wrote:
Michael Robbert wrote:
> Kit,
> I thought that it may be a timing issue, but I added mount commands to
> rc.local and it didn't help.
Robert,
I'm not sure of the root cause of your mount problems, but we were also
hitting a timing problem when mounting file systems over Infiniband at
boot time.
Kit,
I thought that it may be a timing issue, but I added mount commands to rc.local
and it didn't help. The odd thing is that it does seem to work on subsequent
reboots. I haven't done extensive testing to see if that works all the time or
not. The other odd thing is that if the FSs don't mount
Hey Mike,
Are there any messages in dmesg on boot? I've seen it on occasion where
the IB takes a second to actually start. If that's the case, you might
need to add mounts to rc.local, or try to get openibd to start earlier.
- Kit
On 4/12/2010 7:33 PM, Michael Robbert wrote:
> I am trying to c
I am trying to configure a Lustre 1.8.2 client on a CentOS 5.4 machine. I have
compiled from source into RPMS and all 4 RPMS are installed (lustre, -modules,
-tests, and -source). The lustre module will load find manually with "modprobe
lustre", but I can not get the filesystem to automatically