I think that I've discovered the problem is the OFED Roll that I'm using. When a node is first built it recompiles the OFED modules for the current kernel and I'm still deciphering the actual sequence of events, but I think that I need to add a reboot at the end of the process.
Mike On Apr 14, 2010, at 10:21 PM, Kit Westneat wrote: > Hey Mike, > > That's pretty odd, it looks like the o2ib module has a symbol mismatch > with the ofed driver. I'm surprised it works at all...can you send the > dmesg output after modprobe lustre + mounting, as well as the lctl > list_nids output? > > Thanks, > Kit > > On 4/14/2010 1:42 PM, Michael Robbert wrote: >> Kit, >> I thought that it may be a timing issue, but I added mount commands to >> rc.local and it didn't help. The odd thing is that it does seem to work on >> subsequent reboots. I haven't done extensive testing to see if that works >> all the time or not. The other odd thing is that if the FSs don't mount on >> boot a manual mount command does not work without first doing "modprobe >> lustre" first. This is what I see in that case: >> >> [r...@compute-2-1 ~]# mount -a >> mount.lustre: mount 172.16.3...@o2ib:/home at /lustre/home failed: No such >> device >> Are the lustre modules loaded? >> Check /etc/modprobe.conf and /proc/filesystems >> Note 'alias lustre llite' should be removed from modprobe.conf >> mount.lustre: mount 172.16.3...@o2ib:/scratch at /lustre/scratch failed: No >> such device >> Are the lustre modules loaded? >> Check /etc/modprobe.conf and /proc/filesystems >> Note 'alias lustre llite' should be removed from modprobe.conf >> >> Here are some dmesg entries from a boot that does not mount the FSs: >> >> ADDRCONF(NETDEV_UP): eth0: link is not ready >> bnx2: eth0 NIC Copper Link is Up, 1000 Mbps full duplex >> ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready >> ADDRCONF(NETDEV_UP): ib0: link is not ready >> ADDRCONF(NETDEV_CHANGE): ib0: link becomes ready >> Lustre: OBD class driver, http://www.lustre.org/ >> Lustre: Lustre Version: 1.8.2 >> Lustre: Build Version: 1.8.2-20100122190848-PRISTINE-2.6.18-164.15.1.el5 >> ko2iblnd: disagrees about version of symbol ib_fmr_pool_unmap >> ko2iblnd: Unknown symbol ib_fmr_pool_unmap >> ... Lots more ko2iblnd errors here (Is this part of the problem or a red >> herring? ... >> ko2iblnd: disagrees about version of symbol ib_fmr_pool_map_phys >> ko2iblnd: Unknown symbol ib_fmr_pool_map_phys >> LustreError: 3288:0:(api-ni.c:1043:lnet_startup_lndnis()) Can't load LND >> o2ib, module ko2iblnd, rc=256 >> LustreError: 3288:0:(events.c:729:ptlrpc_init_portals()) network >> initialisation failed >> LustreError: 165-2: Nothing registered for client mount! Is the 'lustre' >> module loaded? >> LustreError: 3381:0:(obd_mount.c:2042:lustre_fill_super()) Unable to mount >> (-19) >> >> >> Thanks, >> Mike >> >> On Apr 12, 2010, at 10:07 PM, Kit Westneat wrote: >> >> >>> Hey Mike, >>> >>> Are there any messages in dmesg on boot? I've seen it on occasion where >>> the IB takes a second to actually start. If that's the case, you might >>> need to add mounts to rc.local, or try to get openibd to start earlier. >>> >>> - Kit >>> >>> On 4/12/2010 7:33 PM, Michael Robbert wrote: >>> >>>> I am trying to configure a Lustre 1.8.2 client on a CentOS 5.4 machine. I >>>> have compiled from source into RPMS and all 4 RPMS are installed (lustre, >>>> -modules, -tests, and -source). The lustre module will load find manually >>>> with "modprobe lustre", but I can not get the filesystem to automatically >>>> mount on boot up. I have added the following to /etc/modprobe.conf >>>> >>>> options lnet networks=o2ib0(ib0) >>>> >>>> and these are the entries in my /etc/fstab >>>> >>>> 172.16.3...@o2ib:/home /lustre/home lustre auto,_netdev 1 2 >>>> 172.16.3...@o2ib:/scratch /lustre/scratch lustre auto,_netdev 1 2 >>>> >>>> I have a similar setup with Lustre 1.6.7.2 client running on RHEL 4.5 and >>>> it loads fine there. >>>> >>>> What am I missing? >>>> >>>> Thanks, >>>> Mike Robbert >>>> >>>> _______________________________________________ >>>> Lustre-discuss mailing list >>>> Lustre-discuss@lists.lustre.org >>>> http://lists.lustre.org/mailman/listinfo/lustre-discuss >>>> >>>> >>> >>> -- >>> --- >>> Kit Westneat >>> kwestn...@datadirectnet.com >>> 812-484-8485 >>> >>> >> > > > -- > --- > Kit Westneat > kwestn...@datadirectnet.com > 812-484-8485 > _______________________________________________ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss