Chris Worley wrote: > Does anybody have any clues, or do I need to rebuild the entire FS from > scratch?
First, what is in your client modprobe.conf? Should only be 'tcp' for tcp-only clients. Second, I don't think you can use an ipoib address as a tcp connection. If it's ipoib, LNET is going to use o2ib. cliffw > > On Mon, Apr 21, 2008 at 9:31 PM, Chris Worley <[EMAIL PROTECTED]> wrote: >> On Mon, Apr 21, 2008 at 9:22 PM, Chris Worley <[EMAIL PROTECTED]> wrote: >> > The only configuration error on my OSS was: I initially only had >> > "o2ib0(ib0)" in my modprobe.conf. After unmounting all the OSTs, and >> > getting the modprobe.conf right: >> > >> > options lnet networks=o2ib0(ib0),tcp0(eth0) >> > >> > ...and remounting from scratch, both ksocklnd and ko2iblnd are now >> > loaded properly. >> > >> > But, I still can't mount the partition on the ethernet-only client nodes. >> > >> > They get the error: >> > >> > LustreError: 8439:0:(events.c:401:ptlrpc_uuid_to_peer()) No NID found >> > for [EMAIL PROTECTED] >> > LustreError: 8439:0:(client.c:58:ptlrpc_uuid_to_connection()) cannot >> > find peer [EMAIL PROTECTED] >> > LustreError: 8439:0:(ldlm_lib.c:312:client_obd_setup()) can't add >> > initial connection >> > LustreError: 8439:0:(obd_config.c:325:class_setup()) setup >> > lfs-OST0026-osc-0000010753919000 failed (-2) >> > LustreError: 8439:0:(obd_config.c:1062:class_config_llog_handler()) >> > Err -2 on cfg command: >> > Lustre: cmd=cf003 0:lfs-OST0026-osc 1:lfs-OST0026_UUID 2:[EMAIL >> PROTECTED] >> > LustreError: 15c-8: [EMAIL PROTECTED]: The configuration from log >> > 'lfs-client' failed (-2). >> > >> > The 36.102.29.4 is the IPoIB address of the added OSS. It shouldn't >> > want it "@o2ib". >> > >> > I've also unmounted all Lustre mounts on the MGS/MDS, unloaded all the >> > modules and remounted. Still no joy. >> > >> >> From this point forward, every time I say"OST" I mean "OSS"... >> >> >> >> > The file systems were created on the new OST, just as on all the others: >> > >> > for i in b c d e f g h i j k l; do mkfs.lustre --ost >> > --mgsnode="[EMAIL PROTECTED],[EMAIL PROTECTED]" --fsname=lfs --param >> > sys.timeout=40 --param lov.stripesize=2M /dev/sd$i & done >> > >> > The client has the right modprobe.conf, which worked before the >> additional OST: >> > >> > options lnet networks=tcp0(eth0) >> > >> > ... and I'm using the same mount command that worked previously: >> > >> > mount -t lustre [EMAIL PROTECTED]:/lfs /lfs >> > >> > From the OST I can ping the client: >> > >> > # lctl list_nids >> > [EMAIL PROTECTED] >> > [EMAIL PROTECTED] >> > # lctl ping [EMAIL PROTECTED] >> > [EMAIL PROTECTED] >> > [EMAIL PROTECTED] >> > >> > From the client, I can ping the OST and MDS/MGS: >> > >> > # lctl list_nids >> > [EMAIL PROTECTED] >> > # lctl ping [EMAIL PROTECTED] >> > [EMAIL PROTECTED] >> > [EMAIL PROTECTED] >> > [EMAIL PROTECTED] >> > # lctl ping [EMAIL PROTECTED] >> > [EMAIL PROTECTED] >> > [EMAIL PROTECTED] >> > [EMAIL PROTECTED] >> > >> > So, somehow, not having the right modprobe.conf the first time I >> > mounted the partitions on the new OST has made it permanently not want >> > to mount properly on Ethernet clients (it mounts fine on IB clients). >> > >> > Any ideas? >> > >> > Thanks, >> > >> > Chris >> > >> > _______________________________________________ > Lustre-discuss mailing list > [email protected] > http://lists.lustre.org/mailman/listinfo/lustre-discuss _______________________________________________ Lustre-discuss mailing list [email protected] http://lists.lustre.org/mailman/listinfo/lustre-discuss
