I think your client modprobe.conf lnet option should be this: options lnet networks=o2ib(ib0)
(not 'o2ib0'). Another thing to try, if that doesn't work lctl ping your MDS/MGS/OSS nids, like so: lctl ping [EMAIL PROTECTED] Cheers, Craig Chris Worley wrote: > More issues. Now, on the clients. > > The MDT/MGS/OST's are all up and mounted, showing: > > # lctl list_nids > [EMAIL PROTECTED] > [EMAIL PROTECTED] > > Now, when I go to mount on the IB-based clients, I get: > > # mount -t lustre [EMAIL PROTECTED]:/ddnlfs /lfs > mount.lustre: mount [EMAIL PROTECTED]:/ddnlfs at /lfs failed: No > such file or directory > Is the MGS specification correct? > Is the filesystem name correct? > If upgrading, is the copied client log valid? (see upgrade docs) > > The modprobe.conf contains: > > options lnet networks=o2ib0(ib0) > > And lctl looks good: > > # lctl list_nids > [EMAIL PROTECTED] > > But dmesg shows that it wants to go over the 36.121.x.x (tcp) network > (36.12[12].255.201 is the MGS/MDS server): > > LustreError: 10001:0:(events.c:401:ptlrpc_uuid_to_peer()) No NID found > for [EMAIL PROTECTED] > LustreError: 10001:0:(client.c:58:ptlrpc_uuid_to_connection()) cannot > find peer [EMAIL PROTECTED] > LustreError: 10001:0:(ldlm_lib.c:312:client_obd_setup()) can't add > initial connection > LustreError: 9836:0:(connection.c:142:ptlrpc_put_connection()) NULL connection > LustreError: 10001:0:(obd_config.c:325:class_setup()) setup > ddnlfs-MDT0000-mdc-0000010430913c00 failed (-2) > LustreError: 10001:0:(obd_config.c:1062:class_config_llog_handler()) > Err -2 on cfg command: > Lustre: cmd=cf003 0:ddnlfs-MDT0000-mdc 1:ddnlfs-MDT0000_UUID > 2:[EMAIL PROTECTED] > LustreError: 15c-8: [EMAIL PROTECTED]: The configuration from log > 'ddnlfs-client' failed (-2). This may be the result of communication > errors between this node and the MGS, a bad configuration, or other > errors. See the syslog for more information. > LustreError: 10001:0:(llite_lib.c:1021:ll_fill_super()) Unable to > process log: -2 > LustreError: 10001:0:(obd_config.c:392:class_cleanup()) Device 2 not setup > Lustre: client 0000010430913c00 umount complete > LustreError: 10001:0:(obd_mount.c:1924:lustre_fill_super()) Unable to > mount (-2) > > Note that this setup works fine in the non-multihomed setup, so I > don't think ko2iblnd is to blame (the setup on the clients hasn't > changed at all). > > What am I doing wrong? > > Thanks, > > Chris > On Fri, Mar 7, 2008 at 7:41 AM, Chris Worley <[EMAIL PROTECTED]> wrote: >> I changed my modprobe.conf to look exactly as yours, and it worked. I >> hadn't been using all the quotes until the doc said to... but they may >> have indeed been the problem. >> >> Thanks! >> >> Chris >> >> On Fri, Mar 7, 2008 at 3:40 AM, Charles Taylor <[EMAIL PROTECTED]> wrote: >> > >> > >> > Do "lclt list_nids" on your mds and oss's. They should look >> > something like this. >> > >> > [EMAIL PROTECTED] ~]# lctl list_nids >> > [EMAIL PROTECTED] >> > [EMAIL PROTECTED] >> > >> > Then your clients should have a nid on one or the other. >> > >> > Check your dmesg output after loading lnet. The complaints are >> > pretty useful. Your modprobe.conf line looks correct although we >> > found we did not need all the quoting so you should check that as >> > well. Ours looks like... >> > >> > options lnet networks=o2ib(ib0),tcp(eth0) >> > >> > My guess is that it either cannot find or does not like your ko2iblnd >> > module. >> > >> > ct >> > >> > >> > >> > On Mar 7, 2008, at 12:46 AM, Chris Worley wrote: >> > >> > > Most everything is over IB, but I have a few systems I'd like to mount >> > > the Lustre fs over GigE. >> > > >> > > I think I've followed the Multihomed instructions correctly, in: >> > > >> > > http://dlc.sun.com/pdf/820-3681/820-3681.pdf >> > > >> > > My /etc/modprobe.conf on mds/mgs/oss servers (which all have both >> > > Ethernet and IB) includes: >> > > >> > > options lnet 'networks="tcp0(eth0),o2ib0(ib0)"' >> > > >> > > I make and mount the mdt with (which has both IB and Ethernet, subnet >> > > 36.122.x.x is IB, 36.121.x.x is Ethernet): >> > > >> > > # mkfs.lustre --mdt --mgs >> > > --mgsnode="[EMAIL PROTECTED],[EMAIL PROTECTED]" <... > /dev/md0 >> > > # mount -t lustre /dev/md0 /lfs/mdtb >> > > >> > > But, at this point, the ksocklnd module is loaded rather than the >> > > ko2iblnd module! >> > > >> > > On the OSS, I make the fs w/ the same "msgnode", but, when I try to >> > > mount it, it correctly uses the IB interface, but can't contact the >> > > MDS: >> > > >> > > LustreError: 27520:0:(events.c:401:ptlrpc_uuid_to_peer()) No NID found >> > > for [EMAIL PROTECTED] >> > > LustreError: 27520:0:(client.c:58:ptlrpc_uuid_to_connection()) cannot >> > > find peer [EMAIL PROTECTED] >> > > LustreError: 27520:0:(ldlm_lib.c:312:client_obd_setup()) can't add >> > > initial connection >> > > LustreError: 17126:0:(connection.c:142:ptlrpc_put_connection()) >> > > NULL connection >> > > LustreError: 27520:0:(obd_config.c:325:class_setup()) setup >> > > [EMAIL PROTECTED] failed (-2) >> > > LustreError: 27520:0:(obd_mount.c:454:lustre_start_simple()) >> > > [EMAIL PROTECTED] setup error -2 >> > > LustreError: 27520:0:(obd_mount.c:1368:server_put_super()) no obd >> > > ddnlfs-OSTffff >> > > LustreError: 27520:0:(obd_mount.c:119:server_deregister_mount()) >> > > ddnlfs-OSTffff not registered >> > > >> > > It too has loaded the ksocklnd module, and not the ko2iblnd module. I >> > > guess that both modules should be loaded in a multihomed case? >> > > >> > > What am I doing wrong? >> > > >> > > Thanks, >> > > >> > > Chris >> > > _______________________________________________ >> > > Lustre-discuss mailing list >> > > [email protected] >> > > http://lists.lustre.org/mailman/listinfo/lustre-discuss >> > >> > >> > _______________________________________________ > Lustre-discuss mailing list > [email protected] > http://lists.lustre.org/mailman/listinfo/lustre-discuss _______________________________________________ Lustre-discuss mailing list [email protected] http://lists.lustre.org/mailman/listinfo/lustre-discuss
