Chris, Perhaps you need to perform some write_conf like command. I'm not sure if this is needed in 1.6 or not.
Shane ----- Original Message ----- From: [EMAIL PROTECTED] <[EMAIL PROTECTED]> To: lustre-discuss <[email protected]> Sent: Fri Mar 07 12:03:17 2008 Subject: Re: [Lustre-discuss] Multihomed question: want Lustre over IB andEthernet On Fri, Mar 7, 2008 at 9:39 AM, Craig Prescott <[EMAIL PROTECTED]> wrote: > > I think your client modprobe.conf lnet option > should be this: > > > options lnet networks=o2ib(ib0) > > (not 'o2ib0'). It still seems to want the TCP connection: Lustre: Added LNI [EMAIL PROTECTED] [8/64] Lustre: Lustre Client File System; [EMAIL PROTECTED] LustreError: 11043:0:(events.c:401:ptlrpc_uuid_to_peer()) No NID found for [EMAIL PROTECTED] LustreError: 11043:0:(client.c:58:ptlrpc_uuid_to_connection()) cannot find peer [EMAIL PROTECTED] LustreError: 11043:0:(ldlm_lib.c:312:client_obd_setup()) can't add initial connection LustreError: 11043:0:(obd_config.c:325:class_setup()) setup ddnlfs-MDT0000-mdc-0000010430934400 failed (-2) LustreError: 11043:0:(obd_config.c:1062:class_config_llog_handler()) Err -2 on cfg command: LustreError: 11141:0:(connection.c:142:ptlrpc_put_connection()) NULL connection Lustre: cmd=cf003 0:ddnlfs-MDT0000-mdc 1:ddnlfs-MDT0000_UUID 2:[EMAIL PROTECTED] LustreError: 15c-8: [EMAIL PROTECTED]: The configuration from log 'ddnlfs-client' failed (-2). This may be the result of communication errors between this node and the MGS, a bad configuration, or other errors. See the syslog for more information. LustreError: 11043:0:(llite_lib.c:1021:ll_fill_super()) Unable to process log: -2 LustreError: 11043:0:(obd_config.c:392:class_cleanup()) Device 2 not setup Lustre: client 0000010430934400 umount complete LustreError: 11043:0:(obd_mount.c:1924:lustre_fill_super()) Unable to mount (-2) > > Another thing to try, if that doesn't work lctl > ping your MDS/MGS/OSS nids, like so: > > lctl ping [EMAIL PROTECTED] Before and after the change it looks the same: # lctl ping [EMAIL PROTECTED] [EMAIL PROTECTED] [EMAIL PROTECTED] [EMAIL PROTECTED] If I change my modprobe.conf to look as on the MDS/OSS's: options lnet networks=o2ib0(ib0),tcp0(eth0) Then, mount just specifying o2ib: # mount -t lustre [EMAIL PROTECTED]:/ddnlfs /lfs It works, but, both ko2iblnd and ksocklnd are loaded. The dmesg output is: Lustre: OBD class driver, [EMAIL PROTECTED] Lustre Version: 1.6.4.2 Build Version: 1.6.4.2-19691231190000-PRISTINE-.usr.src.linux-2.6.9-67.0.4.EL-Lustre-1.6.4.2 Lustre: Added LNI [EMAIL PROTECTED] [8/64] Lustre: Added LNI [EMAIL PROTECTED] [8/256] Lustre: Accept secure, port 988 Lustre: Lustre Client File System; [EMAIL PROTECTED] Lustre: ddnlfs-clilov-000001042f8b7c00.lov: set parameter stripesize=2M Lustre: Client ddnlfs-client has started Can I be certain it'll use IB for LFS on this client? Thanks, Chris > > Cheers, > Craig > > > > > Chris Worley wrote: > > More issues. Now, on the clients. > > > > The MDT/MGS/OST's are all up and mounted, showing: > > > > # lctl list_nids > > [EMAIL PROTECTED] > > [EMAIL PROTECTED] > > > > Now, when I go to mount on the IB-based clients, I get: > > > > # mount -t lustre [EMAIL PROTECTED]:/ddnlfs /lfs > > mount.lustre: mount [EMAIL PROTECTED]:/ddnlfs at /lfs failed: No > > such file or directory > > Is the MGS specification correct? > > Is the filesystem name correct? > > If upgrading, is the copied client log valid? (see upgrade docs) > > > > The modprobe.conf contains: > > > > options lnet networks=o2ib0(ib0) > > > > And lctl looks good: > > > > # lctl list_nids > > [EMAIL PROTECTED] > > > > But dmesg shows that it wants to go over the 36.121.x.x (tcp) network > > (36.12[12].255.201 is the MGS/MDS server): > > > > LustreError: 10001:0:(events.c:401:ptlrpc_uuid_to_peer()) No NID found > > for [EMAIL PROTECTED] > > LustreError: 10001:0:(client.c:58:ptlrpc_uuid_to_connection()) cannot > > find peer [EMAIL PROTECTED] > > LustreError: 10001:0:(ldlm_lib.c:312:client_obd_setup()) can't add > > initial connection > > LustreError: 9836:0:(connection.c:142:ptlrpc_put_connection()) NULL > connection > > LustreError: 10001:0:(obd_config.c:325:class_setup()) setup > > ddnlfs-MDT0000-mdc-0000010430913c00 failed (-2) > > LustreError: 10001:0:(obd_config.c:1062:class_config_llog_handler()) > > Err -2 on cfg command: > > Lustre: cmd=cf003 0:ddnlfs-MDT0000-mdc 1:ddnlfs-MDT0000_UUID > > 2:[EMAIL PROTECTED] > > LustreError: 15c-8: [EMAIL PROTECTED]: The configuration from log > > 'ddnlfs-client' failed (-2). This may be the result of communication > > errors between this node and the MGS, a bad configuration, or other > > errors. See the syslog for more information. > > LustreError: 10001:0:(llite_lib.c:1021:ll_fill_super()) Unable to > > process log: -2 > > LustreError: 10001:0:(obd_config.c:392:class_cleanup()) Device 2 not setup > > Lustre: client 0000010430913c00 umount complete > > LustreError: 10001:0:(obd_mount.c:1924:lustre_fill_super()) Unable to > > mount (-2) > > > > Note that this setup works fine in the non-multihomed setup, so I > > don't think ko2iblnd is to blame (the setup on the clients hasn't > > changed at all). > > > > What am I doing wrong? > > > > Thanks, > > > > Chris > > On Fri, Mar 7, 2008 at 7:41 AM, Chris Worley <[EMAIL PROTECTED]> wrote: > >> I changed my modprobe.conf to look exactly as yours, and it worked. I > >> hadn't been using all the quotes until the doc said to... but they may > >> have indeed been the problem. > >> > >> Thanks! > >> > >> Chris > >> > >> On Fri, Mar 7, 2008 at 3:40 AM, Charles Taylor <[EMAIL PROTECTED]> wrote: > >> > > >> > > >> > Do "lclt list_nids" on your mds and oss's. They should look > >> > something like this. > >> > > >> > [EMAIL PROTECTED] ~]# lctl list_nids > >> > [EMAIL PROTECTED] > >> > [EMAIL PROTECTED] > >> > > >> > Then your clients should have a nid on one or the other. > >> > > >> > Check your dmesg output after loading lnet. The complaints are > >> > pretty useful. Your modprobe.conf line looks correct although we > >> > found we did not need all the quoting so you should check that as > >> > well. Ours looks like... > >> > > >> > options lnet networks=o2ib(ib0),tcp(eth0) > >> > > >> > My guess is that it either cannot find or does not like your ko2iblnd > >> > module. > >> > > >> > ct > >> > > >> > > >> > > >> > On Mar 7, 2008, at 12:46 AM, Chris Worley wrote: > >> > > >> > > Most everything is over IB, but I have a few systems I'd like to > mount > >> > > the Lustre fs over GigE. > >> > > > >> > > I think I've followed the Multihomed instructions correctly, in: > >> > > > >> > > http://dlc.sun.com/pdf/820-3681/820-3681.pdf > >> > > > >> > > My /etc/modprobe.conf on mds/mgs/oss servers (which all have both > >> > > Ethernet and IB) includes: > >> > > > >> > > options lnet 'networks="tcp0(eth0),o2ib0(ib0)"' > >> > > > >> > > I make and mount the mdt with (which has both IB and Ethernet, > subnet > >> > > 36.122.x.x is IB, 36.121.x.x is Ethernet): > >> > > > >> > > # mkfs.lustre --mdt --mgs > >> > > --mgsnode="[EMAIL PROTECTED],[EMAIL PROTECTED]" <... > /dev/md0 > >> > > # mount -t lustre /dev/md0 /lfs/mdtb > >> > > > >> > > But, at this point, the ksocklnd module is loaded rather than the > >> > > ko2iblnd module! > >> > > > >> > > On the OSS, I make the fs w/ the same "msgnode", but, when I try > to > >> > > mount it, it correctly uses the IB interface, but can't contact the > >> > > MDS: > >> > > > >> > > LustreError: 27520:0:(events.c:401:ptlrpc_uuid_to_peer()) No NID > found > >> > > for [EMAIL PROTECTED] > >> > > LustreError: 27520:0:(client.c:58:ptlrpc_uuid_to_connection()) > cannot > >> > > find peer [EMAIL PROTECTED] > >> > > LustreError: 27520:0:(ldlm_lib.c:312:client_obd_setup()) can't add > >> > > initial connection > >> > > LustreError: 17126:0:(connection.c:142:ptlrpc_put_connection()) > >> > > NULL connection > >> > > LustreError: 27520:0:(obd_config.c:325:class_setup()) setup > >> > > [EMAIL PROTECTED] failed (-2) > >> > > LustreError: 27520:0:(obd_mount.c:454:lustre_start_simple()) > >> > > [EMAIL PROTECTED] setup error -2 > >> > > LustreError: 27520:0:(obd_mount.c:1368:server_put_super()) no obd > >> > > ddnlfs-OSTffff > >> > > LustreError: 27520:0:(obd_mount.c:119:server_deregister_mount()) > >> > > ddnlfs-OSTffff not registered > >> > > > >> > > It too has loaded the ksocklnd module, and not the ko2iblnd > module. I > >> > > guess that both modules should be loaded in a multihomed case? > >> > > > >> > > What am I doing wrong? > >> > > > >> > > Thanks, > >> > > > >> > > Chris > >> > > _______________________________________________ > >> > > Lustre-discuss mailing list > >> > > [email protected] > >> > > http://lists.lustre.org/mailman/listinfo/lustre-discuss > >> > > >> > > >> > > _______________________________________________ > > Lustre-discuss mailing list > > [email protected] > > http://lists.lustre.org/mailman/listinfo/lustre-discuss > > _______________________________________________ Lustre-discuss mailing list [email protected] http://lists.lustre.org/mailman/listinfo/lustre-discuss _______________________________________________ Lustre-discuss mailing list [email protected] http://lists.lustre.org/mailman/listinfo/lustre-discuss
