More issues. Now, on the clients. The MDT/MGS/OST's are all up and mounted, showing:
# lctl list_nids [EMAIL PROTECTED] [EMAIL PROTECTED] Now, when I go to mount on the IB-based clients, I get: # mount -t lustre [EMAIL PROTECTED]:/ddnlfs /lfs mount.lustre: mount [EMAIL PROTECTED]:/ddnlfs at /lfs failed: No such file or directory Is the MGS specification correct? Is the filesystem name correct? If upgrading, is the copied client log valid? (see upgrade docs) The modprobe.conf contains: options lnet networks=o2ib0(ib0) And lctl looks good: # lctl list_nids [EMAIL PROTECTED] But dmesg shows that it wants to go over the 36.121.x.x (tcp) network (36.12[12].255.201 is the MGS/MDS server): LustreError: 10001:0:(events.c:401:ptlrpc_uuid_to_peer()) No NID found for [EMAIL PROTECTED] LustreError: 10001:0:(client.c:58:ptlrpc_uuid_to_connection()) cannot find peer [EMAIL PROTECTED] LustreError: 10001:0:(ldlm_lib.c:312:client_obd_setup()) can't add initial connection LustreError: 9836:0:(connection.c:142:ptlrpc_put_connection()) NULL connection LustreError: 10001:0:(obd_config.c:325:class_setup()) setup ddnlfs-MDT0000-mdc-0000010430913c00 failed (-2) LustreError: 10001:0:(obd_config.c:1062:class_config_llog_handler()) Err -2 on cfg command: Lustre: cmd=cf003 0:ddnlfs-MDT0000-mdc 1:ddnlfs-MDT0000_UUID 2:[EMAIL PROTECTED] LustreError: 15c-8: [EMAIL PROTECTED]: The configuration from log 'ddnlfs-client' failed (-2). This may be the result of communication errors between this node and the MGS, a bad configuration, or other errors. See the syslog for more information. LustreError: 10001:0:(llite_lib.c:1021:ll_fill_super()) Unable to process log: -2 LustreError: 10001:0:(obd_config.c:392:class_cleanup()) Device 2 not setup Lustre: client 0000010430913c00 umount complete LustreError: 10001:0:(obd_mount.c:1924:lustre_fill_super()) Unable to mount (-2) Note that this setup works fine in the non-multihomed setup, so I don't think ko2iblnd is to blame (the setup on the clients hasn't changed at all). What am I doing wrong? Thanks, Chris On Fri, Mar 7, 2008 at 7:41 AM, Chris Worley <[EMAIL PROTECTED]> wrote: > > I changed my modprobe.conf to look exactly as yours, and it worked. I > hadn't been using all the quotes until the doc said to... but they may > have indeed been the problem. > > Thanks! > > Chris > > On Fri, Mar 7, 2008 at 3:40 AM, Charles Taylor <[EMAIL PROTECTED]> wrote: > > > > > > Do "lclt list_nids" on your mds and oss's. They should look > > something like this. > > > > [EMAIL PROTECTED] ~]# lctl list_nids > > [EMAIL PROTECTED] > > [EMAIL PROTECTED] > > > > Then your clients should have a nid on one or the other. > > > > Check your dmesg output after loading lnet. The complaints are > > pretty useful. Your modprobe.conf line looks correct although we > > found we did not need all the quoting so you should check that as > > well. Ours looks like... > > > > options lnet networks=o2ib(ib0),tcp(eth0) > > > > My guess is that it either cannot find or does not like your ko2iblnd > > module. > > > > ct > > > > > > > > On Mar 7, 2008, at 12:46 AM, Chris Worley wrote: > > > > > Most everything is over IB, but I have a few systems I'd like to mount > > > the Lustre fs over GigE. > > > > > > I think I've followed the Multihomed instructions correctly, in: > > > > > > http://dlc.sun.com/pdf/820-3681/820-3681.pdf > > > > > > My /etc/modprobe.conf on mds/mgs/oss servers (which all have both > > > Ethernet and IB) includes: > > > > > > options lnet 'networks="tcp0(eth0),o2ib0(ib0)"' > > > > > > I make and mount the mdt with (which has both IB and Ethernet, subnet > > > 36.122.x.x is IB, 36.121.x.x is Ethernet): > > > > > > # mkfs.lustre --mdt --mgs > > > --mgsnode="[EMAIL PROTECTED],[EMAIL PROTECTED]" <... > /dev/md0 > > > # mount -t lustre /dev/md0 /lfs/mdtb > > > > > > But, at this point, the ksocklnd module is loaded rather than the > > > ko2iblnd module! > > > > > > On the OSS, I make the fs w/ the same "msgnode", but, when I try to > > > mount it, it correctly uses the IB interface, but can't contact the > > > MDS: > > > > > > LustreError: 27520:0:(events.c:401:ptlrpc_uuid_to_peer()) No NID found > > > for [EMAIL PROTECTED] > > > LustreError: 27520:0:(client.c:58:ptlrpc_uuid_to_connection()) cannot > > > find peer [EMAIL PROTECTED] > > > LustreError: 27520:0:(ldlm_lib.c:312:client_obd_setup()) can't add > > > initial connection > > > LustreError: 17126:0:(connection.c:142:ptlrpc_put_connection()) > > > NULL connection > > > LustreError: 27520:0:(obd_config.c:325:class_setup()) setup > > > [EMAIL PROTECTED] failed (-2) > > > LustreError: 27520:0:(obd_mount.c:454:lustre_start_simple()) > > > [EMAIL PROTECTED] setup error -2 > > > LustreError: 27520:0:(obd_mount.c:1368:server_put_super()) no obd > > > ddnlfs-OSTffff > > > LustreError: 27520:0:(obd_mount.c:119:server_deregister_mount()) > > > ddnlfs-OSTffff not registered > > > > > > It too has loaded the ksocklnd module, and not the ko2iblnd module. I > > > guess that both modules should be loaded in a multihomed case? > > > > > > What am I doing wrong? > > > > > > Thanks, > > > > > > Chris > > > _______________________________________________ > > > Lustre-discuss mailing list > > > [email protected] > > > http://lists.lustre.org/mailman/listinfo/lustre-discuss > > > > > _______________________________________________ Lustre-discuss mailing list [email protected] http://lists.lustre.org/mailman/listinfo/lustre-discuss
