Hello,

Scene:
We have lustre 1.6 set up and running over tcp and ib. Runing CentOS 5.1,
seperate networks.

I have a new node I want to install with the newer kernel
(2.6.18-164.11.1.el5). I have installed the stock kernel, the appropriate ib
modules and am running openib on it.
I have installed the client modules and client tools
(lustre-client-modules-1.8.2-2.6.18_164.11.1.el5_lustre.1.8.2
and lustre-client-1.8.2-2.6.18_164.11.1.el5_lustre.1.8.2) downloaded as RPMs
from the lustre website.

My difficulty:
I CAN mount over TCP without a problem.
I CANNOT mount over infiniband. I get:
--------------------------------
# mount -t lustre nas-ib-...@o2ib:/scratch /scratch
mount.lustre: mount nas-ib-...@o2ib:/scratch at /scratch failed: Cannot send
after transport endpoint shutdown
---------------------------------

#cat /etc/modprobe.conf
alias scsi_hostadapter aacraid
alias scsi_hostadapter1 ata_piix
alias eth0 e1000e
alias ib0 ib_ipoib
options lnet ip2nets="o2ib0(ib0) 192.168.*.*; tcp(eth0) 10.1.*.*"
-----------------------------------------

#mount -t lustre nas-...@tcp:/scratch /scratch
#df -h /scratch
Filesystem            Size  Used Avail Use% Mounted on
nas-...@tcp:/scratch   22T  8.0T   13T  39% /scratch
--------------------------------------------

#tail /var/log/messages
Mar 15 10:49:01 compute-1-1 kernel: LustreError:
6539:0:(lib-move.c:2436:LNetPut()) Error sending PUT to
12345-192.168.1...@tcp: -113
Mar 15 10:49:01 compute-1-1 kernel: LustreError:
6539:0:(events.c:66:request_out_callback()) @@@ type 4, status -113
 r...@ffff81025c861400 x1330300123611200/t0
o250->m...@mgc192.168.1.95@tcp_0:26/25
lens 368/584 e 0 to 1 dl 1268675346 ref 2 fl Rpc:N/0/0 rc 0/0
Mar 15 10:49:01 compute-1-1 kernel: LustreError:
7075:0:(client.c:848:ptlrpc_import_delay_req()) @@@ IMP_INVALID
 r...@ffff81025c861000 x1330300123611201/t0
o101->m...@mgc192.168.1.95@tcp_0:26/25
lens 296/544 e 0 to 1 dl 0 ref 1 fl Rpc:/0/0 rc 0/0
Mar 15 10:49:01 compute-1-1 kernel: LustreError: 15c-8: mgc192.168.1...@tcp:
The configuration from log 'scratch-client' failed (-108). This may be the
result of communication errors between this node and the MGS, a bad
configuration, or other errors. See the syslog for more information.
Mar 15 10:49:01 compute-1-1 kernel: LustreError:
7075:0:(llite_lib.c:1176:ll_fill_super()) Unable to process log: -108
Mar 15 10:49:01 compute-1-1 kernel: LustreError:
7075:0:(obd_mount.c:2042:lustre_fill_super()) Unable to mount  (-108)

-------------------------------------

Any ideas on troubleshooting this would be greatly appreciated.

Brian Andrus
_______________________________________________
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss

Reply via email to