Kevin L. Buterbaugh wrote:
#
# Configure networking
#
lmc -m config.xml --add net --node lustrem --nid lustrem --nettype tcp
lmc -m config.xml --add net --node lustre1 --nid lustre1 --nettype tcp
lmc -m config.xml --add net --node lustre2 --nid lustre2 --nettype tcp
And from the MDS (lustrem):
3894:0:(client.c:940:ptlrpc_expire_one_request()) @@@ timeout (sent at
1170442057, 5s ago) [EMAIL PROTECTED] x1/t0
o8->[EMAIL PROTECTED]:6 lens 240/272 ref 1 fl Rpc:/0/0 rc 0/0
Feb 2 12:48:07 lustrem kernel: LustreError:
3894:0:(client.c:940:ptlrpc_expire_one_request()) @@@ timeout (sent at
1170442082, 5s ago) [EMAIL PROTECTED] x4/t0
o8->[EMAIL PROTECTED]:6 lens 240/272 ref 1 fl Rpc:/0/0 rc 0/0
Feb 2 12:48:07 lustrem kernel: LustreError:
These messages indicate failure to connect to the OSTs (op 8 = OST_CONNECT).
There's always a possibility that one or more of your nodes doesn't
resolve the hostname properly. In general, I recommend using the actual
IP address:
lmc -m config.xml --add net --node lustrem --nid [EMAIL PROTECTED]
--nettype lnet
Also, check that every node can ping every other. On each node:
modprobe lnet
lctl network up
lctl list_nids
Then on each node:
lctl ping <nids from other nodes>
lctl network down (so you'll be able to remove the module)
_______________________________________________
Lustre-discuss mailing list
[email protected]
https://mail.clusterfs.com/mailman/listinfo/lustre-discuss