Andreas,

I made the changes Nathan suggested to how the networking was set up in config.sh. I also checked the LUNs and you were correct: sda on lustre1 is sdb on lustre2 and vice versa. So I also changed config.sh to use sda1 for both.

However, I still get the exact same error when I try to mount the client (and yes, it's still the ENODEV, but why?):

[EMAIL PROTECTED] ~]# mount -v -t lustre lustrem:/mds-test/client /mnt/lustre
verbose: 1
arg[0] = /sbin/mount.lustre
arg[1] = lustrem:/mds-test/client
arg[2] = /mnt/lustre
arg[3] = -v
arg[4] = -o
arg[5] = rw
mds nid 0:       [EMAIL PROTECTED]
mds name:        mds-test
profile:         client
options:         rw
retry:           0
mount.lustre: mount(lustrem:/mds-test/client, /mnt/lustre) failed: Input/output error
mds nid 0:       [EMAIL PROTECTED]
mds name:        mds-test
profile:         client
options:         rw
retry:           0
[EMAIL PROTECTED] ~]#

MDS (lustre1) /var/log/messages:

Feb 2 16:17:18 lustrem kernel: Lustre: OBD class driver Build Version: 1.4.8-19691231170000-PRISTINE-.testsuite.tmp.lbuild-boulder.lbuild-v1_4_8_RC8-2.6-rhel4-x86_64.lbuild.BUILD.lustre-kernel-2.6.9.lustre.linux-2.6.9-42.0.3.EL_lustre.1.4.8smp, [EMAIL PROTECTED]
Feb  2 16:17:19 lustrem kernel: Lustre: Added LNI [EMAIL PROTECTED] [8/256]
Feb  2 16:17:19 lustrem kernel: Lustre: Accept secure, port 988
Feb  2 16:17:19 lustrem kernel: loop: loaded (max 8 devices)
Feb 2 16:17:21 lustrem kernel: kjournald starting. Commit interval 5 seconds
Feb  2 16:17:21 lustrem kernel: LDISKFS FS on loop0, internal journal
Feb 2 16:17:21 lustrem kernel: LDISKFS-fs: mounted filesystem with ordered data mode. Feb 2 16:17:21 lustrem kernel: Lustre: 3518:0:(mds_fs.c:239:mds_init_server_data()) mds-test: initializing new last_rcvd Feb 2 16:17:21 lustrem kernel: Lustre: MDT mds-test now serving /dev/loop0 (b505d8f0-d424-4bf8-a8cd-8bfa8af0cf36) with recovery enabled
Feb  2 16:17:21 lustrem kernel: Lustre: MDT mds-test has stopped.
Feb 2 16:17:22 lustrem kernel: kjournald starting. Commit interval 5 seconds
Feb  2 16:17:22 lustrem kernel: LDISKFS FS on loop0, internal journal
Feb 2 16:17:22 lustrem kernel: LDISKFS-fs: mounted filesystem with ordered data mode. Feb 2 16:17:22 lustrem kernel: Lustre: Binding irq 185 to CPU 0 with cmd: echo 1 > /proc/irq/185/smp_affinity Feb 2 16:17:27 lustrem kernel: LustreError: 3882:0:(client.c:940:ptlrpc_expire_one_request()) @@@ timeout (sent at 1170454642, 5s ago) [EMAIL PROTECTED] x1/t0 o8->[EMAIL PROTECTED]:6 lens 240/272 ref 1 fl Rpc:/0/0 rc 0/0 Feb 2 16:17:28 lustrem kernel: LustreError: 3680:0:(ldlm_lib.c:541:target_handle_connect()) @@@ UUID 'mds-test' is not available for connect (not set up) [EMAIL PROTECTED] x27/t0 o38-><?>@<?>:-1 lens 240/0 ref 0 fl Interpret:/0/0 rc 0/0 Feb 2 16:17:28 lustrem kernel: LustreError: 3680:0:(ldlm_lib.c:1288:target_send_reply_msg()) @@@ processing error (-19) [EMAIL PROTECTED] x27/t0 o38-><?>@<?>:-1 lens 240/0 ref 0 fl Interpret:/0/0 rc -19/0

OST (lustre1) /var/log/messages:

Feb 2 16:16:30 lustre1 kernel: Lustre: OBD class driver Build Version: 1.4.8-19691231170000-PRISTINE-.testsuite.tmp.lbuild-boulder.lbuild-v1_4_8_RC8-2.6-rhel4-x86_64.lbuild.BUILD.lustre-kernel-2.6.9.lustre.linux-2.6.9-42.0.3.EL_lustre.1.4.8smp, [EMAIL PROTECTED]
Feb  2 16:16:30 lustre1 kernel: Lustre: Added LNI [EMAIL PROTECTED] [8/256]
Feb  2 16:16:30 lustre1 kernel: Lustre: Accept secure, port 988
Feb 2 16:16:31 lustre1 kernel: Lustre: Filtering OBD driver; [EMAIL PROTECTED] Feb 2 16:17:00 lustre1 kernel: Lustre: Binding irq 185 to CPU 0 with cmd: echo 1 > /proc/irq/185/smp_affinity Feb 2 16:17:00 lustre1 kernel: Lustre: 3521:0:(lib-move.c:1627:lnet_parse_put()) Dropping PUT from [EMAIL PROTECTED] portal 6 match 1 offset 0 length 240: 2 Feb 2 16:17:25 lustre1 kernel: Lustre: 3521:0:(lib-move.c:1627:lnet_parse_put()) Dropping PUT from [EMAIL PROTECTED] portal 6 match 4 offset 0 length 240: 2 Feb 2 16:17:50 lustre1 kernel: Lustre: 3521:0:(lib-move.c:1627:lnet_parse_put()) Dropping PUT from [EMAIL PROTECTED] portal 6 match 6 offset 0 length 240: 2 Feb 2 16:18:15 lustre1 kernel: Lustre: 3521:0:(lib-move.c:1627:lnet_parse_put()) Dropping PUT from [EMAIL PROTECTED] portal 6 match 8 offset 0 length 240: 2 Feb 2 16:18:40 lustre1 kernel: Lustre: 3521:0:(lib-move.c:1627:lnet_parse_put()) Dropping PUT from [EMAIL PROTECTED] portal 6 match 10 offset 0 length 240: 2

OST (lustre2) /var/log/messages:

Feb 2 16:16:28 lustre2 kernel: Lustre: OBD class driver Build Version: 1.4.8-19691231170000-PRISTINE-.testsuite.tmp.lbuild-boulder.lbuild-v1_4_8_RC8-2.6-rhel4-x86_64.lbuild.BUILD.lustre-kernel-2.6.9.lustre.linux-2.6.9-42.0.3.EL_lustre.1.4.8smp, [EMAIL PROTECTED]
Feb  2 16:16:28 lustre2 kernel: Lustre: Added LNI [EMAIL PROTECTED] [8/256]
Feb  2 16:16:28 lustre2 kernel: Lustre: Accept secure, port 988
Feb 2 16:16:28 lustre2 kernel: Lustre: Filtering OBD driver; [EMAIL PROTECTED] Feb 2 16:16:53 lustre2 kernel: Lustre: Binding irq 185 to CPU 0 with cmd: echo 1 > /proc/irq/185/smp_affinity Feb 2 16:16:53 lustre2 kernel: Lustre: 3528:0:(lib-move.c:1627:lnet_parse_put()) Dropping PUT from [EMAIL PROTECTED] portal 6 match 2 offset 0 length 240: 2 Feb 2 16:17:18 lustre2 kernel: Lustre: 3528:0:(lib-move.c:1627:lnet_parse_put()) Dropping PUT from [EMAIL PROTECTED] portal 6 match 5 offset 0 length 240: 2 Feb 2 16:17:43 lustre2 kernel: Lustre: 3528:0:(lib-move.c:1627:lnet_parse_put()) Dropping PUT from [EMAIL PROTECTED] portal 6 match 7 offset 0 length 240: 2

Client (scnode01) /var/log/messages:

Feb 2 16:17:10 scnode01 kernel: LustreError: 19745:0:(client.c:576:ptlrpc_check_status()) @@@ type == PTL_RPC_MSG_ERR, err == -19 [EMAIL PROTECTED] x27/t0 o38->[EMAIL PROTECTED]@tcp:12 lens 240/272 ref 1 fl Rpc:R/0/0 rc 0/-19 Feb 2 16:17:10 scnode01 kernel: LustreError: mdc_dev: The configuration 'client' could not be read from the MDS 'mds-test'. This may be the result of communication errors between the client and the MDS, or if the MDS is not running. Feb 2 16:17:10 scnode01 kernel: LustreError: 19742:0:(llite_lib.c:936:lustre_fill_super()) Unable to process log: client

config.sh:

#!/bin/sh
# config.sh
#
rm -f config.xml
#
# Create nodes
# Trying to get this to work with 1 MDS, 2 OST's, and 1 client. Will add the
# others when I get this working. - klb, 2/2/07
#
lmc -m config.xml --add node --node lustrem
lmc -m config.xml --add node --node lustre1
lmc -m config.xml --add node --node lustre2
lmc -m config.xml --add node --node client
#
# Configure networking
#
lmc -m config.xml --add net --node lustrem --nid [EMAIL PROTECTED] --nettype lnet lmc -m config.xml --add net --node lustre1 --nid [EMAIL PROTECTED] --nettype lnet lmc -m config.xml --add net --node lustre2 --nid [EMAIL PROTECTED] --nettype lnet
lmc -m config.xml --add net --node client --nid '*' --nettype lnet
#lmc -m config.xml --add net --node lustrem --nid lustrem --nettype tcp
#lmc -m config.xml --add net --node lustre1 --nid lustre1 --nettype tcp
#lmc -m config.xml --add net --node lustre2 --nid lustre2 --nettype tcp
#lmc -m config.xml --add net --node client --nid '*' --nettype tcp
#
# Configure MDS
#
lmc -m config.xml --add mds --node lustrem --mds mds-test --fstype ldiskfs --dev /tmp/mds-test --size 50000
#
# Configure OSTs - testing with 2 initially - klb, 2/1/2007
#
lmc -m config.xml --add lov --lov lov-test --mds mds-test --stripe_sz 1048576 --stripe_cnt 0 --stripe_pattern 0 lmc -m config.xml --add ost --node lustre1 --lov lov-test --ost ost1-test --fstype ldiskfs --dev /dev/sda1 lmc -m config.xml --add ost --node lustre2 --lov lov-test --ost ost2-test --fstype ldiskfs --dev /dev/sda1
#
# Configure client (this is a 'generic' client used for all client mounts)
# testing with 1 client initially - klb, 2/1/2007
#
lmc -m config.xml --add mtpt --node client --path /mnt/lustre --mds mds-test --lov lov-test
#
# Copy config.xml to all the other nodes in the cluster - klb, 2/1/07
#
for i in `seq 1 4`
 do
   echo "Copying config.xml to OST lustre$i..."
   rcp -p config.xml [EMAIL PROTECTED]:~/lustre
done

for i in `seq -w 1 14`
 do
   echo "Copying config.xml to client scnode$i..."
   rcp -p config.xml [EMAIL PROTECTED]:~/lustre
done


Andreas Dilger wrote:
On Feb 02, 2007  13:16 -0600, Kevin L. Buterbaugh wrote:
Sorry, meant to include that. Here's the relevant information from the client (scnode01):

Feb 2 12:48:15 scnode01 kernel: LustreError: 16536:0:(client.c:576:ptlrpc_check_status()) @@@ type == PTL_RPC_MSG_ERR, err == -19 [EMAIL PROTECTED] x13/t0 o38->[EMAIL PROTECTED]@tcp:12 lens 240/272 ref 1 fl Rpc:R/0/0 rc 0/-19 Feb 2 12:48:15 scnode01 kernel: LustreError: mdc_dev: The configuration 'client' could not be read from the MDS 'mds-test'. This may be the result of communication errors between the client and the MDS, or if the MDS is not running.

Client couldn't connect to the MDS.  -19 = -ENODEV

And from the MDS (lustrem):

3894:0:(client.c:940:ptlrpc_expire_one_request()) @@@ timeout (sent at 1170442057, 5s ago) [EMAIL PROTECTED] x1/t0 o8->[EMAIL PROTECTED]:6 lens 240/272 ref 1 fl Rpc:/0/0 rc 0/0 Feb 2 12:48:07 lustrem kernel: LustreError: 3894:0:(client.c:940:ptlrpc_expire_one_request()) @@@ timeout (sent at 1170442082, 5s ago) [EMAIL PROTECTED] x4/t0 o8->[EMAIL PROTECTED]:6 lens 240/272 ref 1 fl Rpc:/0/0 rc 0/0 Feb 2 12:48:07 lustrem kernel: LustreError:

These messages indicate failure to connect to the OSTs (op 8 = OST_CONNECT).
What is in the OST syslog?  Are you positive that /dev/sda1 and /dev/sdb1
on the two nodes are set up the same way, so that e.g. lustre1+sda1 isn't
talking to the same disk as lustre2+sdb1?

Also minor nit - you don't need to have a partition table, it can hurt
performance on some RAID setups because of the 512-byte offset of IOs
due to the DOS partition table.


Cheers, Andreas
--
Andreas Dilger
Principal Software Engineer
Cluster File Systems, Inc.


--

Kevin L. Buterbaugh
Advanced Computing Center for Research & Education - Vanderbilt University
www.accre.vanderbilt.edu - (615)343-0288 - [EMAIL PROTECTED]

_______________________________________________
Lustre-discuss mailing list
[email protected]
https://mail.clusterfs.com/mailman/listinfo/lustre-discuss

Reply via email to