Hi,

I'm encountering problem when starting the "local" example (one
MSD, LOV, OST, and client, all on node "sun-n1-console").

# lmc -m test.xml --batch test.txt
# cat test.txt
--add node --node sun-n1-console
--add net --node sun-n1-console --nettype lnet --nid [EMAIL PROTECTED]
--add mds --node sun-n1-console --mds mds1 --fstype ldiskfs --dev 
/tmp/mds1-sun-n1-console --size 400000
--add lov --lov lov1 --mds mds1 --stripe_sz 1048576 --stripe_cnt 1 
--stripe_pattern 0
--add ost --node sun-n1-console --lov lov1 --ost ost1-sun-n1-console --fstype 
ldiskfs --dev /tmp/ost1-sun-n1-console --size 400000
--add mtpt --node sun-n1-console --path /mnt/lustre --mds mds1 --lov lov1



The node has two ethernets, eth0 and eth1, both on separate subnets.
I deploys all lustre components on eth1 (IP: 192.168.123.45, hostname:
sun-n1-console).

# cat /etc/hosts
127.0.0.1               localhost.localdomain   localhost
xxx.yyy.zzz.ab          public-host
192.168.123.45          sun-n1-console


When eth0 is down, I successfully deployed the "local" example.
Only when eth0 is up that Lustre fails to start (see attachment)

The error messages from /var/log/messages indicates that MDS does
not respond (see below). I believe it's not caused by firewall cause
I've switched it off:

# iptables -L
Chain INPUT (policy ACCEPT)
target     prot opt source               destination

Chain FORWARD (policy ACCEPT)
target     prot opt source               destination

Chain OUTPUT (policy ACCEPT)
target     prot opt source               destination




And here're are the error messages:

# tail /var/log/messages
Apr 20 17:37:35 sun-n1-console kernel: LustreError: 
6840:0:(events.c:53:request_out_callback()) @@@ type 4, status -5  [EMAIL 
PROTECTED] x22/t0 o8->[EMAIL PROTECTED]:6 lens 240/272 ref 2 fl Rpc:/0/0 rc 0/0
Apr 20 17:37:35 sun-n1-console kernel: LustreError: 
6840:0:(client.c:947:ptlrpc_expire_one_request()) @@@ timeout (sent at 
1177061855, 0s ago)  [EMAIL PROTECTED] x22/t0 o8->[EMAIL PROTECTED]:6 lens 
240/272 ref 1 fl Rpc:/0/0 rc 0/0
Apr 20 17:37:35 sun-n1-console kernel: LustreError: 
6840:0:(client.c:947:ptlrpc_expire_one_request()) Skipped 2 previous similar 
messages
Apr 20 17:38:00 sun-n1-console kernel: LustreError: 
6840:0:(events.c:53:request_out_callback()) @@@ type 4, status -5  [EMAIL 
PROTECTED] x23/t0 o8->[EMAIL PROTECTED]:6 lens 240/272 ref 2 fl Rpc:/0/0 rc 0/0
Apr 20 17:38:25 sun-n1-console kernel: audit(1177061905.683:64): avc:  denied  
{ rawip_recv } for  pid=6537 comm="socknal_cd03" saddr=192.168.123.45 src=1023 
daddr=192.168.123.45 dest=988 netif=lo scontext=system_u:object_r:unlabeled_t 
tcontext=system_u:object_r:netif_lo_t tclass=netif
Apr 20 17:38:25 sun-n1-console kernel: audit(1177061905.884:65): avc:  denied  
{ rawip_recv } for  saddr=192.168.123.45 src=1023 daddr=192.168.123.45 dest=988 
netif=lo scontext=system_u:object_r:unlabeled_t 
tcontext=system_u:object_r:netif_lo_t tclass=netif
Apr 20 17:38:26 sun-n1-console kernel: audit(1177061906.286:66): avc:  denied  
{ rawip_recv } for  saddr=192.168.123.45 src=1023 daddr=192.168.123.45 dest=988 
netif=lo scontext=system_u:object_r:unlabeled_t 
tcontext=system_u:object_r:netif_lo_t tclass=netif
Apr 20 17:38:27 sun-n1-console kernel: audit(1177061907.090:67): avc:  denied  
{ rawip_recv } for  saddr=192.168.123.45 src=1023 daddr=192.168.123.45 dest=988 
netif=lo scontext=system_u:object_r:unlabeled_t 
tcontext=system_u:object_r:netif_lo_t tclass=netif
Apr 20 17:38:28 sun-n1-console kernel: audit(1177061908.698:68): avc:  denied  
{ rawip_recv } for  saddr=192.168.123.45 src=1023 daddr=192.168.123.45 dest=988 
netif=lo scontext=system_u:object_r:unlabeled_t 
tcontext=system_u:object_r:netif_lo_t tclass=netif
Apr 20 17:38:30 sun-n1-console kernel: LustreError: 
6539:0:(acceptor.c:442:lnet_acceptor()) Error -11 reading connection request 
from 192.168.123.45
Apr 20 17:38:30 sun-n1-console kernel: audit(1177061910.683:69): avc:  denied  
{ rawip_send } for  pid=6539 comm="acceptor_988" saddr=192.168.123.45 src=988 
daddr=192.168.123.45 dest=1023 netif=lo scontext=system_u:object_r:unlabeled_t 
tcontext=system_u:object_r:netif_lo_t tclass=netif
Apr 20 17:38:30 sun-n1-console kernel: LustreError: 
6537:0:(socklnd_cb.c:2160:ksocknal_recv_hello()) Error -104 reading HELLO from 
192.168.123.45
Apr 20 17:38:30 sun-n1-console kernel: LustreError: Connection to [EMAIL 
PROTECTED] at host 192.168.123.45 on port 988 was reset: is it running a 
compatible version of Lustre and is [EMAIL PROTECTED] one of its NIDs?
Apr 20 17:38:50 sun-n1-console kernel: LustreError: 
6840:0:(events.c:53:request_out_callback()) @@@ type 4, status -5  [EMAIL 
PROTECTED] x25/t0 o8->[EMAIL PROTECTED]:6 lens 240/272 ref 2 fl Rpc:/0/0 rc 0/0
Apr 20 17:39:15 sun-n1-console kernel: LustreError: 
6840:0:(events.c:53:request_out_callback()) @@@ type 4, status -5  [EMAIL 
PROTECTED] x26/t0 o8->[EMAIL PROTECTED]:6 lens 240/272 ref 2 fl Rpc:/0/0 rc 0/0



Any advices how to make this simple example work?


Regards,
Verdi


-- 
"Feel free" - 10 GB Mailbox, 100 FreeSMS/Monat ...
Jetzt GMX TopMail testen: http://www.gmx.net/de/go/topmail
[EMAIL PROTECTED] tmp]# lconf --reformat --verbose hoho.xml
configuring for host:  ['sun-n1-console']
setting /proc/sys/net/core/rmem_max to at least 16777216
setting /proc/sys/net/core/wmem_max to at least 16777216
Service: network NET_sun-n1-console_lnet NET_sun-n1-console_lnet_UUID
loading module: libcfs srcdir None devdir libcfs
+ /sbin/modprobe libcfs
loading module: lnet srcdir None devdir lnet
+ /sbin/modprobe lnet
+ /sbin/modprobe lnet
loading module: ksocklnd srcdir None devdir klnds/socklnd
+ /sbin/modprobe ksocklnd
Service: ldlm ldlm ldlm_UUID
loading module: lvfs srcdir None devdir lvfs
+ /sbin/modprobe lvfs
loading module: obdclass srcdir None devdir obdclass
+ /sbin/modprobe obdclass
loading module: ptlrpc srcdir None devdir ptlrpc
+ /sbin/modprobe ptlrpc
Service: osd OSD_ost1-sun-n1-console_sun-n1-console 
-n1-console_sun-n1-console_UUID
loading module: ost srcdir None devdir ost
+ /sbin/modprobe ost
loading module: ldiskfs srcdir None devdir ldiskfs
+ /sbin/modprobe ldiskfs
loading module: fsfilt_ldiskfs srcdir None devdir lvfs
+ /sbin/modprobe fsfilt_ldiskfs
loading module: obdfilter srcdir None devdir obdfilter
+ /sbin/modprobe obdfilter
Service: mdsdev MDD_mds1_sun-n1-console MDD_mds1_sun-n1-console_UUID
original inode_size  0
stripe_count  1  inode_size  512
loading module: mdc srcdir None devdir mdc
+ /sbin/modprobe mdc
loading module: osc srcdir None devdir osc
+ /sbin/modprobe osc
loading module: lov srcdir None devdir lov
+ /sbin/modprobe lov
loading module: mds srcdir None devdir mds
+ /sbin/modprobe mds
Service: mountpoint MNT_sun-n1-console MNT_sun-n1-console_UUID
get_lov_tgts failed, using get_refs
dbg LOV __init__: [(<__main__.OSC instance at 0xb7cd952c>, 0, 1, 1)] 
[u'ost1-sun-n1-console_UUID'] 1
loading module: llite srcdir None devdir llite
+ /sbin/modprobe llite
+ sysctl lnet/debug_path /tmp/lustre-log-sun-n1-console
+ /usr/sbin/lctl  modules > /tmp/ogdb-sun-n1-console
Service: network NET_sun-n1-console_lnet NET_sun-n1-console_lnet_UUID
NETWORK: NET_sun-n1-console_lnet NET_sun-n1-console_lnet_UUID lnet [EMAIL 
PROTECTED]
Service: ldlm ldlm ldlm_UUID
Service: osd OSD_ost1-sun-n1-console_sun-n1-console 
-n1-console_sun-n1-console_UUID
OSD: ost1-sun-n1-console ost1-sun-n1-console_UUID obdfilter 
/tmp/ost1-sun-n1-console 400000 ldiskfs no 0 256
+ losetup /dev/loop0
+ losetup /dev/loop1
+ losetup /dev/loop2
+ losetup /dev/loop3
+ losetup /dev/loop4
+ losetup /dev/loop5
+ losetup /dev/loop6
+ losetup /dev/loop7
+ dd if=/dev/zero bs=1k count=0 seek=400000 of=/tmp/ost1-sun-n1-console
+ mkfs.ext2 -j -b 4096  -F    -I 256 /tmp/ost1-sun-n1-console 100000
+ tune2fs -O dir_index /tmp/ost1-sun-n1-console
+ losetup /dev/loop0
+ losetup /dev/loop0 /tmp/ost1-sun-n1-console
+ dumpe2fs -f -h /dev/loop0
no external journal found for /dev/loop0
OST mount options: errors=remount-ro
+ /usr/sbin/lctl
  attach obdfilter ost1-sun-n1-console ost1-sun-n1-console_UUID
  quit
+ /usr/sbin/lctl
  cfg_device ost1-sun-n1-console
  setup /dev/loop0 ldiskfs f errors=remount-ro
  quit
+ /usr/sbin/lctl
  attach ost OSS OSS_UUID
  quit
+ /usr/sbin/lctl
  cfg_device OSS
  setup
  quit
Service: mdsdev MDD_mds1_sun-n1-console MDD_mds1_sun-n1-console_UUID
original inode_size  0
stripe_count  1  inode_size  512
MDSDEV: mds1 mds1_UUID /tmp/mds1-sun-n1-console ldiskfs no
+ losetup /dev/loop0
+ losetup /dev/loop1
+ losetup /dev/loop2
+ losetup /dev/loop3
+ losetup /dev/loop4
+ losetup /dev/loop5
+ losetup /dev/loop6
+ losetup /dev/loop7
+ dd if=/dev/zero bs=1k count=0 seek=400000 of=/tmp/mds1-sun-n1-console
+ mkfs.ext2 -j -b 4096  -F  -i 4096   -I 512 /tmp/mds1-sun-n1-console 100000
+ tune2fs -O dir_index /tmp/mds1-sun-n1-console
+ losetup /dev/loop0
+ losetup /dev/loop1
+ losetup /dev/loop1 /tmp/mds1-sun-n1-console
+ /usr/sbin/lctl
  attach mds mds1 mds1_UUID
  quit
+ /usr/sbin/lctl
  cfg_device mds1
  setup /dev/loop1 ldiskfs
  quit
recording clients for filesystem: FS_fsname_UUID
get_lov_tgts failed, using get_refs
dbg LOV __init__: [(<__main__.OSC instance at 0xb7cd988c>, 0, 1, 1)] 
[u'ost1-sun-n1-console_UUID'] 1
+ /usr/sbin/lctl
  device $mds1
  probe
  clear_log mds1
  quit
Recording log mds1 on mds1
dbg LOV prepare
dbg LOV prepare: [(<__main__.OSC instance at 0xb7cd988c>, 0, 1, 1)] 
[u'ost1-sun-n1-console_UUID']
LOV: lov_mds1 4300b_lov_mds1_fe6fd41018 mds1_UUID 1 1048576 0 0 
[u'ost1-sun-n1-console_UUID'] mds1
+ /usr/sbin/lctl
    device $mds1
    record mds1

  attach lov lov_mds1 4300b_lov_mds1_fe6fd41018
  lov_setup lov1_UUID 1 1048576 0 0
  quit
OSC: OSC_sun-n1-console_ost1-sun-n1-console_mds1 4300b_lov_mds1_fe6fd41018 
ost1-sun-n1-console_UUID
dbg CLIENT __prepare__: ost1-sun-n1-console_UUID [<__main__.Network instance at 
0xb7cd9c6c>]
+ /usr/sbin/lctl
    device $mds1
    record mds1

  add_uuid sun-n1-console_UUID [EMAIL PROTECTED]
ost1-sun-n1-console_UUID active
+ /usr/sbin/lctl
    device $mds1
    record mds1

  attach osc OSC_sun-n1-console_ost1-sun-n1-console_mds1 
4300b_lov_mds1_fe6fd41018
  quit
+ /usr/sbin/lctl
    device $mds1
    record mds1

  cfg_device OSC_sun-n1-console_ost1-sun-n1-console_mds1
  setup ost1-sun-n1-console_UUID sun-n1-console_UUID
  quit
+ /usr/sbin/lctl
    device $mds1
    record mds1

  cfg_device lov_mds1
  lov_modify_tgts add lov_mds1 ost1-sun-n1-console_UUID 0 1
  quit
+ /usr/sbin/lctl
    device $mds1
    record mds1

  mount_option mds1 lov_mds1
  quit
End recording log mds1 on mds1
Recording log sun-n1-console on mds1
+ /usr/sbin/lconf   -v --record --nomod --old_conf --record_log sun-n1-console 
--record_device mds1 --node sun-n1-console hoho.xml
record>  configuring for host:  ['sun-n1-console']
record>  Checking XML modification time
record>  + debugfs -c -R 'stat /LOGS' /tmp/mds1-sun-n1-console 2>&1 | grep mtime
record>  Can not get mtime info of MDS LOGS directory
record>  + /usr/sbin/lctl
record>  device $mds1
record>  probe
record>  clear_log sun-n1-console
record>  quit
record>  Recording log sun-n1-console on mds1
record>  Service: network NET_sun-n1-console_lnet NET_sun-n1-console_lnet_UUID
record>  Service: ldlm ldlm ldlm_UUID
record>  Service: osd OSD_ost1-sun-n1-console_sun-n1-console 
-n1-console_sun-n1-console_UUID
record>  Service: mdsdev MDD_mds1_sun-n1-console MDD_mds1_sun-n1-console_UUID
record>  original inode_size  0
record>  stripe_count  1  inode_size  512
record>  Service: mountpoint MNT_sun-n1-console MNT_sun-n1-console_UUID
record>  get_lov_tgts failed, using get_refs
record>  dbg LOV __init__: [(<__main__.OSC instance at 0xb7cf64cc>, 0, 1, 1)] 
[u'ost1-sun-n1-console_UUID'] 1
record>  dbg LOV prepare
record>  dbg LOV prepare: [(<__main__.OSC instance at 0xb7cf64cc>, 0, 1, 1)] 
[u'ost1-sun-n1-console_UUID']
record>  LOV: lov1 028ec_lov1_fa9d4fa5b7 mds1_UUID 1 1048576 0 0 
[u'ost1-sun-n1-console_UUID'] mds1
record>  + /usr/sbin/lctl
record>  device $mds1
record>  record sun-n1-console
record>
record>  attach lov lov1 028ec_lov1_fa9d4fa5b7
record>  lov_setup lov1_UUID 1 1048576 0 0
record>  quit
record>  OSC: OSC_sun-n1-console_ost1-sun-n1-console_MNT_sun-n1-console 
028ec_lov1_fa9d4fa5b7 ost1-sun-n1-console_UUID
record>  dbg CLIENT __prepare__: ost1-sun-n1-console_UUID [<__main__.Network 
instance at 0xb7cf66cc>]
record>  + /usr/sbin/lctl
record>  device $mds1
record>  record sun-n1-console
record>
record>  add_uuid sun-n1-console_UUID [EMAIL PROTECTED]
record>  ost1-sun-n1-console_UUID active
record>  + /usr/sbin/lctl
record>  device $mds1
record>  record sun-n1-console
record>
record>  attach osc OSC_sun-n1-console_ost1-sun-n1-console_MNT_sun-n1-console 
028ec_lov1_fa9d4fa5b7
record>  quit
record>  + /usr/sbin/lctl
record>  device $mds1
record>  record sun-n1-console
record>
record>  cfg_device OSC_sun-n1-console_ost1-sun-n1-console_MNT_sun-n1-console
record>  setup ost1-sun-n1-console_UUID sun-n1-console_UUID
record>  quit
record>  + /usr/sbin/lctl
record>  device $mds1
record>  record sun-n1-console
record>
record>  cfg_device lov1
record>  lov_modify_tgts add lov1 ost1-sun-n1-console_UUID 0 1
record>  quit
record>  MDC: MDC_sun-n1-console_mds1_MNT_sun-n1-console 
0cf7b_MNT_sun-n1-console_dd8b963906 mds1_UUID
record>  dbg CLIENT __prepare__: mds1_UUID [<__main__.Network instance at 
0xb7cf6a4c>]
record>  + /usr/sbin/lctl
record>  device $mds1
record>  record sun-n1-console
record>
record>  add_uuid sun-n1-console_UUID [EMAIL PROTECTED]
record>  mds1_UUID active
record>  + /usr/sbin/lctl
record>  device $mds1
record>  record sun-n1-console
record>
record>  attach mdc MDC_sun-n1-console_mds1_MNT_sun-n1-console 
0cf7b_MNT_sun-n1-console_dd8b963906
record>  quit
record>  + /usr/sbin/lctl
record>  device $mds1
record>  record sun-n1-console
record>
record>  cfg_device MDC_sun-n1-console_mds1_MNT_sun-n1-console
record>  setup mds1_UUID sun-n1-console_UUID
record>  quit
record>  MTPT: MNT_sun-n1-console MNT_sun-n1-console_UUID /mnt/lustre mds1_UUID 
lov1_UUID
record>  + /usr/sbin/lctl
record>  device $mds1
record>  record sun-n1-console
record>
record>  mount_option sun-n1-console lov1 
MDC_sun-n1-console_mds1_MNT_sun-n1-console
record>  quit
record>  End recording log sun-n1-console on mds1
+ /usr/sbin/lctl
  ignore_errors
  cfg_device $mds1
  cleanup
  detach
  quit
+ losetup /dev/loop0
+ losetup /dev/loop1
+ losetup -d /dev/loop1
changing mtime of LOGS to 1177060884
+ mktemp /tmp/lustre-cmd.XXXXXXXX
+ debugfs -w -R "mi /LOGS" </tmp/lustre-cmd.mEPL5082 /tmp/mds1-sun-n1-console
MDSDEV: mds1 mds1_UUID /tmp/mds1-sun-n1-console ldiskfs 400000 no
+ losetup /dev/loop0
+ losetup /dev/loop1
+ losetup /dev/loop2
+ losetup /dev/loop3
+ losetup /dev/loop4
+ losetup /dev/loop5
+ losetup /dev/loop6
+ losetup /dev/loop7
+ losetup /dev/loop0
+ losetup /dev/loop1
+ losetup /dev/loop1 /tmp/mds1-sun-n1-console
+ /usr/sbin/lctl
  attach mdt MDT MDT_UUID
  quit
+ /usr/sbin/lctl
  cfg_device MDT
  setup
  quit
+ dumpe2fs -f -h /dev/loop1
no external journal found for /dev/loop1
MDS mount options: errors=remount-ro
+ /usr/sbin/lctl
  attach mds mds1 mds1_UUID
  quit
+ /usr/sbin/lctl
  cfg_device mds1
  setup /dev/loop1 ldiskfs mds1 errors=remount-ro
  quit
_______________________________________________
Lustre-discuss mailing list
[email protected]
https://mail.clusterfs.com/mailman/listinfo/lustre-discuss

Reply via email to