Hi Alexey,
I'm still encountering a problem even after disabling SELinux.
# cat /proc/cmdline
ro root=LABEL=/ splash=0 rhgb selinux=0 quiet
# grep ^SELINUX /etc/selinux/config
SELINUX=disabled
SELINUXTYPE=targeted
Below is a snippet of /var/log/messages (more complete log is attached):
==========
Apr 23 12:57:06 sun-n1-console kernel: Lustre: OBD class driver Build Version:
1.4.10-19691231170000-PRISTINE-.testsuite.tmp.lbuild-boulder.lbuild-v1_4_10_RC2-2.6-rhel4-i686.lbuild.BUILD.lustre-kernel-2.6.9.lustre.linux-2.6.9-42.0.10.EL_lustre.1.4.10smp,
[EMAIL PROTECTED]
Apr 23 12:57:07 sun-n1-console kernel: Lustre: Added LNI [EMAIL PROTECTED]
[8/256]
Apr 23 12:57:07 sun-n1-console kernel: Lustre: Accept secure, port 988
Apr 23 12:57:12 sun-n1-console kernel: LustreError: Refusing connection from
192.168.123.45 for [EMAIL PROTECTED]: No matching NI
Apr 23 12:57:12 sun-n1-console kernel: LustreError:
4416:0:(socklnd_cb.c:2160:ksocknal_recv_hello()) Error -104 reading HELLO from
192.168.123.45
Apr 23 12:57:12 sun-n1-console kernel: LustreError: Connection to [EMAIL
PROTECTED] at host 192.168.123.45 on port 988 was reset: is it running a
compatible version of Lustre and is [EMAIL PROTECTED] one of its NIDs?
Apr 23 12:57:12 sun-n1-console kernel: Lustre:
10:0:(linux-debug.c:98:libcfs_run_upcall()) Invoked LNET upcall
/usr/lib/lustre/lnet_upcall ROUTER_NOTIFY,[EMAIL PROTECTED],down,1177304206
Apr 23 12:57:17 sun-n1-console kernel: LustreError:
4854:0:(client.c:947:ptlrpc_expire_one_request()) @@@ timeout (sent at
1177304232, 5s ago) [EMAIL PROTECTED] x1/t0 o8->[EMAIL PROTECTED]:6 lens
240/272 ref 1 fl Rpc:/0/0 rc 0/0
Apr 23 12:57:31 sun-n1-console kernel: LustreError:
5170:0:(mds_lov.c:589:mds_lov_start_synchronize()) mds1: error starting
mds_lov_synchronize: -4
Apr 23 12:57:31 sun-n1-console kernel: LustreError:
5170:0:(quota_master.c:1103:mds_quota_recovery()) Cannot start quota recovery
thread: rc -4
Apr 23 12:57:37 sun-n1-console kernel: LustreError: Refusing connection from
192.168.123.45 for [EMAIL PROTECTED]: No matching NI
Apr 23 12:57:37 sun-n1-console kernel: LustreError:
4417:0:(socklnd_cb.c:2160:ksocknal_recv_hello()) Error -104 reading HELLO from
192.168.123.45
Apr 23 12:57:37 sun-n1-console kernel: LustreError: Connection to [EMAIL
PROTECTED] at host 192.168.123.45 on port 988 was reset: is it running a
compatible version of Lustre and is [EMAIL PROTECTED] one of its NIDs?
Apr 23 12:57:42 sun-n1-console kernel: LustreError:
4854:0:(client.c:947:ptlrpc_expire_one_request()) @@@ timeout (sent at
1177304257, 5s ago) [EMAIL PROTECTED] x3/t0 o8->[EMAIL PROTECTED]:6 lens
240/272 ref 1 fl Rpc:/0/0 rc 0/0
==========
It looks to me that there's a confusion over which network interface
to use (eth0 = 129.158.130.75, and eth1 = 192.168.123.45).
I intended to deploy MDS on eth1; this is specified using IP address
when creating a node:
--add net --node sun-n1-console --nettype lnet --nid [EMAIL PROTECTED]
I've emptied /etc/resolv.conf to ensured that "sun-n1-console" is
resolved to 192.168.12.45,
# cat /etc/hosts
127.0.0.1 localhost.localdomain localhost
192.168.123.45 sun-n1-console
129.158.130.75 public-host
# hostname -f ; hostname -i
sun-n1-console
192.168.123.45
And results of ifconfig:
eth0 Link encap:Ethernet HWaddr 00:07:E9:06:AC:5C
inet addr:129.158.130.75 Bcast:129.158.130.255 Mask:255.255.255.0
eth1 Link encap:Ethernet HWaddr 00:07:E9:06:AC:5D
inet addr:192.168.123.45 Bcast:192.168.123.255 Mask:255.255.255.0
Are there anything else that I missed?
Regards,
Verdi
Alexey Lyashkov wrote:
> looks you need selinux disable.
> ===
> Apr 20 17:38:26 sun-n1-console kernel: audit(1177061906.286:66): avc:
> denied { rawip_recv } for saddr=192.168.123.45 src=1023
> daddr=192.168.123.45 dest=988 netif=lo
> ==
>
>
> On Fri, 2007-04-20 at 14:04, Verdi March wrote:
> > Hi,
> >
> > I'm encountering problem when starting the "local" example (one
> > MSD, LOV, OST, and client, all on node "sun-n1-console").
> >
> > # lmc -m test.xml --batch test.txt
> > # cat test.txt
> > --add node --node sun-n1-console
> > --add net --node sun-n1-console --nettype lnet --nid [EMAIL PROTECTED]
> > --add mds --node sun-n1-console --mds mds1 --fstype ldiskfs --dev
> /tmp/mds1-sun-n1-console --size 400000
> > --add lov --lov lov1 --mds mds1 --stripe_sz 1048576 --stripe_cnt 1
> --stripe_pattern 0
> > --add ost --node sun-n1-console --lov lov1 --ost ost1-sun-n1-console
> --fstype ldiskfs --dev /tmp/ost1-sun-n1-console --size 400000
> > --add mtpt --node sun-n1-console --path /mnt/lustre --mds mds1 --lov
> lov1
> >
> >
> >
> > The node has two ethernets, eth0 and eth1, both on separate subnets.
> > I deploys all lustre components on eth1 (IP: 192.168.123.45, hostname:
> > sun-n1-console).
> >
> > # cat /etc/hosts
> > 127.0.0.1 localhost.localdomain localhost
> > xxx.yyy.zzz.ab public-host
> > 192.168.123.45 sun-n1-console
> >
> >
> > When eth0 is down, I successfully deployed the "local" example.
> > Only when eth0 is up that Lustre fails to start (see attachment)
> >
> > The error messages from /var/log/messages indicates that MDS does
> > not respond (see below). I believe it's not caused by firewall cause
> > I've switched it off:
> >
> > # iptables -L
> > Chain INPUT (policy ACCEPT)
> > target prot opt source destination
> >
> > Chain FORWARD (policy ACCEPT)
> > target prot opt source destination
> >
> > Chain OUTPUT (policy ACCEPT)
> > target prot opt source destination
> >
> >
> >
> >
> > And here're are the error messages:
> >
> > # tail /var/log/messages
> > Apr 20 17:37:35 sun-n1-console kernel: LustreError:
> 6840:0:(events.c:53:request_out_callback()) @@@ type 4, status -5 [EMAIL
> PROTECTED] x22/t0
> o8->[EMAIL PROTECTED]:6 lens 240/272 ref 2 fl Rpc:/0/0
> rc 0/0
> > Apr 20 17:37:35 sun-n1-console kernel: LustreError:
> 6840:0:(client.c:947:ptlrpc_expire_one_request()) @@@ timeout (sent at
> 1177061855, 0s ago)
> [EMAIL PROTECTED] x22/t0 o8->[EMAIL PROTECTED]:6 lens
> 240/272 ref 1 fl Rpc:/0/0 rc 0/0
> > Apr 20 17:37:35 sun-n1-console kernel: LustreError:
> 6840:0:(client.c:947:ptlrpc_expire_one_request()) Skipped 2 previous similar
> messages
> > Apr 20 17:38:00 sun-n1-console kernel: LustreError:
> 6840:0:(events.c:53:request_out_callback()) @@@ type 4, status -5 [EMAIL
> PROTECTED] x23/t0
> o8->[EMAIL PROTECTED]:6 lens 240/272 ref 2 fl Rpc:/0/0
> rc 0/0
> > Apr 20 17:38:25 sun-n1-console kernel: audit(1177061905.683:64): avc:
> denied { rawip_recv } for pid=6537 comm="socknal_cd03"
> saddr=192.168.123.45 src=1023 daddr=192.168.123.45 dest=988 netif=lo
> scontext=system_u:object_r:unlabeled_t tcontext=system_u:object_r:netif_lo_t
> tclass=netif
> > Apr 20 17:38:25 sun-n1-console kernel: audit(1177061905.884:65): avc:
> denied { rawip_recv } for saddr=192.168.123.45 src=1023
> daddr=192.168.123.45 dest=988 netif=lo scontext=system_u:object_r:unlabeled_t
> tcontext=system_u:object_r:netif_lo_t tclass=netif
> > Apr 20 17:38:26 sun-n1-console kernel: audit(1177061906.286:66): avc:
> denied { rawip_recv } for saddr=192.168.123.45 src=1023
> daddr=192.168.123.45 dest=988 netif=lo scontext=system_u:object_r:unlabeled_t
> tcontext=system_u:object_r:netif_lo_t tclass=netif
> > Apr 20 17:38:27 sun-n1-console kernel: audit(1177061907.090:67): avc:
> denied { rawip_recv } for saddr=192.168.123.45 src=1023
> daddr=192.168.123.45 dest=988 netif=lo scontext=system_u:object_r:unlabeled_t
> tcontext=system_u:object_r:netif_lo_t tclass=netif
> > Apr 20 17:38:28 sun-n1-console kernel: audit(1177061908.698:68): avc:
> denied { rawip_recv } for saddr=192.168.123.45 src=1023
> daddr=192.168.123.45 dest=988 netif=lo scontext=system_u:object_r:unlabeled_t
> tcontext=system_u:object_r:netif_lo_t tclass=netif
> > Apr 20 17:38:30 sun-n1-console kernel: LustreError:
> 6539:0:(acceptor.c:442:lnet_acceptor()) Error -11 reading connection request
> from
> 192.168.123.45
> > Apr 20 17:38:30 sun-n1-console kernel: audit(1177061910.683:69): avc:
> denied { rawip_send } for pid=6539 comm="acceptor_988"
> saddr=192.168.123.45 src=988 daddr=192.168.123.45 dest=1023 netif=lo
> scontext=system_u:object_r:unlabeled_t tcontext=system_u:object_r:netif_lo_t
> tclass=netif
> > Apr 20 17:38:30 sun-n1-console kernel: LustreError:
> 6537:0:(socklnd_cb.c:2160:ksocknal_recv_hello()) Error -104 reading HELLO
> from 192.168.123.45
> > Apr 20 17:38:30 sun-n1-console kernel: LustreError: Connection to
> [EMAIL PROTECTED] at host 192.168.123.45 on port 988 was reset: is it running
> a
> compatible version of Lustre and is [EMAIL PROTECTED] one of its NIDs?
> > Apr 20 17:38:50 sun-n1-console kernel: LustreError:
> 6840:0:(events.c:53:request_out_callback()) @@@ type 4, status -5 [EMAIL
> PROTECTED] x25/t0
> o8->[EMAIL PROTECTED]:6 lens 240/272 ref 2 fl Rpc:/0/0
> rc 0/0
> > Apr 20 17:39:15 sun-n1-console kernel: LustreError:
> 6840:0:(events.c:53:request_out_callback()) @@@ type 4, status -5 [EMAIL
> PROTECTED] x26/t0
> o8->[EMAIL PROTECTED]:6 lens 240/272 ref 2 fl Rpc:/0/0
> rc 0/0
> >
> >
> >
> > Any advices how to make this simple example work?
> >
> >
> > Regards,
> > Verdi
> --
> Alexey Lyashkov <[EMAIL PROTECTED]>
> Beaver team
--
"Feel free" - 10 GB Mailbox, 100 FreeSMS/Monat ...
Jetzt GMX TopMail testen: http://www.gmx.net/de/go/topmail
_______________________________________________
Lustre-discuss mailing list
[email protected]
https://mail.clusterfs.com/mailman/listinfo/lustre-discuss