Hi Alexey,

I'm still encountering a problem even after disabling SELinux.

# cat /proc/cmdline
ro root=LABEL=/ splash=0 rhgb selinux=0 quiet

# grep ^SELINUX /etc/selinux/config
SELINUX=disabled
SELINUXTYPE=targeted


Below is a snippet of /var/log/messages (more complete log is attached):
==========
Apr 23 12:57:06 sun-n1-console kernel: Lustre: OBD class driver Build Version: 
1.4.10-19691231170000-PRISTINE-.testsuite.tmp.lbuild-boulder.lbuild-v1_4_10_RC2-2.6-rhel4-i686.lbuild.BUILD.lustre-kernel-2.6.9.lustre.linux-2.6.9-42.0.10.EL_lustre.1.4.10smp,
 [EMAIL PROTECTED]
Apr 23 12:57:07 sun-n1-console kernel: Lustre: Added LNI [EMAIL PROTECTED] 
[8/256]
Apr 23 12:57:07 sun-n1-console kernel: Lustre: Accept secure, port 988

Apr 23 12:57:12 sun-n1-console kernel: LustreError: Refusing connection from 
192.168.123.45 for [EMAIL PROTECTED]:  No matching NI
Apr 23 12:57:12 sun-n1-console kernel: LustreError: 
4416:0:(socklnd_cb.c:2160:ksocknal_recv_hello()) Error -104 reading HELLO from 
192.168.123.45
Apr 23 12:57:12 sun-n1-console kernel: LustreError: Connection to [EMAIL 
PROTECTED] at host 192.168.123.45 on port 988 was reset: is it running a 
compatible version of Lustre and is [EMAIL PROTECTED] one of its NIDs?
Apr 23 12:57:12 sun-n1-console kernel: Lustre: 
10:0:(linux-debug.c:98:libcfs_run_upcall()) Invoked LNET upcall 
/usr/lib/lustre/lnet_upcall ROUTER_NOTIFY,[EMAIL PROTECTED],down,1177304206
Apr 23 12:57:17 sun-n1-console kernel: LustreError: 
4854:0:(client.c:947:ptlrpc_expire_one_request()) @@@ timeout (sent at 
1177304232, 5s ago)  [EMAIL PROTECTED] x1/t0 o8->[EMAIL PROTECTED]:6 lens 
240/272 ref 1 fl Rpc:/0/0 rc 0/0
Apr 23 12:57:31 sun-n1-console kernel: LustreError: 
5170:0:(mds_lov.c:589:mds_lov_start_synchronize()) mds1: error starting 
mds_lov_synchronize: -4
Apr 23 12:57:31 sun-n1-console kernel: LustreError: 
5170:0:(quota_master.c:1103:mds_quota_recovery()) Cannot start quota recovery 
thread: rc -4
Apr 23 12:57:37 sun-n1-console kernel: LustreError: Refusing connection from 
192.168.123.45 for [EMAIL PROTECTED]:  No matching NI
Apr 23 12:57:37 sun-n1-console kernel: LustreError: 
4417:0:(socklnd_cb.c:2160:ksocknal_recv_hello()) Error -104 reading HELLO from 
192.168.123.45
Apr 23 12:57:37 sun-n1-console kernel: LustreError: Connection to [EMAIL 
PROTECTED] at host 192.168.123.45 on port 988 was reset: is it running a 
compatible version of Lustre and is [EMAIL PROTECTED] one of its NIDs?
Apr 23 12:57:42 sun-n1-console kernel: LustreError: 
4854:0:(client.c:947:ptlrpc_expire_one_request()) @@@ timeout (sent at 
1177304257, 5s ago)  [EMAIL PROTECTED] x3/t0 o8->[EMAIL PROTECTED]:6 lens 
240/272 ref 1 fl Rpc:/0/0 rc 0/0
==========

It looks to me that there's a confusion over which network interface
to use (eth0 = 129.158.130.75, and eth1 = 192.168.123.45).
I intended to deploy MDS on eth1; this is specified using IP address
when creating a node:
  --add net --node sun-n1-console --nettype lnet --nid [EMAIL PROTECTED]


I've emptied /etc/resolv.conf to ensured that "sun-n1-console" is
resolved to 192.168.12.45, 

# cat /etc/hosts
127.0.0.1               localhost.localdomain   localhost
192.168.123.45          sun-n1-console
129.158.130.75          public-host

# hostname -f ; hostname -i
sun-n1-console
192.168.123.45

And results of ifconfig:
eth0      Link encap:Ethernet  HWaddr 00:07:E9:06:AC:5C
          inet addr:129.158.130.75  Bcast:129.158.130.255  Mask:255.255.255.0

eth1      Link encap:Ethernet  HWaddr 00:07:E9:06:AC:5D
          inet addr:192.168.123.45  Bcast:192.168.123.255  Mask:255.255.255.0


Are there anything else that I missed?


Regards,
Verdi

Alexey Lyashkov wrote:
> looks you need selinux disable.
> ===
> Apr 20 17:38:26 sun-n1-console kernel: audit(1177061906.286:66): avc: 
> denied  { rawip_recv } for  saddr=192.168.123.45 src=1023
> daddr=192.168.123.45 dest=988 netif=lo
> ==
> 
> 
> On Fri, 2007-04-20 at 14:04, Verdi March wrote:
> > Hi,
> > 
> > I'm encountering problem when starting the "local" example (one
> > MSD, LOV, OST, and client, all on node "sun-n1-console").
> > 
> > # lmc -m test.xml --batch test.txt
> > # cat test.txt
> > --add node --node sun-n1-console
> > --add net --node sun-n1-console --nettype lnet --nid [EMAIL PROTECTED]
> > --add mds --node sun-n1-console --mds mds1 --fstype ldiskfs --dev
> /tmp/mds1-sun-n1-console --size 400000
> > --add lov --lov lov1 --mds mds1 --stripe_sz 1048576 --stripe_cnt 1
> --stripe_pattern 0
> > --add ost --node sun-n1-console --lov lov1 --ost ost1-sun-n1-console
> --fstype ldiskfs --dev /tmp/ost1-sun-n1-console --size 400000
> > --add mtpt --node sun-n1-console --path /mnt/lustre --mds mds1 --lov
> lov1
> > 
> > 
> > 
> > The node has two ethernets, eth0 and eth1, both on separate subnets.
> > I deploys all lustre components on eth1 (IP: 192.168.123.45, hostname:
> > sun-n1-console).
> > 
> > # cat /etc/hosts
> > 127.0.0.1               localhost.localdomain   localhost
> > xxx.yyy.zzz.ab          public-host
> > 192.168.123.45          sun-n1-console
> > 
> > 
> > When eth0 is down, I successfully deployed the "local" example.
> > Only when eth0 is up that Lustre fails to start (see attachment)
> > 
> > The error messages from /var/log/messages indicates that MDS does
> > not respond (see below). I believe it's not caused by firewall cause
> > I've switched it off:
> > 
> > # iptables -L
> > Chain INPUT (policy ACCEPT)
> > target     prot opt source               destination
> > 
> > Chain FORWARD (policy ACCEPT)
> > target     prot opt source               destination
> > 
> > Chain OUTPUT (policy ACCEPT)
> > target     prot opt source               destination
> > 
> > 
> > 
> > 
> > And here're are the error messages:
> > 
> > # tail /var/log/messages
> > Apr 20 17:37:35 sun-n1-console kernel: LustreError:
> 6840:0:(events.c:53:request_out_callback()) @@@ type 4, status -5  [EMAIL 
> PROTECTED] x22/t0
> o8->[EMAIL PROTECTED]:6 lens 240/272 ref 2 fl Rpc:/0/0
> rc 0/0
> > Apr 20 17:37:35 sun-n1-console kernel: LustreError:
> 6840:0:(client.c:947:ptlrpc_expire_one_request()) @@@ timeout (sent at 
> 1177061855, 0s ago) 
> [EMAIL PROTECTED] x22/t0 o8->[EMAIL PROTECTED]:6 lens
> 240/272 ref 1 fl Rpc:/0/0 rc 0/0
> > Apr 20 17:37:35 sun-n1-console kernel: LustreError:
> 6840:0:(client.c:947:ptlrpc_expire_one_request()) Skipped 2 previous similar 
> messages
> > Apr 20 17:38:00 sun-n1-console kernel: LustreError:
> 6840:0:(events.c:53:request_out_callback()) @@@ type 4, status -5  [EMAIL 
> PROTECTED] x23/t0
> o8->[EMAIL PROTECTED]:6 lens 240/272 ref 2 fl Rpc:/0/0
> rc 0/0
> > Apr 20 17:38:25 sun-n1-console kernel: audit(1177061905.683:64): avc: 
> denied  { rawip_recv } for  pid=6537 comm="socknal_cd03"
> saddr=192.168.123.45 src=1023 daddr=192.168.123.45 dest=988 netif=lo
> scontext=system_u:object_r:unlabeled_t tcontext=system_u:object_r:netif_lo_t 
> tclass=netif
> > Apr 20 17:38:25 sun-n1-console kernel: audit(1177061905.884:65): avc: 
> denied  { rawip_recv } for  saddr=192.168.123.45 src=1023
> daddr=192.168.123.45 dest=988 netif=lo scontext=system_u:object_r:unlabeled_t
> tcontext=system_u:object_r:netif_lo_t tclass=netif
> > Apr 20 17:38:26 sun-n1-console kernel: audit(1177061906.286:66): avc: 
> denied  { rawip_recv } for  saddr=192.168.123.45 src=1023
> daddr=192.168.123.45 dest=988 netif=lo scontext=system_u:object_r:unlabeled_t
> tcontext=system_u:object_r:netif_lo_t tclass=netif
> > Apr 20 17:38:27 sun-n1-console kernel: audit(1177061907.090:67): avc: 
> denied  { rawip_recv } for  saddr=192.168.123.45 src=1023
> daddr=192.168.123.45 dest=988 netif=lo scontext=system_u:object_r:unlabeled_t
> tcontext=system_u:object_r:netif_lo_t tclass=netif
> > Apr 20 17:38:28 sun-n1-console kernel: audit(1177061908.698:68): avc: 
> denied  { rawip_recv } for  saddr=192.168.123.45 src=1023
> daddr=192.168.123.45 dest=988 netif=lo scontext=system_u:object_r:unlabeled_t
> tcontext=system_u:object_r:netif_lo_t tclass=netif
> > Apr 20 17:38:30 sun-n1-console kernel: LustreError:
> 6539:0:(acceptor.c:442:lnet_acceptor()) Error -11 reading connection request 
> from
> 192.168.123.45
> > Apr 20 17:38:30 sun-n1-console kernel: audit(1177061910.683:69): avc: 
> denied  { rawip_send } for  pid=6539 comm="acceptor_988"
> saddr=192.168.123.45 src=988 daddr=192.168.123.45 dest=1023 netif=lo
> scontext=system_u:object_r:unlabeled_t tcontext=system_u:object_r:netif_lo_t 
> tclass=netif
> > Apr 20 17:38:30 sun-n1-console kernel: LustreError:
> 6537:0:(socklnd_cb.c:2160:ksocknal_recv_hello()) Error -104 reading HELLO 
> from 192.168.123.45
> > Apr 20 17:38:30 sun-n1-console kernel: LustreError: Connection to
> [EMAIL PROTECTED] at host 192.168.123.45 on port 988 was reset: is it running 
> a
> compatible version of Lustre and is [EMAIL PROTECTED] one of its NIDs?
> > Apr 20 17:38:50 sun-n1-console kernel: LustreError:
> 6840:0:(events.c:53:request_out_callback()) @@@ type 4, status -5  [EMAIL 
> PROTECTED] x25/t0
> o8->[EMAIL PROTECTED]:6 lens 240/272 ref 2 fl Rpc:/0/0
> rc 0/0
> > Apr 20 17:39:15 sun-n1-console kernel: LustreError:
> 6840:0:(events.c:53:request_out_callback()) @@@ type 4, status -5  [EMAIL 
> PROTECTED] x26/t0
> o8->[EMAIL PROTECTED]:6 lens 240/272 ref 2 fl Rpc:/0/0
> rc 0/0
> > 
> > 
> > 
> > Any advices how to make this simple example work?
> > 
> > 
> > Regards,
> > Verdi
> -- 
> Alexey Lyashkov <[EMAIL PROTECTED]>
> Beaver team

-- 
"Feel free" - 10 GB Mailbox, 100 FreeSMS/Monat ...
Jetzt GMX TopMail testen: http://www.gmx.net/de/go/topmail

_______________________________________________
Lustre-discuss mailing list
[email protected]
https://mail.clusterfs.com/mailman/listinfo/lustre-discuss

Reply via email to