On Wed, Jan 16, 2008 at 11:23:30AM -0800, Herb Wartens wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA512
> 
> Andrew,
> I have not used lustre-1.6.4.X yet, but in previous versions (and most likely 
> the version you are using)
> Lustre actually listens on all interfaces no matter what you specify in the 
> modprobe.conf.  You can verify this
> by looking at the netstat output for port 988 and look for what ports you are 
> listening on.  We here at LLNL
> regularly use multiple interfaces.
> I believe that the issue you are referring to is a bug in the lctl ping code 
> where the ping only responds
> over the first network device specified for a particular lnd.  As long as you 
> have properly configured your
> host routes so that you can ping both interfaces from the other node you 
> should be fine.  IMHO this should
> just be fixed in lnet so you can do an lctl ping from any endpoint to any 
> other endpoint.

I don't think it's a lctl ping bug.

> 
> # ilc6 /root > cat /etc/modprobe.conf
> options lnet networks="tcp0(eth2,eth3)"

This config gives the node only one NID: [EMAIL PROTECTED] You can
verify it by 'lctl list_nids' on the node.

> 
> # ilc6 /root > netstat -a -t -n | grep 988 | grep LIST
> tcp        0      0 0.0.0.0:988                 0.0.0.0:*                   
> LISTEN
> 
> # ilc6 /root > cat /etc/hosts | grep ilc7
> 172.16.101.7     ilc7-lnet0   ilc7-eth2
> 172.16.102.7     ilc7-lnet1   ilc7-eth3
> 
> # ilc6 /root > lctl ping [EMAIL PROTECTED]
> [EMAIL PROTECTED]
> [EMAIL PROTECTED]

When you lctl ping a node at any one of its NIDs, the ping reply
contains a list of all NIDs of the node. As can be seen from the reply
above, [EMAIL PROTECTED] has two NIDs: [EMAIL PROTECTED] and [EMAIL PROTECTED]

So when you tried 'lctl ping [EMAIL PROTECTED]', the ping request
could reach 172.16.102.7, but it was rejected since [EMAIL PROTECTED]
was not one of the node's NIDs.

The socklnd does interface bonding transparently from lnet's
perspective. It exchanges a list of IPs of all NICs under a lnet NID
with peers, and creates connections to all IPs of a peer and thus
aggregates bandwidth. Lnet has no knowledge of this - all it sees is
just one NID, i.e. [EMAIL PROTECTED]

Isaac

> # ilc6 /root > lctl ping [EMAIL PROTECTED]
> failed to ping [EMAIL PROTECTED]: Input/output error
> 
> # ilc6 /root > ping -c 1 172.16.101.7
> PING 172.16.101.7 (172.16.101.7) 56(84) bytes of data.
> 64 bytes from 172.16.101.7: icmp_seq=1 ttl=64 time=0.143 ms
> 
> - --- 172.16.101.7 ping statistics ---
> 1 packets transmitted, 1 received, 0% packet loss, time 0ms
> rtt min/avg/max/mdev = 0.143/0.143/0.143/0.000 ms
> # ilc6 /root > ping -c 1 172.16.102.7
> PING 172.16.102.7 (172.16.102.7) 56(84) bytes of data.
> 64 bytes from 172.16.102.7: icmp_seq=1 ttl=64 time=0.094 ms
> 
> - --- 172.16.102.7 ping statistics ---
> 1 packets transmitted, 1 received, 0% packet loss, time 0ms
> rtt min/avg/max/mdev = 0.094/0.094/0.094/0.000 ms
> 
> 
> Lundgren, Andrew wrote:
> > So the only way to use two nics at once is to bond?  I am more for 
> > redundancy rather than increased throughput.
> > 
> >> -----Original Message-----
> >> From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
> >> Sent: Wednesday, January 16, 2008 6:34 AM
> >> To: Lundgren, Andrew
> >> Cc: '[email protected]'
> >> Subject: Re: [Lustre-discuss] How do you make an MGS/OSS
> >> listen on 2 NICs?
> >>
> >> On Tue, Jan 15, 2008 at 10:28:33AM -0700, Lundgren, Andrew wrote:
> >>>    I am running on CentOS 5 distribution without adding any
> >> updates from
> >>>    CentOS. I am using the lustre 1.6.4.1 kernel and software.
> >>>
> >>>
> >>>
> >>>    I have two NICs that run though different switches.
> >>>
> >>>
> >>>
> >>>    I have the lustre options in my modprobe.conf to look like this:
> >>>
> >>>
> >>>
> >>>    options lnet networks=tcp0(eth1,eth0)
> >>>
> >> This way of interface bonding is now a deprecated lnet
> >> feature. Please refer to:
> >> http://manual.lustre.org/manual/LustreManual16_HTML/DynamicHTM
> >> L-13-1.html
> >>
> >> Isaac
> >>
> > 
> > _______________________________________________
> > Lustre-discuss mailing list
> > [email protected]
> > https://mail.clusterfs.com/mailman/listinfo/lustre-discuss
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.7 (GNU/Linux)
> Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org
> 
> iD8DBQFHjlmyP/62XqEEbMYRCvegAKCK3z6aFIVtGe/O5ruqStf6/tZoLQCcD1L8
> lfEg/WwNivOlMxHDdnWpgcA=
> =0xbK
> -----END PGP SIGNATURE-----
> 
> _______________________________________________
> Lustre-discuss mailing list
> [email protected]
> https://mail.clusterfs.com/mailman/listinfo/lustre-discuss

_______________________________________________
Lustre-discuss mailing list
[email protected]
https://mail.clusterfs.com/mailman/listinfo/lustre-discuss

Reply via email to