Re: [Lustre-discuss] controlling which eth interface lustre uses

2010-10-21 Thread Joe Landman
On 10/21/2010 09:37 AM, Brock Palen wrote:
 We recently added a new oss, it has 1 1Gb interface and 1 10Gb
 interface,

 The 10Gb interface is eth4 10.164.0.166 The 1Gb   interface is eth0
 10.164.0.10

They look like they are on the same subnet if you are using /24 ...


 In modprobe.conf I have:

 options lnet networks=tcp0(eth4)

 lctl list_nids 10.164.0@tcp

 From a host I run:

 lctl which_nid oss4 10.164.0@tcp

 But yet I still see traffic over eth0 the 1Gb management network,
 might higher than I would expect (upto 100MB/s) The management
 interface is oss4-gb  So If I do from a client:

 lctl which_nid oss4-gb 10.164.0...@tcp

 Why If I have netwroks=tcp0(eth4)  and that list_nids showa only the
 10Gb interface, do I have so much traffic over the 1Gb interface?
 There is some traffic on the 10Gb interface, but I would like to tell
 lustre 'don't use the 1Gb interface'.

If they are on the same subnet, its possible that the 1GbE sees the arp 
response first.  And then its pretty much guaranteed to have the traffic 
go out that port.

If your subnets are different, this shouldn't be the issue.


 Thanks!

 Brock Palen www.umich.edu/~brockp Center for Advanced Computing
 bro...@umich.edu (734)936-1985



 ___ Lustre-discuss
 mailing list Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss


-- 
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics Inc.
email: land...@scalableinformatics.com
web  : http://scalableinformatics.com
http://scalableinformatics.com/jackrabbit
phone: +1 734 786 8423 x121
fax  : +1 866 888 3112
cell : +1 734 612 4615
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] controlling which eth interface lustre uses

2010-10-21 Thread Brock Palen
On Oct 21, 2010, at 9:48 AM, Joe Landman wrote:

 On 10/21/2010 09:37 AM, Brock Palen wrote:
 We recently added a new oss, it has 1 1Gb interface and 1 10Gb
 interface,
 
 The 10Gb interface is eth4 10.164.0.166 The 1Gb   interface is eth0
 10.164.0.10
 
 They look like they are on the same subnet if you are using /24 ...

You are correct 

Both interfaces are on the same subnet:

[r...@oss4-gb ~]# route
Kernel IP routing table
Destination Gateway Genmask Flags Metric RefUse Iface
10.164.0.0  *   255.255.248.0   U 0  00 eth0
10.164.0.0  *   255.255.248.0   U 0  00 eth4
169.254.0.0 *   255.255.0.0 U 0  00 eth4
default 10.164.0.1  0.0.0.0 UG0  00 eth0

There is no way to mask the lustre service away from the 1Gb interface?  

 
 
 In modprobe.conf I have:
 
 options lnet networks=tcp0(eth4)
 
 lctl list_nids 10.164.0@tcp
 
 From a host I run:
 
 lctl which_nid oss4 10.164.0@tcp
 
 But yet I still see traffic over eth0 the 1Gb management network,
 might higher than I would expect (upto 100MB/s) The management
 interface is oss4-gb  So If I do from a client:
 
 lctl which_nid oss4-gb 10.164.0...@tcp
 
 Why If I have netwroks=tcp0(eth4)  and that list_nids showa only the
 10Gb interface, do I have so much traffic over the 1Gb interface?
 There is some traffic on the 10Gb interface, but I would like to tell
 lustre 'don't use the 1Gb interface'.
 
 If they are on the same subnet, its possible that the 1GbE sees the arp 
 response first.  And then its pretty much guaranteed to have the traffic 
 go out that port.
 
 If your subnets are different, this shouldn't be the issue.
 
 
 Thanks!
 
 Brock Palen www.umich.edu/~brockp Center for Advanced Computing
 bro...@umich.edu (734)936-1985
 
 
 
 ___ Lustre-discuss
 mailing list Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss
 
 
 -- 
 Joseph Landman, Ph.D
 Founder and CEO
 Scalable Informatics Inc.
 email: land...@scalableinformatics.com
 web  : http://scalableinformatics.com
http://scalableinformatics.com/jackrabbit
 phone: +1 734 786 8423 x121
 fax  : +1 866 888 3112
 cell : +1 734 612 4615
 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss
 
 

___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] controlling which eth interface lustre uses

2010-10-21 Thread Bob Ball
Why do you need both active?  If one is a backup to the other, then bond 
them as a primary/backup pair, meaning only one will be active at at a 
time, ie, your designated primary (unless it goes down).

bob

On 10/21/2010 9:51 AM, Brock Palen wrote:
 On Oct 21, 2010, at 9:48 AM, Joe Landman wrote:

 On 10/21/2010 09:37 AM, Brock Palen wrote:
 We recently added a new oss, it has 1 1Gb interface and 1 10Gb
 interface,

 The 10Gb interface is eth4 10.164.0.166 The 1Gb   interface is eth0
 10.164.0.10
 They look like they are on the same subnet if you are using /24 ...
 You are correct

 Both interfaces are on the same subnet:

 [r...@oss4-gb ~]# route
 Kernel IP routing table
 Destination Gateway Genmask Flags Metric RefUse Iface
 10.164.0.0  *   255.255.248.0   U 0  00 eth0
 10.164.0.0  *   255.255.248.0   U 0  00 eth4
 169.254.0.0 *   255.255.0.0 U 0  00 eth4
 default 10.164.0.1  0.0.0.0 UG0  00 eth0

 There is no way to mask the lustre service away from the 1Gb interface?

 In modprobe.conf I have:

 options lnet networks=tcp0(eth4)

 lctl list_nids 10.164.0@tcp

  From a host I run:
 lctl which_nid oss4 10.164.0@tcp

 But yet I still see traffic over eth0 the 1Gb management network,
 might higher than I would expect (upto 100MB/s) The management
 interface is oss4-gb  So If I do from a client:

 lctl which_nid oss4-gb 10.164.0...@tcp

 Why If I have netwroks=tcp0(eth4)  and that list_nids showa only the
 10Gb interface, do I have so much traffic over the 1Gb interface?
 There is some traffic on the 10Gb interface, but I would like to tell
 lustre 'don't use the 1Gb interface'.
 If they are on the same subnet, its possible that the 1GbE sees the arp
 response first.  And then its pretty much guaranteed to have the traffic
 go out that port.

 If your subnets are different, this shouldn't be the issue.

 Thanks!

 Brock Palen www.umich.edu/~brockp Center for Advanced Computing
 bro...@umich.edu (734)936-1985



 ___ Lustre-discuss
 mailing list Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss

 -- 
 Joseph Landman, Ph.D
 Founder and CEO
 Scalable Informatics Inc.
 email: land...@scalableinformatics.com
 web  : http://scalableinformatics.com
 http://scalableinformatics.com/jackrabbit
 phone: +1 734 786 8423 x121
 fax  : +1 866 888 3112
 cell : +1 734 612 4615
 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss


 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss


___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] controlling which eth interface lustre uses

2010-10-21 Thread Charles Taylor

On Oct 21, 2010, at 9:51 AM, Brock Palen wrote:

 On Oct 21, 2010, at 9:48 AM, Joe Landman wrote:

 On 10/21/2010 09:37 AM, Brock Palen wrote:
 We recently added a new oss, it has 1 1Gb interface and 1 10Gb
 interface,

 The 10Gb interface is eth4 10.164.0.166 The 1Gb   interface is eth0
 10.164.0.10

 They look like they are on the same subnet if you are using /24 ...

 You are correct

 Both interfaces are on the same subnet:

 [r...@oss4-gb ~]# route
 Kernel IP routing table
 Destination Gateway Genmask Flags Metric Ref 
 Use Iface
 10.164.0.0  *   255.255.248.0   U 0   
 00 eth0
 10.164.0.0  *   255.255.248.0   U 0   
 00 eth4
 169.254.0.0 *   255.255.0.0 U 0   
 00 eth4
 default 10.164.0.1  0.0.0.0 UG0   
 00 eth0

 There is no way to mask the lustre service away from the 1Gb  
 interface?

We struggle with this as well but have not found a way to enforce  
it.   You would think that lustre would honor the NID for incoming  
*and* outgoing traffic but apparently the standard linux routing table  
determines the outbound path and lnet is out of the picture. Thus,  
you end up having to assign separate subnets, shut down your eth0 (in  
this case) interface, or use static routes to fine tune the routing  
decisions (where possible).

We wish that the outgoing decision could be made on the basis of the  
*NID* but that might be too intrusive with regard to the linux  
kernel's network stack so I can understand, somewhat, why it is not  
that way.   Still, it is somewhat counter-intuitive to go through all  
the trouble of having the LNET layer and assigning NIDs only to have  
them disregarded for outbound traffic.

Perhaps there is a way around this that we don't know about.

Regards,

Charlie Taylor
UF HPC Center

___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] controlling which eth interface lustre uses

2010-10-21 Thread Brock Palen


 Why do you need both active?  If one is a backup to the other, then bond 
 them as a primary/backup pair, meaning only one will be active at at a 
 time, ie, your designated primary (unless it goes down).

We could do this, the 10Gb drivers have been such a pain for us we wanted to 
have a 'back door' management network to get to the box should we have issues 
with the 10Gb driver.

Oddly I ran:

ifconfig eth0 down 

and I could nolonger ping the box over the eth4 interface, I had to power cycle 
it form management.  Very odd.

 
 bob
 
 On 10/21/2010 9:51 AM, Brock Palen wrote:
 On Oct 21, 2010, at 9:48 AM, Joe Landman wrote:
 
 On 10/21/2010 09:37 AM, Brock Palen wrote:
 We recently added a new oss, it has 1 1Gb interface and 1 10Gb
 interface,
 
 The 10Gb interface is eth4 10.164.0.166 The 1Gb   interface is eth0
 10.164.0.10
 They look like they are on the same subnet if you are using /24 ...
 You are correct
 
 Both interfaces are on the same subnet:
 
 [r...@oss4-gb ~]# route
 Kernel IP routing table
 Destination Gateway Genmask Flags Metric RefUse Iface
 10.164.0.0  *   255.255.248.0   U 0  00 eth0
 10.164.0.0  *   255.255.248.0   U 0  00 eth4
 169.254.0.0 *   255.255.0.0 U 0  00 eth4
 default 10.164.0.1  0.0.0.0 UG0  00 eth0
 
 There is no way to mask the lustre service away from the 1Gb interface?
 
 In modprobe.conf I have:
 
 options lnet networks=tcp0(eth4)
 
 lctl list_nids 10.164.0@tcp
 
 From a host I run:
 lctl which_nid oss4 10.164.0@tcp
 
 But yet I still see traffic over eth0 the 1Gb management network,
 might higher than I would expect (upto 100MB/s) The management
 interface is oss4-gb  So If I do from a client:
 
 lctl which_nid oss4-gb 10.164.0...@tcp
 
 Why If I have netwroks=tcp0(eth4)  and that list_nids showa only the
 10Gb interface, do I have so much traffic over the 1Gb interface?
 There is some traffic on the 10Gb interface, but I would like to tell
 lustre 'don't use the 1Gb interface'.
 If they are on the same subnet, its possible that the 1GbE sees the arp
 response first.  And then its pretty much guaranteed to have the traffic
 go out that port.
 
 If your subnets are different, this shouldn't be the issue.
 
 Thanks!
 
 Brock Palen www.umich.edu/~brockp Center for Advanced Computing
 bro...@umich.edu (734)936-1985
 
 
 
 ___ Lustre-discuss
 mailing list Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss
 
 -- 
 Joseph Landman, Ph.D
 Founder and CEO
 Scalable Informatics Inc.
 email: land...@scalableinformatics.com
 web  : http://scalableinformatics.com
http://scalableinformatics.com/jackrabbit
 phone: +1 734 786 8423 x121
 fax  : +1 866 888 3112
 cell : +1 734 612 4615
 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss
 
 
 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss
 
 
 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss
 
 

___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] controlling which eth interface lustre uses

2010-10-21 Thread Wojciech Turek
Maybe I am missing a point here but can you explain me why would you need to
have two NICs in one host on the same subnet?
If you need additional access route to your host why not to configure eth0
on different subnet?

On 21 October 2010 15:29, Brock Palen bro...@umich.edu wrote:



  Why do you need both active?  If one is a backup to the other, then bond
  them as a primary/backup pair, meaning only one will be active at at a
  time, ie, your designated primary (unless it goes down).

 We could do this, the 10Gb drivers have been such a pain for us we wanted
 to have a 'back door' management network to get to the box should we have
 issues with the 10Gb driver.

 Oddly I ran:

 ifconfig eth0 down

 and I could nolonger ping the box over the eth4 interface, I had to power
 cycle it form management.  Very odd.

 
  bob
 
  On 10/21/2010 9:51 AM, Brock Palen wrote:
  On Oct 21, 2010, at 9:48 AM, Joe Landman wrote:
 
  On 10/21/2010 09:37 AM, Brock Palen wrote:
  We recently added a new oss, it has 1 1Gb interface and 1 10Gb
  interface,
 
  The 10Gb interface is eth4 10.164.0.166 The 1Gb   interface is eth0
  10.164.0.10
  They look like they are on the same subnet if you are using /24 ...
  You are correct
 
  Both interfaces are on the same subnet:
 
  [r...@oss4-gb ~]# route
  Kernel IP routing table
  Destination Gateway Genmask Flags Metric RefUse
 Iface
  10.164.0.0  *   255.255.248.0   U 0  00
 eth0
  10.164.0.0  *   255.255.248.0   U 0  00
 eth4
  169.254.0.0 *   255.255.0.0 U 0  00
 eth4
  default 10.164.0.1  0.0.0.0 UG0  00
 eth0
 
  There is no way to mask the lustre service away from the 1Gb interface?
 
  In modprobe.conf I have:
 
  options lnet networks=tcp0(eth4)
 
  lctl list_nids 10.164.0@tcp
 
  From a host I run:
  lctl which_nid oss4 10.164.0@tcp
 
  But yet I still see traffic over eth0 the 1Gb management network,
  might higher than I would expect (upto 100MB/s) The management
  interface is oss4-gb  So If I do from a client:
 
  lctl which_nid oss4-gb 10.164.0...@tcp
 
  Why If I have netwroks=tcp0(eth4)  and that list_nids showa only the
  10Gb interface, do I have so much traffic over the 1Gb interface?
  There is some traffic on the 10Gb interface, but I would like to tell
  lustre 'don't use the 1Gb interface'.
  If they are on the same subnet, its possible that the 1GbE sees the arp
  response first.  And then its pretty much guaranteed to have the
 traffic
  go out that port.
 
  If your subnets are different, this shouldn't be the issue.
 
  Thanks!
 
  Brock Palen www.umich.edu/~brockp http://www.umich.edu/%7EbrockpCenter 
  for Advanced Computing
  bro...@umich.edu (734)936-1985
 
 
 
  ___ Lustre-discuss
  mailing list Lustre-discuss@lists.lustre.org
  http://lists.lustre.org/mailman/listinfo/lustre-discuss
 
  --
  Joseph Landman, Ph.D
  Founder and CEO
  Scalable Informatics Inc.
  email: land...@scalableinformatics.com
  web  : http://scalableinformatics.com
 http://scalableinformatics.com/jackrabbit
  phone: +1 734 786 8423 x121
  fax  : +1 866 888 3112
  cell : +1 734 612 4615
  ___
  Lustre-discuss mailing list
  Lustre-discuss@lists.lustre.org
  http://lists.lustre.org/mailman/listinfo/lustre-discuss
 
 
  ___
  Lustre-discuss mailing list
  Lustre-discuss@lists.lustre.org
  http://lists.lustre.org/mailman/listinfo/lustre-discuss
 
 
  ___
  Lustre-discuss mailing list
  Lustre-discuss@lists.lustre.org
  http://lists.lustre.org/mailman/listinfo/lustre-discuss
 
 

 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss

___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] controlling which eth interface lustre uses

2010-10-21 Thread Brian J. Murrell
On Thu, 2010-10-21 at 10:29 -0400, Brock Palen wrote: 
 
 We could do this, the 10Gb drivers have been such a pain for us we wanted to 
 have a 'back door' management network to get to the box should we have issues 
 with the 10Gb driver.

If you really do want two separate networks, one for Lustre and one for
management, they why not configure them as separate networks with
different subnets?  Anything else is just going to confuse the routing
engine.

I think at best two interfaces on the same subnet is going to cause
indeterminate behaviour.

b.



signature.asc
Description: This is a digitally signed message part
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] controlling which eth interface lustre uses

2010-10-21 Thread Brock Palen
On Oct 21, 2010, at 10:35 AM, Brian J. Murrell wrote:

 On Thu, 2010-10-21 at 10:29 -0400, Brock Palen wrote: 
 
 We could do this, the 10Gb drivers have been such a pain for us we wanted to 
 have a 'back door' management network to get to the box should we have 
 issues with the 10Gb driver.
 
 If you really do want two separate networks, one for Lustre and one for
 management, they why not configure them as separate networks with
 different subnets?  Anything else is just going to confuse the routing
 engine.
 
 I think at best two interfaces on the same subnet is going to cause
 indeterminate behaviour.

We settled on disabling the eth0 interface and hope the 10Gb driver will not 
give us any more trouble.
We don't currently have a dedicated management network, it was passed over 
being setup for just a single host.



 
 b.
 
 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss

___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] controlling which eth interface lustre uses

2010-10-21 Thread Bob Ball
OK, quick startup on bonding, as we use it for our OSS here.

We have 2 NICs we bond (SL5.5, an RHEL variant), eth1 at 1Gb and eth2 at 
10Gb using Myricom hardware.  10.10.1.2 is the network gateway, a 
convenient arp target that should always be up.

[r...@umdist04 network-scripts]# cat ifcfg-bond0
DEVICE=bond0
IPADDR=10.10.2.24
NETMASK=255.255.252.0
BOOTPROTO=static
ONBOOT=yes
VLAN=no
MTU=1500

[r...@umdist04 network-scripts]# cat ifcfg-eth1
DEVICE=eth1
ONBOOT=no
BOOTPROTO=none
MTU=1500
MASTER=bond0
SLAVE=yes

[r...@umdist04 network-scripts]# cat ifcfg-eth2
DEVICE=eth2
BOOTPROTO=none
ONBOOT=no
MTU=1500
MASTER=bond0
SLAVE=yes

[r...@umdist04 etc]# cat modprobe.conf
...
alias eth1 bnx2
alias eth2 myri10ge
...
alias bond0 bonding
options bond0 mode=1 arp_interval=250 arp_ip_target=10.10.1.2 primary=eth2
options lnet networks=tcp0(bond0)
...

You can check /proc/net/bonding/bond0 afterwards for information.

bob


On 10/21/2010 9:59 AM, Bob Ball wrote:
 Why do you need both active?  If one is a backup to the other, then bond
 them as a primary/backup pair, meaning only one will be active at at a
 time, ie, your designated primary (unless it goes down).

 bob

 On 10/21/2010 9:51 AM, Brock Palen wrote:
 On Oct 21, 2010, at 9:48 AM, Joe Landman wrote:

 On 10/21/2010 09:37 AM, Brock Palen wrote:
 We recently added a new oss, it has 1 1Gb interface and 1 10Gb
 interface,

 The 10Gb interface is eth4 10.164.0.166 The 1Gb   interface is eth0
 10.164.0.10
 They look like they are on the same subnet if you are using /24 ...
 You are correct

 Both interfaces are on the same subnet:

 [r...@oss4-gb ~]# route
 Kernel IP routing table
 Destination Gateway Genmask Flags Metric RefUse Iface
 10.164.0.0  *   255.255.248.0   U 0  00 eth0
 10.164.0.0  *   255.255.248.0   U 0  00 eth4
 169.254.0.0 *   255.255.0.0 U 0  00 eth4
 default 10.164.0.1  0.0.0.0 UG0  00 eth0

 There is no way to mask the lustre service away from the 1Gb interface?

 In modprobe.conf I have:

 options lnet networks=tcp0(eth4)

 lctl list_nids 10.164.0@tcp

From a host I run:
 lctl which_nid oss4 10.164.0@tcp

 But yet I still see traffic over eth0 the 1Gb management network,
 might higher than I would expect (upto 100MB/s) The management
 interface is oss4-gb  So If I do from a client:

 lctl which_nid oss4-gb 10.164.0...@tcp

 Why If I have netwroks=tcp0(eth4)  and that list_nids showa only the
 10Gb interface, do I have so much traffic over the 1Gb interface?
 There is some traffic on the 10Gb interface, but I would like to tell
 lustre 'don't use the 1Gb interface'.
 If they are on the same subnet, its possible that the 1GbE sees the arp
 response first.  And then its pretty much guaranteed to have the traffic
 go out that port.

 If your subnets are different, this shouldn't be the issue.

 Thanks!

 Brock Palen www.umich.edu/~brockp Center for Advanced Computing
 bro...@umich.edu (734)936-1985



 ___ Lustre-discuss
 mailing list Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss
 -- 
 Joseph Landman, Ph.D
 Founder and CEO
 Scalable Informatics Inc.
 email: land...@scalableinformatics.com
 web  : http://scalableinformatics.com
  http://scalableinformatics.com/jackrabbit
 phone: +1 734 786 8423 x121
 fax  : +1 866 888 3112
 cell : +1 734 612 4615
 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss


 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss


 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss


___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] controlling which eth interface lustre uses

2010-10-21 Thread Joe Landman
On 10/21/2010 10:29 AM, Brock Palen wrote:


 Why do you need both active?  If one is a backup to the other, then
 bond them as a primary/backup pair, meaning only one will be active
 at at a time, ie, your designated primary (unless it goes down).

 We could do this, the 10Gb drivers have been such a pain for us we
 wanted to have a 'back door' management network to get to the box
 should we have issues with the 10Gb driver.

 Oddly I ran:

 ifconfig eth0 down

 and I could nolonger ping the box over the eth4 interface, I had to
 power cycle it form management.  Very odd.


Hmmm ... what 1GbE and 10GbE NICs?  Which kernel?  We maintain kernel 
RPMs and tarballs for our customers, and if one of ours will work for 
you, you are welcome to it.

When we set up clusters and/or storage clusters, we typically 
(completely) isolate the (management and storage fabric) nets from each 
other.  We don't recommend putting interfaces on the same subnet unless 
there is a clear intention to channel bond.

You may be able to tell the box to ignore arps on the eth0 net, and then 
hand edit the arp table (arp -s ...) to force a connection.  However, 
this is somewhat convoluted and a management pain.

For out of band work, a kvm over IP could be helpful.  Does the box 
support kvm over ip from IPMI?  If not, you could get a drop in unit 
that does this (we use these for older less capable nodes when needed).




 bob

 On 10/21/2010 9:51 AM, Brock Palen wrote:
 On Oct 21, 2010, at 9:48 AM, Joe Landman wrote:

 On 10/21/2010 09:37 AM, Brock Palen wrote:
 We recently added a new oss, it has 1 1Gb interface and 1
 10Gb interface,

 The 10Gb interface is eth4 10.164.0.166 The 1Gb   interface
 is eth0 10.164.0.10
 They look like they are on the same subnet if you are using /24
 ...
 You are correct

 Both interfaces are on the same subnet:

 [r...@oss4-gb ~]# route Kernel IP routing table Destination
 Gateway Genmask Flags Metric RefUse Iface
 10.164.0.0  *   255.255.248.0   U 0  0
 0 eth0 10.164.0.0  *   255.255.248.0   U 0
 00 eth4 169.254.0.0 *   255.255.0.0 U
 0  00 eth4 default 10.164.0.1  0.0.0.0
 UG0  00 eth0

 There is no way to mask the lustre service away from the 1Gb
 interface?

 In modprobe.conf I have:

 options lnet networks=tcp0(eth4)

 lctl list_nids 10.164.0@tcp

 From a host I run:
 lctl which_nid oss4 10.164.0@tcp

 But yet I still see traffic over eth0 the 1Gb management
 network, might higher than I would expect (upto 100MB/s) The
 management interface is oss4-gb  So If I do from a client:

 lctl which_nid oss4-gb 10.164.0...@tcp

 Why If I have netwroks=tcp0(eth4)  and that list_nids showa
 only the 10Gb interface, do I have so much traffic over the
 1Gb interface? There is some traffic on the 10Gb interface,
 but I would like to tell lustre 'don't use the 1Gb
 interface'.
 If they are on the same subnet, its possible that the 1GbE sees
 the arp response first.  And then its pretty much guaranteed to
 have the traffic go out that port.

 If your subnets are different, this shouldn't be the issue.

 Thanks!

 Brock Palen www.umich.edu/~brockp Center for Advanced
 Computing bro...@umich.edu (734)936-1985



 ___
 Lustre-discuss mailing list Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss

 -- Joseph Landman, Ph.D Founder and CEO Scalable Informatics
 Inc. email: land...@scalableinformatics.com web  :
 http://scalableinformatics.com
 http://scalableinformatics.com/jackrabbit phone: +1 734 786
 8423 x121 fax  : +1 866 888 3112 cell : +1 734 612 4615
 ___ Lustre-discuss
 mailing list Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss


 ___ Lustre-discuss
 mailing list Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss


 ___ Lustre-discuss
 mailing list Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss



 ___ Lustre-discuss
 mailing list Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss


-- 
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics Inc.
email: land...@scalableinformatics.com
web  : http://scalableinformatics.com
http://scalableinformatics.com/jackrabbit
phone: +1 734 786 8423 x121
fax  : +1 866 888 3112
cell : +1 734 612 4615
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] controlling which eth interface lustre uses

2010-10-21 Thread Lundgren, Andrew
Just as a FYI, you can set most of the bonding options in the ifcfg-bond0 file.

IE:

BONDING_OPTS=arp_ip_target=10.248.58.254 arp_interval=500 mode=active-backup 
primary=eth0

Then your modprobe.conf only needs:

alias bond0 bonding

-Original Message-
From: lustre-discuss-boun...@lists.lustre.org 
[mailto:lustre-discuss-boun...@lists.lustre.org] On Behalf Of Bob Ball
Sent: Thursday, October 21, 2010 8:41 AM
To: lustre-discuss@lists.lustre.org
Subject: Re: [Lustre-discuss] controlling which eth interface lustre uses

OK, quick startup on bonding, as we use it for our OSS here.

We have 2 NICs we bond (SL5.5, an RHEL variant), eth1 at 1Gb and eth2 at 
10Gb using Myricom hardware.  10.10.1.2 is the network gateway, a 
convenient arp target that should always be up.

[r...@umdist04 network-scripts]# cat ifcfg-bond0
DEVICE=bond0
IPADDR=10.10.2.24
NETMASK=255.255.252.0
BOOTPROTO=static
ONBOOT=yes
VLAN=no
MTU=1500

[r...@umdist04 network-scripts]# cat ifcfg-eth1
DEVICE=eth1
ONBOOT=no
BOOTPROTO=none
MTU=1500
MASTER=bond0
SLAVE=yes

[r...@umdist04 network-scripts]# cat ifcfg-eth2
DEVICE=eth2
BOOTPROTO=none
ONBOOT=no
MTU=1500
MASTER=bond0
SLAVE=yes

[r...@umdist04 etc]# cat modprobe.conf
...
alias eth1 bnx2
alias eth2 myri10ge
...
alias bond0 bonding
options bond0 mode=1 arp_interval=250 arp_ip_target=10.10.1.2 primary=eth2
options lnet networks=tcp0(bond0)
...

You can check /proc/net/bonding/bond0 afterwards for information.

bob


On 10/21/2010 9:59 AM, Bob Ball wrote:
 Why do you need both active?  If one is a backup to the other, then bond
 them as a primary/backup pair, meaning only one will be active at at a
 time, ie, your designated primary (unless it goes down).

 bob

 On 10/21/2010 9:51 AM, Brock Palen wrote:
 On Oct 21, 2010, at 9:48 AM, Joe Landman wrote:

 On 10/21/2010 09:37 AM, Brock Palen wrote:
 We recently added a new oss, it has 1 1Gb interface and 1 10Gb
 interface,

 The 10Gb interface is eth4 10.164.0.166 The 1Gb   interface is eth0
 10.164.0.10
 They look like they are on the same subnet if you are using /24 ...
 You are correct

 Both interfaces are on the same subnet:

 [r...@oss4-gb ~]# route
 Kernel IP routing table
 Destination Gateway Genmask Flags Metric RefUse Iface
 10.164.0.0  *   255.255.248.0   U 0  00 eth0
 10.164.0.0  *   255.255.248.0   U 0  00 eth4
 169.254.0.0 *   255.255.0.0 U 0  00 eth4
 default 10.164.0.1  0.0.0.0 UG0  00 eth0

 There is no way to mask the lustre service away from the 1Gb interface?

 In modprobe.conf I have:

 options lnet networks=tcp0(eth4)

 lctl list_nids 10.164.0@tcp

From a host I run:
 lctl which_nid oss4 10.164.0@tcp

 But yet I still see traffic over eth0 the 1Gb management network,
 might higher than I would expect (upto 100MB/s) The management
 interface is oss4-gb  So If I do from a client:

 lctl which_nid oss4-gb 10.164.0...@tcp

 Why If I have netwroks=tcp0(eth4)  and that list_nids showa only the
 10Gb interface, do I have so much traffic over the 1Gb interface?
 There is some traffic on the 10Gb interface, but I would like to tell
 lustre 'don't use the 1Gb interface'.
 If they are on the same subnet, its possible that the 1GbE sees the arp
 response first.  And then its pretty much guaranteed to have the traffic
 go out that port.

 If your subnets are different, this shouldn't be the issue.

 Thanks!

 Brock Palen www.umich.edu/~brockp Center for Advanced Computing
 bro...@umich.edu (734)936-1985



 ___ Lustre-discuss
 mailing list Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss
 -- 
 Joseph Landman, Ph.D
 Founder and CEO
 Scalable Informatics Inc.
 email: land...@scalableinformatics.com
 web  : http://scalableinformatics.com
  http://scalableinformatics.com/jackrabbit
 phone: +1 734 786 8423 x121
 fax  : +1 866 888 3112
 cell : +1 734 612 4615
 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss


 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss


 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss


___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] controlling which eth interface lustre uses

2010-10-21 Thread Christopher J.Walker
Charles Taylor wrote:
 On Oct 21, 2010, at 9:51 AM, Brock Palen wrote:
 
 On Oct 21, 2010, at 9:48 AM, Joe Landman wrote:

 On 10/21/2010 09:37 AM, Brock Palen wrote:
 We recently added a new oss, it has 1 1Gb interface and 1 10Gb
 interface,

 The 10Gb interface is eth4 10.164.0.166 The 1Gb   interface is eth0
 10.164.0.10
 They look like they are on the same subnet if you are using /24 ...
 You are correct

 Both interfaces are on the same subnet:

 [r...@oss4-gb ~]# route
 Kernel IP routing table
 Destination Gateway Genmask Flags Metric Ref 
 Use Iface
 10.164.0.0  *   255.255.248.0   U 0   
 00 eth0
 10.164.0.0  *   255.255.248.0   U 0   
 00 eth4
 169.254.0.0 *   255.255.0.0 U 0   
 00 eth4
 default 10.164.0.1  0.0.0.0 UG0   
 00 eth0

 There is no way to mask the lustre service away from the 1Gb  
 interface?
 
 We struggle with this as well but have not found a way to enforce  
 it.   You would think that lustre would honor the NID for incoming  
 *and* outgoing traffic but apparently the standard linux routing table  
 determines the outbound path and lnet is out of the picture. Thus,  
 you end up having to assign separate subnets, shut down your eth0 (in  
 this case) interface, or use static routes to fine tune the routing  
 decisions (where possible).
 
 We wish that the outgoing decision could be made on the basis of the  
 *NID* but that might be too intrusive with regard to the linux  
 kernel's network stack so I can understand, somewhat, why it is not  
 that way.   Still, it is somewhat counter-intuitive to go through all  
 the trouble of having the LNET layer and assigning NIDs only to have  
 them disregarded for outbound traffic.
 
 Perhaps there is a way around this that we don't know about.

Source based routing. You need both to make sure that each interface 
ignores arp requests to the other IP, and that traffic from the 10Gig IP 
is routed out of that card.

This is the way I solved the problem:


#!/bin/sh
# Script to use policy based routing to ensure lustre traffic goes in 
and out from eth2.
# First make sure that eth0 and eth2 only respond to arp requests for 
their own ip
echo  1 /proc/sys/net/ipv4/conf/all/arp_ignore

# Now add a source based route - if the route is from the ip address of 
eth2, then send traffic via it
ip route add 10.1.0.0/16 dev eth2 tab 2
ip rule add from $(ifconfig eth2 | awk 'BEGIN {FS=[ :]+};/inet 
addr/{print $4}') tab 2 priority 600


Having said this, I don't think it's what I'd set up now.
I'd use IPMI to get a serial console on the machine as my back door 
and/or use LACP bonding (can't remember which mode). If you do this, and 
IPMI shares the same physical port as eth0, then it is probably best to 
use eth1 as the failover link[1].


Chris
[1] We had a brief try with IPMI with eth0 and eth1 bonded - DHCP 
packets got out, but the replies didn't get back. Presumably the switch 
is sending the reply to eth1 rather than eth0 (swapping the physical 
cables around was suggested, but we didn't try this).
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss