Teng, In our case, the bonded NICs are on the server. The issue is that a task handling traffic through a bonded interface can't tell which physical interface the data is going through, and can't be guaranteed to run on the NUMA node handling the interface. This can cause more traffic between each NUMA node (socket) than desired, which drives up latency. Take a look at this ticket [1], in particular the slides Liang posted [2,3]. This issue applies to network interfaces and IO controllers like HBAs and RAID controllers.
--Rick [1] https://jira.hpdd.intel.com/browse/LU-6228 [2] Lustre 2.0 and NUMIOA architectures<http://cdn.opensfs.org/wp-content/uploads/2012/12/900-930_Diego_Moreno_LUG_Bull_2011.pdf> [3] High Performance I/O with NUMA Systems in Linux<http://events.linuxfoundation.org/sites/events/files/eeus13_shelton.pdf> ________________________________ From: teng wang [[email protected]] Sent: Monday, February 23, 2015 8:38 AM To: Rick Wagner Cc: Andrus, Brian Contractor; [email protected] Subject: Re: [Lustre-discuss] Questions about the LNET routing Rick, Thanks for you answer. Could you explain more about the NUMA architecture? The two NICs attached to the same CPU you mentioned are on the client side or the server side? How is the performance impacted by the NUMA architecture given that the client can balance traffic from the interfaces? Thanks, Teng On Fri, Feb 20, 2015 at 5:03 PM, Rick Wagner <[email protected]<mailto:[email protected]>> wrote: Teng, It is the mode of your LACP that determines with physical interface the packets travel over, which can be configured to hash on client IP and port. Each client will open 3 TCP sockets for Lustre traffic to each server, and given a reasonable number of clients these will balance over interfaces in the link aggregation group. If you have bonded interfaces on the clients, a similar thing will happen as they connect to multiple servers. The caveat is that performance can be impacted by the NUMA architecture of your server. Basically, it's better to have both NICs attached to the same CPU. --Rick ________________________________ From: [email protected]<mailto:[email protected]> [[email protected]<mailto:[email protected]>] on behalf of teng wang [[email protected]<mailto:[email protected]>] Sent: Friday, February 20, 2015 1:40 PM To: Andrus, Brian Contractor; [email protected]<mailto:[email protected]> Subject: Re: [Lustre-discuss] Questions about the LNET routing Hi Andrus, thanks for your answer. Without bonding, is there any preference for LNET to route from the two interfaces? Even when we bond the two interfaces together, I think LNET should still choose between the different interfaces, although they share the same address. Is there any preference in this situation? Thanks, Teng On Thu, Feb 19, 2015 at 3:55 PM, Andrus, Brian Contractor <[email protected]<mailto:[email protected]>> wrote: Teng, I believe it would depend on how you have your interfaces configured. I seems that you have them both on the same subnet and being accessed by the same client. Is this the case? If they are on the same subnet, I would expect you would bond them (bond0) rather than have two separate IPs for them. Then you get to control how/where the data flows at the networking level. You may want to check on the nodes to see what they see. (lctl peer_list) If they are different subnets or networks, you can set that in the options for the lustre module for lnet. For example, we have both ib and tcp. I give ib the priority for the best performance, but if ib is unavailable, it falls back to tcp. That could just as well be two Ethernet cards on two networks as well. Brian Andrus ITACS/Research Computing Naval Postgraduate School Monterey, California voice: 831-656-6238<tel:831-656-6238> From: [email protected]<mailto:[email protected]> [mailto:[email protected]<mailto:[email protected]>] On Behalf Of teng wang Sent: Thursday, February 19, 2015 1:28 PM To: [email protected]<mailto:[email protected]> Subject: [Lustre-discuss] Questions about the LNET routing I have a basic question about the LNET. Will the data belonging to the same object be routed from the same interface? For example, if a node has multiple network interfaces and two processes are running on the same node writing to the same shared file, striped across 1 OST. Process 1 writes like: write chunk1 write chunk2 Process 2 writes like: write chunk3 write chunk4 If Process 1 and Process2 are pinned to two different network interfaces, say Eth0 and Eth1. Then from the OSC side, will these chunks be routed to the OST from the same interface (E.g. All the four chunks through Eth0)? If so, what if they write different objects that come to the same OST (E.g. Process1 write File1, Process2 write file2, file1 and file2 are striped over the same OST)? Thanks, Teng
_______________________________________________ Lustre-discuss mailing list [email protected] http://lists.lustre.org/mailman/listinfo/lustre-discuss
