Hello,

I am trying this list because the bonding-devel list appears to be
dead.  I am seeing an issue on a few servers, one of which is a RHEL
cluster. where it doesn't appear all NICs in the 802.3ad lag are all
operating at the same level. A few of the servers have two bonds each
with two NICs in each bond. I have two NFS servers that each has 1
bond with 3 NICs. All are RHEL5 x64 2.6.18.

Here's one of my bonds:
[root@server ~]# cat /proc/net/bonding/bond0
Ethernet Channel Bonding Driver: v3.4.0-1 (October 7, 2008)

Bonding Mode: IEEE 802.3ad Dynamic link aggregation
Transmit Hash Policy: layer2 (0)
MII Status: up
MII Polling Interval (ms): 80
Up Delay (ms): 0
Down Delay (ms): 0

802.3ad info
LACP rate: slow
Active Aggregator Info:
Aggregator ID: 3
Number of ports: 3
Actor Key: 17
Partner Key: 10010
Partner Mac Address: 00:1b:ed:80:17:c0

Slave Interface: eth0
MII Status: up
Speed: 1000 Mbps
Duplex: full
Link Failure Count: 2
Permanent HW addr: 00:1d:09:71:97:3f
Aggregator ID: 3

Slave Interface: eth1
MII Status: up
Speed: 1000 Mbps
Duplex: full
Link Failure Count: 3
Permanent HW addr: 00:1d:09:71:97:41
Aggregator ID: 3

Slave Interface: eth3
MII Status: up
Speed: 1000 Mbps
Duplex: full
Link Failure Count: 2
Permanent HW addr: 00:15:17:6c:e4:89
Aggregator ID: 3


Here's a snippet of sar:
[root@server ~]# sar -n DEV 1 5

Average: IFACE rxpck/s txpck/s rxbyt/s txbyt/s rxcmp/s txcmp/s rxmcst/s
Average: lo 0.40 0.40 33.67 33.67 0.00 0.00 0.00
Average: eth0 1924.25 0.00 1574977.35 0.00 0.00 0.00 0.00
Average: eth1 7034.87 524.25 41138060.92 3758143.29 0.00 0.00 0.00
Average: eth2 4608.82 3578.56 11813732.26 17426152.71 0.00 0.00 0.00
Average: eth3 352.91 7568.94 174972.34 31082267.33 0.00 0.00 0.00
Average: eth4 4613.83 3667.94 12243683.77 17352368.34 0.00 0.00 0.00
Average: eth5 4533.87 3526.25 11913877.35 17389009.22 0.00 0.00 0.00
Average: sit0 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Average: bond0 9312.02 8093.19 42888010.62 34840410.62 0.00 0.00 0.00

I couldn't read that without having a stroke so here it is broken down
in MB/s by interface:
rx MB/s tx MB/s
eth0 1.5 0.0
eth1 39.2 3.6
eth3 0.2 29.6
bond0 40.9 33.2

I think the reason why I see one interface dominating RX and another
dominating TX is due to the xmit_hash_policy but there are three hosts
that use this particular server for network traffic. That's 3
different physical mac addresses. The layer2 algorithm should be fine
in that situation I would think. What am I missing? Would I just be
better off with balance-rr?

Thanks!
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to