Dear Ben:
I don't know the snapshot cannot be display in the email, so I wrote this 
again, please ignore the last one, thank you. If my expression is not clear 
enough, please let me know in time. Thank you very much.


When I added a"Linux balance-alb bond" to an OVS bridge, I had a problem. Some 
machines can ping the bridge‘s IP, but some can not.
My procedure would look something like this:
--------------------- --------------------- --------------------- 
--------------------- | machine-1, ip: ip-1 | | machine-2, ip: ip-2 | | | | 
machine-n, ip: ip-n | | arp: | | arp: | | ...... | | arp: | | ip-x mac-x1 | | 
ip-x mac-x1 | | | | ip-x mac-x2 | --------------------- --------------------- 
--------------------- ---------------------


------------------------------------------------------------------------
| | ------------------------------------------------------------------ |
| | | -------------------------- -------------------------- | | |
| | | | eth1, mac: mac-x1 | | eth2, mac: mac-x2 | | | |
| | |  --------------------------    --------------------------     |     |     
|
| | | ovs port1: bond0, mac: mac-x1                          |     |     |
| |  -------------------------------------------------------------|     |     |
| | ovs bridge: ovsbr0, ip: ip-x, mac: mac-x1                   |     |
|  -------------------------------------------------------------------       |
| machine-x                                                                     
      |
 --------------------------------------------------------------------------


1. configure "bond0" in "machine-x", mode is balance-alb.
$ ip link 3: eth1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq 
master bond0 state UP mode DEFAULT qlen 1000 link/ether ac:1f:6b:12:4c:1e brd 
ff:ff:ff:ff:ff:ff 5: eth2: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 
qdisc mq master bond0 state UP mode DEFAULT qlen 1000 link/ether 
ac:1f:6b:12:4c:1f brd ff:ff:ff:ff:ff:ff 6: bond0: 
<BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue master 
ovs-system state UP mode DEFAULT qlen 1000 link/ether ac:1f:6b:12:4c:1e brd 
ff:ff:ff:ff:ff:ff


2. add an ovs bridge "ovsbr0" in "machine-x", and add a port "bond0"(linux 
balance-alb bond).
$ ovs-vsctl add-br ovsbr0
$ ovs-vsctl add-port ovsbr0 bond0
$ ip addr flush dev bond0
$ ip addr add ip-x/24 dev ovsbr0
$ ip link set ovsbr0 up


3.many machines ping the bridge's IP at the same time.
Some machines can ping the bridge‘s IP, but some can not.


I tried to analyze the cause of the problem. I found that any machine with 
HWaddress "mac-x1" in the ARP table could ping "machine-x", while others with 
HWaddress "mac-x2" could not.
The description of the balance-alb in the bonding 
documentation(https://www.kernel.org/doc/Documentation/networking/bonding.txt) 
is as follows:
balance-alb or 6

                Adaptive load balancing: includes balance-tlb plus
                receive load balancing (rlb) for IPV4 traffic, and
                does not require any special switch support.  The
                receive load balancing is achieved by ARP negotiation.
                The bonding driver intercepts the ARP Replies sent by
                the local system on their way out and overwrites the
                source hardware address with the unique hardware
                address of one of the slaves in the bond such that
                different peers use different hardware addresses for
                the server.

                Receive traffic from connections created by the server
                is also balanced.  When the local system sends an ARP
                Request the bonding driver copies and saves the peer's
                IP information from the ARP packet.  When the ARP
                Reply arrives from the peer, its hardware address is
                retrieved and the bonding driver initiates an ARP
                reply to this peer assigning it to one of the slaves
                in the bond.  A problematic outcome of using ARP
                negotiation for balancing is that each time that an
                ARP request is broadcast it uses the hardware address
                of the bond.  Hence, peers learn the hardware address
                of the bond and the balancing of receive traffic
                collapses to the current slave.  This is handled by
                sending updates (ARP Replies) to all the peers with
                their individually assigned hardware address such that
                the traffic is redistributed.  Receive traffic is also
                redistributed when a new slave is added to the bond
                and when an inactive slave is re-activated.  The
                receive load is distributed sequentially (round robin)
                among the group of highest speed slaves in the bond.

                When a link is reconnected or a new slave joins the
                bond the receive traffic is redistributed among all
                active slaves in the bond by initiating ARP Replies
                with the selected MAC address to each of the
                clients. The updelay parameter (detailed below) must
                be set to a value equal or greater than the switch's
                forwarding delay so that the ARP Replies sent to the
                peers will not be blocked by the switch.

                Prerequisites:

                1. Ethtool support in the base drivers for retrieving
                the speed of each slave.

                2. Base driver support for setting the hardware
                address of a device while it is open.  This is
                required so that there will always be one slave in the
                team using the bond hardware address (the
                curr_active_slave) while having a unique hardware
                address for each slave in the bond.  If the
                curr_active_slave fails its hardware address is
                swapped with the new curr_active_slave that was
                chosen.
Different peers use different hardware addresses for the server.  So I wonder 
if the destination MAC address of the message sent from "machine-n" is not in 
the CAM table of ovs bridge, so ovs bridge discarded the message?
If this is the case, does it mean that the ovs bridge does not support bonding 
of specific receive load balancing function? And, is there any way to solve 
this problem?


I tested linux bridge, which has no such problem. I found that the destination 
MAC address of the message eth2 received was mac-x2, but after the message was 
sent to bond0, the MAC address was changed to mac-x1. Is this the reason why 
the Linux bridge doesn't have this ping problem? If so, what can I do to make 
ovs bridge work?


I used the "tcpdump" tool to grab packages for "eth1", "eth2", "bond0", and 
linux bridge "br0", as below:
tcpdump -i br0 -n -e -p src 13.10.12.102 or dst 13.10.12.102
11:49:32.843719 0c:c4:7a:c1:64:3a > ac:1f:6b:12:4c:1e, ethertype IPv4 (0x0800), 
length 98: 13.10.12.102 > 13.10.12.26: ICMP echo request, id 4921, seq 3357, 
length 64
11:49:32.843744 ac:1f:6b:12:4c:1e > 0c:c4:7a:c1:64:3a, ethertype IPv4 (0x0800), 
length 98: 13.10.12.26 > 13.10.12.102: ICMP echo reply, id 4921, seq 3357, 
length 64
11:49:33.843731 0c:c4:7a:c1:64:3a > ac:1f:6b:12:4c:1e, ethertype IPv4 (0x0800), 
length 98: 13.10.12.102 > 13.10.12.26: ICMP echo request, id 4921, seq 3358, 
length 64
11:49:33.843754 ac:1f:6b:12:4c:1e > 0c:c4:7a:c1:64:3a, ethertype IPv4 (0x0800), 
length 98: 13.10.12.26 > 13.10.12.102: ICMP echo reply, id 4921, seq 3358, 
length 64
11:49:34.843745 0c:c4:7a:c1:64:3a > ac:1f:6b:12:4c:1e, ethertype IPv4 (0x0800), 
length 98: 13.10.12.102 > 13.10.12.26: ICMP echo request, id 4921, seq 3359, 
length 64
11:49:34.843768 ac:1f:6b:12:4c:1e > 0c:c4:7a:c1:64:3a, ethertype IPv4 (0x0800), 
length 98: 13.10.12.26 > 13.10.12.102: ICMP echo reply, id 4921, seq 3359, 
length 64
11:49:35.843841 0c:c4:7a:c1:64:3a > ac:1f:6b:12:4c:1e, ethertype IPv4 (0x0800), 
length 98: 13.10.12.102 > 13.10.12.26: ICMP echo request, id 4921, seq 3360, 
length 64
11:49:35.843869 ac:1f:6b:12:4c:1e > 0c:c4:7a:c1:64:3a, ethertype IPv4 (0x0800), 
length 98: 13.10.12.26 > 13.10.12.102: ICMP echo reply, id 4921, seq 3360, 
length 64


tcpdump -i bond0 -n -e -p src 13.10.12.102 or dst 13.10.12.102
11:49:32.843713 0c:c4:7a:c1:64:3a > ac:1f:6b:12:4c:1e, ethertype IPv4 (0x0800), 
length 98: 13.10.12.102 > 13.10.12.26: ICMP echo request, id 4921, seq 3357, 
length 64
11:49:32.843747 ac:1f:6b:12:4c:1e > 0c:c4:7a:c1:64:3a, ethertype IPv4 (0x0800), 
length 98: 13.10.12.26 > 13.10.12.102: ICMP echo reply, id 4921, seq 3357, 
length 64
11:49:33.843724 0c:c4:7a:c1:64:3a > ac:1f:6b:12:4c:1e, ethertype IPv4 (0x0800), 
length 98: 13.10.12.102 > 13.10.12.26: ICMP echo request, id 4921, seq 3358, 
length 64
11:49:33.843757 ac:1f:6b:12:4c:1e > 0c:c4:7a:c1:64:3a, ethertype IPv4 (0x0800), 
length 98: 13.10.12.26 > 13.10.12.102: ICMP echo reply, id 4921, seq 3358, 
length 64
11:49:34.843738 0c:c4:7a:c1:64:3a > ac:1f:6b:12:4c:1e, ethertype IPv4 (0x0800), 
length 98: 13.10.12.102 > 13.10.12.26: ICMP echo request, id 4921, seq 3359, 
length 64
11:49:34.843771 ac:1f:6b:12:4c:1e > 0c:c4:7a:c1:64:3a, ethertype IPv4 (0x0800), 
length 98: 13.10.12.26 > 13.10.12.102: ICMP echo reply, id 4921, seq 3359, 
length 64
11:49:35.843834 0c:c4:7a:c1:64:3a > ac:1f:6b:12:4c:1e, ethertype IPv4 (0x0800), 
length 98: 13.10.12.102 > 13.10.12.26: ICMP echo request, id 4921, seq 3360, 
length 64
11:49:35.843873 ac:1f:6b:12:4c:1e > 0c:c4:7a:c1:64:3a, ethertype IPv4 (0x0800), 
length 98: 13.10.12.26 > 13.10.12.102: ICMP echo reply, id 4921, seq 3360, 
length 64


tcpdump -i eth1 -n -e -p src 13.10.12.102 or dst 13.10.12.102
11:49:32.843752 ac:1f:6b:12:4c:1e > 0c:c4:7a:c1:64:3a, ethertype IPv4 (0x0800), 
length 98: 13.10.12.26 > 13.10.12.102: ICMP echo reply, id 4921, seq 3357, 
length 64
11:49:33.843762 ac:1f:6b:12:4c:1e > 0c:c4:7a:c1:64:3a, ethertype IPv4 (0x0800), 
length 98: 13.10.12.26 > 13.10.12.102: ICMP echo reply, id 4921, seq 3358, 
length 64
11:49:34.843776 ac:1f:6b:12:4c:1e > 0c:c4:7a:c1:64:3a, ethertype IPv4 (0x0800), 
length 98: 13.10.12.26 > 13.10.12.102: ICMP echo reply, id 4921, seq 3359, 
length 64
11:49:35.843878 ac:1f:6b:12:4c:1e > 0c:c4:7a:c1:64:3a, ethertype IPv4 (0x0800), 
length 98: 13.10.12.26 > 13.10.12.102: ICMP echo reply, id 4921, seq 3360, 
length 64


tcpdump -i eth2 -n -e -p src 13.10.12.102 or dst 13.10.12.102
11:49:32.843703 0c:c4:7a:c1:64:3a > ac:1f:6b:12:4c:1f, ethertype IPv4 (0x0800), 
length 98: 13.10.12.102 > 13.10.12.26: ICMP echo request, id 4921, seq 3357, 
length 64
11:49:33.843717 0c:c4:7a:c1:64:3a > ac:1f:6b:12:4c:1f, ethertype IPv4 (0x0800), 
length 98: 13.10.12.102 > 13.10.12.26: ICMP echo request, id 4921, seq 3358, 
length 64
11:49:34.843730 0c:c4:7a:c1:64:3a > ac:1f:6b:12:4c:1f, ethertype IPv4 (0x0800), 
length 98: 13.10.12.102 > 13.10.12.26: ICMP echo request, id 4921, seq 3359, 
length 64
11:49:35.843828 0c:c4:7a:c1:64:3a > ac:1f:6b:12:4c:1f, ethertype IPv4 (0x0800), 
length 98: 13.10.12.102 > 13.10.12.26: ICMP echo request, id 4921, seq 3360, 
length 64



At 2018-04-29 03:01:53, "Ben Pfaff" <[email protected]> wrote:
>On Fri, Apr 27, 2018 at 11:44:46AM +0800, netsurfed wrote:
>> There is a "balance-alb" bond0 in my linux host, like this:
>> 
>> 
>> Can I add this as a port to ovs bridge? like this:
>> ovs-vsctl add-br ovsbr0
>> ovs-vsctl add-port ovsbr0 bond0
>> 
>> 
>> I know ovs can create bond using "ovs-vsctl add-bond BRIDGE PORT IFACE...". 
>> However, that requires removing the original bond first. I don't want to do 
>> that.
>
>Usually it works fine to add a Linux bond to an OVS bridge.
_______________________________________________
discuss mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss

Reply via email to