Francois-Xavier,
    Thank you for your feedback.

I have seen the issue on two systems where STP is turned off. Here is the /etc/network/interfaces entry for the particular bridge where I've seen it the most; note the last line:

iface br0 inet dhcp
        bridge_ports eth0
        bridge_fd 9
        bridge_hello 2
        bridge_maxage 12
        bridge_stp off

(As you can see, on this server I was using DHCP for the bridge. That is rare, but not unusual. In this case we centrally manage all fixed IP addresses using an /etc/ethers file on the DHCP server.)

I submit that the symptom is not related to STP, but instead is related to the ARP cache (and network topology) of the equipment you are connecting through. With my Linux laptop hooked up through two GigE switches (and no STP), I see the host's network freeze. I've seen it on Ubuntu 10.04 and 11.04.

F> /If you are using a bridge in a controlled environment, you really dont need STP anyway./

If using collocation or managed hardware from a data center provider, you may not have a choice re: STP.

It is worth noting that the KVM/libvirt folks found the issue serious enough to fix.


Thank You,
Derek Simkowiak

On 10/24/2011 11:41 AM, Francois-Xavier Bourlet wrote:
Hi,

Here we are using lxc intensively with bridges. Since we don't use STP, the downtime for each a mac@ change is unnoticeable. In fact, we discovered it when reading this mailinglist. After some test I can confirm that most of the time we are spawning/destroying a container, the bridge's mac@ change, but there is no loss of connectivity, since arp tables are instantly refreshed.

So an easy workaround for the moment is to disable STP on the brige (brctl br0 stp off). If you are using a bridge in a controlled environment, you really dont need STP anyway.

My 2cents,

On Mon, Oct 24, 2011 at 11:09 AM, Derek Simkowiak <de...@simkowiak.net <mailto:de...@simkowiak.net>> wrote:

        Hello,
        Just following up re: this bug.  I think it's a pretty serious
    issue.

        I am looking to work on this, but I am seeking some feedback and
    direction from one of the core LXC devs.

    - Do you agree with my analysis?
    - Has anyone else worked on this already?
    etc.


    Thanks,
    Derek

    On 10/18/2011 04:31 PM, Derek Simkowiak wrote:
    >       There is a behavior in the Linux kernel which can cause a
    bridge
    > device to change MAC address, thus causing a network blackout of
    several
    > seconds (while everybody ARPs the new MAC address flushes the
    old one).
    > This happens when bridging an enslaved interface, like we do
    with LXC.
    >
    >       The symptom is that the LXC host will black out for
    several seconds
    > when starting or stopping an LXC container.  Your SSH terminal
    on the
    > host will freeze and become unresponsive.  (It is a random symptom,
    > because the blackout only happens if the randomly-assigned MAC
    address
    > of the virtual device is lower than that of the physical eth0
    device).
    >
    >       This behavior was first observed by the libvirt folks when
    creating
    > virtual machines.  You can read more details about it (and how they
    > fixed it) here:
    >
    > https://www.redhat.com/archives/libvir-list/2010-July/msg00450.html
    > https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/584048
    >
    >       I have observed the symptom under LXC, and the workaround
    for it
    > has been independently confirmed for LXC in this bug report (ID:
    3411497):
    >
    >
    
http://sourceforge.net/tracker/index.php?func=detail&aid=3411497&group_id=163076&atid=826303
    
<http://sourceforge.net/tracker/index.php?func=detail&aid=3411497&group_id=163076&atid=826303>
    >
    >
    >       The workaround for the bug is to give the virtual device a
    high MAC
    > address, thus discouraging the bridge device from adapting its MAC
    > address as its own.
    >
    >       I have mentioned this bug on the list before, however, I was
    > confused about which MAC address was causing the problem.  This
    is NOT
    > the mac address specified in lxc.conf, like this:
    >
    > lxc.network.hwaddr = fe:16:3e:fd:5a:5b
    >
    >       That MAC address has nothing to do with the bug; the
    host's bridge
    > device (br0) will never assume a configured LXC MAC address as
    its own.
    > Instead, the MAC address in question is the one of the virtual
    vethXXXX
    > device, as shown with "ifconfig" on the host:
    >
    > veth0IEDlk Link encap:Ethernet  HWaddr 4e:34:7c:dc:92:e8
    > [...snip...]
    >
    >       That HWaddr should be given a high prefix to avoid the network
    > blackouts, just like they've done for libvirt.  That does not
    exist in
    > any config file anywhere; it must be fixed in the LXC source code.
    >
    >       I looked in network.c for the LXC source code and I think
    the fix
    > should go in lxc_bridge_attach() near line 991.  The fix would put a
    > manually-generated MAC address -- one with a high prefix -- into
    > ifr.ifr_hwaddr.sa_data and thus replace the random one assigned
    by the
    > kernel.
    >
    >       However, I'm new to the LXC source and would like some
    input and
    > analysis from a more seasoned contributor.  I would be happy to
    test and
    > maybe even contribute a patch, but I'd like some feedback first.
    >
    >
    > Thank You,
    > Derek Simkowiak
    >
    >
    >
    
------------------------------------------------------------------------------
    > All the data continuously generated in your IT infrastructure
    contains a
    > definitive record of customers, application performance, security
    > threats, fraudulent activity and more. Splunk takes this data
    and makes
    > sense of it. Business sense. IT sense. Common sense.
    > http://p.sf.net/sfu/splunk-d2d-oct
    > _______________________________________________
    > Lxc-users mailing list
    > Lxc-users@lists.sourceforge.net
    <mailto:Lxc-users@lists.sourceforge.net>
    > https://lists.sourceforge.net/lists/listinfo/lxc-users


    
------------------------------------------------------------------------------
    The demand for IT networking professionals continues to grow, and the
    demand for specialized networking skills is growing even more rapidly.
    Take a complimentary Learning@Cisco Self-Assessment and learn
    about Cisco certifications, training, and career opportunities.
    http://p.sf.net/sfu/cisco-dev2dev
    _______________________________________________
    Lxc-users mailing list
    Lxc-users@lists.sourceforge.net
    <mailto:Lxc-users@lists.sourceforge.net>
    https://lists.sourceforge.net/lists/listinfo/lxc-users




--
François-Xavier Bourlet

------------------------------------------------------------------------------
The demand for IT networking professionals continues to grow, and the
demand for specialized networking skills is growing even more rapidly.
Take a complimentary Learning@Cisco Self-Assessment and learn 
about Cisco certifications, training, and career opportunities. 
http://p.sf.net/sfu/cisco-dev2dev
_______________________________________________
Lxc-users mailing list
Lxc-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/lxc-users

Reply via email to