Night flood ping through LACP didn't lose any packet.
On Wed, Feb 26, 2014 at 12:31 AM, Vladimir Kuklin <[email protected]>wrote: > Guys, suggested https://review.openstack.org/76345 fix works OK, though > it makes impossible to understand patch names :-) So we are waiting for > Sergey to provide more human-readable workaround. But we can continue > testing with this patch applied to ensure that 1.9.3 downgrade does not > introduce any regressions. > > > On Wed, Feb 26, 2014 at 12:11 AM, Vladimir Kuklin <[email protected]>wrote: > >> Guys, we are testing OVS 1.9.3 on Ubuntu right now. It seems we have some >> problems with l23network module: >> https://bugs.launchpad.net/fuel/+bug/1284801 >> We are going to apply a workaround for it. If everything else goes fine, >> we are going to move to 1.9.3 as it is OVS LTS version both for CentOS and >> Ubuntu. >> >> >> On Tue, Feb 25, 2014 at 11:27 PM, Mike Scherbakov < >> [email protected]> wrote: >> >>> Great news!!! >>> Andrey, thanks for staying late and waking up early these days in order >>> to resolve this. You deserve a good rest. Przmek - thanks for help! >>> Documentation would be really needed, otherwise users will be getting back >>> to us and complaining that something doesn't work.. >>> >>> >>> >>> On Tue, Feb 25, 2014 at 11:04 PM, Andrey Danin <[email protected]>wrote: >>> >>>> Okay. I finally have learned to set up LACP between OVS and Procurve >>>> 2510G. It works fine, like the balance-slb do. I leave my flood ping for >>>> the night and will tell you the results tomorrow. It seems we can fly to >>>> production with current versions of openvswitch. But here in Moscow we >>>> still try to build a fully OVS-1.9.3 ISO and test it. >>>> >>>> Of course we need to document all the issues properly. As I know >>>> Przemek wants to publish a good written examples of OVS, Cisco, Juniper and >>>> Arista configs about enabling LACP. >>>> >>>> >>>> On Tue, Feb 25, 2014 at 3:12 PM, Mike Scherbakov < >>>> [email protected]> wrote: >>>> >>>>> Good news. >>>>> Thanks Andrey, keep going! >>>>> >>>>> >>>>> On Tue, Feb 25, 2014 at 2:28 PM, Andrey Danin <[email protected]>wrote: >>>>> >>>>>> After 14 hours of a flood ping a hardware lab lost few packets and >>>>>> virtual env lost hundreds of packets. Mode: balance-slb. >>>>>> >>>>>> I'm going to test LACP behaviour today. >>>>>> >>>>>> >>>>>> On Tue, Feb 25, 2014 at 3:50 AM, Andrey Danin <[email protected]>wrote: >>>>>> >>>>>>> Fine. They wrote about that in the documentation too: >>>>>>> http://openvswitch.org/ovs-vswitchd.conf.db.5.pdf page 14 It was >>>>>>> introduced two years ago since version 1.5.0. One problem less! >>>>>>> >>>>>>> >>>>>>> On Tue, Feb 25, 2014 at 3:37 AM, Ryan Moe <[email protected]> wrote: >>>>>>> >>>>>>>> Andrey is correct. It appears that balance-tcp requires successful >>>>>>>> LACP negotiation. See here: >>>>>>>> https://github.com/osrg/openvswitch/blob/master/lib/bond.c#L610and >>>>>>>> here: >>>>>>>> https://github.com/osrg/openvswitch/blob/master/lib/bond.c#L1438. >>>>>>>> This also means that when we create bonds with balance-tcp we need to >>>>>>>> configure lacp as well. >>>>>>>> >>>>>>>> >>>>>>>> On Mon, Feb 24, 2014 at 3:14 PM, Andrey Danin >>>>>>>> <[email protected]>wrote: >>>>>>>> >>>>>>>>> And yes, the bug https://bugs.launchpad.net/fuel/+bug/1272842 and >>>>>>>>> current problem can be unrelated but they have similar error messages >>>>>>>>> in >>>>>>>>> OVS logs. >>>>>>>>> >>>>>>>>> >>>>>>>>> On Tue, Feb 25, 2014 at 2:55 AM, Andrey Danin <[email protected] >>>>>>>>> > wrote: >>>>>>>>> >>>>>>>>>> Guys, I set up hardware (2 nodes) and software (3 nodes) labs >>>>>>>>>> today with ISO #181 to test bonding. Unfortunately, balance-tcp mode >>>>>>>>>> is >>>>>>>>>> totally broken. When I use it during deployment or switch to it in a >>>>>>>>>> working cluster, all traffic stops. Playing with rebalance interval >>>>>>>>>> doesn't >>>>>>>>>> help. >>>>>>>>>> On the contrary, balance-slb works fine. Both Ubuntu (Hhardware >>>>>>>>>> nodes) and CentOS (virtual env) works without any traffic lost. I'm >>>>>>>>>> running >>>>>>>>>> a flooded ping between virtual instances inside of clouds for a >>>>>>>>>> night and >>>>>>>>>> will check a number of lost packets. Also I want to play with iperf. >>>>>>>>>> >>>>>>>>>> Next things we can do: >>>>>>>>>> * Build an ISO with stable (1.9.3) or newest (2.0.x) version of >>>>>>>>>> OVS and play with them. Yesterday we decided to build Ubuntu 12.04 >>>>>>>>>> with >>>>>>>>>> Debian Sid 1.9.3 version of OVS. There is the ticket about that >>>>>>>>>> https://mirantis.jira.com/browse/OSCI-1089 Also Igor built its >>>>>>>>>> own version of an ISO with Sid package. >>>>>>>>>> * Dump openflow rules in balance-tcp mode and try to fix them. >>>>>>>>>> It's hard to do that because Aliens developed their syntax. >>>>>>>>>> * Run Igor's tests again and again until balance-slb starts block >>>>>>>>>> a traffic. Then dig into openflow rules. >>>>>>>>>> * Play with LACP on a real hardware. Maybe balance-tcp can be >>>>>>>>>> used only with lacp=active. >>>>>>>>>> * Ask the openvswitch community about our problems. >>>>>>>>>> >>>>>>>>>> Andrew, yes, the PXE network still nailed to an interface. I hope >>>>>>>>>> we will fix it in 5.0. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Tue, Feb 25, 2014 at 12:20 AM, Igor Shishkin < >>>>>>>>>> [email protected]> wrote: >>>>>>>>>> >>>>>>>>>>> Hello, Dmitry. >>>>>>>>>>> >>>>>>>>>>> It’s 100% reproducible on virtual environment when we’re trying >>>>>>>>>>> to deploy bonding in balance tcp or balance slb mode. >>>>>>>>>>> Tests related as a way to reproduce and a warning why these >>>>>>>>>>> tests should fail when they’ll be merged. >>>>>>>>>>> >>>>>>>>>>> As we can see problem is in rebalance procedure openvswitch >>>>>>>>>>> tries to do since it started bonded interface. And in this time >>>>>>>>>>> bonded >>>>>>>>>>> interfaces stops to accept ARPs. >>>>>>>>>>> >>>>>>>>>>> I just built openvswitch=1.9.3 which is LTS and wanna try it in >>>>>>>>>>> the same case and try to descrease bond-rebalance-interval to 0(as >>>>>>>>>>> Andrey >>>>>>>>>>> K. suggested). If any of this will help - this could be the >>>>>>>>>>> solution(but >>>>>>>>>>> I'm really not sure bond-rebalance-interval=0 is a good way). >>>>>>>>>>> — >>>>>>>>>>> Igor Shishkin >>>>>>>>>>> QA Engineer >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On 24 Feb 2014, at 23:59, Dmitry Borodaenko < >>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>> >>>>>>>>>>> > Mike, Igor, >>>>>>>>>>> > >>>>>>>>>>> > Can you provide more details on how the integration test in >>>>>>>>>>> review >>>>>>>>>>> > #75161 helps to reproduce bug #1272842? >>>>>>>>>>> > >>>>>>>>>>> > As far as I understand, the bug is a highly intermittent >>>>>>>>>>> problem with >>>>>>>>>>> > ARP that was only showing up after an environment with LACP >>>>>>>>>>> bonding >>>>>>>>>>> > was operational for at least a few hours. >>>>>>>>>>> > >>>>>>>>>>> > On the other hand, the problem Igor is reporting based on the >>>>>>>>>>> > integration test sounds like something 100% reproducible that >>>>>>>>>>> doesn't >>>>>>>>>>> > require real hardware or LACP and is not necessarily related >>>>>>>>>>> to ARP. >>>>>>>>>>> > >>>>>>>>>>> > Are you sure you're not confusing two unrelated problems? >>>>>>>>>>> > >>>>>>>>>>> > Thanks, >>>>>>>>>>> > -DmitryB >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> > On Mon, Feb 24, 2014 at 9:18 AM, Mike Scherbakov >>>>>>>>>>> > <[email protected]> wrote: >>>>>>>>>>> >> The issue is here: >>>>>>>>>>> https://bugs.launchpad.net/fuel/+bug/1272842. >>>>>>>>>>> >> Those who know what can be wrong with our openvswitch/kernel, >>>>>>>>>>> please provide >>>>>>>>>>> >> your input.. >>>>>>>>>>> >> >>>>>>>>>>> >> >>>>>>>>>>> >> On Mon, Feb 24, 2014 at 9:04 PM, Igor Shishkin < >>>>>>>>>>> [email protected]> >>>>>>>>>>> >> wrote: >>>>>>>>>>> >>> >>>>>>>>>>> >>> Hello, >>>>>>>>>>> >>> >>>>>>>>>>> >>> Currently we have this review >>>>>>>>>>> https://review.openstack.org/#/c/75161 with >>>>>>>>>>> >>> test cases for our brand new shiny bonding feature but >>>>>>>>>>> >>> balance-tcp/balance-slb modes are not working for now. >>>>>>>>>>> >>> >>>>>>>>>>> >>> Steps to reproduce are very simple: >>>>>>>>>>> >>> Create cluster with simple or HA configuration, select >>>>>>>>>>> balance-tcp or >>>>>>>>>>> >>> balance-slb bonding mode and start deployment. >>>>>>>>>>> >>> >>>>>>>>>>> >>> Deployment will not finish with success because of rebalance >>>>>>>>>>> procedure >>>>>>>>>>> >>> problems. >>>>>>>>>>> >>> -- >>>>>>>>>>> >>> Igor Shishkin >>>>>>>>>>> >>> QA Engineer >>>>>>>>>>> >>> >>>>>>>>>>> >>> >>>>>>>>>>> >>> >>>>>>>>>>> >> >>>>>>>>>>> >> >>>>>>>>>>> >> >>>>>>>>>>> >> -- >>>>>>>>>>> >> Mike Scherbakov >>>>>>>>>>> >> #mihgen >>>>>>>>>>> >> >>>>>>>>>>> >> -- >>>>>>>>>>> >> Mailing list: https://launchpad.net/~fuel-dev >>>>>>>>>>> >> Post to : [email protected] >>>>>>>>>>> >> Unsubscribe : https://launchpad.net/~fuel-dev >>>>>>>>>>> >> More help : https://help.launchpad.net/ListHelp >>>>>>>>>>> >> >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> > -- >>>>>>>>>>> > Dmitry Borodaenko >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> Mailing list: https://launchpad.net/~fuel-dev >>>>>>>>>>> Post to : [email protected] >>>>>>>>>>> Unsubscribe : https://launchpad.net/~fuel-dev >>>>>>>>>>> More help : https://help.launchpad.net/ListHelp >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> Andrey Danin >>>>>>>>>> [email protected] >>>>>>>>>> skype: gcon.monolake >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Andrey Danin >>>>>>>>> [email protected] >>>>>>>>> skype: gcon.monolake >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Mailing list: https://launchpad.net/~fuel-dev >>>>>>>>> Post to : [email protected] >>>>>>>>> Unsubscribe : https://launchpad.net/~fuel-dev >>>>>>>>> More help : https://help.launchpad.net/ListHelp >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Andrey Danin >>>>>>> [email protected] >>>>>>> skype: gcon.monolake >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Andrey Danin >>>>>> [email protected] >>>>>> skype: gcon.monolake >>>>>> >>>>>> -- >>>>>> Mailing list: https://launchpad.net/~fuel-dev >>>>>> Post to : [email protected] >>>>>> Unsubscribe : https://launchpad.net/~fuel-dev >>>>>> More help : https://help.launchpad.net/ListHelp >>>>>> >>>>>> >>>>> >>>>> >>>>> -- >>>>> Mike Scherbakov >>>>> #mihgen >>>>> >>>> >>>> >>>> >>>> -- >>>> Andrey Danin >>>> [email protected] >>>> skype: gcon.monolake >>>> >>> >>> >>> >>> -- >>> Mike Scherbakov >>> #mihgen >>> >>> -- >>> Mailing list: https://launchpad.net/~fuel-dev >>> Post to : [email protected] >>> Unsubscribe : https://launchpad.net/~fuel-dev >>> More help : https://help.launchpad.net/ListHelp >>> >>> >> >> >> -- >> Yours Faithfully, >> Vladimir Kuklin, >> Senior Deployment Engineer, >> Mirantis, Inc. >> +7 (495) 640-49-04 >> +7 (926) 702-39-68 >> Skype kuklinvv >> 45bk3, Vorontsovskaya Str. >> Moscow, Russia, >> www.mirantis.com <http://www.mirantis.ru/> >> www.mirantis.ru >> [email protected] >> > > > > -- > Yours Faithfully, > Vladimir Kuklin, > Senior Deployment Engineer, > Mirantis, Inc. > +7 (495) 640-49-04 > +7 (926) 702-39-68 > Skype kuklinvv > 45bk3, Vorontsovskaya Str. > Moscow, Russia, > www.mirantis.com <http://www.mirantis.ru/> > www.mirantis.ru > [email protected] > -- Andrey Danin [email protected] skype: gcon.monolake
-- Mailing list: https://launchpad.net/~fuel-dev Post to : [email protected] Unsubscribe : https://launchpad.net/~fuel-dev More help : https://help.launchpad.net/ListHelp

