+ openflowplugin-dev I don't see a reason not to revert this patch [1] (merge [2]) and re-introduce it once we figure out what it breaks. netvirt CSIT clearly shows the patch is problematic, and has broken Boron for the past week.
Abhijit/Anil/Jon could we have your view on this? More details in the thread below. I also want to remind you guys that we added "test-openflowplugin-netvirt" keyword, and encourage you to at least trigger it before any patch is merged. [1] https://git.opendaylight.org/gerrit/#/c/50153 [2] https://git.opendaylight.org/gerrit/#/c/51814 --alon From: N Vivekanandan [mailto:[email protected]] Sent: Tuesday, 14 February 2017 05:12 To: Vishal Thapar <[email protected]>; Sam Hague <[email protected]>; Jamo Luhrsen <[email protected]> Cc: Kochba, Alon <[email protected]>; odl netvirt dev <[email protected]> Subject: RE: [netvirt-dev] Steps to resolve latest CSIT regressions Hi Sam, We couldn’t get back to you yesterday as we haven’t been able pin the openflowplugin review here: https://git.opendaylight.org/gerrit/#/c/50153 “ [1] https://jenkins.opendaylight.org/sandbox/job/netvirt-csit-1node-openstack-newton-nodl-v2-upstream-stateful-boron-shague/5/ https://logs.opendaylight.org/sandbox/jenkins091/netvirt-csit-1node-openstack-newton-nodl-v2-upstream-stateful-boron-shague/6/ - 2 csit runs on the reverted openflowplugin patch [2] distro - passes no errors [2] https://git.opendaylight.org/gerrit/51814 - the reverted openflowplugin patch “ I agree with you that we can request for revert of this patch from Boron. I see that you have already placed a -1 on the equivalent unmerged Master patch here: https://git.opendaylight.org/gerrit/#/c/51589 -- Thanks, Vivek From: Vishal Thapar Sent: Tuesday, February 14, 2017 7:53 AM To: Sam Hague <[email protected]<mailto:[email protected]>>; Jamo Luhrsen <[email protected]<mailto:[email protected]>> Cc: Kochba, Alon <[email protected]<mailto:[email protected]>>; N Vivekanandan <[email protected]<mailto:[email protected]>>; odl netvirt dev <[email protected]<mailto:[email protected]>> Subject: RE: [netvirt-dev] Steps to resolve latest CSIT regressions I’d vote for reverting the patch. We have enough information to pin it on this one and should get OFPlugin folks to take a look at it. At this point inputs have to come from OFPlugin on what change in netvirt is causing this, if it is. I’ll send across a mail to OFPlugin if others agree on this. Regards, Vishal. From: Sam Hague [mailto:[email protected]] Sent: 14 February 2017 07:44 To: Jamo Luhrsen <[email protected]<mailto:[email protected]>> Cc: Vishal Thapar <[email protected]<mailto:[email protected]>>; Kochba, Alon <[email protected]<mailto:[email protected]>>; N Vivekanandan <[email protected]<mailto:[email protected]>>; odl netvirt dev <[email protected]<mailto:[email protected]>> Subject: Re: [netvirt-dev] Steps to resolve latest CSIT regressions Ok, I think we are close to saying the openflowplugin patch is causing problems. Bunch of results below. Quick run down is the openflowplugin patch on boron and on master produces the same 52 errors. The revert of the patch on boron passes 100%. Now the question is how to proceed? Do we push to revert that patch or work towards why it is causing problems for netvirt? Seems like we should revert since the master patch is not merged yet. Thanks, Sam [1] https://jenkins.opendaylight.org/sandbox/job/netvirt-csit-1node-openstack-newton-nodl-v2-upstream-stateful-boron-shague/5/ https://logs.opendaylight.org/sandbox/jenkins091/netvirt-csit-1node-openstack-newton-nodl-v2-upstream-stateful-boron-shague/6/ - 2 csit runs on the reverted openflowplugin patch [2] distro - passes no errors [2] https://git.opendaylight.org/gerrit/51814 - the reverted openflowplugin patch [3] https://git.opendaylight.org/gerrit/50153 - the openflowplugin patch before revert. this patch is already merged. https://logs.opendaylight.org/sandbox/jenkins091/netvirt-csit-1node-openstack-newton-nodl-v2-upstream-stateful-boron-shague/2/ - fails csit with the 52 errors [4] https://git.opendaylight.org/gerrit/51589 - the openflowplugin patch on master - has the same 52 errors [5] https://jenkins.opendaylight.org/releng/job/netvirt-csit-1node-openstack-newton-nodl-v2-gate-stateful-carbon/59/ - gate job against the openflowplugin patch on master - it also hits the 52 errors On Mon, Feb 13, 2017 at 6:06 PM, Sam Hague <[email protected]<mailto:[email protected]>> wrote: Hi all, I reverted the openflowplugin patch [2] and ran csit against it at [1] .That passed 100%. I am running it again as job 6 since we do have random results sometimes. [3] is the revert patch. I also started the gate against the master branch of [2] to see if that fails. Thanks, Sam [1] https://jenkins.opendaylight.org/sandbox/job/netvirt-csit-1node-openstack-newton-nodl-v2-upstream-stateful-boron-shague/5/ [2] https://git.opendaylight.org/gerrit/50153 [3] https://git.opendaylight.org/gerrit/51814 [4] https://git.opendaylight.org/gerrit/51589 On Mon, Feb 13, 2017 at 4:46 PM, Jamo Luhrsen <[email protected]<mailto:[email protected]>> wrote: I'm trying to set up a local environment to run tempest against and will try to manually reproduce the ovsdb inactivity timeout problem. It is showing up in the first debug collection which happens after the tempest.api.network set of tests. The fact that the timeout didn't matter (5s vs 30s) makes me think something on the controller side has gone for a toss and not coming back. Of course, looking at the karaf logs was fruitless. I have an OPNFV apex virtual setup running. Now, just trying to make tempest work. Once I have that, I can swap in boron distros and debug. JamO On 02/13/2017 10:09 AM, Vishal Thapar wrote: > HI Sam, > > > > Not yet. I think we should bring it up with OFPlugin folks. Looking at logs > of Alon’s patch in [10], no response even after > 30 seconds. One thing we can probably try is disable inactivity probe > altogether, should give an idea if it becomes > responsive or something much worse has gone wrong. Disable it by setting > probe timeout to 0. > > > > This looks like one of those issues where chasing guilty patch may not help. > Need to figure out what is going wrong. > > > > Regards, > > Vishal. > > > > *From:*Sam Hague [mailto:[email protected]<mailto:[email protected]>] > *Sent:* 13 February 2017 23:35 > *To:* Kochba, Alon <[email protected]<mailto:[email protected]>>; N Vivekanandan > <[email protected]<mailto:[email protected]>>; Vishal > Thapar > <[email protected]<mailto:[email protected]>>; odl netvirt > dev > <[email protected]<mailto:[email protected]>> > *Subject:* Steps to resolve latest CSIT regressions > > > > Vivek, Vishal, > > did you find anything else related to the regressions? > > From Vishal's debugging he saw ovsdb inactivity timeouts happening at the > default 5s, so we suspected the openflowplugin > patch [1]. We ran test-patch [2] on it, but it also had the same 52 errors so > that doesn't look like the culprit. Also ran > the test-patch against the two patches before [1] and they also blew up. > > [5] is the first patch on boron netvirt around when things went south so I am > running csit on it with [6]. I tried some other > jobs around then but the distros are being deleted. > > [7] is the job against the openflowplugin job again but using the > openflowplugin distribution. > > > > [10] is the job Alon pushed to check the inactivty-timeout using the patch > [11]. Same 52 errors so increasing the > inactivity-timeout to 30s didn't seem to help. > > > > Thanks, Sam > > [1] https://git.opendaylight.org/gerrit/#/c/50153/ > > [2] > https://jenkins.opendaylight.org/releng/job/netvirt-csit-1node-openstack-newton-nodl-v2-gate-stateful-boron/17/ > [3] > https://jenkins.opendaylight.org/releng/view/netvirt/job/netvirt-csit-1node-openstack-newton-nodl-v2-gate-stateful-boron/18/ > [4] > https://jenkins.opendaylight.org/releng/view/netvirt/job/netvirt-csit-1node-openstack-newton-nodl-v2-gate-stateful-boron/19/ > > > [5] https://git.opendaylight.org/gerrit/51456 > Bug 7714 <https://bugs.opendaylight.org/show_bug.cgi?id=7714>- Vpn Interface > not deleted from oper DS > [6] > https://jenkins.opendaylight.org/sandbox/job/netvirt-csit-1node-openstack-newton-nodl-v2-upstream-stateful-boron-shague/1/ > > [6] > https://jenkins.opendaylight.org/sandbox/job/netvirt-csit-1node-openstack-newton-nodl-v2-upstream-stateful-boron-shague/2/ > > - using openflowplugin distro from: [1] > https://git.opendaylight.org/gerrit/#/c/50153/ > > > > [10] > https://jenkins.opendaylight.org/sandbox/job/netvirt-csit-1node-openstack-newton-nodl-v2-upstream-stateful-alonko-boron/1/ > [11] https://git.opendaylight.org/gerrit/#/c/51763/ > > > > _______________________________________________ > netvirt-dev mailing list > [email protected]<mailto:[email protected]> > https://lists.opendaylight.org/mailman/listinfo/netvirt-dev >
_______________________________________________ openflowplugin-dev mailing list [email protected] https://lists.opendaylight.org/mailman/listinfo/openflowplugin-dev
