Here [1], it is suggested to give 24 hours for upstream project to analyze regression and provide fix or revert. So given you notified yesterday 2:15 PM PST, I would expect either a fix or a revet no later than this afternoon.
[1] https://wiki.opendaylight.org/view/TSC:Main#Best_practices_for_preventing.2Fhandling_cross-project_breakages > On Mar 22, 2017, at 9:27 AM, Jamo Luhrsen <[email protected]> wrote: > > +release > > what are we doing here? I think this needs to be resolved asap, as I know > netvirt > 3node jobs can get in a bad state and be stuck for the full 6 hour timeout. > this is > surely affecting our jenkins queue. > > https://jenkins.opendaylight.org/releng/view/netvirt-csit/job/netvirt-csit-3node-openstack-newton-nodl-v2-upstream-transparent-carbon/buildTimeTrend > > can we merge the revert patch? > > or do we need to disable the 3node jobs for now? > > we can file a bug, but that is just overhead if we can get this resolved soon. > > Thanks, > JamO > > On 03/21/2017 10:15 PM, Luis Gomez wrote: >> Hi Jamo, I can confirm the controller patch introduced the regression, >> >> after building the revert: >> >> https://git.opendaylight.org/gerrit/#/c/53643/ >> >> things go back to normal in cluster test: >> >> https://logs.opendaylight.org/sandbox/jenkins091/openflowplugin-csit-3node-clustering-only-carbon/4/archives/log.html.gz >> >> BR/Luis >> >> >>> On Mar 21, 2017, at 3:22 PM, Luis Gomez <[email protected] >>> <mailto:[email protected]>> wrote: >>> >>> Right, something really broke the ofp cluster in carbon between Mar 19th >>> 7:22AM UTC and Mar 20th 10:53AM UTC. The patch you >>> point out is in that interval. >>> >>> It seems the controller cluster test in carbon is far from stable so >>> difficult to tell when the regression was introduced >>> by looking at it: >>> >>> https://jenkins.opendaylight.org/releng/view/CSIT-3node/job/controller-csit-3node-clustering-only-carbon/ >>> >>> Finally, how does controller people verify patches? I do not see any patch >>> test job like we have in other projects. >>> >>> BR/Luis >>> >>>> On Mar 21, 2017, at 2:15 PM, Jamo Luhrsen <[email protected] >>>> <mailto:[email protected]>> wrote: >>>> >>>> +openflowplugin and controller teams >>>> >>>> TL;DR >>>> >>>> I think this controller patch caused some breakages in our 3node CSIT. >>>> >>>> https://git.opendaylight.org/gerrit/#/c/49265/ >>>> >>>> >>>> both functionality of the controller as well as giving us a ton more >>>> logs which creates other problems. >>>> >>>> I think 3node ofp csit is broken too: >>>> >>>> https://jenkins.opendaylight.org/releng/view/openflowplugin/job/openflowplugin-csit-3node-clustering-only-carbon/ >>>> >>>> I ran some csit tests in the sandbox, (jobs 1-4) here: >>>> >>>> https://jenkins.opendaylight.org/sandbox/job/netvirt-csit-3node-openstack-newton-nodl-v2-jamo-upstream-transparent-carbon/ >>>> >>>> >>>> you can see job 1 is yellow, and the rest are 100% pass. They are using >>>> distros from nexus as they were published from *4500.zip down to *4997.zip >>>> >>>> the only difference between 4500 and 4499 is that controller patch above. >>>> >>>> Of course something in our env/csit could have changed too, but the karaf >>>> logs are definitely bigger in netvirt csit. We collect just expections in >>>> a single file and it's ~30x more in a failed job. >>>> >>>> Thanks, >>>> JamO >>>> >>>> On 03/21/2017 01:49 PM, Jamo Luhrsen wrote: >>>>> current theory is our karaf.log is getting a lot more messages now. I >>>>> found one >>>>> job that didn't get aborted. It did run for 5h33m though: >>>>> >>>>> https://jenkins.opendaylight.org/releng/view/netvirt-csit/job/netvirt-csit-3node-openstack-newton-nodl-v2-upstream-transparent-carbon/376/ >>>>> >>>>> the robot logs didn't get created because the generated output.xml was >>>>> too big the >>>>> tool to make the .html reports failed or quit. Locally, I could create >>>>> the .html >>>>> with that output.xml >>>>> >>>>> We have this trouble before where all of a sudden lots more logging comes >>>>> in and >>>>> it breaks our jobs. >>>>> >>>>> still getting to the bottom of it... >>>>> >>>>> JamO >>>>> >>>>> On 03/21/2017 10:39 AM, Jamo Luhrsen wrote: >>>>>> Netvirt, Integration, >>>>>> >>>>>> we need to figure out and fix what's wrong with the netvirt 3node carbon >>>>>> csit. >>>>>> >>>>>> the jobs are timing out at our jenkins 6h limit. that means we don't >>>>>> get any logs either. >>>>>> >>>>>> This will likely cause a large backlog in our jenkins queue. >>>>>> >>>>>> If anyone has cycles at the moment to help, catch me on IRC. >>>>>> >>>>>> Initially, with Alon's help, we know that this job [0] was not seeing >>>>>> this trouble. This job [1]. >>>>>> >>>>>> the difference in ODL patches between the two distros that were used >>>>>> have some controller patches that seem cluster related. here are all >>>>>> the patches that came in between the two: >>>>>> >>>>>> controller https://git.opendaylight.org/gerrit/49265BUG-5280: add >>>>>> frontend state lifecycle >>>>>> controller https://git.opendaylight.org/gerrit/49738BUG-2138: Use >>>>>> correct actor context in shard lookup. >>>>>> controller https://git.opendaylight.org/gerrit/49663BUG-2138: Fix >>>>>> shard registration with ProxyProducers. >>>>>> >>>>>> From the looks of the console log (all we have) it seems that each >>>>>> test case is just taking a long time. I don't know more than that >>>>>> at the moment. >>>>>> >>>>>> JamO >>>>>> >>>>>> >>>>>> >>>>>> [0] >>>>>> https://jenkins.opendaylight.org/releng/view/netvirt-csit/job/netvirt-csit-3node-openstack-newton-nodl-v2-upstream-transparent-carbon/373/ >>>>>> [1] >>>>>> https://jenkins.opendaylight.org/releng/view/netvirt-csit/job/netvirt-csit-3node-openstack-newton-nodl-v2-upstream-transparent-carbon/374/ >>>>>> >>>> _______________________________________________ >>>> dev mailing list >>>> [email protected] <mailto:[email protected]> >>>> https://lists.opendaylight.org/mailman/listinfo/dev >>> >> _______________________________________________ openflowplugin-dev mailing list [email protected] https://lists.opendaylight.org/mailman/listinfo/openflowplugin-dev
