Here [1], it is suggested to give 24 hours for upstream project to analyze 
regression and provide fix or revert. So given you notified yesterday 2:15 PM 
PST, I would expect either a fix or a revet no later than this afternoon.

[1] 
https://wiki.opendaylight.org/view/TSC:Main#Best_practices_for_preventing.2Fhandling_cross-project_breakages


> On Mar 22, 2017, at 9:27 AM, Jamo Luhrsen <[email protected]> wrote:
> 
> +release
> 
> what are we doing here? I think this needs to be resolved asap, as I know 
> netvirt
> 3node jobs can get in a bad state and be stuck for the full 6 hour timeout. 
> this is
> surely affecting our jenkins queue.
> 
> https://jenkins.opendaylight.org/releng/view/netvirt-csit/job/netvirt-csit-3node-openstack-newton-nodl-v2-upstream-transparent-carbon/buildTimeTrend
> 
> can we merge the revert patch?
> 
> or do we need to disable the 3node jobs for now?
> 
> we can file a bug, but that is just overhead if we can get this resolved soon.
> 
> Thanks,
> JamO
> 
> On 03/21/2017 10:15 PM, Luis Gomez wrote:
>> Hi Jamo, I can confirm the controller patch introduced the regression,
>> 
>> after building the revert:
>> 
>> https://git.opendaylight.org/gerrit/#/c/53643/
>> 
>> things go back to normal in cluster test:
>> 
>> https://logs.opendaylight.org/sandbox/jenkins091/openflowplugin-csit-3node-clustering-only-carbon/4/archives/log.html.gz
>> 
>> BR/Luis
>> 
>> 
>>> On Mar 21, 2017, at 3:22 PM, Luis Gomez <[email protected] 
>>> <mailto:[email protected]>> wrote:
>>> 
>>> Right, something really broke the ofp cluster in carbon between Mar 19th 
>>> 7:22AM UTC and Mar 20th 10:53AM UTC. The patch you
>>> point out is in that interval.
>>> 
>>> It seems the controller cluster test in carbon is far from stable so 
>>> difficult to tell when the regression was introduced
>>> by looking at it:
>>> 
>>> https://jenkins.opendaylight.org/releng/view/CSIT-3node/job/controller-csit-3node-clustering-only-carbon/
>>> 
>>> Finally, how does controller people verify patches? I do not see any patch 
>>> test job like we have in other projects.
>>> 
>>> BR/Luis
>>> 
>>>> On Mar 21, 2017, at 2:15 PM, Jamo Luhrsen <[email protected] 
>>>> <mailto:[email protected]>> wrote:
>>>> 
>>>> +openflowplugin and controller teams
>>>> 
>>>> TL;DR
>>>> 
>>>> I think this controller patch caused some breakages in our 3node CSIT.
>>>> 
>>>> https://git.opendaylight.org/gerrit/#/c/49265/
>>>> 
>>>> 
>>>> both functionality of the controller as well as giving us a ton more
>>>> logs which creates other problems.
>>>> 
>>>> I think 3node ofp csit is broken too:
>>>> 
>>>> https://jenkins.opendaylight.org/releng/view/openflowplugin/job/openflowplugin-csit-3node-clustering-only-carbon/
>>>> 
>>>> I ran some csit tests in the sandbox, (jobs 1-4) here:
>>>> 
>>>> https://jenkins.opendaylight.org/sandbox/job/netvirt-csit-3node-openstack-newton-nodl-v2-jamo-upstream-transparent-carbon/
>>>> 
>>>> 
>>>> you can see job 1 is yellow, and the rest are 100% pass. They are using
>>>> distros from nexus as they were published from *4500.zip down to *4997.zip
>>>> 
>>>> the only difference between 4500 and 4499 is that controller patch above.
>>>> 
>>>> Of course something in our env/csit could have changed too, but the karaf
>>>> logs are definitely bigger in netvirt csit. We collect just expections in
>>>> a single file and it's ~30x more in a failed job.
>>>> 
>>>> Thanks,
>>>> JamO
>>>> 
>>>> On 03/21/2017 01:49 PM, Jamo Luhrsen wrote:
>>>>> current theory is our karaf.log is getting a lot more messages now. I 
>>>>> found one
>>>>> job that didn't get aborted. It did run for 5h33m though:
>>>>> 
>>>>> https://jenkins.opendaylight.org/releng/view/netvirt-csit/job/netvirt-csit-3node-openstack-newton-nodl-v2-upstream-transparent-carbon/376/
>>>>> 
>>>>> the robot logs didn't get created because the generated output.xml was 
>>>>> too big the
>>>>> tool to make the .html reports failed or quit. Locally, I could create 
>>>>> the .html
>>>>> with that output.xml
>>>>> 
>>>>> We have this trouble before where all of a sudden lots more logging comes 
>>>>> in and
>>>>> it breaks our jobs.
>>>>> 
>>>>> still getting to the bottom of it...
>>>>> 
>>>>> JamO
>>>>> 
>>>>> On 03/21/2017 10:39 AM, Jamo Luhrsen wrote:
>>>>>> Netvirt, Integration,
>>>>>> 
>>>>>> we need to figure out and fix what's wrong with the netvirt 3node carbon 
>>>>>> csit.
>>>>>> 
>>>>>> the jobs are timing out at our jenkins 6h limit. that means we don't
>>>>>> get any logs either.
>>>>>> 
>>>>>> This will likely cause a large backlog in our jenkins queue.
>>>>>> 
>>>>>> If anyone has cycles at the moment to help, catch me on IRC.
>>>>>> 
>>>>>> Initially, with Alon's help, we know that this job [0] was not seeing
>>>>>> this trouble. This job [1].
>>>>>> 
>>>>>> the difference in ODL patches between the two distros that were used
>>>>>> have some controller patches that seem cluster related. here are all
>>>>>> the patches that came in between the two:
>>>>>> 
>>>>>> controller   https://git.opendaylight.org/gerrit/49265BUG-5280: add 
>>>>>> frontend state lifecycle
>>>>>> controller   https://git.opendaylight.org/gerrit/49738BUG-2138: Use 
>>>>>> correct actor context in shard lookup.
>>>>>> controller   https://git.opendaylight.org/gerrit/49663BUG-2138: Fix 
>>>>>> shard registration with ProxyProducers.
>>>>>> 
>>>>>> From the looks of the console log (all we have) it seems that each
>>>>>> test case is just taking a long time. I don't know more than that
>>>>>> at the moment.
>>>>>> 
>>>>>> JamO
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> [0]
>>>>>> https://jenkins.opendaylight.org/releng/view/netvirt-csit/job/netvirt-csit-3node-openstack-newton-nodl-v2-upstream-transparent-carbon/373/
>>>>>> [1]
>>>>>> https://jenkins.opendaylight.org/releng/view/netvirt-csit/job/netvirt-csit-3node-openstack-newton-nodl-v2-upstream-transparent-carbon/374/
>>>>>> 
>>>> _______________________________________________
>>>> dev mailing list
>>>> [email protected] <mailto:[email protected]>
>>>> https://lists.opendaylight.org/mailman/listinfo/dev
>>> 
>> 

_______________________________________________
openflowplugin-dev mailing list
[email protected]
https://lists.opendaylight.org/mailman/listinfo/openflowplugin-dev

Reply via email to