+openflowplugin and controller teams

TL;DR

I think this controller patch caused some breakages in our 3node CSIT.

https://git.opendaylight.org/gerrit/#/c/49265/


both functionality of the controller as well as giving us a ton more
logs which creates other problems.

I think 3node ofp csit is broken too:

https://jenkins.opendaylight.org/releng/view/openflowplugin/job/openflowplugin-csit-3node-clustering-only-carbon/

I ran some csit tests in the sandbox, (jobs 1-4) here:

https://jenkins.opendaylight.org/sandbox/job/netvirt-csit-3node-openstack-newton-nodl-v2-jamo-upstream-transparent-carbon/


you can see job 1 is yellow, and the rest are 100% pass. They are using
distros from nexus as they were published from *4500.zip down to *4997.zip

the only difference between 4500 and 4499 is that controller patch above.

Of course something in our env/csit could have changed too, but the karaf
logs are definitely bigger in netvirt csit. We collect just expections in
a single file and it's ~30x more in a failed job.

Thanks,
JamO

On 03/21/2017 01:49 PM, Jamo Luhrsen wrote:
> current theory is our karaf.log is getting a lot more messages now. I found 
> one
> job that didn't get aborted. It did run for 5h33m though:
> 
> https://jenkins.opendaylight.org/releng/view/netvirt-csit/job/netvirt-csit-3node-openstack-newton-nodl-v2-upstream-transparent-carbon/376/
> 
> the robot logs didn't get created because the generated output.xml was too 
> big the
> tool to make the .html reports failed or quit. Locally, I could create the 
> .html
> with that output.xml
> 
> We have this trouble before where all of a sudden lots more logging comes in 
> and
> it breaks our jobs.
> 
> still getting to the bottom of it...
> 
> JamO
> 
> On 03/21/2017 10:39 AM, Jamo Luhrsen wrote:
>> Netvirt, Integration,
>>
>> we need to figure out and fix what's wrong with the netvirt 3node carbon 
>> csit.
>>
>> the jobs are timing out at our jenkins 6h limit. that means we don't
>> get any logs either.
>>
>> This will likely cause a large backlog in our jenkins queue.
>>
>> If anyone has cycles at the moment to help, catch me on IRC.
>>
>> Initially, with Alon's help, we know that this job [0] was not seeing
>> this trouble. This job [1].
>>
>> the difference in ODL patches between the two distros that were used
>> have some controller patches that seem cluster related. here are all
>> the patches that came in between the two:
>>
>> controller   https://git.opendaylight.org/gerrit/49265       BUG-5280: add 
>> frontend state lifecycle
>> controller   https://git.opendaylight.org/gerrit/49738       BUG-2138: Use 
>> correct actor context in shard lookup.
>> controller   https://git.opendaylight.org/gerrit/49663       BUG-2138: Fix 
>> shard registration with ProxyProducers.
>>
>> From the looks of the console log (all we have) it seems that each
>> test case is just taking a long time. I don't know more than that
>> at the moment.
>>
>> JamO
>>
>>
>>
>> [0]
>> https://jenkins.opendaylight.org/releng/view/netvirt-csit/job/netvirt-csit-3node-openstack-newton-nodl-v2-upstream-transparent-carbon/373/
>> [1]
>> https://jenkins.opendaylight.org/releng/view/netvirt-csit/job/netvirt-csit-3node-openstack-newton-nodl-v2-upstream-transparent-carbon/374/
>>
_______________________________________________
openflowplugin-dev mailing list
[email protected]
https://lists.opendaylight.org/mailman/listinfo/openflowplugin-dev

Reply via email to