+openflowplugin and controller teams TL;DR
I think this controller patch caused some breakages in our 3node CSIT. https://git.opendaylight.org/gerrit/#/c/49265/ both functionality of the controller as well as giving us a ton more logs which creates other problems. I think 3node ofp csit is broken too: https://jenkins.opendaylight.org/releng/view/openflowplugin/job/openflowplugin-csit-3node-clustering-only-carbon/ I ran some csit tests in the sandbox, (jobs 1-4) here: https://jenkins.opendaylight.org/sandbox/job/netvirt-csit-3node-openstack-newton-nodl-v2-jamo-upstream-transparent-carbon/ you can see job 1 is yellow, and the rest are 100% pass. They are using distros from nexus as they were published from *4500.zip down to *4997.zip the only difference between 4500 and 4499 is that controller patch above. Of course something in our env/csit could have changed too, but the karaf logs are definitely bigger in netvirt csit. We collect just expections in a single file and it's ~30x more in a failed job. Thanks, JamO On 03/21/2017 01:49 PM, Jamo Luhrsen wrote: > current theory is our karaf.log is getting a lot more messages now. I found > one > job that didn't get aborted. It did run for 5h33m though: > > https://jenkins.opendaylight.org/releng/view/netvirt-csit/job/netvirt-csit-3node-openstack-newton-nodl-v2-upstream-transparent-carbon/376/ > > the robot logs didn't get created because the generated output.xml was too > big the > tool to make the .html reports failed or quit. Locally, I could create the > .html > with that output.xml > > We have this trouble before where all of a sudden lots more logging comes in > and > it breaks our jobs. > > still getting to the bottom of it... > > JamO > > On 03/21/2017 10:39 AM, Jamo Luhrsen wrote: >> Netvirt, Integration, >> >> we need to figure out and fix what's wrong with the netvirt 3node carbon >> csit. >> >> the jobs are timing out at our jenkins 6h limit. that means we don't >> get any logs either. >> >> This will likely cause a large backlog in our jenkins queue. >> >> If anyone has cycles at the moment to help, catch me on IRC. >> >> Initially, with Alon's help, we know that this job [0] was not seeing >> this trouble. This job [1]. >> >> the difference in ODL patches between the two distros that were used >> have some controller patches that seem cluster related. here are all >> the patches that came in between the two: >> >> controller https://git.opendaylight.org/gerrit/49265 BUG-5280: add >> frontend state lifecycle >> controller https://git.opendaylight.org/gerrit/49738 BUG-2138: Use >> correct actor context in shard lookup. >> controller https://git.opendaylight.org/gerrit/49663 BUG-2138: Fix >> shard registration with ProxyProducers. >> >> From the looks of the console log (all we have) it seems that each >> test case is just taking a long time. I don't know more than that >> at the moment. >> >> JamO >> >> >> >> [0] >> https://jenkins.opendaylight.org/releng/view/netvirt-csit/job/netvirt-csit-3node-openstack-newton-nodl-v2-upstream-transparent-carbon/373/ >> [1] >> https://jenkins.opendaylight.org/releng/view/netvirt-csit/job/netvirt-csit-3node-openstack-newton-nodl-v2-upstream-transparent-carbon/374/ >> _______________________________________________ openflowplugin-dev mailing list [email protected] https://lists.opendaylight.org/mailman/listinfo/openflowplugin-dev
