Re: [openstack-dev] Top Gate Bugs
On Wednesday, December 04, 2013 7:22:23 AM, Joe Gordon wrote: TL;DR: Gate is failing 23% of the time due to bugs in nova, neutron and tempest. We need help fixing these bugs. Hi All, Before going any further we have a bug that is affecting gate and stable, so its getting top priority here. elastic-recheck currently doesn't track unit tests because we don't expect them to fail very often. Turns out that assessment was wrong, we now have a nova py27 unit test bug in gate and stable gate. https://bugs.launchpad.net/nova/+bug/1216851 Title: nova unit tests occasionally fail migration tests for mysql and postgres Hits FAILURE: 74 The failures appear multiple times for a single job, and some of those are due to bad patches in the check queue. But this is being seen in stable and trunk gate so something is definitely wrong. === Its time for another edition of of 'Top Gate Bugs.' I am sending this out now because in addition to our usual gate bugs a few new ones have cropped up recently, and as we saw a few weeks ago it doesn't take very many new bugs to wedge the gate. Currently the gate has a failure rate of at least 23%! [0] Note: this email was generated with http://status.openstack.org/elastic-recheck/ and 'elastic-recheck-success' [1] 1) https://bugs.launchpad.net/bugs/1253896 Title: test_minimum_basic_scenario fails with SSHException: Error reading SSH protocol banner Projects: neutron, nova, tempest Hits FAILURE: 324 This one has been around for several weeks now and although we have made some attempts at fixing this, we aren't any closer at resolving this then we were a few weeks ago. 2) https://bugs.launchpad.net/bugs/1251448 Title: BadRequest: Multiple possible networks found, use a Network ID to be more specific. Project: neutron Hits FAILURE: 141 3) https://bugs.launchpad.net/bugs/1249065 Title: Tempest failure: tempest/scenario/test_snapshot_pattern.py Project: nova Hits FAILURE: 112 This is a bug in nova's neutron code. 4) https://bugs.launchpad.net/bugs/1250168 Title: gate-tempest-devstack-vm-neutron-large-ops is failing Projects: neutron, nova Hits FAILURE: 94 This is an old bug that was fixed, but came back on December 3rd. So this is a recent regression. This may be an infra issue. 5) https://bugs.launchpad.net/bugs/1210483 Title: ServerAddressesTestXML.test_list_server_addresses FAIL Projects: neutron, nova Hits FAILURE: 73 This has had some attempts made at fixing it but its still around. In addition to the existing bugs, we have some new bugs on the rise: 1) https://bugs.launchpad.net/bugs/1257626 Title: Timeout while waiting on RPC response - topic: network, RPC method: allocate_for_instance info: unknown Project: nova Hits FAILURE: 52 large-ops only bug. This has been around for at least two weeks, but we have seen this in higher numbers starting around December 3rd. This may be an infrastructure issue as the neutron-large-ops started failing more around the same time. 2) https://bugs.launchpad.net/bugs/1257641 Title: Quota exceeded for instances: Requested 1, but already used 10 of 10 instances Projects: nova, tempest Hits FAILURE: 41 Like the previous bug, this has been around for at least two weeks but appears to be on the rise. Raw Data: http://paste.openstack.org/show/54419/ best, Joe [0] failure rate = 1-(success rate gate-tempest-dsvm-neutron)*(success rate ...) * ... gate-tempest-dsvm-neutron = 0.00 gate-tempest-dsvm-neutron-large-ops = 11.11 gate-tempest-dsvm-full = 11.11 gate-tempest-dsvm-large-ops = 4.55 gate-tempest-dsvm-postgres-full = 10.00 gate-grenade-dsvm = 0.00 (I hope I got the math right here) [1] http://git.openstack.org/cgit/openstack-infra/elastic-recheck/tree/elastic_recheck/cmd/check_success.py ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev Let's add bug 1257644 [1] to the list. I'm pretty sure this is due to some recent code [2][3] in the nova libvirt driver that is automatically disabling the host when the libvirt connection drops. Joe said there was a known issue with libvirt connection failures so this could be duped against that, but I'm not sure where/what that one is - maybe bug 1254872 [4]? Unless I just don't understand the code, there is some funny logic going on in the libvirt driver when it's automatically disabling a host which I've documented in bug 1257644. It would help to have some libvirt-minded people helping to look at that, or the authors/approvers of those patches. Also, does anyone know if libvirt will pass a 'reason' string to the _close_callback function? I was digging through the libvirt code this morning but couldn't figure out where the callback is actually called and with what parameters. The code in nova seemed to just be based on the patch that danpb had in libvirt [5]. This bug is going to raise a bigger long-term question
Re: [openstack-dev] Top Gate Bugs
Joe, Looks like we may be a bit more stable now? Short URL: http://bit.ly/18qq4q2 Long URL : http://graphite.openstack.org/graphlot/?from=-120houruntil=-0hourtarget=color(alias(movingAverage(asPercent(stats.zuul.pipeline.gate.job.gate-tempest-dsvm-full.SUCCESS,sum(stats.zuul.pipeline.gate.job.gate-tempest-dsvm-full.{SUCCESS,FAILURE})),'6hours'),%20'gate-tempest-dsvm-postgres-full'),'ED9121')target=color(alias(movingAverage(asPercent(stats.zuul.pipeline.gate.job.gate-tempest-dsvm-postgres-full.SUCCESS,sum(stats.zuul.pipeline.gate.job.gate-tempest-dsvm-postgres-full.{SUCCESS,FAILURE})),'6hours'),%20'gate-tempest-dsvm-neutron-large-ops'),'00F0F0')target=color(alias(movingAverage(asPercent(stats.zuul.pipeline.gate.job.gate-tempest-dsvm-neutron.SUCCESS,sum(stats.zuul.pipeline.gate.job.gate-tempest-dsvm-neutron.{SUCCESS,FAILURE})),'6hours'),%20'gate-tempest-dsvm-neutron'),'00FF00')target=color(alias(movingAverage(asPercent(stats.zuul.pipeline.gate.job.gate-tempest-dsvm-neutron-large-ops.SUCCESS,sum(stats.zuul.pipeline.gate.job.gate-tempest-dsvm-neutron-large-ops.{S UCCESS,FAILURE})),'6hours'),%20'gate-tempest-dsvm-neutron-large-ops'),'00c868')target=color(alias(movingAverage(asPercent(stats.zuul.pipeline.check.job.check-grenade-dsvm.SUCCESS,sum(stats.zuul.pipeline.check.job.check-grenade-dsvm.{SUCCESS,FAILURE})),'6hours'),%20'check-grenade-dsvm'),'800080')target=color(alias(movingAverage(asPercent(stats.zuul.pipeline.gate.job.gate-tempest-dsvm-large-ops.SUCCESS,sum(stats.zuul.pipeline.gate.job.gate-tempest-dsvm-large-ops.{SUCCESS,FAILURE})),'6hours'),%20'gate-tempest-dsvm-neutron-large-ops'),'E080FF') -- dims On Fri, Dec 6, 2013 at 11:28 AM, Matt Riedemann mrie...@linux.vnet.ibm.com wrote: On Wednesday, December 04, 2013 7:22:23 AM, Joe Gordon wrote: TL;DR: Gate is failing 23% of the time due to bugs in nova, neutron and tempest. We need help fixing these bugs. Hi All, Before going any further we have a bug that is affecting gate and stable, so its getting top priority here. elastic-recheck currently doesn't track unit tests because we don't expect them to fail very often. Turns out that assessment was wrong, we now have a nova py27 unit test bug in gate and stable gate. https://bugs.launchpad.net/nova/+bug/1216851 Title: nova unit tests occasionally fail migration tests for mysql and postgres Hits FAILURE: 74 The failures appear multiple times for a single job, and some of those are due to bad patches in the check queue. But this is being seen in stable and trunk gate so something is definitely wrong. === Its time for another edition of of 'Top Gate Bugs.' I am sending this out now because in addition to our usual gate bugs a few new ones have cropped up recently, and as we saw a few weeks ago it doesn't take very many new bugs to wedge the gate. Currently the gate has a failure rate of at least 23%! [0] Note: this email was generated with http://status.openstack.org/elastic-recheck/ and 'elastic-recheck-success' [1] 1) https://bugs.launchpad.net/bugs/1253896 Title: test_minimum_basic_scenario fails with SSHException: Error reading SSH protocol banner Projects: neutron, nova, tempest Hits FAILURE: 324 This one has been around for several weeks now and although we have made some attempts at fixing this, we aren't any closer at resolving this then we were a few weeks ago. 2) https://bugs.launchpad.net/bugs/1251448 Title: BadRequest: Multiple possible networks found, use a Network ID to be more specific. Project: neutron Hits FAILURE: 141 3) https://bugs.launchpad.net/bugs/1249065 Title: Tempest failure: tempest/scenario/test_snapshot_pattern.py Project: nova Hits FAILURE: 112 This is a bug in nova's neutron code. 4) https://bugs.launchpad.net/bugs/1250168 Title: gate-tempest-devstack-vm-neutron-large-ops is failing Projects: neutron, nova Hits FAILURE: 94 This is an old bug that was fixed, but came back on December 3rd. So this is a recent regression. This may be an infra issue. 5) https://bugs.launchpad.net/bugs/1210483 Title: ServerAddressesTestXML.test_list_server_addresses FAIL Projects: neutron, nova Hits FAILURE: 73 This has had some attempts made at fixing it but its still around. In addition to the existing bugs, we have some new bugs on the rise: 1) https://bugs.launchpad.net/bugs/1257626 Title: Timeout while waiting on RPC response - topic: network, RPC method: allocate_for_instance info: unknown Project: nova Hits FAILURE: 52 large-ops only bug. This has been around for at least two weeks, but we have seen this in higher numbers starting around December 3rd. This may be an infrastructure issue as the neutron-large-ops started failing more around the same time. 2) https://bugs.launchpad.net/bugs/1257641 Title: Quota exceeded for instances: Requested 1, but already used 10 of 10 instances Projects: nova, tempest Hits FAILURE: 41 Like the previous bug, this has been around
Re: [openstack-dev] Top Gate Bugs
I had the labels wrong - here's a slightly better link - http://bit.ly/1gdxYeg On Fri, Dec 6, 2013 at 4:31 PM, Davanum Srinivas dava...@gmail.com wrote: Joe, Looks like we may be a bit more stable now? Short URL: http://bit.ly/18qq4q2 Long URL : http://graphite.openstack.org/graphlot/?from=-120houruntil=-0hourtarget=color(alias(movingAverage(asPercent(stats.zuul.pipeline.gate.job.gate-tempest-dsvm-full.SUCCESS,sum(stats.zuul.pipeline.gate.job.gate-tempest-dsvm-full.{SUCCESS,FAILURE})),'6hours'),%20'gate-tempest-dsvm-postgres-full'),'ED9121')target=color(alias(movingAverage(asPercent(stats.zuul.pipeline.gate.job.gate-tempest-dsvm-postgres-full.SUCCESS,sum(stats.zuul.pipeline.gate.job.gate-tempest-dsvm-postgres-full.{SUCCESS,FAILURE})),'6hours'),%20'gate-tempest-dsvm-neutron-large-ops'),'00F0F0')target=color(alias(movingAverage(asPercent(stats.zuul.pipeline.gate.job.gate-tempest-dsvm-neutron.SUCCESS,sum(stats.zuul.pipeline.gate.job.gate-tempest-dsvm-neutron.{SUCCESS,FAILURE})),'6hours'),%20'gate-tempest-dsvm-neutron'),'00FF00')target=color(alias(movingAverage(asPercent(stats.zuul.pipeline.gate.job.gate-tempest-dsvm-neutron-large-ops.SUCCESS,sum(stats.zuul.pipeline.gate.job.gate-tempest-dsvm-neutron-large-ops. {SUCCESS,FAILURE})),'6hours'),%20'gate-tempest-dsvm-neutron-large-ops'),'00c868')target=color(alias(movingAverage(asPercent(stats.zuul.pipeline.check.job.check-grenade-dsvm.SUCCESS,sum(stats.zuul.pipeline.check.job.check-grenade-dsvm.{SUCCESS,FAILURE})),'6hours'),%20'check-grenade-dsvm'),'800080')target=color(alias(movingAverage(asPercent(stats.zuul.pipeline.gate.job.gate-tempest-dsvm-large-ops.SUCCESS,sum(stats.zuul.pipeline.gate.job.gate-tempest-dsvm-large-ops.{SUCCESS,FAILURE})),'6hours'),%20'gate-tempest-dsvm-neutron-large-ops'),'E080FF') -- dims On Fri, Dec 6, 2013 at 11:28 AM, Matt Riedemann mrie...@linux.vnet.ibm.com wrote: On Wednesday, December 04, 2013 7:22:23 AM, Joe Gordon wrote: TL;DR: Gate is failing 23% of the time due to bugs in nova, neutron and tempest. We need help fixing these bugs. Hi All, Before going any further we have a bug that is affecting gate and stable, so its getting top priority here. elastic-recheck currently doesn't track unit tests because we don't expect them to fail very often. Turns out that assessment was wrong, we now have a nova py27 unit test bug in gate and stable gate. https://bugs.launchpad.net/nova/+bug/1216851 Title: nova unit tests occasionally fail migration tests for mysql and postgres Hits FAILURE: 74 The failures appear multiple times for a single job, and some of those are due to bad patches in the check queue. But this is being seen in stable and trunk gate so something is definitely wrong. === Its time for another edition of of 'Top Gate Bugs.' I am sending this out now because in addition to our usual gate bugs a few new ones have cropped up recently, and as we saw a few weeks ago it doesn't take very many new bugs to wedge the gate. Currently the gate has a failure rate of at least 23%! [0] Note: this email was generated with http://status.openstack.org/elastic-recheck/ and 'elastic-recheck-success' [1] 1) https://bugs.launchpad.net/bugs/1253896 Title: test_minimum_basic_scenario fails with SSHException: Error reading SSH protocol banner Projects: neutron, nova, tempest Hits FAILURE: 324 This one has been around for several weeks now and although we have made some attempts at fixing this, we aren't any closer at resolving this then we were a few weeks ago. 2) https://bugs.launchpad.net/bugs/1251448 Title: BadRequest: Multiple possible networks found, use a Network ID to be more specific. Project: neutron Hits FAILURE: 141 3) https://bugs.launchpad.net/bugs/1249065 Title: Tempest failure: tempest/scenario/test_snapshot_pattern.py Project: nova Hits FAILURE: 112 This is a bug in nova's neutron code. 4) https://bugs.launchpad.net/bugs/1250168 Title: gate-tempest-devstack-vm-neutron-large-ops is failing Projects: neutron, nova Hits FAILURE: 94 This is an old bug that was fixed, but came back on December 3rd. So this is a recent regression. This may be an infra issue. 5) https://bugs.launchpad.net/bugs/1210483 Title: ServerAddressesTestXML.test_list_server_addresses FAIL Projects: neutron, nova Hits FAILURE: 73 This has had some attempts made at fixing it but its still around. In addition to the existing bugs, we have some new bugs on the rise: 1) https://bugs.launchpad.net/bugs/1257626 Title: Timeout while waiting on RPC response - topic: network, RPC method: allocate_for_instance info: unknown Project: nova Hits FAILURE: 52 large-ops only bug. This has been around for at least two weeks, but we have seen this in higher numbers starting around December 3rd. This may be an infrastructure issue as the neutron-large-ops started failing more around the same time. 2) https://bugs.launchpad.net/bugs/1257641 Title: Quota
[openstack-dev] Top Gate Bugs
TL;DR: Gate is failing 23% of the time due to bugs in nova, neutron and tempest. We need help fixing these bugs. Hi All, Before going any further we have a bug that is affecting gate and stable, so its getting top priority here. elastic-recheck currently doesn't track unit tests because we don't expect them to fail very often. Turns out that assessment was wrong, we now have a nova py27 unit test bug in gate and stable gate. https://bugs.launchpad.net/nova/+bug/1216851 Title: nova unit tests occasionally fail migration tests for mysql and postgres Hits FAILURE: 74 The failures appear multiple times for a single job, and some of those are due to bad patches in the check queue. But this is being seen in stable and trunk gate so something is definitely wrong. === Its time for another edition of of 'Top Gate Bugs.' I am sending this out now because in addition to our usual gate bugs a few new ones have cropped up recently, and as we saw a few weeks ago it doesn't take very many new bugs to wedge the gate. Currently the gate has a failure rate of at least 23%! [0] Note: this email was generated with http://status.openstack.org/elastic-recheck/ and 'elastic-recheck-success' [1] 1) https://bugs.launchpad.net/bugs/1253896 Title: test_minimum_basic_scenario fails with SSHException: Error reading SSH protocol banner Projects: neutron, nova, tempest Hits FAILURE: 324 This one has been around for several weeks now and although we have made some attempts at fixing this, we aren't any closer at resolving this then we were a few weeks ago. 2) https://bugs.launchpad.net/bugs/1251448 Title: BadRequest: Multiple possible networks found, use a Network ID to be more specific. Project: neutron Hits FAILURE: 141 3) https://bugs.launchpad.net/bugs/1249065 Title: Tempest failure: tempest/scenario/test_snapshot_pattern.py Project: nova Hits FAILURE: 112 This is a bug in nova's neutron code. 4) https://bugs.launchpad.net/bugs/1250168 Title: gate-tempest-devstack-vm-neutron-large-ops is failing Projects: neutron, nova Hits FAILURE: 94 This is an old bug that was fixed, but came back on December 3rd. So this is a recent regression. This may be an infra issue. 5) https://bugs.launchpad.net/bugs/1210483 Title: ServerAddressesTestXML.test_list_server_addresses FAIL Projects: neutron, nova Hits FAILURE: 73 This has had some attempts made at fixing it but its still around. In addition to the existing bugs, we have some new bugs on the rise: 1) https://bugs.launchpad.net/bugs/1257626 Title: Timeout while waiting on RPC response - topic: network, RPC method: allocate_for_instance info: unknown Project: nova Hits FAILURE: 52 large-ops only bug. This has been around for at least two weeks, but we have seen this in higher numbers starting around December 3rd. This may be an infrastructure issue as the neutron-large-ops started failing more around the same time. 2) https://bugs.launchpad.net/bugs/1257641 Title: Quota exceeded for instances: Requested 1, but already used 10 of 10 instances Projects: nova, tempest Hits FAILURE: 41 Like the previous bug, this has been around for at least two weeks but appears to be on the rise. Raw Data: http://paste.openstack.org/show/54419/ best, Joe [0] failure rate = 1-(success rate gate-tempest-dsvm-neutron)*(success rate ...) * ... gate-tempest-dsvm-neutron = 0.00 gate-tempest-dsvm-neutron-large-ops = 11.11 gate-tempest-dsvm-full = 11.11 gate-tempest-dsvm-large-ops = 4.55 gate-tempest-dsvm-postgres-full = 10.00 gate-grenade-dsvm = 0.00 (I hope I got the math right here) [1] http://git.openstack.org/cgit/openstack-infra/elastic-recheck/tree/elastic_recheck/cmd/check_success.py ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] Top Gate Bugs
Hi Clark, 2013/11/21 Clark Boylan clark.boy...@gmail.com: Joe seemed to be on the same track with https://review.openstack.org/#/q/status:open+project:openstack/tempest+branch:master+topic:57578,n,z but went far enough to revert the change that introduced that test. A couple people were going to keep hitting those changes to run them through more tests and see if 1251920 goes away. Thanks for updating my patch and pushing to approve it. Now 1251920 went away from gerrit :-) I don't quite understand why this test is problematic (Joe indicated it went in at about the time 1251920 became a problem). I would be very interested in finding out why this caused a problem. test_create_backup deletes two server snapshot images at the end, and I guess the deleting process runs with the next test(test_get_console_output) in parallel. As the result, heavy workload causes at test_get_console_output, and it is a little difficult to get console log. Now the problem is in work around, I think we would solve it by waiting for the end of the image delete in each test. I will dig this problem more next week. You can see frequencies for bugs with known signatures at http://status.openstack.org/elastic-recheck/ Thank you for the info, that is interesting. Thanks Ken'ichi Ohmichi ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] Top Gate Bugs
On Wednesday, November 20, 2013 11:53:45 PM, Clark Boylan wrote: On Wed, Nov 20, 2013 at 9:43 PM, Ken'ichi Ohmichi ken1ohmi...@gmail.com wrote: Hi Joe, 2013/11/20 Joe Gordon joe.gord...@gmail.com: Hi All, As many of you have noticed the gate has been in very bad shape over the past few days. Here is a list of some of the top open bugs (without pending patches, and many recent hits) that we are hitting. Gate won't be stable, and it will be hard to get your code merged, until we fix these bugs. 1) https://bugs.launchpad.net/bugs/1251920 nova 468 Hits Can we know the frequency of each failure? I'm trying 1251920 and putting the investigation tempest patch. https://review.openstack.org/#/c/57193/ The patch can avoid this problem 4 times, but I am not sure this is worth or not. Thanks Ken'ichi Ohmichi --- 2) https://bugs.launchpad.net/bugs/1251784 neutron, Nova 328 Hits 3) https://bugs.launchpad.net/bugs/1249065 neutron 122 hits 4) https://bugs.launchpad.net/bugs/1251448 neutron 65 Hits Raw Data: Note: If a bug has any hits for anything besides failure, it means the fingerprint isn't perfect. Elastic recheck known issues Bug: https://bugs.launchpad.net/bugs/1251920 = message:assertionerror: console output was empty AND filename:console.html Title: Tempest failures due to failure to return console logs from an instance Project: Status nova: Confirmed Hits FAILURE: 468 Bug: https://bugs.launchpad.net/bugs/1251784 = message:Connection to neutron failed: Maximum attempts reached AND filename:logs/screen-n-cpu.txt Title: nova+neutron scheduling error: Connection to neutron failed: Maximum attempts reached Project: Status neutron: New nova: New Hits FAILURE: 328 UNSTABLE: 13 SUCCESS: 275 Bug: https://bugs.launchpad.net/bugs/1240256 = message: 503 AND filename:logs/syslog.txt AND syslog_program:proxy-server Title: swift proxy-server returning 503 during tempest run Project: Status openstack-ci: Incomplete swift: New tempest: New Hits FAILURE: 136 SUCCESS: 83 Pending Patch Bug: https://bugs.launchpad.net/bugs/1249065 = message:No nw_info cache associated with instance AND filename:logs/screen-n-api.txt Title: Tempest failure: tempest/scenario/test_snapshot_pattern.py Project: Status neutron: New nova: Confirmed Hits FAILURE: 122 Bug: https://bugs.launchpad.net/bugs/1252514 = message:Got error from Swift: put_object AND filename:logs/screen-g-api.txt Title: glance doesn't recover if Swift returns an error Project: Status devstack: New glance: New swift: New Hits FAILURE: 95 Pending Patch Bug: https://bugs.launchpad.net/bugs/1244255 = message:NovaException: Unexpected vif_type=binding_failed AND filename:logs/screen-n-cpu.txt Title: binding_failed because of l2 agent assumed down Project: Status neutron: Fix Committed Hits FAILURE: 92 SUCCESS: 29 Bug: https://bugs.launchpad.net/bugs/1251448 = message: possible networks found, use a Network ID to be more specific. (HTTP 400) AND filename:console.html Title: BadRequest: Multiple possible networks found, use a Network ID to be more specific. Project: Status neutron: New Hits FAILURE: 65 Bug: https://bugs.launchpad.net/bugs/1239856 = message:tempest/services AND message:/images_client.py AND message:wait_for_image_status AND filename:console.html Title: TimeoutException: Request timed out on tempest.api.compute.images.test_list_image_filters.ListImageFiltersTestXML Project: Status glance: New Hits FAILURE: 62 Bug: https://bugs.launchpad.net/bugs/1235435 = message:One or more ports have an IP allocation from this subnet AND message: SubnetInUse: Unable to complete operation on subnet AND filename:logs/screen-q-svc.txt Title: 'SubnetInUse: Unable to complete operation on subnet UUID. One or more ports have an IP allocation from this subnet.' Project: Status neutron: Incomplete nova: Fix Committed tempest: New Hits FAILURE: 48 Bug: https://bugs.launchpad.net/bugs/1224001 = message:tempest.scenario.test_network_basic_ops AssertionError: Timed out waiting for AND filename:console.html Title: test_network_basic_ops fails waiting for network to become available Project: Status neutron: In Progress swift: Invalid tempest: Invalid Hits FAILURE: 42 Bug: https://bugs.launchpad.net/bugs/1218391 = message:Cannot 'createImage' AND filename:console.html Title: tempest.api.compute.images.test_images_oneserver.ImagesOneServerTestXML.test_delete_image_that_is_not_yet_active spurious failure Project: Status nova: Confirmed swift: Confirmed tempest: Confirmed Hits FAILURE: 25 best, Joe Gordon ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev Joe seemed to be on the same track with
Re: [openstack-dev] Top Gate Bugs
On Fri, Nov 22, 2013 at 2:28 AM, Matt Riedemann mrie...@linux.vnet.ibm.comwrote: On Wednesday, November 20, 2013 11:53:45 PM, Clark Boylan wrote: On Wed, Nov 20, 2013 at 9:43 PM, Ken'ichi Ohmichi ken1ohmi...@gmail.com wrote: Hi Joe, 2013/11/20 Joe Gordon joe.gord...@gmail.com: Hi All, As many of you have noticed the gate has been in very bad shape over the past few days. Here is a list of some of the top open bugs (without pending patches, and many recent hits) that we are hitting. Gate won't be stable, and it will be hard to get your code merged, until we fix these bugs. 1) https://bugs.launchpad.net/bugs/1251920 nova 468 Hits Can we know the frequency of each failure? I'm trying 1251920 and putting the investigation tempest patch. https://review.openstack.org/#/c/57193/ The patch can avoid this problem 4 times, but I am not sure this is worth or not. Thanks Ken'ichi Ohmichi --- 2) https://bugs.launchpad.net/bugs/1251784 neutron, Nova 328 Hits 3) https://bugs.launchpad.net/bugs/1249065 neutron 122 hits 4) https://bugs.launchpad.net/bugs/1251448 neutron 65 Hits Raw Data: Note: If a bug has any hits for anything besides failure, it means the fingerprint isn't perfect. Elastic recheck known issues Bug: https://bugs.launchpad.net/bugs/1251920 = message:assertionerror: console output was empty AND filename:console.html Title: Tempest failures due to failure to return console logs from an instance Project: Status nova: Confirmed Hits FAILURE: 468 Bug: https://bugs.launchpad.net/bugs/1251784 = message:Connection to neutron failed: Maximum attempts reached AND filename:logs/screen-n-cpu.txt Title: nova+neutron scheduling error: Connection to neutron failed: Maximum attempts reached Project: Status neutron: New nova: New Hits FAILURE: 328 UNSTABLE: 13 SUCCESS: 275 Bug: https://bugs.launchpad.net/bugs/1240256= message: 503 AND filename:logs/syslog.txt AND syslog_program:proxy-server Title: swift proxy-server returning 503 during tempest run Project: Status openstack-ci: Incomplete swift: New tempest: New Hits FAILURE: 136 SUCCESS: 83 Pending Patch Bug: https://bugs.launchpad.net/bugs/1249065 = message:No nw_info cache associated with instance AND filename:logs/screen-n-api. txt Title: Tempest failure: tempest/scenario/test_snapshot_pattern.py Project: Status neutron: New nova: Confirmed Hits FAILURE: 122 Bug: https://bugs.launchpad.net/bugs/1252514 = message:Got error from Swift: put_object AND filename:logs/screen-g-api.txt Title: glance doesn't recover if Swift returns an error Project: Status devstack: New glance: New swift: New Hits FAILURE: 95 Pending Patch Bug: https://bugs.launchpad.net/bugs/1244255 = message:NovaException: Unexpected vif_type=binding_failed AND filename:logs/screen-n-cpu.txt Title: binding_failed because of l2 agent assumed down Project: Status neutron: Fix Committed Hits FAILURE: 92 SUCCESS: 29 Bug: https://bugs.launchpad.net/bugs/1251448 = message: possible networks found, use a Network ID to be more specific. (HTTP 400) AND filename:console.html Title: BadRequest: Multiple possible networks found, use a Network ID to be more specific. Project: Status neutron: New Hits FAILURE: 65 Bug: https://bugs.launchpad.net/bugs/1239856 = message:tempest/services AND message:/images_client.py AND message:wait_for_image_status AND filename:console.html Title: TimeoutException: Request timed out on tempest.api.compute.images.test_list_image_filters. ListImageFiltersTestXML Project: Status glance: New Hits FAILURE: 62 Bug: https://bugs.launchpad.net/bugs/1235435 = message:One or more ports have an IP allocation from this subnet AND message: SubnetInUse: Unable to complete operation on subnet AND filename:logs/screen-q-svc.txt Title: 'SubnetInUse: Unable to complete operation on subnet UUID. One or more ports have an IP allocation from this subnet.' Project: Status neutron: Incomplete nova: Fix Committed tempest: New Hits FAILURE: 48 Bug: https://bugs.launchpad.net/bugs/1224001 = message:tempest.scenario.test_network_basic_ops AssertionError: Timed out waiting for AND filename:console.html Title: test_network_basic_ops fails waiting for network to become available Project: Status neutron: In Progress swift: Invalid tempest: Invalid Hits FAILURE: 42 Bug: https://bugs.launchpad.net/bugs/1218391 = message:Cannot 'createImage' AND filename:console.html Title: tempest.api.compute.images.test_images_oneserver. ImagesOneServerTestXML.test_delete_image_that_is_not_yet_active spurious failure Project: Status nova: Confirmed swift: Confirmed tempest: Confirmed Hits FAILURE: 25 best, Joe Gordon ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list
Re: [openstack-dev] Top Gate Bugs
Thanks for posting this, Joe. It really helps to create focus so we can address these bugs. We are chatting in #openstack-neutron about 1251784, 1249065, and 1251448. We are looking for someone to work on 1251784 - I had mentioned it at Monday's Neutron team meeting and am trying to shop it around in -neutron now. We need someone other than Salvatore, Aaron or Maru to work on this since they each have at least one very important bug they are working on. Please join us in #openstack-neutron and lend a hand - all of OpenStack needs your help. Bug 1249065 is assigned to Aaron Rosen, who isn't in the channel at the moment, so I don't have an update on his progress or any blockers he is facing. Hopefully (if you are reading this Aaron) he will join us in channel soon and I had hear from him about his status. Bug 1251448 is assigned to Maru Newby, who I am talking with now in -neutron. He is addressing the bug. I will share what information I have regarding this one when I have some. We are all looking forward to a more stable gate and this information really helps. Thanks again, Joe, Anita. On 11/20/2013 01:09 AM, Joe Gordon wrote: Hi All, As many of you have noticed the gate has been in very bad shape over the past few days. Here is a list of some of the top open bugs (without pending patches, and many recent hits) that we are hitting. Gate won't be stable, and it will be hard to get your code merged, until we fix these bugs. 1) https://bugs.launchpad.net/bugs/1251920 nova 468 Hits 2) https://bugs.launchpad.net/bugs/1251784 neutron, Nova 328 Hits 3) https://bugs.launchpad.net/bugs/1249065 neutron 122 hits 4) https://bugs.launchpad.net/bugs/1251448 neutron 65 Hits Raw Data: Note: If a bug has any hits for anything besides failure, it means the fingerprint isn't perfect. Elastic recheck known issues Bug: https://bugs.launchpad.net/bugs/1251920 = message:assertionerror: console output was empty AND filename:console.html Title: Tempest failures due to failure to return console logs from an instance Project: Status nova: Confirmed Hits FAILURE: 468 Bug: https://bugs.launchpad.net/bugs/1251784 = message:Connection to neutron failed: Maximum attempts reached AND filename:logs/screen-n-cpu.txt Title: nova+neutron scheduling error: Connection to neutron failed: Maximum attempts reached Project: Status neutron: New nova: New Hits FAILURE: 328 UNSTABLE: 13 SUCCESS: 275 Bug: https://bugs.launchpad.net/bugs/1240256 = message: 503 AND filename:logs/syslog.txt AND syslog_program:proxy-server Title: swift proxy-server returning 503 during tempest run Project: Status openstack-ci: Incomplete swift: New tempest: New Hits FAILURE: 136 SUCCESS: 83 Pending Patch Bug: https://bugs.launchpad.net/bugs/1249065 = message:No nw_info cache associated with instance AND filename:logs/screen-n-api.txt Title: Tempest failure: tempest/scenario/test_snapshot_pattern.py Project: Status neutron: New nova: Confirmed Hits FAILURE: 122 Bug: https://bugs.launchpad.net/bugs/1252514 = message:Got error from Swift: put_object AND filename:logs/screen-g-api.txt Title: glance doesn't recover if Swift returns an error Project: Status devstack: New glance: New swift: New Hits FAILURE: 95 Pending Patch Bug: https://bugs.launchpad.net/bugs/1244255 = message:NovaException: Unexpected vif_type=binding_failed AND filename:logs/screen-n-cpu.txt Title: binding_failed because of l2 agent assumed down Project: Status neutron: Fix Committed Hits FAILURE: 92 SUCCESS: 29 Bug: https://bugs.launchpad.net/bugs/1251448 = message: possible networks found, use a Network ID to be more specific. (HTTP 400) AND filename:console.html Title: BadRequest: Multiple possible networks found, use a Network ID to be more specific. Project: Status neutron: New Hits FAILURE: 65 Bug: https://bugs.launchpad.net/bugs/1239856 = message:tempest/services AND message:/images_client.py AND message:wait_for_image_status AND filename:console.html Title: TimeoutException: Request timed out on tempest.api.compute.images.test_list_image_filters.ListImageFiltersTestXML Project: Status glance: New Hits FAILURE: 62 Bug: https://bugs.launchpad.net/bugs/1235435 = message:One or more ports have an IP allocation from this subnet AND message: SubnetInUse: Unable to complete operation on subnet AND filename:logs/screen-q-svc.txt Title: 'SubnetInUse: Unable to complete operation on subnet UUID. One or more ports have an IP allocation from this subnet.' Project: Status neutron: Incomplete nova: Fix Committed tempest: New Hits FAILURE: 48 Bug: https://bugs.launchpad.net/bugs/1224001 = message:tempest.scenario.test_network_basic_ops AssertionError: Timed out waiting for AND filename:console.html Title: test_network_basic_ops fails waiting for network to become available Project: Status neutron: In Progress swift: Invalid tempest: Invalid Hits FAILURE: 42 Bug: https://bugs.launchpad.net/bugs/1218391 = message:Cannot
Re: [openstack-dev] Top Gate Bugs
On 20/11/13 14:21, Anita Kuno wrote: Thanks for posting this, Joe. It really helps to create focus so we can address these bugs. We are chatting in #openstack-neutron about 1251784, 1249065, and 1251448. We are looking for someone to work on 1251784 - I had mentioned it at Monday's Neutron team meeting and am trying to shop it around in -neutron now. We need someone other than Salvatore, Aaron or Maru to work on this since they each have at least one very important bug they are working on. Please join us in #openstack-neutron and lend a hand - all of OpenStack needs your help. I've been hitting this in tripleo intermittently for the last few days (or it at least looks to be the same bug), this morning while trying to debug the problem I noticed http request/responses happening out of order. I've added details to the bug. https://bugs.launchpad.net/tripleo/+bug/1251784 Bug 1249065 is assigned to Aaron Rosen, who isn't in the channel at the moment, so I don't have an update on his progress or any blockers he is facing. Hopefully (if you are reading this Aaron) he will join us in channel soon and I had hear from him about his status. Bug 1251448 is assigned to Maru Newby, who I am talking with now in -neutron. He is addressing the bug. I will share what information I have regarding this one when I have some. We are all looking forward to a more stable gate and this information really helps. Thanks again, Joe, Anita. On 11/20/2013 01:09 AM, Joe Gordon wrote: Hi All, As many of you have noticed the gate has been in very bad shape over the past few days. Here is a list of some of the top open bugs (without pending patches, and many recent hits) that we are hitting. Gate won't be stable, and it will be hard to get your code merged, until we fix these bugs. 1) https://bugs.launchpad.net/bugs/1251920 nova 468 Hits 2) https://bugs.launchpad.net/bugs/1251784 neutron, Nova 328 Hits 3) https://bugs.launchpad.net/bugs/1249065 neutron 122 hits 4) https://bugs.launchpad.net/bugs/1251448 neutron 65 Hits Raw Data: Note: If a bug has any hits for anything besides failure, it means the fingerprint isn't perfect. Elastic recheck known issues Bug: https://bugs.launchpad.net/bugs/1251920 = message:assertionerror: console output was empty AND filename:console.html Title: Tempest failures due to failure to return console logs from an instance Project: Status nova: Confirmed Hits FAILURE: 468 Bug: https://bugs.launchpad.net/bugs/1251784 = message:Connection to neutron failed: Maximum attempts reached AND filename:logs/screen-n-cpu.txt Title: nova+neutron scheduling error: Connection to neutron failed: Maximum attempts reached Project: Status neutron: New nova: New Hits FAILURE: 328 UNSTABLE: 13 SUCCESS: 275 Bug: https://bugs.launchpad.net/bugs/1240256 = message: 503 AND filename:logs/syslog.txt AND syslog_program:proxy-server Title: swift proxy-server returning 503 during tempest run Project: Status openstack-ci: Incomplete swift: New tempest: New Hits FAILURE: 136 SUCCESS: 83 Pending Patch Bug: https://bugs.launchpad.net/bugs/1249065 = message:No nw_info cache associated with instance AND filename:logs/screen-n-api.txt Title: Tempest failure: tempest/scenario/test_snapshot_pattern.py Project: Status neutron: New nova: Confirmed Hits FAILURE: 122 Bug: https://bugs.launchpad.net/bugs/1252514 = message:Got error from Swift: put_object AND filename:logs/screen-g-api.txt Title: glance doesn't recover if Swift returns an error Project: Status devstack: New glance: New swift: New Hits FAILURE: 95 Pending Patch Bug: https://bugs.launchpad.net/bugs/1244255 = message:NovaException: Unexpected vif_type=binding_failed AND filename:logs/screen-n-cpu.txt Title: binding_failed because of l2 agent assumed down Project: Status neutron: Fix Committed Hits FAILURE: 92 SUCCESS: 29 Bug: https://bugs.launchpad.net/bugs/1251448 = message: possible networks found, use a Network ID to be more specific. (HTTP 400) AND filename:console.html Title: BadRequest: Multiple possible networks found, use a Network ID to be more specific. Project: Status neutron: New Hits FAILURE: 65 Bug: https://bugs.launchpad.net/bugs/1239856 = message:tempest/services AND message:/images_client.py AND message:wait_for_image_status AND filename:console.html Title: TimeoutException: Request timed out on tempest.api.compute.images.test_list_image_filters.ListImageFiltersTestXML Project: Status glance: New Hits FAILURE: 62 Bug: https://bugs.launchpad.net/bugs/1235435 = message:One or more ports have an IP allocation from this subnet AND message: SubnetInUse: Unable to complete operation on subnet AND filename:logs/screen-q-svc.txt Title: 'SubnetInUse: Unable to complete operation on subnet UUID. One or more ports have an IP allocation from this subnet.' Project: Status neutron: Incomplete nova: Fix Committed tempest: New Hits FAILURE: 48 Bug:
Re: [openstack-dev] Top Gate Bugs
Hey all, I think I found a serious bug in our usage of eventlet thread local storage. Please check out this snippet [1]. This is how we use eventlet TLS in Nova and common Oslo code [2]. This could explain how [3] actually breaks TripleO devtest story and our gates. Am I right? Or I am missing something and should get some sleep? :) Thanks, Roman [1] http://paste.openstack.org/show/53686/ [2] https://github.com/openstack/nova/blob/master/nova/openstack/common/local.py#L48 [3] https://github.com/openstack/nova/commit/85332012dede96fa6729026c2a90594ea0502ac5 On Wed, Nov 20, 2013 at 5:55 PM, Derek Higgins der...@redhat.com wrote: On 20/11/13 14:21, Anita Kuno wrote: Thanks for posting this, Joe. It really helps to create focus so we can address these bugs. We are chatting in #openstack-neutron about 1251784, 1249065, and 1251448. We are looking for someone to work on 1251784 - I had mentioned it at Monday's Neutron team meeting and am trying to shop it around in -neutron now. We need someone other than Salvatore, Aaron or Maru to work on this since they each have at least one very important bug they are working on. Please join us in #openstack-neutron and lend a hand - all of OpenStack needs your help. I've been hitting this in tripleo intermittently for the last few days (or it at least looks to be the same bug), this morning while trying to debug the problem I noticed http request/responses happening out of order. I've added details to the bug. https://bugs.launchpad.net/tripleo/+bug/1251784 Bug 1249065 is assigned to Aaron Rosen, who isn't in the channel at the moment, so I don't have an update on his progress or any blockers he is facing. Hopefully (if you are reading this Aaron) he will join us in channel soon and I had hear from him about his status. Bug 1251448 is assigned to Maru Newby, who I am talking with now in -neutron. He is addressing the bug. I will share what information I have regarding this one when I have some. We are all looking forward to a more stable gate and this information really helps. Thanks again, Joe, Anita. On 11/20/2013 01:09 AM, Joe Gordon wrote: Hi All, As many of you have noticed the gate has been in very bad shape over the past few days. Here is a list of some of the top open bugs (without pending patches, and many recent hits) that we are hitting. Gate won't be stable, and it will be hard to get your code merged, until we fix these bugs. 1) https://bugs.launchpad.net/bugs/1251920 nova 468 Hits 2) https://bugs.launchpad.net/bugs/1251784 neutron, Nova 328 Hits 3) https://bugs.launchpad.net/bugs/1249065 neutron 122 hits 4) https://bugs.launchpad.net/bugs/1251448 neutron 65 Hits Raw Data: Note: If a bug has any hits for anything besides failure, it means the fingerprint isn't perfect. Elastic recheck known issues Bug: https://bugs.launchpad.net/bugs/1251920 = message:assertionerror: console output was empty AND filename:console.html Title: Tempest failures due to failure to return console logs from an instance Project: Status nova: Confirmed Hits FAILURE: 468 Bug: https://bugs.launchpad.net/bugs/1251784 = message:Connection to neutron failed: Maximum attempts reached AND filename:logs/screen-n-cpu.txt Title: nova+neutron scheduling error: Connection to neutron failed: Maximum attempts reached Project: Status neutron: New nova: New Hits FAILURE: 328 UNSTABLE: 13 SUCCESS: 275 Bug: https://bugs.launchpad.net/bugs/1240256 = message: 503 AND filename:logs/syslog.txt AND syslog_program:proxy-server Title: swift proxy-server returning 503 during tempest run Project: Status openstack-ci: Incomplete swift: New tempest: New Hits FAILURE: 136 SUCCESS: 83 Pending Patch Bug: https://bugs.launchpad.net/bugs/1249065 = message:No nw_info cache associated with instance AND filename:logs/screen-n-api.txt Title: Tempest failure: tempest/scenario/test_snapshot_pattern.py Project: Status neutron: New nova: Confirmed Hits FAILURE: 122 Bug: https://bugs.launchpad.net/bugs/1252514 = message:Got error from Swift: put_object AND filename:logs/screen-g-api.txt Title: glance doesn't recover if Swift returns an error Project: Status devstack: New glance: New swift: New Hits FAILURE: 95 Pending Patch Bug: https://bugs.launchpad.net/bugs/1244255 = message:NovaException: Unexpected vif_type=binding_failed AND filename:logs/screen-n-cpu.txt Title: binding_failed because of l2 agent assumed down Project: Status neutron: Fix Committed Hits FAILURE: 92 SUCCESS: 29 Bug: https://bugs.launchpad.net/bugs/1251448 = message: possible networks found, use a Network ID to be more specific. (HTTP 400) AND filename:console.html Title: BadRequest: Multiple possible networks found, use a Network ID to be more specific. Project: Status neutron: New Hits FAILURE: 65 Bug: https://bugs.launchpad.net/bugs/1239856 = message:tempest/services AND message:/images_client.py AND message:wait_for_image_status AND
Re: [openstack-dev] Top Gate Bugs
Nope, you're totally right, corolocal.local is a class, whose instances are the actual coroutine local storage. Alex On Wed, Nov 20, 2013 at 9:11 AM, Roman Podoliaka rpodoly...@mirantis.comwrote: Hey all, I think I found a serious bug in our usage of eventlet thread local storage. Please check out this snippet [1]. This is how we use eventlet TLS in Nova and common Oslo code [2]. This could explain how [3] actually breaks TripleO devtest story and our gates. Am I right? Or I am missing something and should get some sleep? :) Thanks, Roman [1] http://paste.openstack.org/show/53686/ [2] https://github.com/openstack/nova/blob/master/nova/openstack/common/local.py#L48 [3] https://github.com/openstack/nova/commit/85332012dede96fa6729026c2a90594ea0502ac5 On Wed, Nov 20, 2013 at 5:55 PM, Derek Higgins der...@redhat.com wrote: On 20/11/13 14:21, Anita Kuno wrote: Thanks for posting this, Joe. It really helps to create focus so we can address these bugs. We are chatting in #openstack-neutron about 1251784, 1249065, and 1251448. We are looking for someone to work on 1251784 - I had mentioned it at Monday's Neutron team meeting and am trying to shop it around in -neutron now. We need someone other than Salvatore, Aaron or Maru to work on this since they each have at least one very important bug they are working on. Please join us in #openstack-neutron and lend a hand - all of OpenStack needs your help. I've been hitting this in tripleo intermittently for the last few days (or it at least looks to be the same bug), this morning while trying to debug the problem I noticed http request/responses happening out of order. I've added details to the bug. https://bugs.launchpad.net/tripleo/+bug/1251784 Bug 1249065 is assigned to Aaron Rosen, who isn't in the channel at the moment, so I don't have an update on his progress or any blockers he is facing. Hopefully (if you are reading this Aaron) he will join us in channel soon and I had hear from him about his status. Bug 1251448 is assigned to Maru Newby, who I am talking with now in -neutron. He is addressing the bug. I will share what information I have regarding this one when I have some. We are all looking forward to a more stable gate and this information really helps. Thanks again, Joe, Anita. On 11/20/2013 01:09 AM, Joe Gordon wrote: Hi All, As many of you have noticed the gate has been in very bad shape over the past few days. Here is a list of some of the top open bugs (without pending patches, and many recent hits) that we are hitting. Gate won't be stable, and it will be hard to get your code merged, until we fix these bugs. 1) https://bugs.launchpad.net/bugs/1251920 nova 468 Hits 2) https://bugs.launchpad.net/bugs/1251784 neutron, Nova 328 Hits 3) https://bugs.launchpad.net/bugs/1249065 neutron 122 hits 4) https://bugs.launchpad.net/bugs/1251448 neutron 65 Hits Raw Data: Note: If a bug has any hits for anything besides failure, it means the fingerprint isn't perfect. Elastic recheck known issues Bug: https://bugs.launchpad.net/bugs/1251920 = message:assertionerror: console output was empty AND filename:console.html Title: Tempest failures due to failure to return console logs from an instance Project: Status nova: Confirmed Hits FAILURE: 468 Bug: https://bugs.launchpad.net/bugs/1251784 = message:Connection to neutron failed: Maximum attempts reached AND filename:logs/screen-n-cpu.txt Title: nova+neutron scheduling error: Connection to neutron failed: Maximum attempts reached Project: Status neutron: New nova: New Hits FAILURE: 328 UNSTABLE: 13 SUCCESS: 275 Bug: https://bugs.launchpad.net/bugs/1240256= message: 503 AND filename:logs/syslog.txt AND syslog_program:proxy-server Title: swift proxy-server returning 503 during tempest run Project: Status openstack-ci: Incomplete swift: New tempest: New Hits FAILURE: 136 SUCCESS: 83 Pending Patch Bug: https://bugs.launchpad.net/bugs/1249065 = message:No nw_info cache associated with instance AND filename:logs/screen-n-api.txt Title: Tempest failure: tempest/scenario/test_snapshot_pattern.py Project: Status neutron: New nova: Confirmed Hits FAILURE: 122 Bug: https://bugs.launchpad.net/bugs/1252514 = message:Got error from Swift: put_object AND filename:logs/screen-g-api.txt Title: glance doesn't recover if Swift returns an error Project: Status devstack: New glance: New swift: New Hits FAILURE: 95 Pending Patch Bug: https://bugs.launchpad.net/bugs/1244255 = message:NovaException: Unexpected vif_type=binding_failed AND filename:logs/screen-n-cpu.txt Title: binding_failed because of l2 agent assumed down Project: Status neutron: Fix Committed Hits FAILURE: 92 SUCCESS: 29 Bug: https://bugs.launchpad.net/bugs/1251448 = message: possible networks found, use a Network ID to be more specific. (HTTP
Re: [openstack-dev] Top Gate Bugs
On 11/20/2013 12:21 PM, Alex Gaynor wrote: Nope, you're totally right, corolocal.local is a class, whose instances are the actual coroutine local storage. But I don't think his example is what is being used. Here is an example using the openstack.common.local module, which is what nova uses for this. This produces the expected output. http://paste.openstack.org/show/53687/ https://git.openstack.org/cgit/openstack/nova/tree/nova/openstack/common/local.py For reference, original example from OP: http://paste.openstack.org/show/53686/ -- Russell Bryant ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] Top Gate Bugs
We settled on 1251920. https://review.openstack.org/57509 is the fix for that bug. Note that Oslo was fixed on Jun 28th, nova hasn't synced since then. If we were using oslo as a library we would have had the fix as soon as olso did a release. These are the references to strong_store - and thus broken in nova trunk (and if any references exist in H, in H too): ./nova/network/neutronv2/__init__.py:58:if not hasattr(local.strong_store, 'neutron_client'): ./nova/network/neutronv2/__init__.py:59: local.strong_store.neutron_client = _get_client(token=None) ./nova/network/neutronv2/__init__.py:60:return local.strong_store.neutron_client ./nova/openstack/common/rpc/__init__.py:102:if ((hasattr(local.strong_store, 'locks_held') ./nova/openstack/common/rpc/__init__.py:103: and local.strong_store.locks_held)): ./nova/openstack/common/rpc/__init__.py:108: {'locks': local.strong_store.locks_held, ./nova/openstack/common/local.py:47:strong_store = threading.local() ./nova/openstack/common/lockutils.py:173:if not hasattr(local.strong_store, 'locks_held'): ./nova/openstack/common/lockutils.py:174: local.strong_store.locks_held = [] ./nova/openstack/common/lockutils.py:175: local.strong_store.locks_held.append(name) ./nova/openstack/common/lockutils.py:217: local.strong_store.locks_held.remove(name) ./nova/tests/network/test_neutronv2.py:1837: local.strong_store.neutron_client = None So we can expect lockutils to be broken, and rpc to be broken. Clearly they are being impacted more subtly than the neutron client usage. -Rob On 21 November 2013 07:44, Robert Collins robe...@robertcollins.net wrote: Which of these bugs would be appropriate to use for the fix to strong_store - it affects lockutils and rpc, both of which are going to create havoc :) -Rob On 21 November 2013 07:19, Salvatore Orlando sorla...@nicira.com wrote: I've noticed that https://github.com/openstack/nova/commit/85332012dede96fa6729026c2a90594ea0502ac5 stores the network client in local.strong_store which is a reference to corolocal.local (the class, not the instance). In Russell's example instead the code accesses local.store which is an instance of WeakLocal (inheriting from corolocal.local). Perhaps then Roman's findings apply to the issue being observed on the gate. Regards, Salvatore On 20 November 2013 18:32, Russell Bryant rbry...@redhat.com wrote: On 11/20/2013 12:21 PM, Alex Gaynor wrote: Nope, you're totally right, corolocal.local is a class, whose instances are the actual coroutine local storage. But I don't think his example is what is being used. Here is an example using the openstack.common.local module, which is what nova uses for this. This produces the expected output. http://paste.openstack.org/show/53687/ https://git.openstack.org/cgit/openstack/nova/tree/nova/openstack/common/local.py For reference, original example from OP: http://paste.openstack.org/show/53686/ -- Russell Bryant ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev -- Robert Collins rbtcoll...@hp.com Distinguished Technologist HP Converged Cloud -- Robert Collins rbtcoll...@hp.com Distinguished Technologist HP Converged Cloud ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] Top Gate Bugs
Hi Joe, 2013/11/20 Joe Gordon joe.gord...@gmail.com: Hi All, As many of you have noticed the gate has been in very bad shape over the past few days. Here is a list of some of the top open bugs (without pending patches, and many recent hits) that we are hitting. Gate won't be stable, and it will be hard to get your code merged, until we fix these bugs. 1) https://bugs.launchpad.net/bugs/1251920 nova 468 Hits Can we know the frequency of each failure? I'm trying 1251920 and putting the investigation tempest patch. https://review.openstack.org/#/c/57193/ The patch can avoid this problem 4 times, but I am not sure this is worth or not. Thanks Ken'ichi Ohmichi --- 2) https://bugs.launchpad.net/bugs/1251784 neutron, Nova 328 Hits 3) https://bugs.launchpad.net/bugs/1249065 neutron 122 hits 4) https://bugs.launchpad.net/bugs/1251448 neutron 65 Hits Raw Data: Note: If a bug has any hits for anything besides failure, it means the fingerprint isn't perfect. Elastic recheck known issues Bug: https://bugs.launchpad.net/bugs/1251920 = message:assertionerror: console output was empty AND filename:console.html Title: Tempest failures due to failure to return console logs from an instance Project: Status nova: Confirmed Hits FAILURE: 468 Bug: https://bugs.launchpad.net/bugs/1251784 = message:Connection to neutron failed: Maximum attempts reached AND filename:logs/screen-n-cpu.txt Title: nova+neutron scheduling error: Connection to neutron failed: Maximum attempts reached Project: Status neutron: New nova: New Hits FAILURE: 328 UNSTABLE: 13 SUCCESS: 275 Bug: https://bugs.launchpad.net/bugs/1240256 = message: 503 AND filename:logs/syslog.txt AND syslog_program:proxy-server Title: swift proxy-server returning 503 during tempest run Project: Status openstack-ci: Incomplete swift: New tempest: New Hits FAILURE: 136 SUCCESS: 83 Pending Patch Bug: https://bugs.launchpad.net/bugs/1249065 = message:No nw_info cache associated with instance AND filename:logs/screen-n-api.txt Title: Tempest failure: tempest/scenario/test_snapshot_pattern.py Project: Status neutron: New nova: Confirmed Hits FAILURE: 122 Bug: https://bugs.launchpad.net/bugs/1252514 = message:Got error from Swift: put_object AND filename:logs/screen-g-api.txt Title: glance doesn't recover if Swift returns an error Project: Status devstack: New glance: New swift: New Hits FAILURE: 95 Pending Patch Bug: https://bugs.launchpad.net/bugs/1244255 = message:NovaException: Unexpected vif_type=binding_failed AND filename:logs/screen-n-cpu.txt Title: binding_failed because of l2 agent assumed down Project: Status neutron: Fix Committed Hits FAILURE: 92 SUCCESS: 29 Bug: https://bugs.launchpad.net/bugs/1251448 = message: possible networks found, use a Network ID to be more specific. (HTTP 400) AND filename:console.html Title: BadRequest: Multiple possible networks found, use a Network ID to be more specific. Project: Status neutron: New Hits FAILURE: 65 Bug: https://bugs.launchpad.net/bugs/1239856 = message:tempest/services AND message:/images_client.py AND message:wait_for_image_status AND filename:console.html Title: TimeoutException: Request timed out on tempest.api.compute.images.test_list_image_filters.ListImageFiltersTestXML Project: Status glance: New Hits FAILURE: 62 Bug: https://bugs.launchpad.net/bugs/1235435 = message:One or more ports have an IP allocation from this subnet AND message: SubnetInUse: Unable to complete operation on subnet AND filename:logs/screen-q-svc.txt Title: 'SubnetInUse: Unable to complete operation on subnet UUID. One or more ports have an IP allocation from this subnet.' Project: Status neutron: Incomplete nova: Fix Committed tempest: New Hits FAILURE: 48 Bug: https://bugs.launchpad.net/bugs/1224001 = message:tempest.scenario.test_network_basic_ops AssertionError: Timed out waiting for AND filename:console.html Title: test_network_basic_ops fails waiting for network to become available Project: Status neutron: In Progress swift: Invalid tempest: Invalid Hits FAILURE: 42 Bug: https://bugs.launchpad.net/bugs/1218391 = message:Cannot 'createImage' AND filename:console.html Title: tempest.api.compute.images.test_images_oneserver.ImagesOneServerTestXML.test_delete_image_that_is_not_yet_active spurious failure Project: Status nova: Confirmed swift: Confirmed tempest: Confirmed Hits FAILURE: 25 best, Joe Gordon ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] Top Gate Bugs
On Wed, Nov 20, 2013 at 9:43 PM, Ken'ichi Ohmichi ken1ohmi...@gmail.com wrote: Hi Joe, 2013/11/20 Joe Gordon joe.gord...@gmail.com: Hi All, As many of you have noticed the gate has been in very bad shape over the past few days. Here is a list of some of the top open bugs (without pending patches, and many recent hits) that we are hitting. Gate won't be stable, and it will be hard to get your code merged, until we fix these bugs. 1) https://bugs.launchpad.net/bugs/1251920 nova 468 Hits Can we know the frequency of each failure? I'm trying 1251920 and putting the investigation tempest patch. https://review.openstack.org/#/c/57193/ The patch can avoid this problem 4 times, but I am not sure this is worth or not. Thanks Ken'ichi Ohmichi --- 2) https://bugs.launchpad.net/bugs/1251784 neutron, Nova 328 Hits 3) https://bugs.launchpad.net/bugs/1249065 neutron 122 hits 4) https://bugs.launchpad.net/bugs/1251448 neutron 65 Hits Raw Data: Note: If a bug has any hits for anything besides failure, it means the fingerprint isn't perfect. Elastic recheck known issues Bug: https://bugs.launchpad.net/bugs/1251920 = message:assertionerror: console output was empty AND filename:console.html Title: Tempest failures due to failure to return console logs from an instance Project: Status nova: Confirmed Hits FAILURE: 468 Bug: https://bugs.launchpad.net/bugs/1251784 = message:Connection to neutron failed: Maximum attempts reached AND filename:logs/screen-n-cpu.txt Title: nova+neutron scheduling error: Connection to neutron failed: Maximum attempts reached Project: Status neutron: New nova: New Hits FAILURE: 328 UNSTABLE: 13 SUCCESS: 275 Bug: https://bugs.launchpad.net/bugs/1240256 = message: 503 AND filename:logs/syslog.txt AND syslog_program:proxy-server Title: swift proxy-server returning 503 during tempest run Project: Status openstack-ci: Incomplete swift: New tempest: New Hits FAILURE: 136 SUCCESS: 83 Pending Patch Bug: https://bugs.launchpad.net/bugs/1249065 = message:No nw_info cache associated with instance AND filename:logs/screen-n-api.txt Title: Tempest failure: tempest/scenario/test_snapshot_pattern.py Project: Status neutron: New nova: Confirmed Hits FAILURE: 122 Bug: https://bugs.launchpad.net/bugs/1252514 = message:Got error from Swift: put_object AND filename:logs/screen-g-api.txt Title: glance doesn't recover if Swift returns an error Project: Status devstack: New glance: New swift: New Hits FAILURE: 95 Pending Patch Bug: https://bugs.launchpad.net/bugs/1244255 = message:NovaException: Unexpected vif_type=binding_failed AND filename:logs/screen-n-cpu.txt Title: binding_failed because of l2 agent assumed down Project: Status neutron: Fix Committed Hits FAILURE: 92 SUCCESS: 29 Bug: https://bugs.launchpad.net/bugs/1251448 = message: possible networks found, use a Network ID to be more specific. (HTTP 400) AND filename:console.html Title: BadRequest: Multiple possible networks found, use a Network ID to be more specific. Project: Status neutron: New Hits FAILURE: 65 Bug: https://bugs.launchpad.net/bugs/1239856 = message:tempest/services AND message:/images_client.py AND message:wait_for_image_status AND filename:console.html Title: TimeoutException: Request timed out on tempest.api.compute.images.test_list_image_filters.ListImageFiltersTestXML Project: Status glance: New Hits FAILURE: 62 Bug: https://bugs.launchpad.net/bugs/1235435 = message:One or more ports have an IP allocation from this subnet AND message: SubnetInUse: Unable to complete operation on subnet AND filename:logs/screen-q-svc.txt Title: 'SubnetInUse: Unable to complete operation on subnet UUID. One or more ports have an IP allocation from this subnet.' Project: Status neutron: Incomplete nova: Fix Committed tempest: New Hits FAILURE: 48 Bug: https://bugs.launchpad.net/bugs/1224001 = message:tempest.scenario.test_network_basic_ops AssertionError: Timed out waiting for AND filename:console.html Title: test_network_basic_ops fails waiting for network to become available Project: Status neutron: In Progress swift: Invalid tempest: Invalid Hits FAILURE: 42 Bug: https://bugs.launchpad.net/bugs/1218391 = message:Cannot 'createImage' AND filename:console.html Title: tempest.api.compute.images.test_images_oneserver.ImagesOneServerTestXML.test_delete_image_that_is_not_yet_active spurious failure Project: Status nova: Confirmed swift: Confirmed tempest: Confirmed Hits FAILURE: 25 best, Joe Gordon ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev Joe seemed to be on the same track with
[openstack-dev] Top Gate Bugs
Hi All, As many of you have noticed the gate has been in very bad shape over the past few days. Here is a list of some of the top open bugs (without pending patches, and many recent hits) that we are hitting. Gate won't be stable, and it will be hard to get your code merged, until we fix these bugs. 1) https://bugs.launchpad.net/bugs/1251920 nova 468 Hits 2) https://bugs.launchpad.net/bugs/1251784 neutron, Nova 328 Hits 3) https://bugs.launchpad.net/bugs/1249065 neutron 122 hits 4) https://bugs.launchpad.net/bugs/1251448 neutron 65 Hits Raw Data: Note: If a bug has any hits for anything besides failure, it means the fingerprint isn't perfect. Elastic recheck known issues Bug: https://bugs.launchpad.net/bugs/1251920 = message:assertionerror: console output was empty AND filename:console.html Title: Tempest failures due to failure to return console logs from an instance Project: Status nova: Confirmed Hits FAILURE: 468 Bug: https://bugs.launchpad.net/bugs/1251784 = message:Connection to neutron failed: Maximum attempts reached AND filename:logs/screen-n-cpu.txt Title: nova+neutron scheduling error: Connection to neutron failed: Maximum attempts reached Project: Status neutron: New nova: New Hits FAILURE: 328 UNSTABLE: 13 SUCCESS: 275 Bug: https://bugs.launchpad.net/bugs/1240256 = message: 503 AND filename:logs/syslog.txt AND syslog_program:proxy-server Title: swift proxy-server returning 503 during tempest run Project: Status openstack-ci: Incomplete swift: New tempest: New Hits FAILURE: 136 SUCCESS: 83 Pending Patch Bug: https://bugs.launchpad.net/bugs/1249065 = message:No nw_info cache associated with instance AND filename:logs/screen-n-api.txt Title: Tempest failure: tempest/scenario/test_snapshot_pattern.py Project: Status neutron: New nova: Confirmed Hits FAILURE: 122 Bug: https://bugs.launchpad.net/bugs/1252514 = message:Got error from Swift: put_object AND filename:logs/screen-g-api.txt Title: glance doesn't recover if Swift returns an error Project: Status devstack: New glance: New swift: New Hits FAILURE: 95 Pending Patch Bug: https://bugs.launchpad.net/bugs/1244255 = message:NovaException: Unexpected vif_type=binding_failed AND filename:logs/screen-n-cpu.txt Title: binding_failed because of l2 agent assumed down Project: Status neutron: Fix Committed Hits FAILURE: 92 SUCCESS: 29 Bug: https://bugs.launchpad.net/bugs/1251448 = message: possible networks found, use a Network ID to be more specific. (HTTP 400) AND filename:console.html Title: BadRequest: Multiple possible networks found, use a Network ID to be more specific. Project: Status neutron: New Hits FAILURE: 65 Bug: https://bugs.launchpad.net/bugs/1239856 = message:tempest/services AND message:/images_client.py AND message:wait_for_image_status AND filename:console.html Title: TimeoutException: Request timed out on tempest.api.compute.images.test_list_image_filters.ListImageFiltersTestXML Project: Status glance: New Hits FAILURE: 62 Bug: https://bugs.launchpad.net/bugs/1235435 = message:One or more ports have an IP allocation from this subnet AND message: SubnetInUse: Unable to complete operation on subnet AND filename:logs/screen-q-svc.txt Title: 'SubnetInUse: Unable to complete operation on subnet UUID. One or more ports have an IP allocation from this subnet.' Project: Status neutron: Incomplete nova: Fix Committed tempest: New Hits FAILURE: 48 Bug: https://bugs.launchpad.net/bugs/1224001 = message:tempest.scenario.test_network_basic_ops AssertionError: Timed out waiting for AND filename:console.html Title: test_network_basic_ops fails waiting for network to become available Project: Status neutron: In Progress swift: Invalid tempest: Invalid Hits FAILURE: 42 Bug: https://bugs.launchpad.net/bugs/1218391 = message:Cannot 'createImage' AND filename:console.html Title: tempest.api.compute.images.test_images_oneserver.ImagesOneServerTestXML.test_delete_image_that_is_not_yet_active spurious failure Project: Status nova: Confirmed swift: Confirmed tempest: Confirmed Hits FAILURE: 25 best, Joe Gordon ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev