On 09/26/2013 09:41 PM, Joe Gordon wrote:
Hi All,
As many of you may have suspected the gate has gotten less stable in the
past few days. Turns out we have the numbers to prove it too!
http://graphite.openstack.org/graphlot/?width=586&from=00%3A00_20130919&_salt=1380244287.508&height=308&target=summarize(stats_counts.zuul.pipeline.gate.job.gate-tempest-devstack-vm-neutron.FAILURE%2C%2224h%22)&target=summarize(stats_counts.zuul.pipeline.gate.job.gate-tempest-devstack-vm-neutron.SUCCESS%2C%2224h%22)&until=23%3A59_20130926&lineMode=staircase
So tempest started failing more right around the 24th, even though we
are in FeatureFreeze.
"FF ensures that sufficient share of theReleaseCycle
<https://wiki.openstack.org/wiki/ReleaseCycle>is dedicated to QA, until
we produce the first release candidates. Limiting the changes that
affect the behavior of the software allow for consistent testing and
efficient bugfixing."
https://wiki.openstack.org/wiki/FeatureFreeze
Thanks to the work we have been doing with logstash and elastic-recheck
we have very good numbers on the top offenders and when they began, the
good news is there are two bugs which we are hitting the most, so the
top offenders list has just two bugs. But there are still other unknown
bugs and lower priority ones out there too!
https://bugs.launchpad.net/tempest/+bug/1226337 -- Launchpad bug 1226337
in tempest
"tempest.scenario.test_volume_boot_pattern.TestVolumeBootPattern flake
failure" [High,Triaged]
Started on 9-23 with 408 hits! in the last 24 hours alone
http://logstash.openstack.org/#eyJzZWFyY2giOiJAbWVzc2FnZTpcIk5vdmFFeGNlcHRpb246IGlTQ1NJIGRldmljZSBub3QgZm91bmQgYXRcIiBBTkQgQGZpZWxkcy5idWlsZF9zdGF0dXM6XCJGQUlMVVJFXCIgQU5EIEBmaWVsZHMuZmlsZW5hbWU6XCJsb2dzL3NjcmVlbi1uLWNwdS50eHRcIiIsImZpZWxkcyI6W10sIm9mZnNldCI6MCwidGltZWZyYW1lIjoiNjA0ODAwIiwiZ3JhcGhtb2RlIjoiY291bnQiLCJ0aW1lIjp7InVzZXJfaW50ZXJ2YWwiOjB9LCJzdGFtcCI6MTM4MDI0NDY2ODQ5Nn0=
https://bugs.launchpad.net/tempest/+bug/1230407 -- Launchpad bug
1230407 in neutron "State change timeout exceeded" [Undecided,Confirmed]
Started on 9-25 with 66 hits in the last 24 hours alone
http://logstash.openstack.org/#eyJzZWFyY2giOiIgQG1lc3NhZ2U6XCJBc3NlcnRpb25FcnJvcjogU3RhdGUgY2hhbmdlIHRpbWVvdXQgZXhjZWVkZWQhXCIgQU5EIEBmaWVsZHMuYnVpbGRfc3RhdHVzOlwiRkFJTFVSRVwiIEFORCBAZmllbGRzLmZpbGVuYW1lOlwiY29uc29sZS5odG1sXCIiLCJmaWVsZHMiOltdLCJvZmZzZXQiOjAsInRpbWVmcmFtZSI6IjYwNDgwMCIsImdyYXBobW9kZSI6ImNvdW50IiwidGltZSI6eyJ1c2VyX2ludGVydmFsIjowfSwic3RhbXAiOjEzODAyNDQ0MzM2NzZ9
This second one is looking like it's an issue with the Neutron DB layer,
as it seems to like to deadlock itself on agent updates -
http://logs.openstack.org/87/47487/4/check/gate-tempest-devstack-vm-neutron/4128a28/logs/screen-q-svc.txt.gz?level=TRACE
So DB assistance would be good.
I've set that bug to Critical and RC1 for Neutron, because right now
it's bouncing at least 50% of the changes out of the gate (and as such
we're starving out the check queue for devstack nodes, so no changes
have made progress for 12 hrs over there).
-Sean
--
Sean Dague
http://dague.net
_______________________________________________
OpenStack-dev mailing list
[email protected]
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev