[openstack-dev] [all] OpenStack races piling up in the gate - please stop approving patches unless they are fixing a race condition

Sean Dague Thu, 05 Jun 2014 05:14:28 -0700

You may all have noticed things are really backed up in the gate right
now, and you would be correct. (Top of gate is about 30 hrs, but if you
do the math on ingress / egress rates the gate is probably really double
that in transit time right now).


We've hit another threshold where there are so many really small races
in the gate that they are compounding to the point where fixing one is
often failed by another one killing your job. This whole situation was
exacerbated by the fact that while the transition from HP cloud 1.0 ->
1.1 was happening and we were under capacity, the check queue grew to
500 with lots of stuff being approved.

That flush all hit the gate at once. But it also means that those jobs
passed in a very specific timing situation, which is different on the
new HP cloud nodes. And the normal statistical distribution of some jobs
on RAX and some on HP that shake out different races didn't happen.

At this point we could really use help getting focus on only recheck
bugs. The current list of bugs is here:
http://status.openstack.org/elastic-recheck/

Also our categorization rate is only 75% so there are probably at least
2 critical bugs we don't even know about yet hiding in the failures.
Helping categorize here -
http://status.openstack.org/elastic-recheck/data/uncategorized.html
would be handy.

We're coordinating changes via an etherpad here -
https://etherpad.openstack.org/p/gatetriage-june2014

If you want to help, jumping in #openstack-infra would be the place to go.

        -Sean

-- 
Sean Dague
http://dague.net

signature.asc
Description: OpenPGP digital signature

_______________________________________________
OpenStack-dev mailing list
[email protected]
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

[openstack-dev] [all] OpenStack races piling up in the gate - please stop approving patches unless they are fixing a race condition

Reply via email to