On Wed, Jan 20, 2016 at 07:45:16AM -0500, Sean Dague wrote: > The large-ops jobs jumped to a 50% fail in check, 25% fail in gate in > the last 24 hours. > > http://tinyurl.com/j5u4nf5 > > There isn't an obvious culprit at this point. I spent some time this
There is a very obvious culprit, pip 8 was released last night. [1][2] Every dsvm job was failing between the release and when the fixes [3][4][5] landed will have a spike like this. That graph has a 12 hour rolling average and the fixes landed less than 12 hours ago. > morning digging into it a bit. Possibly each individual instance build > got slower, possibly some other timeout is getting hit. > > The large-ops jobs were largely maintained by Joe Gordon, who dug into > them when there were issues. He's not part of the community any more, > and I don't think there is currently a point person. I think you're conflating adding the jobs with maintaining them. Joe did initially add the jobs but he wasn't an active a maintainer as you're implying here. Well, no more so than he was for any other dsvm failure. Not having him around to help with failures anymore is an issue for all jobs not just the ones he added. > > With no current maintainer, I'd suggest we make the jobs non voting - > https://review.openstack.org/#/c/270141/ I'm -1 on this, we really don't want to remove jobs like this until we have equivalent coverage setup somewhere. Frankly there should just be a nova functional test that load similar testing with the fake virt driver. But, until that's done I think premature to make these non-voting. > > I also suggest their time has probably come and gone. There is no one > active on them, and the Rally team is. > > A pre-gating test job is only useful if someone is actively addressing > systematic fails. This job class no longer has it. We should thus retire it. While I agree with the sentiment I don't think this actually applies in practice, the idea of a formal maintainer for a job is kinda a pipe dream. Look at: http://status.openstack.org/elastic-recheck/data/uncategorized.html and identify the maintainers for all the jobs listed there and ask why they have uncategorized failures. Are you saying we should retire all those jobs because there isn't anyone signed up (in the non-existent registry of job maintainers) to watch the failures? -Matt Treinish [1] http://lists.openstack.org/pipermail/openstack-dev/2016-January/084475.html [2] https://github.com/pypa/pip/issues/3384 [3] https://review.openstack.org/269954 [4] https://review.openstack.org/269970 [5] https://review.openstack.org/269969
signature.asc
Description: PGP signature
__________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: [email protected]?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
