Re: [openstack-dev] [openstack-qa] Post job failures
What about using graphite + logstash to power a post-job /nightly-job/post-merge-periodic (the new thing we talked about in Germany) dashboard? There are a few different use cases for a dashboard for jobs that don't report on gerrit changes. * Track the success an failure rates over time * If I am maintaining a a job that doesn't vote anywhere, I will check this daily * If I am part of the core team of a project where one feature is tested post-merge, I want to periodically check this to see if that feature is being maintained. * Provide links to logs for failed jobs so the cause of the failure can be investigated We can do all this with graphite on logstash. Graphite for the tracking the trends (something like http://jogo.github.io/gate/) and logstash to find the logs for failed jobs (we can get around the 10 day logstash window by saving the results instead of overwriting them every time we regenerate the list of log links) And if we really want some sort of alerts, there are a lot of graphite tools (http://graphite.readthedocs.org/en/latest/tools.html) that can give us alerts on metrics (alert me if the last X runs of job-foo-bar failed). On Wed, Oct 1, 2014 at 9:46 AM, Jeremy Stanley wrote: > On 2014-10-01 10:39:40 -0400 (-0400), Matthew Treinish wrote: > [...] > > So I actually think as a first pass this would be the best way to > > handle it. You can leave comments on a closed gerrit changes, > [...] > > Not so easy as it sounds. Jobs in post are running on an arbitrary > Git commit (more often than not, a merge commit), and mapping that > back to a change in Gerrit is nontrivial. > -- > Jeremy Stanley > > ___ > OpenStack-dev mailing list > OpenStack-dev@lists.openstack.org > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [openstack-qa] Post job failures
On 2014-10-01 10:39:40 -0400 (-0400), Matthew Treinish wrote: [...] > So I actually think as a first pass this would be the best way to > handle it. You can leave comments on a closed gerrit changes, [...] Not so easy as it sounds. Jobs in post are running on an arbitrary Git commit (more often than not, a merge commit), and mapping that back to a change in Gerrit is nontrivial. -- Jeremy Stanley ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [openstack-qa] Post job failures
Hi Josh, Just a heads up that you shouldn't use this list for any discussion. We've moved all of the discussion off this list into openstack-dev. The only reason we haven't removed the openstack-qa list is so we have a separate address for the periodic job results. (which honestly hasn't been the most effective approach for handling those jobs) On Wed, Oct 01, 2014 at 07:39:44PM +1000, Joshua Hesketh wrote: > Hello QA team, > > When a job fails in the post queue (which have jobs that are triggered on a > change being merged) no warning or failure message is sent anywhere so it > does so silently. This has caused an issue in the past[0] and there are > likely more cases we don't know about. > > We should report failures somewhere but since post jobs don't come from > gerrit they can't be reported back to gerrit trivially. And even if we could > it would be a comment on a closed change. So I actually think as a first pass this would be the best way to handle it. You can leave comments on a closed gerrit changes, it would still generate the same notifications for people who have that enabled. It also would be picked up in the ci results table on the top which I think might be convenient. Long term I'm thinking we might need to make a separate dashboard view for all of these jobs so we can track the results over time. I don't think instantaneous reporting is actually important for post or periodic jobs because if it were they'd be running in check or gate then. Back in the days when there was a single jenkins, the jenkins dashboard could be used for this to a certain extent which was useful. > > My feeling is an easy solution is to email somewhere when a post job fails. > However I'm not sure where might be an appropriate location for that. Would > this mailing list, for example, be a good place to start and then see how we > go? I really don't think this is the right approach. The issue is that most of these things are a project specific failure and you'd either be spamming everyone that it failed or small set of people who aren't interested. I also feel that we run the post jobs far too frequently to have it be sent to any ML. > > I've set up the change here: https://review.openstack.org/#/c/125298/ > > Cheers, > Josh > > [0] > http://lists.openstack.org/pipermail/openstack-dev/2014-September/046481.html > -Matt Treinish pgpTOBhqldUAY.pgp Description: PGP signature ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev