Re: [openstack-dev] [nova][docker][containers][qa] nova-docker CI failing a lot on unrelated nova patches
On 12/09/2014 06:18 PM, Eric Windisch wrote: While gating on nova-docker will prevent patches that cause nova-docker to break 100% to land, it won't do a lot to prevent transient failures. To fix those we need people dedicated to making sure nova-docker is working. What would be helpful for me is a way to know that our tests are breaking without manually checking Kibana, such as an email. I know that periodic jobs can do this kind of notification, if you ask about it in #openstack-infra there might be a solution there. However, having a job in infra on Nova is a thing that comes with an expectation that someone is staying engaged on the infra and Nova sides to ensure that it's running correctly, and debug it when it's wrong. It's not a set it and forget it. It's already past the 2 weeks politeness boundary before it's considered fair game to just delete it. Creating the job is 10% of the work. Long term maintenance is important. I'm still not getting the feeling that there is really a long term owner on this job. I'd love that not to be the case, but simple things like the fact that the directory structure was all out of whack make it clear no one was regularly looking at it. -Sean -- Sean Dague http://dague.net ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova][docker][containers][qa] nova-docker CI failing a lot on unrelated nova patches
On 2014-12-10 06:37:02 -0500 (-0500), Sean Dague wrote: I know that periodic jobs can do this kind of notification, if you ask about it in #openstack-infra there might be a solution there. [...] E-mail reporting in Zuul is currently implemented pipeline-specific, so the nova-docker tests would need to be in their own job in a dedicated pipeline with reporting set to the relevant contact address. This may be an excessive level of overhead, so we should have a separate infra discussion on whether that's a realistic solution, or whether it's worth looking at new Zuul functionality to tack E-mail reporting addresses onto specific jobs in arbitrary pipelines. -- Jeremy Stanley ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova][docker][containers][qa] nova-docker CI failing a lot on unrelated nova patches
Sean, fyi, got it stable now for the moment. http://logstash.openstack.org/#eyJzZWFyY2giOiIgYnVpbGRfbmFtZTpcImNoZWNrLXRlbXBlc3QtZHN2bS1kb2NrZXJcIiBBTkQgbWVzc2FnZTpcIkZpbmlzaGVkOlwiIEFORCBidWlsZF9zdGF0dXM6XCJGQUlMVVJFXCIiLCJmaWVsZHMiOltdLCJvZmZzZXQiOjAsInRpbWVmcmFtZSI6IjE3MjgwMCIsImdyYXBobW9kZSI6ImNvdW50IiwidGltZSI6eyJ1c2VyX2ludGVydmFsIjowfSwibW9kZSI6IiIsImFuYWx5emVfZmllbGQiOiIiLCJzdGFtcCI6MTQxODIyMzEwMjcyOX0= with https://review.openstack.org/#/c/138714/ thanks, dims On Wed, Dec 10, 2014 at 6:37 AM, Sean Dague s...@dague.net wrote: On 12/09/2014 06:18 PM, Eric Windisch wrote: While gating on nova-docker will prevent patches that cause nova-docker to break 100% to land, it won't do a lot to prevent transient failures. To fix those we need people dedicated to making sure nova-docker is working. What would be helpful for me is a way to know that our tests are breaking without manually checking Kibana, such as an email. I know that periodic jobs can do this kind of notification, if you ask about it in #openstack-infra there might be a solution there. However, having a job in infra on Nova is a thing that comes with an expectation that someone is staying engaged on the infra and Nova sides to ensure that it's running correctly, and debug it when it's wrong. It's not a set it and forget it. It's already past the 2 weeks politeness boundary before it's considered fair game to just delete it. Creating the job is 10% of the work. Long term maintenance is important. I'm still not getting the feeling that there is really a long term owner on this job. I'd love that not to be the case, but simple things like the fact that the directory structure was all out of whack make it clear no one was regularly looking at it. -Sean -- Sean Dague http://dague.net ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev -- Davanum Srinivas :: https://twitter.com/dims ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova][docker][containers][qa] nova-docker CI failing a lot on unrelated nova patches
On Fri, Dec 5, 2014 at 11:43 AM, Ian Main im...@redhat.com wrote: Sean Dague wrote: On 12/04/2014 05:38 PM, Matt Riedemann wrote: On 12/4/2014 4:06 PM, Michael Still wrote: +Eric and Ian On Fri, Dec 5, 2014 at 8:31 AM, Matt Riedemann mrie...@linux.vnet.ibm.com wrote: This came up in the nova meeting today, I've opened a bug [1] for it. Since this isn't maintained by infra we don't have log indexing so I can't use logstash to see how pervasive it us, but multiple people are reporting the same thing in IRC. Who is maintaining the nova-docker CI and can look at this? It also looks like the log format for the nova-docker CI is a bit weird, can that be cleaned up to be more consistent with other CI log results? [1] https://bugs.launchpad.net/nova-docker/+bug/1399443 -- Thanks, Matt Riedemann ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev Also, according to the 3rd party CI requirements [1] I should see nova-docker CI in the third party wiki page [2] so I can get details on who to contact when this fails but that's not done. [1] http://ci.openstack.org/third_party.html#requirements [2] https://wiki.openstack.org/wiki/ThirdPartySystems It's not the 3rd party CI job we are talking about, it's the one in the check queue which is run by infra. But, more importantly, jobs in those queues need shepards that will fix them. Otherwise they will get deleted. Clarkb provided the fix for the log structure right now - https://review.openstack.org/#/c/139237/1 so at least it will look vaguely sane on failures -Sean This is one of the reasons we might like to have this in nova core. Otherwise we will just keep addressing issues as they come up. We would likely be involved doing this if it were part of nova core anyway. While gating on nova-docker will prevent patches that cause nova-docker to break 100% to land, it won't do a lot to prevent transient failures. To fix those we need people dedicated to making sure nova-docker is working. Ian -- Sean Dague http://dague.net ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova][docker][containers][qa] nova-docker CI failing a lot on unrelated nova patches
While gating on nova-docker will prevent patches that cause nova-docker to break 100% to land, it won't do a lot to prevent transient failures. To fix those we need people dedicated to making sure nova-docker is working. What would be helpful for me is a way to know that our tests are breaking without manually checking Kibana, such as an email. Regards, Eric Windisch ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova][docker][containers][qa] nova-docker CI failing a lot on unrelated nova patches
On Tue, Dec 9, 2014 at 3:18 PM, Eric Windisch e...@windisch.us wrote: While gating on nova-docker will prevent patches that cause nova-docker to break 100% to land, it won't do a lot to prevent transient failures. To fix those we need people dedicated to making sure nova-docker is working. What would be helpful for me is a way to know that our tests are breaking without manually checking Kibana, such as an email. There is also graphite [0], but since the docker-job is running on the check queue the data we are producing is very dirty. Since check jobs often run on broken patches. [0] http://graphite.openstack.org/render/?from=-10daysheight=500until=nowwidth=1200bgcolor=fffgcolor=00yMax=100yMin=0target=color(alias(movingAverage(asPercent(stats.zuul.pipeline.check.job.check-tempest-dsvm-docker.FAILURE,sum(stats.zuul.pipeline.check.job.check-tempest-dsvm-docker.{SUCCESS,FAILURE})),%2736hours%27),%20%27check-tempest-dsvm-docker%27),%27orange%27)title=Docker%20Failure%20Rates%20(10%20days)_t=0.3702208176255226 Regards, Eric Windisch ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova][docker][containers][qa] nova-docker CI failing a lot on unrelated nova patches
Sean Dague wrote: On 12/04/2014 05:38 PM, Matt Riedemann wrote: On 12/4/2014 4:06 PM, Michael Still wrote: +Eric and Ian On Fri, Dec 5, 2014 at 8:31 AM, Matt Riedemann mrie...@linux.vnet.ibm.com wrote: This came up in the nova meeting today, I've opened a bug [1] for it. Since this isn't maintained by infra we don't have log indexing so I can't use logstash to see how pervasive it us, but multiple people are reporting the same thing in IRC. Who is maintaining the nova-docker CI and can look at this? It also looks like the log format for the nova-docker CI is a bit weird, can that be cleaned up to be more consistent with other CI log results? [1] https://bugs.launchpad.net/nova-docker/+bug/1399443 -- Thanks, Matt Riedemann ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev Also, according to the 3rd party CI requirements [1] I should see nova-docker CI in the third party wiki page [2] so I can get details on who to contact when this fails but that's not done. [1] http://ci.openstack.org/third_party.html#requirements [2] https://wiki.openstack.org/wiki/ThirdPartySystems It's not the 3rd party CI job we are talking about, it's the one in the check queue which is run by infra. But, more importantly, jobs in those queues need shepards that will fix them. Otherwise they will get deleted. Clarkb provided the fix for the log structure right now - https://review.openstack.org/#/c/139237/1 so at least it will look vaguely sane on failures -Sean This is one of the reasons we might like to have this in nova core. Otherwise we will just keep addressing issues as they come up. We would likely be involved doing this if it were part of nova core anyway. Ian -- Sean Dague http://dague.net ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova][docker][containers][qa] nova-docker CI failing a lot on unrelated nova patches
+Eric and Ian On Fri, Dec 5, 2014 at 8:31 AM, Matt Riedemann mrie...@linux.vnet.ibm.com wrote: This came up in the nova meeting today, I've opened a bug [1] for it. Since this isn't maintained by infra we don't have log indexing so I can't use logstash to see how pervasive it us, but multiple people are reporting the same thing in IRC. Who is maintaining the nova-docker CI and can look at this? It also looks like the log format for the nova-docker CI is a bit weird, can that be cleaned up to be more consistent with other CI log results? [1] https://bugs.launchpad.net/nova-docker/+bug/1399443 -- Thanks, Matt Riedemann ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev -- Rackspace Australia ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova][docker][containers][qa] nova-docker CI failing a lot on unrelated nova patches
On 12/4/2014 4:06 PM, Michael Still wrote: +Eric and Ian On Fri, Dec 5, 2014 at 8:31 AM, Matt Riedemann mrie...@linux.vnet.ibm.com wrote: This came up in the nova meeting today, I've opened a bug [1] for it. Since this isn't maintained by infra we don't have log indexing so I can't use logstash to see how pervasive it us, but multiple people are reporting the same thing in IRC. Who is maintaining the nova-docker CI and can look at this? It also looks like the log format for the nova-docker CI is a bit weird, can that be cleaned up to be more consistent with other CI log results? [1] https://bugs.launchpad.net/nova-docker/+bug/1399443 -- Thanks, Matt Riedemann ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev Also, according to the 3rd party CI requirements [1] I should see nova-docker CI in the third party wiki page [2] so I can get details on who to contact when this fails but that's not done. [1] http://ci.openstack.org/third_party.html#requirements [2] https://wiki.openstack.org/wiki/ThirdPartySystems -- Thanks, Matt Riedemann ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev