Re: [OpenStack-Infra] [third-party] CI Monitoring Tool

Joshua Hesketh Thu, 13 Nov 2014 05:26:01 -0800

Hi,

Sorry for the slow reply, I'm currently on vacation.


I think we should include the infra mailing list on this discussion so I've 
cc'd them here. If it's off topic we can take this off list again, however I 
feel like we may be duplicating efforts at the moment.

Re people not using zuul, the brainstormed idea from the infra team during the 
summit was to have a generic rest endpoint that can take results (and then do 
stats/graphs etc). Zuul would post to this endpoint as a reporter, but there 
would be nothing stopping others from implementing their own report posts.

Anyway there looks like there is good discussion on the etherpad.

Cheers,
Josh



________________________________
From: Steve Weston [[email protected]]
Sent: Sunday, November 09, 2014 7:00 AM
To: Duncan Thomas; Joshua Hesketh
Cc: [email protected]; [email protected]; Kurt Taylor; Anita Kuno
Subject: Re: [third-party] CI Monitoring Tool

The etherpad has been created 
https://etherpad.openstack.org/p/Third-Party-CI-Dashboard-InitialPlanning

I have included my input on introducing a calibration service which the CI 
systems would use before running a patchset.  The idea is this:  each project 
would define one or more jobs which the CI system would run to make sure it is 
working correctly, and in synchronization with Jenkins, before reporting an 
errant result.

I believe that this would greatly improve the stability of CI and allow 
problems to be fixed before the CI system runs the patch.

Thoughts, comments, and input are welcome!

Thanks,
Steve

On 11/7/14 7:58 PM, Steve Weston wrote:
I have already begun work on the code for this project, and yesterday I did 
write a small bit of code which implements a REST API in the Django REST 
framework.   Although my plan was to expose the data collected by the dashboard 
to other services, this framework can be modified to additionally be used to 
act as sort of a check-in service as Josh wrote about below.

Tomorrow I will create an etherpad so that folks may start listing out their 
ideas for how this dashboard will work.  I will send out a link once I have it.

Thanks,
Steve

On 11/7/14 7:53 PM, Steve Weston wrote:
+ Anita

On 11/7/14 5:34 PM, Duncan Thomas wrote:

So it is worth noting that not every third party ci is using Zuul. I think 
scraping gerrit (even into a db to run queries about) is a better way forward 
than adding something else to the ci requirements

Duncan Thomas

On Nov 7, 2014 4:41 PM, "Joshua Hesketh" 
<[email protected]<mailto:[email protected]>> wrote:
Hi Kurt,

Thanks for kicking this conversation off. I wonder if the -infra list would be 
a good place to include more.

So I believe, although we're still brainstorming etc, the vague infra plan is 
to have a dashboard service with API endpoints that a zuul reporter can talk 
to. Then all 1st + 3rd parties would report to that and therefore have a 
dashboard populated and statistics generated etc.

So that's kind of the long term plan that will give us some more useful data we 
can dive into. However, for the moment I think having a simple 
gerrit-bot-status dashboard (as you have described) will at least help in terms 
of assessing the health of the systems.

I don't think anybody in particular is working on radar so we could probably 
consume that repository. We should get Michael Still's okay first though (since 
he's the original author).

Cheers,
Josh
________________________________________
From: Kurt Taylor [[email protected]<mailto:[email protected]>]
Sent: Saturday, November 08, 2014 1:06 AM
To: [email protected]<mailto:[email protected]>; Joshua Hesketh; 
[email protected]<mailto:[email protected]>; 
[email protected]<mailto:[email protected]>; 
[email protected]<mailto:[email protected]>
Subject: [third-party] CI Monitoring Tool

In the third-party summit session, we discussed the need for CI
systems to have a status dashboard [1]. However, it seems that there
are multiple people writing a CI monitoring tool, let's level set:

- Josh has written a gerrit event gatherer [2]
- Duncan has too
- Steve has too (I have not yet talked to Steve)
- Radar has a command line scraper, we can remove and just use radar
gauges with one of the api backends above, fairly simple [3]
- Nova also discussed CI monitoring and status reporting [4]. Matt
owns? a requirement for Nova to implement CI monitoring (I have not
yet talked to Matt)

[1] https://etherpad.openstack.org/p/kilo-third-party-items
[2] 
https://github.com/stackforge/turbo-hipster/blob/master/tools/zuul_enqueue.py
[3] https://github.com/rcbau/radar/blob/master/report.py
[4] https://etherpad.openstack.org/p/nova-ci-status-checkpoint-kilo

>From conversations with Josh and Duncan, we believe that a good
initial plan is to diff a patch with what Jenkins reported, if failed
and different, collect 5? (or 3?) failures then re-queue a last known
successful patch run. If that fails, the CI system is not working
properly. I believe that covers 95% maybe higher of scenarios.

I like Josh's idea to just have a browser page refresh kick of a
sample collection and report via radar guages. Start simple, then we
could ask infra to have cron fire off gathering once every 20 minutes
or so, then maybe push this data to a database, and so on.

So, the question is, do we create a new github repo for a new tool?
reuse Radar repo? Let's get skeleton code somewhere (no preference)
and the we can get more involvement and figure out where this should
live.  We should create a spec in openstack-infra. If we agree, I'll
be happy to shepherd that.

Comments?

Kurt Taylor (krtaylor)

_______________________________________________
OpenStack-Infra mailing list
[email protected]
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-infra

Re: [OpenStack-Infra] [third-party] CI Monitoring Tool

Reply via email to