Re: [OpenStack-Infra] Public numbers about the scale of the infrastructure/CI ?
The talk was this week and it's up on YouTube [1]. During the talk which was basically a long live demo, we... - Sent a patch to fix a typo in the talk [2] - Fixed a Zuul job through speculative testing [3] - Updated the openstack-infra IRC meeting chair [4]. Oh, and we also added an item on the next meeting to talk about this talk [5]. It was fun. [1]: https://youtu.be/6gTsL7E7U7Q [2]: https://review.openstack.org/#/c/556738/ [3]: https://review.openstack.org/#/c/556615/ [4]: https://review.openstack.org/#/c/557095/ [5]: https://wiki.openstack.org/wiki/Meetings/InfraTeamMeeting David Moreau Simard Senior Software Engineer | OpenStack RDO dmsimard = [irc, github, twitter] On Sat, Mar 24, 2018 at 9:28 PM, David Moreau Simardwrote: > Hi -infra, > > I'll be presenting a talk at a local OpenStack meetup next week [1] > that will highlight some examples about how people can help and > contribute to the infrastructure project. > The talk will be recorded and should hopefully serve as a form of > informal documentation. > > I'd like to disclose some semi-official numbers (as I'd personally > pull them up) to let people have an idea of the scale our contributors > are maintaining. > I suppose this data is already somewhat public if you know where to > look but I don't think it's been written down in a digestable format > in recent history. > > Unless there's any objection, I'd have a slide with up to date numbers such > as: > - # of projects hosted (as per git.openstack.org) > - # of servers (in aggregate of all our regions) > -- (Maybe some big highlights like the size of logstash, logs.o.o, Zuul) > - Nodepool capacity (number of clouds, aggregate capacity) > - # of jobs and Ansible playbooks per month ran by Zuul > - Approximate number of maintained and hosted services (irc, > gerritbot, meetbot, gerrit, git, mailing lists, wiki, ask.openstack, > storyboard, codesearch, etc.) > - Probably some high level numbers from Stackalytics > - Maybe something else I haven't thought about > > The idea of the talk is not to brag about all the stuff we're doing > but rather, "hey, you don't need to be a pro in OpenStack to > contribute, we got all these different things you can help with". > > I realize it's a bit last minute but please let me know if you see > anything wrong with this ! > > [1]: https://www.meetup.com/Montreal-OpenStack/events/248344351/ > > David Moreau Simard > Senior Software Engineer | OpenStack RDO > > dmsimard = [irc, github, twitter] ___ OpenStack-Infra mailing list OpenStack-Infra@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-infra
Re: [OpenStack-Infra] Public numbers about the scale of the infrastructure/CI ?
Good point. I'll work with that instead. David Moreau Simard Senior Software Engineer | OpenStack RDO dmsimard = [irc, github, twitter] On Mon, Mar 26, 2018 at 4:30 PM, James E. Blairwrote: > David Moreau Simard writes: > >> On Mon, Mar 26, 2018 at 10:20 AM, James E. Blair wrote: - # of jobs and Ansible playbooks per month ran by Zuul >>> >>> I'm curious about this one -- how were you planning on defining these >>> values and obtaining them? >>> >> >> I've needed to pull statistics out of Zuul in the past for RDO (i.e, >> justifying budget for CI resources) >> and I use the sql reporter data to do it. >> It looks like this: >> >> $range = "'2018-02-01 00:00:00' AND '2018-02-28 23:59:59'" >> SELECT job_name, >>result, >>start_time, >>end_time, >>TIMEDIFF(end_time, start_time) as duration >> FROM zuul_build >> WHERE >> start_time BETWEEN $range >> >> This gets me the amount of monthly *jobs* and I can extrapolate (over >> N playbooks..) >> by estimating a number knowing that: >> - base and post playbooks are fairly consistently X playbooks >> - there is at least one "run" playbook >> >> So pretending that 1000 jobs ran, I can say something like: >> 1000 jobs and over [1000 * (X+1)] playbooks >> >> It's not a perfect number but we know we run more playbooks than that. >> >> What I have also been thinking about is, if I want to get a more >> accurate number, I could do a sum of all the executor playbook results >> (which are in graphite) but the history for those don't go too far >> back. >> Ex: stats.zuul.executor.ze*_openstack_org.phase.*.* > > The SQL query gets the number of completed jobs which are *reported*. > It doesn't get you two other numbers, which are the jobs *launched* > (many of which may have been aborted before completion), or the jobs > *completed* (the results of many of which may have been discarded due to > changes in the environment). In reality, the system is likely to be > significantly busier than the number of jobs reported will indicate. > > Both of the other values can be obtained from graphite or by parsing > logs. I think for this purpose, graphite might be sufficient. (The > only time I'd recommend going to logs is when we need to find > project-specific resource usage information.) > > stats_counts.zuul.executor.*.builds should be all jobs launched. > stats_counts.zuul.tenant.*.pipeline.*.all_jobs should be all jobs completed. > > -Jim ___ OpenStack-Infra mailing list OpenStack-Infra@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-infra
Re: [OpenStack-Infra] Public numbers about the scale of the infrastructure/CI ?
David Moreau Simardwrites: > On Mon, Mar 26, 2018 at 10:20 AM, James E. Blair wrote: >>> - # of jobs and Ansible playbooks per month ran by Zuul >> >> I'm curious about this one -- how were you planning on defining these >> values and obtaining them? >> > > I've needed to pull statistics out of Zuul in the past for RDO (i.e, > justifying budget for CI resources) > and I use the sql reporter data to do it. > It looks like this: > > $range = "'2018-02-01 00:00:00' AND '2018-02-28 23:59:59'" > SELECT job_name, >result, >start_time, >end_time, >TIMEDIFF(end_time, start_time) as duration > FROM zuul_build > WHERE > start_time BETWEEN $range > > This gets me the amount of monthly *jobs* and I can extrapolate (over > N playbooks..) > by estimating a number knowing that: > - base and post playbooks are fairly consistently X playbooks > - there is at least one "run" playbook > > So pretending that 1000 jobs ran, I can say something like: > 1000 jobs and over [1000 * (X+1)] playbooks > > It's not a perfect number but we know we run more playbooks than that. > > What I have also been thinking about is, if I want to get a more > accurate number, I could do a sum of all the executor playbook results > (which are in graphite) but the history for those don't go too far > back. > Ex: stats.zuul.executor.ze*_openstack_org.phase.*.* The SQL query gets the number of completed jobs which are *reported*. It doesn't get you two other numbers, which are the jobs *launched* (many of which may have been aborted before completion), or the jobs *completed* (the results of many of which may have been discarded due to changes in the environment). In reality, the system is likely to be significantly busier than the number of jobs reported will indicate. Both of the other values can be obtained from graphite or by parsing logs. I think for this purpose, graphite might be sufficient. (The only time I'd recommend going to logs is when we need to find project-specific resource usage information.) stats_counts.zuul.executor.*.builds should be all jobs launched. stats_counts.zuul.tenant.*.pipeline.*.all_jobs should be all jobs completed. -Jim ___ OpenStack-Infra mailing list OpenStack-Infra@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-infra
Re: [OpenStack-Infra] Public numbers about the scale of the infrastructure/CI ?
On Mon, Mar 26, 2018 at 10:20 AM, James E. Blairwrote: >> - # of jobs and Ansible playbooks per month ran by Zuul > > I'm curious about this one -- how were you planning on defining these > values and obtaining them? > I've needed to pull statistics out of Zuul in the past for RDO (i.e, justifying budget for CI resources) and I use the sql reporter data to do it. It looks like this: $range = "'2018-02-01 00:00:00' AND '2018-02-28 23:59:59'" SELECT job_name, result, start_time, end_time, TIMEDIFF(end_time, start_time) as duration FROM zuul_build WHERE start_time BETWEEN $range This gets me the amount of monthly *jobs* and I can extrapolate (over N playbooks..) by estimating a number knowing that: - base and post playbooks are fairly consistently X playbooks - there is at least one "run" playbook So pretending that 1000 jobs ran, I can say something like: 1000 jobs and over [1000 * (X+1)] playbooks It's not a perfect number but we know we run more playbooks than that. What I have also been thinking about is, if I want to get a more accurate number, I could do a sum of all the executor playbook results (which are in graphite) but the history for those don't go too far back. Ex: stats.zuul.executor.ze*_openstack_org.phase.*.* David Moreau Simard Senior Software Engineer | OpenStack RDO dmsimard = [irc, github, twitter] ___ OpenStack-Infra mailing list OpenStack-Infra@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-infra
Re: [OpenStack-Infra] Public numbers about the scale of the infrastructure/CI ?
David Moreau Simardwrites: > Unless there's any objection, I'd have a slide with up to date numbers such > as: I don't have any objection to making them public (I believe nearly all, if not all, of these are public already). But I would like them to be as accurate as possible :). > - # of projects hosted (as per git.openstack.org) > - # of servers (in aggregate of all our regions) > -- (Maybe some big highlights like the size of logstash, logs.o.o, Zuul) > - Nodepool capacity (number of clouds, aggregate capacity) > - # of jobs and Ansible playbooks per month ran by Zuul I'm curious about this one -- how were you planning on defining these values and obtaining them? > - Approximate number of maintained and hosted services (irc, > gerritbot, meetbot, gerrit, git, mailing lists, wiki, ask.openstack, > storyboard, codesearch, etc.) > - Probably some high level numbers from Stackalytics > - Maybe something else I haven't thought about -Jim ___ OpenStack-Infra mailing list OpenStack-Infra@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-infra
[OpenStack-Infra] Public numbers about the scale of the infrastructure/CI ?
Hi -infra, I'll be presenting a talk at a local OpenStack meetup next week [1] that will highlight some examples about how people can help and contribute to the infrastructure project. The talk will be recorded and should hopefully serve as a form of informal documentation. I'd like to disclose some semi-official numbers (as I'd personally pull them up) to let people have an idea of the scale our contributors are maintaining. I suppose this data is already somewhat public if you know where to look but I don't think it's been written down in a digestable format in recent history. Unless there's any objection, I'd have a slide with up to date numbers such as: - # of projects hosted (as per git.openstack.org) - # of servers (in aggregate of all our regions) -- (Maybe some big highlights like the size of logstash, logs.o.o, Zuul) - Nodepool capacity (number of clouds, aggregate capacity) - # of jobs and Ansible playbooks per month ran by Zuul - Approximate number of maintained and hosted services (irc, gerritbot, meetbot, gerrit, git, mailing lists, wiki, ask.openstack, storyboard, codesearch, etc.) - Probably some high level numbers from Stackalytics - Maybe something else I haven't thought about The idea of the talk is not to brag about all the stuff we're doing but rather, "hey, you don't need to be a pro in OpenStack to contribute, we got all these different things you can help with". I realize it's a bit last minute but please let me know if you see anything wrong with this ! [1]: https://www.meetup.com/Montreal-OpenStack/events/248344351/ David Moreau Simard Senior Software Engineer | OpenStack RDO dmsimard = [irc, github, twitter] ___ OpenStack-Infra mailing list OpenStack-Infra@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-infra