Re: [openstack-dev] [stackalytics] [metrics] Review metrics: average numbers
Hi, Mike, I'm not sure what you are looking for exactly, but maybe you can have a look at the quarterly reports. AFAIK, currently there is none specific to Fuel, but for example for Nova, you have: http://activity.openstack.org/dash/reports/2015-q3/pdf/projects/nova.pd f In page 6, you have "time waiting for reviewer" (from the moment a new patchset is produced, to the time a conclusive review vote is found in Gerrit), and "time waiting for developer" (from the conclusive review vote to next patchset). We're working now in a visualization for that kind of information. For now, we only have complete changeset values, check if you're interested: http://blog.bitergia.com/2015/10/22/understanding-the-code-review-proce ss-in-openstack/ Saludos, Jesus. On Wed, 2015-11-11 at 21:45 +, Mike Scherbakov wrote: > Hi stackers, > I have a question about Stackalytics. > I'm trying to get some more data from code review stats. For Fuel, > for instance, > http://stackalytics.com/report/reviews/fuel-group/open > shows some useful stats. Do I understand right, that average numbers > here are calculated out of open reviews, not total number of reviews? > > The most important number which I'm trying to get, is an average time > change requests waiting for reviewers since last vote or mark, from > all requests (not only those which remain in open state, like it is > now, I believe). > > How hard would it be to get / extend Stackalytics to make it..? > > Thanks! > -- > Mike Scherbakov > #mihgen > _ > _ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubs > cribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev -- Bitergia: http://bitergia.com /me at Twitter: https://twitter.com/jgbarah __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [metrics] Summary of the BOF session on software engineering metrics
Hi all This morning we had the BOF session on software engineering metrics for OpenStack. Since we agreed on continuing the conversation in this mailing list, here is a summary of the session, based on the notes taken by Dani (in CC). First of all, the link to the slides we used to frame the session: http://bit.ly/openstack-bof-15 Now, some comments about what we talked. The general idea of the BOF can be found in slide #3: OpenStack is a project leading in open development metrics, the idea was talking about current experiences with this, and about what we would like to have in the future. Current dashboards, collections of metrics, etc. In addition to the ones presented in the slides (Stackalytics, Activity Dashboard, Kibana -based dashboards, Russell Bryant's stats, Bitergia reports, Jenkins Logstash dashboard), some others, below, were mentioned. If you know of any other, please comment about it in this thread. - Somebody has been looking at patterns of how persons contribute over time. - There are some in-house (non public) machinery to check the total time to deploy in companies for their customers. - A tool reporting on coverage in reporting unit tests was mentioned. We discussed about how each of these efforts is targeted at different uses, at answering different questions. Some of them are focused on the contributions by different actors like persons or companies (such as Stackalytics), some allow for getting knowledge about how development processes are happening and performing (such as the Activity Board, Russell's stats, or the Jenkins Logstash dashboard), some are summaries of interest about certain aspects (such as the reports). In most cases, they there are overlappings. It was also mentioned how OpenStack is one of the projects which has advanced more in the idea of open development analytics, thanks in part to all the previous efforts, and to a certain state of mind that makes participants in the OpenStack community more aware of the importance of metrics than in other projects. Then we discussed about use cases of metrics within OpenStack, both cases that are happening now, cases that are happening in other communities but could be translated, and others not yet happening that could be interesting. In addition to those mentioned in slide #11, some others were: - QA: being able to look at code test coverage, duplication of testing, test time cycle. - Hotspots: places of the code where more people are touching (more complex areas). - Bugs: what files are having more bugs? How's the complexity in them? - Complexity: changes that are touching more complex areas than others? - Operators: stats about backporting fixes to stable branches - Metrics in stable branch, such as failures on stable branches - Dependencies on feature requests: Have a new feature tested internally and later uploaded to upstream, to find that this breaks even when it was previously in master. - Frequency of rebase requirements. - How fast fixes for critical vulnerabilities are merged into master - External example: Wikimedia Gerrit clean up days: lists of changesets, times, etc. to track the impact of the day. - External example: Xen community checking current status of the time to merge,to see if their perception was in line with real metrics. Some other issues that were discussed include: - Should SonarQube be used, and the results made public to the people when releasing? Every day? - It may be hard to compare metrics in Stackalytics with other projects, given that it is very specific of Openstack. On the other hand, this specificity makes it very well adapted to OpenStack. - It would be interesting tracking contributions coming from operators - Having public performance metrics could be of interest, although that may be a bit beyond the current discussion on software development metrics. As a summary, we had time to agree on two lines (see slide #13): * Open development analytics is a core value of the OpenStack community * Let's fostering more lively discussions about metrics in OpenStack, using the openstack-dev mailing list Please, comment about any other conclusion that you may propose (were you in the BOF or not), by answering in this thread. Thanks to all of you who made this BOF possible! Please, those of you who attended, comment anything that could missing in these notes. Saludos, Jesus. -- Bitergia: http://bitergia.com /me at Twitter: https://twitter.com/jgbarah __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] How to get openstack bugs data for research?
You can download a database dump with (hopefully) information for all Launchpad tickets corresponding to OpenStack, already organized and ready to be queried: http://activity.openstack.org/dash/browser/data/db/tickets.mysql.7z (linked from http://activity.openstack.org/dash/browser/data_sources.html ) It is a 7z-zipped file with includes a MySQL dump of the actual database, produced by Bicho (one of the tools in MetricsGrimoire). It is updated daily. You can just feed it to MySQL, and start to do your queries. There is some information about Bicho and the database schema used at https://github.com/MetricsGrimoire/Bicho/wiki https://github.com/MetricsGrimoire/Bicho/wiki/Database-Schema Please, let me know if you need further info. Saludos, Jesus. On Wed, 2014-12-03 at 20:20 +0800, zfx0...@gmail.com wrote: Hi, all I am a graduate student in Peking University, our lab do some research on open source projects. This is our introduction: https://passion-lab.org/ Now we need openstack issues data for research, I found the issues list: https://bugs.launchpad.net/openstack/ I want to download the openstack issues data, Could anyone tell me how to download the data? Or is there some link or API for download the data? And I found 9464 bugs in https://bugs.launchpad.net/openstack/ ,is this all? why so few? Many thanks! Beat regards, Feixue, Zhang __ zfx0...@gmail.com ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev -- Bitergia: http://bitergia.com /me at Twitter: https://twitter.com/jgbarah ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [TripleO] Review metrics - what do we want to measure?
On Wed, 2014-09-03 at 12:58 +1200, Robert Collins wrote: On 14 August 2014 11:03, James Polley j...@jamezpolley.com wrote: In recent history, we've been looking each week at stats from http://russellbryant.net/openstack-stats/tripleo-openreviews.html to get a gauge on how our review pipeline is tracking. The main stats we've been tracking have been the since the last revision without -1 or -2. I've included some history at [1], but the summary is that our 3rd quartile has slipped from 13 days to 16 days over the last 4 weeks or so. Our 1st quartile is fairly steady lately, around 1 day (down from 4 a month ago) and median is unchanged around 7 days. There was lots of discussion in our last meeting about what could be causing this[2]. However, the thing we wanted to bring to the list for the discussion is: Are we tracking the right metric? Should we be looking to something else to tell us how well our pipeline is performing? The meeting logs have quite a few suggestions about ways we could tweak the existing metrics, but if we're measuring the wrong thing that's not going to help. I think that what we are looking for is a metric that lets us know whether the majority of patches are getting feedback quickly. Maybe there's some other metric that would give us a good indication? If we review all patches quickly and land none, thats bad too :). For the reviewers specifically i think we need a metric(s) that: - doesn't go bad when submitters go awol, don't respond etc - including when they come back - our stats shouldn't jump hugely because an old review was resurrected - when good means submitters will be getting feedback - flag inventory- things we'd be happy to have landed that haven't - including things with a -1 from non-core reviewers (*) (*) I often see -1's on things core wouldn't -1 due to the learning curve involved in becoming core So, as Ben says, I think we need to address the its-not-a-vote issue as a priority, that has tripped us up in lots of ways I think we need to discount -workflow patches where that was set by the submitter, which AFAICT we don't do today. Looking at current stats: Longest waiting reviews (based on oldest rev without -1 or -2): 54 days, 2 hours, 41 minutes https://review.openstack.org/106167 (Keystone/LDAP integration) That patch had a -1 on Aug 16 1:23 AM: but was quickyl turned to +2. So this patch had a -1 then after discussion it became a +2. And its evolved multiple times. What should we be saying here? Clearly its had little review input over its life, so I think its sadly accurate. I wonder if a big chunk of our sliding quartile is just use not reviewing the oldest reviews. I've been researching review process in OpenStack and other projects for a while, and my impression is that at least three timing metrics are relevant: (1) Total time from submitting a patch to final closing of the review process (landing that, or a subsequent patch, or finally abandoning). This gives an idea of how the whole process is working. (2) Time from submitting a patch to that patch being approved (+2 in OpenStack, I guess) or declined (and a new patch is requested). This gives an idea of how quick reviewers are providing definite feedback to patch submitters, and is a metric for each patch cycle. (3) Time from a patch being reviewed, with a new patch being requested, to a new patch being submitted. This gives an idea of the reaction time of patch submitter. Usually, you want to keep (1) low, while (2) and (3) give you an idea of what is happening if (1) gets high. There is another relevant metric in some cases, which is (4) The number of patch cycles per review cycle (that is, how many patches are needed per patch landing in master). In some cases, that may help to explain how (2) and (3) contribute to (1). And a fifth metric gives you a throughput metric: (5) BMI (backlog management index), number of new review processes by number of closed review process for a certain period. It gives an idea of whether the backlog is going up (1) or down (1), and is usually very interesting when seen over time. (1) alone is not enough to assess on how well the review process is, because it could low, but (5) showing an increasing backlog because simply new review requests come too quickly (eg, in periods when developers are submitting a lot of patch proposals after a freeze). (1) could also be high, but (5) show a decrease in the backlog, because for example reviewers or submitters are overworked or slowly scheduled, but still the project copes with the backlog. Depending on the relationship of (1) and (5), maybe you need more reviewers, or reviewers scheduling their reviews with more priority wrt other actions, or something else. Note for example that in a project with low BMI (1) for a long period, but with a high total delay in reviews (1), usually putting more reviewers doesn't reduce
Re: [openstack-dev] [Metrics] Improving the data about contributor/affiliation/time
On Fri, 2013-10-18 at 08:33 -0400, Sean Dague wrote: On 10/17/2013 05:34 PM, Stefano Maffulli wrote: [...] Four sources of data for this reporting is bad and not sustainable. Since it seems commonly accepted that all developers need to be members of the Foundation, and that Foundation members need to state their affiliation when they join and keep such data current when it changes, I think the Foundation is in a good place to provide the authoritative data for all projects to use. I'm not sure it is well understoond that all members have to join the foundation. We don't make that a requirement on someone slinging a patch. It would be nice to know what percentage of ATCs actually are foundation members at the moment (presumably that number is easy to generate?) My impression is that we need a data source that covers all contributors as much as possible. As you said, even for developers it is not always the case. If you are also interested in tracking bug reporters or message posters, for example, that is even less the case. Linking affiliation information to Foundation membership could be risky from this point of view. A different issue is that the Foundation maintains a system for claiming or fixing affiliation information, so that all of us producing metrics can use it. It could be based on the current datasets (the best of them, or maybe a combination of some of them), and could provide some interface for easy and public proposal of changes. It should also provide some interface so that any metrics collecting system can use it. For being useful, it should also include data for identification of developers (usually, the email addresses they are using in the different OpenStack repositories), since developers not only change organization, they also tend to change identification from time to time. The thing is, the Foundation data currently seems to be the least accurate of all the data sets. Also, the lack of affiliation over time is really a problem for this project, especially if one of the driving factors for so much interest in statistics comes from organizations wanting to ensure contributions by their employees get counted. A significant percentage of top contributors to OpenStack have not remained at a single employer over their duration to contributing to OpenStack, and I expect that to be the norm as the project ages. Also, both gitdm and stackalytics have active open developer communities (and they are open source all the way down, don't need non open components to run), so again, I'm not sure why defaulting to the least open platform makes any sense. Just for the record, the MetricsGrimoire / vizGrimoire stack that is producing the dashboards at http://activity.openstack.org/dash/ is also complete open source, with an open developer community, see http://metricsgrimoire.github.io and http://vizgrimoire.github.io All the data is also available, in the form of JSON files and SQL databases, see http://activity.openstack.org/dash/newbrowser/browser/data/db/ (which includes affilation data) This said, I'm not intending that our affiliation datasets are the best ones. We'd be more than happy to collaborate with the rest to produce a common dataset, or to revert to some other if it proves better maintained. In fact, we have already incorporated affiliation data from gitdm and (partially) from stackalytics. Member affiliation in the Foundation database can also only be fixed by the individual. In the other tools people in the know can fix it. It means we get a wikipedia effect in getting the data more accurate, as you can fix any issue you see, not just your own. This is something very important, from my point of view. The ability of changing any data you may find inaccurate, along with the use of a review system, just to ensure that we don't include malicious requests for change, would be desired features for any system we use. If the foundation member database was it's own thing, had a REST API to bulk fetch, and supported temporal associations, and let others propose updates to people's affiliation, then it would be an option. But right now it seems very far from being useful, and is probably the least, not most, accurate version of the world. [...] From my point of view, having a REST API would be helpful, but not a must. The usual way to include bulk data for us is to retrieve the external bulk data, compare it with the current we have, and decide (in part by hand) on the differences one by one, trying to incorporate the most reliable option. If the external data were always more reliable, it would be a matter of just comparing and using the external data when a match is found, and could be done automatically. And no REST data is really needed for this. Support of temporal associations, proposal of updates by anyone, review system, and support for multiple identites, would be very convenient. Saludos, Jesus.
[openstack-dev] [Metrics] Activity_Board now in infra
Hi all, The OpenStack Development Dashboard [1] is now in infra [2]. If you want to browse details about how to get the data (JSON and SQL), how to clone deploy the dashboard elsewhere, or how to reproduce the data retrieval and analysis process, you can refer to the README [3] or the wiki [4]. [1] http://activity.openstack.org/dash/ [2] http://git.openstack.org/cgit/openstack-infra/activity-board/ [3] http://git.openstack.org/cgit/openstack-infra/activity-board/tree/README.md [4] https://wiki.openstack.org/wiki/Activity_Board Bug reports and patches are welcome. For reports, please use the OpenStack_Community tracker [5] (tag: activityboard). For patches, please use the usual code review process. [5] https://launchpad.net/openstack-community Any feedback is welcome. Saludos, Jesus. -- -- Bitergia: http://bitergia.com http://blog.bitergia.com ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [Metrics][Nova] Using Bicho database to get stats about code review
Hi all, Bicho [1] now has a Gerrit backend, which has been tested with OpenStack's Gerrit. We have used it to produce the MySQL database dump available at [2] ( gerrit.mysql.7z ). You can use it to compute the metrics mentioned in the previous threads about code review, and some others. [1] https://github.com/MetricsGrimoire/Bicho [2] http://activity.openstack.org/dash/browser/data/db/ The database dump will be updated daily, starting in a few days. For some examples on how to run queries on it, or how to produce the database using Bicho, fresh from OpenStack's gerrit, have a look at [3]. [3] https://github.com/MetricsGrimoire/Bicho/wiki/Gerrit-backend At some point, we plan to visualize the data as a part of the development dashboard [4], so any comment on interesting metrics, or bugs, will be welcome. For now, we're taking not of all metrics mentioned in the previous posts about code review stats. [4] http://activity.openstack.org/dash/browser/ Saludos, Jesus. -- -- Bitergia: http://bitergia.com http://blog.bitergia.com ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Metrics][Nova] Using Bicho database to get stats about code review
On Wed, 2013-07-03 at 12:30 -0400, Russell Bryant wrote: On 07/03/2013 12:14 PM, Jesus M. Gonzalez-Barahona wrote: Hi all, Bicho [1] now has a Gerrit backend, which has been tested with OpenStack's Gerrit. We have used it to produce the MySQL database dump available at [2] ( gerrit.mysql.7z ). You can use it to compute the metrics mentioned in the previous threads about code review, and some others. [1] https://github.com/MetricsGrimoire/Bicho [2] http://activity.openstack.org/dash/browser/data/db/ The database dump will be updated daily, starting in a few days. For some examples on how to run queries on it, or how to produce the database using Bicho, fresh from OpenStack's gerrit, have a look at [3]. [3] https://github.com/MetricsGrimoire/Bicho/wiki/Gerrit-backend At some point, we plan to visualize the data as a part of the development dashboard [4], so any comment on interesting metrics, or bugs, will be welcome. For now, we're taking not of all metrics mentioned in the previous posts about code review stats. [4] http://activity.openstack.org/dash/browser/ Thanks for sharing! I don't think I understand the last sentence, though. Can you clarify? Oooops. It should read For now, we're taking notice of all metrics mentioned in the previous posts about code review stats. That would mean that, in a while, we expect to produce charts on the evolution of some of those metrics over time. Sorry for the typo. Saludos, Jesus. -- -- Bitergia: http://bitergia.com http://blog.bitergia.com ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev