Re: [openstack-dev] [stackalytics] [metrics] Review metrics: average numbers

2015-11-11 Thread Jesus M. Gonzalez-Barahona
Hi, Mike,

I'm not sure what you are looking for exactly, but maybe you can have a
look at the quarterly reports. AFAIK, currently there is none specific
to Fuel, but for example for Nova, you have:

http://activity.openstack.org/dash/reports/2015-q3/pdf/projects/nova.pd
f

In page 6, you have "time waiting for reviewer" (from the moment a new
patchset is produced, to the time a conclusive review vote is found in
Gerrit), and "time waiting for developer" (from the conclusive review
vote to next patchset).

We're working now in a visualization for that kind of information. For
now, we only have complete changeset values, check if you're
interested:

http://blog.bitergia.com/2015/10/22/understanding-the-code-review-proce
ss-in-openstack/

Saludos,

Jesus.

On Wed, 2015-11-11 at 21:45 +, Mike Scherbakov wrote:
> Hi stackers,
> I have a question about Stackalytics.
> I'm trying to get some more data from code review stats. For Fuel,
> for instance,
> http://stackalytics.com/report/reviews/fuel-group/open
> shows some useful stats. Do I understand right, that average numbers
> here are calculated out of open reviews, not total number of reviews?
> 
> The most important number which I'm trying to get, is an average time
> change requests waiting for reviewers since last vote or mark, from
> all requests (not only those which remain in open state, like it is
> now, I believe).
> 
> How hard would it be to get / extend Stackalytics to make it..?
> 
> Thanks!
> -- 
> Mike Scherbakov
> #mihgen
> _
> _
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubs
> cribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
-- 
Bitergia: http://bitergia.com
/me at Twitter: https://twitter.com/jgbarah


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [metrics] Summary of the BOF session on software engineering metrics

2015-10-28 Thread Jesus M. Gonzalez-Barahona
Hi all

This morning we had the BOF session on software engineering metrics for
OpenStack. Since we agreed on continuing the conversation in this
mailing list, here is a summary of the session, based on the notes
taken by Dani (in CC).

First of all, the link to the slides we used to frame the session:

http://bit.ly/openstack-bof-15

Now, some comments about what we talked.

The general idea of the BOF can be found in slide #3: OpenStack is a
project leading in open development metrics, the idea was talking about
current experiences with this, and about what we would like to have in
the future.

Current dashboards, collections of metrics, etc. In addition to the
ones presented in the slides (Stackalytics, Activity Dashboard, Kibana
-based dashboards, Russell Bryant's stats, Bitergia reports, Jenkins
Logstash dashboard), some others, below, were mentioned. If you know of
any other, please comment about it in this thread.

- Somebody has been looking at patterns of how persons contribute over
time.
- There are some in-house (non public) machinery to check the
total time to deploy in companies for their customers.
- A tool
reporting on coverage in reporting unit tests was mentioned.

We discussed about how each of these efforts is targeted at different
uses, at answering different questions. Some of them are focused on the
contributions by different actors like persons or companies (such as
Stackalytics), some allow for getting knowledge about how development
processes are happening and performing (such as the Activity Board,
Russell's stats, or the Jenkins Logstash dashboard), some are summaries
of interest about certain aspects (such as the reports). In most cases,
they there are overlappings.

It was also mentioned how OpenStack is one of the projects which has
advanced more in the idea of open development analytics, thanks in part
to all the previous efforts, and to a certain state of mind that makes
participants in the OpenStack community more aware of the importance of
metrics than in other projects.

Then we discussed about use cases of metrics within OpenStack, both
cases that are happening now, cases that are happening in other
communities but could be translated, and others not yet happening that
could be interesting. In addition to those mentioned in slide #11, some
others were:

- QA: being able to look at code test coverage, duplication of testing,
test time cycle.
- Hotspots: places of the code where more people are
touching (more 
complex areas).
- Bugs: what files are having more bugs?
How's the complexity in them?
- Complexity: changes that are touching
more complex areas than others?
- Operators: stats about backporting
fixes to stable branches
- Metrics in stable branch, such as failures on
stable branches
- Dependencies on feature requests: Have a new feature
tested internally and later uploaded to upstream, to find that this
breaks even when it was previously in master.
- Frequency of rebase
requirements.
- How fast fixes for critical vulnerabilities are merged
into master
- External example: Wikimedia Gerrit clean up days: lists of
changesets, times, etc. to track the impact of the day.
- External
example: Xen community checking current status of the time to merge,to
see if their perception was in line with real metrics.

Some other issues that were discussed include:

- Should SonarQube be used, and the results made public to the people
when releasing? Every day?
- It may be hard to compare metrics in Stackalytics with other
projects, given that it is very specific of Openstack. On the other
hand, this specificity makes it very well adapted to OpenStack.
- It would be interesting tracking contributions coming from operators
- Having public performance metrics could be of interest, although that
may be a bit beyond the current discussion on software development
metrics.

As a summary, we had time to agree on two lines (see slide #13):

* Open development analytics is a core value of the
OpenStack community
* Let's fostering more lively discussions about metrics in OpenStack,
using the openstack-dev mailing list

Please, comment about any other conclusion that you may propose (were
you in the BOF or not), by answering in this thread.

Thanks to all of you who made this BOF possible! Please, those of you
who attended, comment anything that could missing in these notes.

Saludos,

Jesus.


-- 
Bitergia: http://bitergia.com
/me at Twitter: https://twitter.com/jgbarah


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] How to get openstack bugs data for research?

2014-12-03 Thread Jesus M. Gonzalez-Barahona
You can download a database dump with (hopefully) information for all
Launchpad tickets corresponding to OpenStack, already organized and
ready to be queried:

http://activity.openstack.org/dash/browser/data/db/tickets.mysql.7z

(linked from
http://activity.openstack.org/dash/browser/data_sources.html )

It is a 7z-zipped file with includes a MySQL dump of the actual
database, produced by Bicho (one of the tools in MetricsGrimoire). It is
updated daily.

You can just feed it to MySQL, and start to do your queries. There is
some information about Bicho and the database schema used at

https://github.com/MetricsGrimoire/Bicho/wiki

https://github.com/MetricsGrimoire/Bicho/wiki/Database-Schema

Please, let me know if you need further info.

Saludos,

Jesus.

On Wed, 2014-12-03 at 20:20 +0800, zfx0...@gmail.com wrote:
 Hi, all
 
 
 I am a graduate student in Peking University, our lab do some research
 on open source projects. 
 This is our introduction: https://passion-lab.org/
 
 
 Now we need openstack issues data for research, I found the issues
 list: https://bugs.launchpad.net/openstack/
 I want to download the openstack issues data, Could anyone tell me how
 to download the data? Or is there some link or API for download the
 data?
 
 
 And I found 9464  bugs in https://bugs.launchpad.net/openstack/ ,is
 this all? why so few?
 
 
 Many thanks!
 
 
 Beat regards,
 Feixue, Zhang
 
 
 
 __
 zfx0...@gmail.com
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

-- 
Bitergia: http://bitergia.com
/me at Twitter: https://twitter.com/jgbarah


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO] Review metrics - what do we want to measure?

2014-09-03 Thread Jesus M. Gonzalez-Barahona
On Wed, 2014-09-03 at 12:58 +1200, Robert Collins wrote:
 On 14 August 2014 11:03, James Polley j...@jamezpolley.com wrote:
  In recent history, we've been looking each week at stats from
  http://russellbryant.net/openstack-stats/tripleo-openreviews.html to get a
  gauge on how our review pipeline is tracking.
 
  The main stats we've been tracking have been the since the last revision
  without -1 or -2. I've included some history at [1], but the summary is
  that our 3rd quartile has slipped from 13 days to 16 days over the last 4
  weeks or so. Our 1st quartile is fairly steady lately, around 1 day (down
  from 4 a month ago) and median is unchanged around 7 days.
 
  There was lots of discussion in our last meeting about what could be causing
  this[2]. However, the thing we wanted to bring to the list for the
  discussion is:
 
  Are we tracking the right metric? Should we be looking to something else to
  tell us how well our pipeline is performing?
 
  The meeting logs have quite a few suggestions about ways we could tweak the
  existing metrics, but if we're measuring the wrong thing that's not going to
  help.
 
  I think that what we are looking for is a metric that lets us know whether
  the majority of patches are getting feedback quickly. Maybe there's some
  other metric that would give us a good indication?
 
 If we review all patches quickly and land none, thats bad too :).
 
 For the reviewers specifically i think we need a metric(s) that:
  - doesn't go bad when submitters go awol, don't respond etc
- including when they come back - our stats shouldn't jump hugely
 because an old review was resurrected
  - when good means submitters will be getting feedback
  - flag inventory- things we'd be happy to have landed that haven't
- including things with a -1 from non-core reviewers (*)
 
 (*) I often see -1's on things core wouldn't -1 due to the learning
 curve involved in becoming core
 
 So, as Ben says, I think we need to address the its-not-a-vote issue
 as a priority, that has tripped us up in lots of ways
 
 I think we need to discount -workflow patches where that was set by
 the submitter, which AFAICT we don't do today.
 
 Looking at current stats:
 Longest waiting reviews (based on oldest rev without -1 or -2):
 
 54 days, 2 hours, 41 minutes https://review.openstack.org/106167
 (Keystone/LDAP integration)
 That patch had a -1 on Aug 16 1:23 AM: but was quickyl turned to +2.
 
 So this patch had a -1 then after discussion it became a +2. And its
 evolved multiple times.
 
 What should we be saying here? Clearly its had little review input
 over its life, so I think its sadly accurate.
 
 I wonder if a big chunk of our sliding quartile is just use not
 reviewing the oldest reviews.

I've been researching review process in OpenStack and other projects for
a while, and my impression is that at least three timing metrics are
relevant:

(1) Total time from submitting a patch to final closing of the review
process (landing that, or a subsequent patch, or finally abandoning).
This gives an idea of how the whole process is working.

(2) Time from submitting a patch to that patch being approved (+2 in
OpenStack, I guess) or declined (and a new patch is requested). This
gives an idea of how quick reviewers are providing definite feedback to
patch submitters, and is a metric for each patch cycle.

(3) Time from a patch being reviewed, with a new patch being requested,
to a new patch being submitted. This gives an idea of the reaction
time of patch submitter.

Usually, you want to keep (1) low, while (2) and (3) give you an idea of
what is happening if (1) gets high.

There is another relevant metric in some cases, which is

(4) The number of patch cycles per review cycle (that is, how many
patches are needed per patch landing in master). In some cases, that may
help to explain how (2) and (3) contribute to (1).

And a fifth metric gives you a throughput metric:

(5) BMI (backlog management index), number of new review processes by
number of closed review process for a certain period. It gives an idea
of whether the backlog is going up (1) or down (1), and is usually
very interesting when seen over time.

(1) alone is not enough to assess on how well the review process is,
because it could low, but (5) showing an increasing backlog because
simply new review requests come too quickly (eg, in periods when
developers are submitting a lot of patch proposals after a freeze). (1)
could also be high, but (5) show a decrease in the backlog, because for
example reviewers or submitters are overworked or slowly scheduled, but
still the project copes with the backlog. Depending on the relationship
of (1) and (5), maybe you need more reviewers, or reviewers scheduling
their reviews with more priority wrt other actions, or something else.

Note for example that in a project with low BMI (1) for a long period,
but with a high total delay in reviews (1), usually putting more
reviewers doesn't reduce 

Re: [openstack-dev] [Metrics] Improving the data about contributor/affiliation/time

2013-10-18 Thread Jesus M. Gonzalez-Barahona
On Fri, 2013-10-18 at 08:33 -0400, Sean Dague wrote:
 On 10/17/2013 05:34 PM, Stefano Maffulli wrote:
  [...]
  Four sources of data for this reporting is bad and not sustainable.
 
  Since it seems commonly accepted that all developers need to be members
  of the Foundation, and that Foundation members need to state their
  affiliation when they join and keep such data current when it changes, I
  think the Foundation is in a good place to provide the authoritative
  data for all projects to use.
 
 I'm not sure it is well understoond that all members have to join the 
 foundation. We don't make that a requirement on someone slinging a 
 patch. It would be nice to know what percentage of ATCs actually are 
 foundation members at the moment (presumably that number is easy to 
 generate?)

My impression is that we need a data source that covers all contributors
as much as possible. As you said, even for developers it is not always
the case. If you are also interested in tracking bug reporters or
message posters, for example, that is even less the case. Linking
affiliation information to Foundation membership could be risky from
this point of view.

A different issue is that the Foundation maintains a system for claiming
or fixing affiliation information, so that all of us producing metrics
can use it. It could be based on the current datasets (the best of them,
or maybe a combination of some of them), and could provide some
interface for easy and public proposal of changes. It should also
provide some interface so that any metrics collecting system can use it.

For being useful, it should also include data for identification of
developers (usually, the email addresses they are using in the different
OpenStack repositories), since developers not only change organization,
they also tend to change identification from time to time.

 The thing is, the Foundation data currently seems to be the least 
 accurate of all the data sets. Also, the lack of affiliation over time 
 is really a problem for this project, especially if one of the driving 
 factors for so much interest in statistics comes from organizations 
 wanting to ensure contributions by their employees get counted. A 
 significant percentage of top contributors to OpenStack have not 
 remained at a single employer over their duration to contributing to 
 OpenStack, and I expect that to be the norm as the project ages.
 
 Also, both gitdm and stackalytics have active open developer communities 
 (and they are open source all the way down, don't need non open 
 components to run), so again, I'm not sure why defaulting to the least 
 open platform makes any sense.

Just for the record, the MetricsGrimoire / vizGrimoire stack that is
producing the dashboards at http://activity.openstack.org/dash/ is also
complete open source, with an open developer community, see
http://metricsgrimoire.github.io and http://vizgrimoire.github.io

All the data is also available, in the form of JSON files and SQL
databases, see
http://activity.openstack.org/dash/newbrowser/browser/data/db/
(which includes affilation data)

This said, I'm not intending that our affiliation datasets are the best
ones. We'd be more than happy to collaborate with the rest to produce a
common dataset, or to revert to some other if it proves better
maintained. In fact, we have already incorporated affiliation data from
gitdm and (partially) from stackalytics.

 Member affiliation in the Foundation database can also only be fixed by 
 the individual. In the other tools people in the know can fix it. It 
 means we get a wikipedia effect in getting the data more accurate, as 
 you can fix any issue you see, not just your own.

This is something very important, from my point of view. The ability of
changing any data you may find inaccurate, along with the use of a
review system, just to ensure that we don't include malicious requests
for change, would be desired features for any system we use.

 If the foundation member database was it's own thing, had a REST API to 
 bulk fetch, and supported temporal associations, and let others propose 
 updates to people's affiliation, then it would be an option. But right 
 now it seems very far from being useful, and is probably the least, not 
 most, accurate version of the world.
[...]

From my point of view, having a REST API would be helpful, but not a
must. The usual way to include bulk data for us is to retrieve the
external bulk data, compare it with the current we have, and decide (in
part by hand) on the differences one by one, trying to incorporate the
most reliable option. If the external data were always more reliable, it
would be a matter of just comparing and using the external data when a
match is found, and could be done automatically. And no REST data is
really needed for this.

Support of temporal associations, proposal of updates by anyone, review
system, and support for multiple identites, would be very convenient.

Saludos,

Jesus.


[openstack-dev] [Metrics] Activity_Board now in infra

2013-09-20 Thread Jesus M. Gonzalez-Barahona
Hi all,

The OpenStack Development Dashboard [1] is now in infra [2]. If you want
to browse details about how to get the data (JSON and SQL), how to clone
 deploy the dashboard elsewhere, or how to reproduce the data retrieval
and analysis process, you can refer to the README [3] or the wiki [4].

[1] http://activity.openstack.org/dash/
[2] http://git.openstack.org/cgit/openstack-infra/activity-board/
[3]
http://git.openstack.org/cgit/openstack-infra/activity-board/tree/README.md
[4] https://wiki.openstack.org/wiki/Activity_Board

Bug reports and patches are welcome. For reports, please use the
OpenStack_Community tracker [5] (tag: activityboard). For patches,
please use the usual code review process.

[5] https://launchpad.net/openstack-community

Any feedback is welcome.

Saludos,

Jesus.

-- 
-- 
Bitergia: http://bitergia.com http://blog.bitergia.com


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [Metrics][Nova] Using Bicho database to get stats about code review

2013-07-03 Thread Jesus M. Gonzalez-Barahona
Hi all,

Bicho [1] now has a Gerrit backend, which has been tested with
OpenStack's Gerrit. We have used it to produce the MySQL database dump
available at [2] ( gerrit.mysql.7z ). You can use it to compute the
metrics mentioned in the previous threads about code review, and some
others.

[1] https://github.com/MetricsGrimoire/Bicho
[2] http://activity.openstack.org/dash/browser/data/db/

The database dump will be updated daily, starting in a few days.

For some examples on how to run queries on it, or how to produce the
database using Bicho, fresh from OpenStack's gerrit, have a look at [3].

[3] https://github.com/MetricsGrimoire/Bicho/wiki/Gerrit-backend

At some point, we plan to visualize the data as a part of the
development dashboard [4], so any comment on interesting metrics, or
bugs, will be welcome. For now, we're taking not of all metrics
mentioned in the previous posts about code review stats.

[4] http://activity.openstack.org/dash/browser/

Saludos,

Jesus.

-- 
-- 
Bitergia: http://bitergia.com http://blog.bitergia.com


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Metrics][Nova] Using Bicho database to get stats about code review

2013-07-03 Thread Jesus M. Gonzalez-Barahona
On Wed, 2013-07-03 at 12:30 -0400, Russell Bryant wrote:
 On 07/03/2013 12:14 PM, Jesus M. Gonzalez-Barahona wrote:
  Hi all,
  
  Bicho [1] now has a Gerrit backend, which has been tested with
  OpenStack's Gerrit. We have used it to produce the MySQL database dump
  available at [2] ( gerrit.mysql.7z ). You can use it to compute the
  metrics mentioned in the previous threads about code review, and some
  others.
  
  [1] https://github.com/MetricsGrimoire/Bicho
  [2] http://activity.openstack.org/dash/browser/data/db/
  
  The database dump will be updated daily, starting in a few days.
  
  For some examples on how to run queries on it, or how to produce the
  database using Bicho, fresh from OpenStack's gerrit, have a look at [3].
  
  [3] https://github.com/MetricsGrimoire/Bicho/wiki/Gerrit-backend
  
  At some point, we plan to visualize the data as a part of the
  development dashboard [4], so any comment on interesting metrics, or
  bugs, will be welcome. For now, we're taking not of all metrics
  mentioned in the previous posts about code review stats.
  
  [4] http://activity.openstack.org/dash/browser/
 
 Thanks for sharing!  I don't think I understand the last sentence,
 though.  Can you clarify?
 

Oooops. It should read For now, we're taking notice of all metrics
mentioned in the previous posts about code review stats. That would
mean that, in a while, we expect to produce charts on the evolution of
some of those metrics over time.

Sorry for the typo.

Saludos,

Jesus.

-- 
-- 
Bitergia: http://bitergia.com http://blog.bitergia.com


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev