Re: [Openstack] Metrics around code

2011-10-19 Thread Thierry Carrez
Stefano Maffulli wrote:
 You'll find there also the implementation details to answer the
 question:
  
 Who commited to an OpenStack repo, how many times in the past 30
 days?
 
 and a demo report built with Pentaho Reporting representing the 
 total number commits per repository in past 30 days
 http://wiki.openstack.org/CommunityMetrics/Code?action=AttachFiledo=gettarget=2011-11-commits30daysallrepo-obfuscated.pdf
 [note: the email addresses are hidden on purpose]

Can an HTML report be produced and posted instead ? It feels like that
sort of information should be pullable rather than pushed, from a
well-known website, and PDF adds an extra step to access, for no real
value (is anybody going to print this ?)

 First of all: do the numbers seem correct to you? In other words, does
 the SQL query seem correct? Does the demo report look interesting to
 you? What/how would you change?

I can't really answer that question, but it looks strange to me to see
Jenkins up there (I bet he didn't author any patch).

 Then, I would like your feedback to refine the other questions we want
 to see answered regularly, regarding code (we'll move on to bugs, docs,
 etc later).
 
 Are the following reports interesting? Do we want to have them run
 monthly or weekly? 

If the reports are not pushed, they can run more often. Maybe something
like last 30 days (refreshed every week) and then generating a report
per-milestone (at the end of every milestone) ? I think it would be good
to know who committed code for a given milestone, rather than for a
given arbitrary month.

 Is this too much information or too little? What else
 would you like to see regarding code?
 
 * Total number of commits across all repos aggregated per month
 * Total number of commits per repository aggregated per month

Maybe per-milestone would be more useful, though it's a bit more
difficult to do (especially since all projects do not follow the common
milestone plan).

-- 
Thierry Carrez (ttx)
Release Manager, OpenStack

___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp


Re: [Openstack] Metrics around code

2011-10-19 Thread Rohit Karajgi
I agree with the HTML report, and this report can be published onto OpenStack 
Jenkins for world readability. We have some similar metrics for bugs. The 
following metrics can be easily pulled from Launchpad's bug database using the 
python-launchpadlib api.

1. Bug Distribution - By Status 
2. Bug Distribution - By Importance 
3. Bug Distribution - By Milestone 
4. Bug Distribution - By Bug owners 
5. Bug Distribution - By Fixed-by 
6. # of times a file was modified 
7. # of lines modified per file

#6 and #7 specifically can be quite useful to identify those modules that can 
be good targets for unit tests.

Cheers,
Rohit

-Original Message-
From: openstack-bounces+rohit.karajgi=vertex.co...@lists.launchpad.net 
[mailto:openstack-bounces+rohit.karajgi=vertex.co...@lists.launchpad.net] On 
Behalf Of Thierry Carrez
Sent: Wednesday, October 19, 2011 3:04 PM
To: openstack@lists.launchpad.net
Subject: Re: [Openstack] Metrics around code

Stefano Maffulli wrote:
 You'll find there also the implementation details to answer the
 question:
  
 Who commited to an OpenStack repo, how many times in the past 30
 days?
 
 and a demo report built with Pentaho Reporting representing the total 
 number commits per repository in past 30 days 
 http://wiki.openstack.org/CommunityMetrics/Code?action=AttachFiledo=g
 ettarget=2011-11-commits30daysallrepo-obfuscated.pdf
 [note: the email addresses are hidden on purpose]

Can an HTML report be produced and posted instead ? It feels like that sort of 
information should be pullable rather than pushed, from a well-known website, 
and PDF adds an extra step to access, for no real value (is anybody going to 
print this ?)

 First of all: do the numbers seem correct to you? In other words, does 
 the SQL query seem correct? Does the demo report look interesting to 
 you? What/how would you change?

I can't really answer that question, but it looks strange to me to see Jenkins 
up there (I bet he didn't author any patch).

 Then, I would like your feedback to refine the other questions we want 
 to see answered regularly, regarding code (we'll move on to bugs, 
 docs, etc later).
 
 Are the following reports interesting? Do we want to have them run 
 monthly or weekly?

If the reports are not pushed, they can run more often. Maybe something like 
last 30 days (refreshed every week) and then generating a report 
per-milestone (at the end of every milestone) ? I think it would be good to 
know who committed code for a given milestone, rather than for a given 
arbitrary month.

 Is this too much information or too little? What else would you like 
 to see regarding code?
 
 * Total number of commits across all repos aggregated per month
 * Total number of commits per repository aggregated per month

Maybe per-milestone would be more useful, though it's a bit more difficult to 
do (especially since all projects do not follow the common milestone plan).

--
Thierry Carrez (ttx)
Release Manager, OpenStack

___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp

___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp


Re: [Openstack] Metrics around code

2011-10-19 Thread Anne Gentle
Hi Stefano -
I would like to see the openstack-manuals commits tracked as well.

I'd also like to see tracking commits to the api repos:
compute-api
image-api
object-api
identity-api
netconn-api

Possibly these could be in additional monthly reports so that code
contributions are highlighted in one report, and doc and api changes in
another (or two others). Monthly seems fine to me for frequency.

Thanks,
Anne

*Anne Gentle*
a...@openstack.org
 my blog http://justwriteclick.com/ | my
bookhttp://xmlpress.net/publications/conversation-community/|
LinkedIn http://www.linkedin.com/in/annegentle |
Delicioushttp://del.icio.us/annegentle|
Twitter http://twitter.com/annegentle
On Tue, Oct 18, 2011 at 4:37 PM, Stefano Maffulli stef...@openstack.orgwrote:

 Hello folks,

 I made more progress using CVSanaly to dig into our git repositories,
 build a database from the git logs and extract information from it.

 CVSanaly is a tool developed under a EU sponsored project (FLOSSmetrics)
 and currently maintained by a few universities. More about it on
 https://projects.libresoft.es/projects/cvsanaly/wiki

 For the curious among us, I documented the steps to populate the
 CVSanaly database with data from OpenStack git repos on a new wiki page:

 http://wiki.openstack.org/CommunityMetrics/Code

 You'll find there also the implementation details to answer the
 question:

Who commited to an OpenStack repo, how many times in the past 30
days?

 and a demo report built with Pentaho Reporting representing the
 total number commits per repository in past 30 days

 http://wiki.openstack.org/CommunityMetrics/Code?action=AttachFiledo=gettarget=2011-11-commits30daysallrepo-obfuscated.pdf
 [note: the email addresses are hidden on purpose]

 First of all: do the numbers seem correct to you? In other words, does
 the SQL query seem correct? Does the demo report look interesting to
 you? What/how would you change?

 Then, I would like your feedback to refine the other questions we want
 to see answered regularly, regarding code (we'll move on to bugs, docs,
 etc later).

 Are the following reports interesting? Do we want to have them run
 monthly or weekly? Is this too much information or too little? What else
 would you like to see regarding code?

* Total number of commits across all repos aggregated per month
* Total number of commits per repository aggregated per month
* Total number of commits per author per repository
* Total number of commits per author per repository in past 30
days
* Average number of Lines of Code changed per commit per
repository
* Average number of Lines of Code changed per commit per
repository per author

 Also, from the list of repositories on https://github.com/openstack/,
 which ones should I keep tracking?

 cheers,
 stef


 ___
 Mailing list: https://launchpad.net/~openstack
 Post to : openstack@lists.launchpad.net
 Unsubscribe : https://launchpad.net/~openstack
 More help   : https://help.launchpad.net/ListHelp

___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp


Re: [Openstack] Metrics around code

2011-10-19 Thread Aaron Lee
I think I missed the larger discussion around why these metrics are getting 
published. 

tl;dr metrics good; these metrics, maybe not so good

My worry about publishing number of commits by author is that it could make 
Gerrit, and our current review process a mess. I agree that smaller commits are 
better, but do not play well with the Gerrit workflow(in particular if a commit 
that has dependancies within the same branch is rejected, the entire branch has 
to be redone). My other fear is with pointless bickering(however good natured 
it may be) about who committed what(we've already seen it, looks strange to me 
to see Jenkins up there).

I don't know if something similar would happen with the bug database, but I 
would hate to see people touching bugs just to get counted. The problem with 
software development metrics is that programers optimize processes for a 
living. When presented with a metric they will often subconsciously optimize 
for it. 

None of this is original ideas to me, I've heard this argument against software 
metrics many times before. The given solution is to pick a metric that has a 
tangible meaning to your end users. Taken to the extreme you end up where the 
Lean Startup movement is; pick something measurable(time to allocate an IP), 
make a hypothesis(your patch), push it to half your users and observe the 
results(A/B testing).

I really appreciate the desire to measure the project, I just don't think this 
is the best way to go about it.

thanks,
Aaron

On Oct 19, 2011, at 6:30 AM, Rohit Karajgi wrote:

 I agree with the HTML report, and this report can be published onto OpenStack 
 Jenkins for world readability. We have some similar metrics for bugs. The 
 following metrics can be easily pulled from Launchpad's bug database using 
 the python-launchpadlib api.
 
 1. Bug Distribution - By Status 
 2. Bug Distribution - By Importance 
 3. Bug Distribution - By Milestone 
 4. Bug Distribution - By Bug owners 
 5. Bug Distribution - By Fixed-by 
 6. # of times a file was modified 
 7. # of lines modified per file
 
 #6 and #7 specifically can be quite useful to identify those modules that can 
 be good targets for unit tests.
 
 Cheers,
 Rohit
 
 -Original Message-
 From: openstack-bounces+rohit.karajgi=vertex.co...@lists.launchpad.net 
 [mailto:openstack-bounces+rohit.karajgi=vertex.co...@lists.launchpad.net] On 
 Behalf Of Thierry Carrez
 Sent: Wednesday, October 19, 2011 3:04 PM
 To: openstack@lists.launchpad.net
 Subject: Re: [Openstack] Metrics around code
 
 Stefano Maffulli wrote:
 You'll find there also the implementation details to answer the
 question:
 
Who commited to an OpenStack repo, how many times in the past 30
days?
 
 and a demo report built with Pentaho Reporting representing the total 
 number commits per repository in past 30 days 
 http://wiki.openstack.org/CommunityMetrics/Code?action=AttachFiledo=g
 ettarget=2011-11-commits30daysallrepo-obfuscated.pdf
 [note: the email addresses are hidden on purpose]
 
 Can an HTML report be produced and posted instead ? It feels like that sort 
 of information should be pullable rather than pushed, from a well-known 
 website, and PDF adds an extra step to access, for no real value (is anybody 
 going to print this ?)
 
 First of all: do the numbers seem correct to you? In other words, does 
 the SQL query seem correct? Does the demo report look interesting to 
 you? What/how would you change?
 
 I can't really answer that question, but it looks strange to me to see 
 Jenkins up there (I bet he didn't author any patch).
 
 Then, I would like your feedback to refine the other questions we want 
 to see answered regularly, regarding code (we'll move on to bugs, 
 docs, etc later).
 
 Are the following reports interesting? Do we want to have them run 
 monthly or weekly?
 
 If the reports are not pushed, they can run more often. Maybe something like 
 last 30 days (refreshed every week) and then generating a report 
 per-milestone (at the end of every milestone) ? I think it would be good to 
 know who committed code for a given milestone, rather than for a given 
 arbitrary month.
 
 Is this too much information or too little? What else would you like 
 to see regarding code?
 
* Total number of commits across all repos aggregated per month
* Total number of commits per repository aggregated per month
 
 Maybe per-milestone would be more useful, though it's a bit more difficult to 
 do (especially since all projects do not follow the common milestone plan).
 
 --
 Thierry Carrez (ttx)
 Release Manager, OpenStack
 
 ___
 Mailing list: https://launchpad.net/~openstack
 Post to : openstack@lists.launchpad.net
 Unsubscribe : https://launchpad.net/~openstack
 More help   : https://help.launchpad.net/ListHelp
 
 ___
 Mailing list: https://launchpad.net/~openstack
 Post to : openstack

Re: [Openstack] Metrics around code

2011-10-19 Thread Stefano Maffulli
On Wed, 2011-10-19 at 11:34 +0200, Thierry Carrez wrote:
 Can an HTML report be produced and posted instead ? 

yes. Actually, the reports can be made self service from a Pentaho BI
server. We'll get there, some day. Lets get the reports defined first,
we'll think about the distribution later.

 I can't really answer that question, but it looks strange to me to see
 Jenkins up there (I bet he didn't author any patch).

You'll have to tell me why that happens. I pulled the repositories from
github and jenkins appears as an author in their logs.

 If the reports are not pushed, they can run more often. Maybe something
 like last 30 days (refreshed every week) and then generating a report
 per-milestone (at the end of every milestone) ? I think it would be good
 to know who committed code for a given milestone, rather than for a
 given arbitrary month.

Uhm ... I'll need some help to I design the query that selects the
milestone timeframe from the database. If you have time, lets have a
meeting: I can show you the database structure and the content of the
tables and you can tell me if it's possible to do this with a query or
more manual work is needed.

/stef


___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp


Re: [Openstack] Metrics around code

2011-10-19 Thread Stefano Maffulli
On Wed, 2011-10-19 at 04:30 -0700, Rohit Karajgi wrote:
 I agree with the HTML report, and this report can be published onto
 OpenStack Jenkins for world readability. We have some similar metrics
 for bugs. 

Indeed. I haven't started looking at the bugs database yet. There are
two interesting tools though, one from Pentaho itself and one from the
same folks that develop cvsanaly. None support launchpad, so I'll need a
python-launchpadlib expert to proceed. I'll think about it when I'm done
with the code reports.

 6. # of times a file was modified 
 7. # of lines modified per file
 
 #6 and #7 specifically can be quite useful to identify those modules that can 
 be good targets for unit tests.

I have this data in the cvsanaly database and I think I can design a
query for it quite easily.

/stef



___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp


Re: [Openstack] Metrics around code

2011-10-19 Thread Stefano Maffulli
Aaron,

I understand your concerns. Measuring irrelevant things may lead to
unintended consequences. We don't have to reveal the names of the
committers and that's why I ran the report here early: to get
feedback :) 

If others feel the same, I can easily remove the who from that report
and show only the Total number of commits per repository in past 30
days graph.

On Wed, 2011-10-19 at 14:18 +, Aaron Lee wrote:
 I really appreciate the desire to measure the project, I just don't
 think this is the best way to go about it. 

How and what would you measure instead? What would you add/remove from
the list on http://wiki.openstack.org/CommunityMetrics/Code?

thanks
stef


___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp


Re: [Openstack] Metrics around code

2011-10-19 Thread Rohit Karajgi
Sure, I can help with python-launchpadlib. I've played around with it for a 
while. Do let me know if needed whenever you get there.

Cheers,
Rohit

-Original Message-
From: openstack-bounces+rohit.karajgi=vertex.co...@lists.launchpad.net 
[mailto:openstack-bounces+rohit.karajgi=vertex.co...@lists.launchpad.net] On 
Behalf Of Stefano Maffulli
Sent: Wednesday, October 19, 2011 9:38 PM
To: openstack@lists.launchpad.net
Subject: Re: [Openstack] Metrics around code

On Wed, 2011-10-19 at 04:30 -0700, Rohit Karajgi wrote:
 I agree with the HTML report, and this report can be published onto 
 OpenStack Jenkins for world readability. We have some similar metrics 
 for bugs.

Indeed. I haven't started looking at the bugs database yet. There are two 
interesting tools though, one from Pentaho itself and one from the same folks 
that develop cvsanaly. None support launchpad, so I'll need a 
python-launchpadlib expert to proceed. I'll think about it when I'm done with 
the code reports.

 6. # of times a file was modified
 7. # of lines modified per file
 
 #6 and #7 specifically can be quite useful to identify those modules that can 
 be good targets for unit tests.

I have this data in the cvsanaly database and I think I can design a query for 
it quite easily.

/stef



___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp

___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp


[Openstack] Metrics around code

2011-10-18 Thread Stefano Maffulli
Hello folks,

I made more progress using CVSanaly to dig into our git repositories,
build a database from the git logs and extract information from it.

CVSanaly is a tool developed under a EU sponsored project (FLOSSmetrics)
and currently maintained by a few universities. More about it on
https://projects.libresoft.es/projects/cvsanaly/wiki

For the curious among us, I documented the steps to populate the
CVSanaly database with data from OpenStack git repos on a new wiki page:

http://wiki.openstack.org/CommunityMetrics/Code

You'll find there also the implementation details to answer the
question:
 
Who commited to an OpenStack repo, how many times in the past 30
days?

and a demo report built with Pentaho Reporting representing the 
total number commits per repository in past 30 days
http://wiki.openstack.org/CommunityMetrics/Code?action=AttachFiledo=gettarget=2011-11-commits30daysallrepo-obfuscated.pdf
[note: the email addresses are hidden on purpose]

First of all: do the numbers seem correct to you? In other words, does
the SQL query seem correct? Does the demo report look interesting to
you? What/how would you change?

Then, I would like your feedback to refine the other questions we want
to see answered regularly, regarding code (we'll move on to bugs, docs,
etc later).

Are the following reports interesting? Do we want to have them run
monthly or weekly? Is this too much information or too little? What else
would you like to see regarding code?

* Total number of commits across all repos aggregated per month
* Total number of commits per repository aggregated per month
* Total number of commits per author per repository
* Total number of commits per author per repository in past 30
days
* Average number of Lines of Code changed per commit per
repository
* Average number of Lines of Code changed per commit per
repository per author

Also, from the list of repositories on https://github.com/openstack/,
which ones should I keep tracking? 

cheers,
stef


___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp