Re: [Openstack] Metrics around code
Stefano Maffulli wrote: You'll find there also the implementation details to answer the question: Who commited to an OpenStack repo, how many times in the past 30 days? and a demo report built with Pentaho Reporting representing the total number commits per repository in past 30 days http://wiki.openstack.org/CommunityMetrics/Code?action=AttachFiledo=gettarget=2011-11-commits30daysallrepo-obfuscated.pdf [note: the email addresses are hidden on purpose] Can an HTML report be produced and posted instead ? It feels like that sort of information should be pullable rather than pushed, from a well-known website, and PDF adds an extra step to access, for no real value (is anybody going to print this ?) First of all: do the numbers seem correct to you? In other words, does the SQL query seem correct? Does the demo report look interesting to you? What/how would you change? I can't really answer that question, but it looks strange to me to see Jenkins up there (I bet he didn't author any patch). Then, I would like your feedback to refine the other questions we want to see answered regularly, regarding code (we'll move on to bugs, docs, etc later). Are the following reports interesting? Do we want to have them run monthly or weekly? If the reports are not pushed, they can run more often. Maybe something like last 30 days (refreshed every week) and then generating a report per-milestone (at the end of every milestone) ? I think it would be good to know who committed code for a given milestone, rather than for a given arbitrary month. Is this too much information or too little? What else would you like to see regarding code? * Total number of commits across all repos aggregated per month * Total number of commits per repository aggregated per month Maybe per-milestone would be more useful, though it's a bit more difficult to do (especially since all projects do not follow the common milestone plan). -- Thierry Carrez (ttx) Release Manager, OpenStack ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] Metrics around code
I agree with the HTML report, and this report can be published onto OpenStack Jenkins for world readability. We have some similar metrics for bugs. The following metrics can be easily pulled from Launchpad's bug database using the python-launchpadlib api. 1. Bug Distribution - By Status 2. Bug Distribution - By Importance 3. Bug Distribution - By Milestone 4. Bug Distribution - By Bug owners 5. Bug Distribution - By Fixed-by 6. # of times a file was modified 7. # of lines modified per file #6 and #7 specifically can be quite useful to identify those modules that can be good targets for unit tests. Cheers, Rohit -Original Message- From: openstack-bounces+rohit.karajgi=vertex.co...@lists.launchpad.net [mailto:openstack-bounces+rohit.karajgi=vertex.co...@lists.launchpad.net] On Behalf Of Thierry Carrez Sent: Wednesday, October 19, 2011 3:04 PM To: openstack@lists.launchpad.net Subject: Re: [Openstack] Metrics around code Stefano Maffulli wrote: You'll find there also the implementation details to answer the question: Who commited to an OpenStack repo, how many times in the past 30 days? and a demo report built with Pentaho Reporting representing the total number commits per repository in past 30 days http://wiki.openstack.org/CommunityMetrics/Code?action=AttachFiledo=g ettarget=2011-11-commits30daysallrepo-obfuscated.pdf [note: the email addresses are hidden on purpose] Can an HTML report be produced and posted instead ? It feels like that sort of information should be pullable rather than pushed, from a well-known website, and PDF adds an extra step to access, for no real value (is anybody going to print this ?) First of all: do the numbers seem correct to you? In other words, does the SQL query seem correct? Does the demo report look interesting to you? What/how would you change? I can't really answer that question, but it looks strange to me to see Jenkins up there (I bet he didn't author any patch). Then, I would like your feedback to refine the other questions we want to see answered regularly, regarding code (we'll move on to bugs, docs, etc later). Are the following reports interesting? Do we want to have them run monthly or weekly? If the reports are not pushed, they can run more often. Maybe something like last 30 days (refreshed every week) and then generating a report per-milestone (at the end of every milestone) ? I think it would be good to know who committed code for a given milestone, rather than for a given arbitrary month. Is this too much information or too little? What else would you like to see regarding code? * Total number of commits across all repos aggregated per month * Total number of commits per repository aggregated per month Maybe per-milestone would be more useful, though it's a bit more difficult to do (especially since all projects do not follow the common milestone plan). -- Thierry Carrez (ttx) Release Manager, OpenStack ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] Metrics around code
Hi Stefano - I would like to see the openstack-manuals commits tracked as well. I'd also like to see tracking commits to the api repos: compute-api image-api object-api identity-api netconn-api Possibly these could be in additional monthly reports so that code contributions are highlighted in one report, and doc and api changes in another (or two others). Monthly seems fine to me for frequency. Thanks, Anne *Anne Gentle* a...@openstack.org my blog http://justwriteclick.com/ | my bookhttp://xmlpress.net/publications/conversation-community/| LinkedIn http://www.linkedin.com/in/annegentle | Delicioushttp://del.icio.us/annegentle| Twitter http://twitter.com/annegentle On Tue, Oct 18, 2011 at 4:37 PM, Stefano Maffulli stef...@openstack.orgwrote: Hello folks, I made more progress using CVSanaly to dig into our git repositories, build a database from the git logs and extract information from it. CVSanaly is a tool developed under a EU sponsored project (FLOSSmetrics) and currently maintained by a few universities. More about it on https://projects.libresoft.es/projects/cvsanaly/wiki For the curious among us, I documented the steps to populate the CVSanaly database with data from OpenStack git repos on a new wiki page: http://wiki.openstack.org/CommunityMetrics/Code You'll find there also the implementation details to answer the question: Who commited to an OpenStack repo, how many times in the past 30 days? and a demo report built with Pentaho Reporting representing the total number commits per repository in past 30 days http://wiki.openstack.org/CommunityMetrics/Code?action=AttachFiledo=gettarget=2011-11-commits30daysallrepo-obfuscated.pdf [note: the email addresses are hidden on purpose] First of all: do the numbers seem correct to you? In other words, does the SQL query seem correct? Does the demo report look interesting to you? What/how would you change? Then, I would like your feedback to refine the other questions we want to see answered regularly, regarding code (we'll move on to bugs, docs, etc later). Are the following reports interesting? Do we want to have them run monthly or weekly? Is this too much information or too little? What else would you like to see regarding code? * Total number of commits across all repos aggregated per month * Total number of commits per repository aggregated per month * Total number of commits per author per repository * Total number of commits per author per repository in past 30 days * Average number of Lines of Code changed per commit per repository * Average number of Lines of Code changed per commit per repository per author Also, from the list of repositories on https://github.com/openstack/, which ones should I keep tracking? cheers, stef ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] Metrics around code
I think I missed the larger discussion around why these metrics are getting published. tl;dr metrics good; these metrics, maybe not so good My worry about publishing number of commits by author is that it could make Gerrit, and our current review process a mess. I agree that smaller commits are better, but do not play well with the Gerrit workflow(in particular if a commit that has dependancies within the same branch is rejected, the entire branch has to be redone). My other fear is with pointless bickering(however good natured it may be) about who committed what(we've already seen it, looks strange to me to see Jenkins up there). I don't know if something similar would happen with the bug database, but I would hate to see people touching bugs just to get counted. The problem with software development metrics is that programers optimize processes for a living. When presented with a metric they will often subconsciously optimize for it. None of this is original ideas to me, I've heard this argument against software metrics many times before. The given solution is to pick a metric that has a tangible meaning to your end users. Taken to the extreme you end up where the Lean Startup movement is; pick something measurable(time to allocate an IP), make a hypothesis(your patch), push it to half your users and observe the results(A/B testing). I really appreciate the desire to measure the project, I just don't think this is the best way to go about it. thanks, Aaron On Oct 19, 2011, at 6:30 AM, Rohit Karajgi wrote: I agree with the HTML report, and this report can be published onto OpenStack Jenkins for world readability. We have some similar metrics for bugs. The following metrics can be easily pulled from Launchpad's bug database using the python-launchpadlib api. 1. Bug Distribution - By Status 2. Bug Distribution - By Importance 3. Bug Distribution - By Milestone 4. Bug Distribution - By Bug owners 5. Bug Distribution - By Fixed-by 6. # of times a file was modified 7. # of lines modified per file #6 and #7 specifically can be quite useful to identify those modules that can be good targets for unit tests. Cheers, Rohit -Original Message- From: openstack-bounces+rohit.karajgi=vertex.co...@lists.launchpad.net [mailto:openstack-bounces+rohit.karajgi=vertex.co...@lists.launchpad.net] On Behalf Of Thierry Carrez Sent: Wednesday, October 19, 2011 3:04 PM To: openstack@lists.launchpad.net Subject: Re: [Openstack] Metrics around code Stefano Maffulli wrote: You'll find there also the implementation details to answer the question: Who commited to an OpenStack repo, how many times in the past 30 days? and a demo report built with Pentaho Reporting representing the total number commits per repository in past 30 days http://wiki.openstack.org/CommunityMetrics/Code?action=AttachFiledo=g ettarget=2011-11-commits30daysallrepo-obfuscated.pdf [note: the email addresses are hidden on purpose] Can an HTML report be produced and posted instead ? It feels like that sort of information should be pullable rather than pushed, from a well-known website, and PDF adds an extra step to access, for no real value (is anybody going to print this ?) First of all: do the numbers seem correct to you? In other words, does the SQL query seem correct? Does the demo report look interesting to you? What/how would you change? I can't really answer that question, but it looks strange to me to see Jenkins up there (I bet he didn't author any patch). Then, I would like your feedback to refine the other questions we want to see answered regularly, regarding code (we'll move on to bugs, docs, etc later). Are the following reports interesting? Do we want to have them run monthly or weekly? If the reports are not pushed, they can run more often. Maybe something like last 30 days (refreshed every week) and then generating a report per-milestone (at the end of every milestone) ? I think it would be good to know who committed code for a given milestone, rather than for a given arbitrary month. Is this too much information or too little? What else would you like to see regarding code? * Total number of commits across all repos aggregated per month * Total number of commits per repository aggregated per month Maybe per-milestone would be more useful, though it's a bit more difficult to do (especially since all projects do not follow the common milestone plan). -- Thierry Carrez (ttx) Release Manager, OpenStack ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp ___ Mailing list: https://launchpad.net/~openstack Post to : openstack
Re: [Openstack] Metrics around code
On Wed, 2011-10-19 at 11:34 +0200, Thierry Carrez wrote: Can an HTML report be produced and posted instead ? yes. Actually, the reports can be made self service from a Pentaho BI server. We'll get there, some day. Lets get the reports defined first, we'll think about the distribution later. I can't really answer that question, but it looks strange to me to see Jenkins up there (I bet he didn't author any patch). You'll have to tell me why that happens. I pulled the repositories from github and jenkins appears as an author in their logs. If the reports are not pushed, they can run more often. Maybe something like last 30 days (refreshed every week) and then generating a report per-milestone (at the end of every milestone) ? I think it would be good to know who committed code for a given milestone, rather than for a given arbitrary month. Uhm ... I'll need some help to I design the query that selects the milestone timeframe from the database. If you have time, lets have a meeting: I can show you the database structure and the content of the tables and you can tell me if it's possible to do this with a query or more manual work is needed. /stef ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] Metrics around code
On Wed, 2011-10-19 at 04:30 -0700, Rohit Karajgi wrote: I agree with the HTML report, and this report can be published onto OpenStack Jenkins for world readability. We have some similar metrics for bugs. Indeed. I haven't started looking at the bugs database yet. There are two interesting tools though, one from Pentaho itself and one from the same folks that develop cvsanaly. None support launchpad, so I'll need a python-launchpadlib expert to proceed. I'll think about it when I'm done with the code reports. 6. # of times a file was modified 7. # of lines modified per file #6 and #7 specifically can be quite useful to identify those modules that can be good targets for unit tests. I have this data in the cvsanaly database and I think I can design a query for it quite easily. /stef ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] Metrics around code
Aaron, I understand your concerns. Measuring irrelevant things may lead to unintended consequences. We don't have to reveal the names of the committers and that's why I ran the report here early: to get feedback :) If others feel the same, I can easily remove the who from that report and show only the Total number of commits per repository in past 30 days graph. On Wed, 2011-10-19 at 14:18 +, Aaron Lee wrote: I really appreciate the desire to measure the project, I just don't think this is the best way to go about it. How and what would you measure instead? What would you add/remove from the list on http://wiki.openstack.org/CommunityMetrics/Code? thanks stef ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] Metrics around code
Sure, I can help with python-launchpadlib. I've played around with it for a while. Do let me know if needed whenever you get there. Cheers, Rohit -Original Message- From: openstack-bounces+rohit.karajgi=vertex.co...@lists.launchpad.net [mailto:openstack-bounces+rohit.karajgi=vertex.co...@lists.launchpad.net] On Behalf Of Stefano Maffulli Sent: Wednesday, October 19, 2011 9:38 PM To: openstack@lists.launchpad.net Subject: Re: [Openstack] Metrics around code On Wed, 2011-10-19 at 04:30 -0700, Rohit Karajgi wrote: I agree with the HTML report, and this report can be published onto OpenStack Jenkins for world readability. We have some similar metrics for bugs. Indeed. I haven't started looking at the bugs database yet. There are two interesting tools though, one from Pentaho itself and one from the same folks that develop cvsanaly. None support launchpad, so I'll need a python-launchpadlib expert to proceed. I'll think about it when I'm done with the code reports. 6. # of times a file was modified 7. # of lines modified per file #6 and #7 specifically can be quite useful to identify those modules that can be good targets for unit tests. I have this data in the cvsanaly database and I think I can design a query for it quite easily. /stef ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
[Openstack] Metrics around code
Hello folks, I made more progress using CVSanaly to dig into our git repositories, build a database from the git logs and extract information from it. CVSanaly is a tool developed under a EU sponsored project (FLOSSmetrics) and currently maintained by a few universities. More about it on https://projects.libresoft.es/projects/cvsanaly/wiki For the curious among us, I documented the steps to populate the CVSanaly database with data from OpenStack git repos on a new wiki page: http://wiki.openstack.org/CommunityMetrics/Code You'll find there also the implementation details to answer the question: Who commited to an OpenStack repo, how many times in the past 30 days? and a demo report built with Pentaho Reporting representing the total number commits per repository in past 30 days http://wiki.openstack.org/CommunityMetrics/Code?action=AttachFiledo=gettarget=2011-11-commits30daysallrepo-obfuscated.pdf [note: the email addresses are hidden on purpose] First of all: do the numbers seem correct to you? In other words, does the SQL query seem correct? Does the demo report look interesting to you? What/how would you change? Then, I would like your feedback to refine the other questions we want to see answered regularly, regarding code (we'll move on to bugs, docs, etc later). Are the following reports interesting? Do we want to have them run monthly or weekly? Is this too much information or too little? What else would you like to see regarding code? * Total number of commits across all repos aggregated per month * Total number of commits per repository aggregated per month * Total number of commits per author per repository * Total number of commits per author per repository in past 30 days * Average number of Lines of Code changed per commit per repository * Average number of Lines of Code changed per commit per repository per author Also, from the list of repositories on https://github.com/openstack/, which ones should I keep tracking? cheers, stef ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp