Dear All, I would like to thank you for your replies!
>> What is a single issue in your context? SonarQube uses a set of coding rules<https://docs.sonarqube.org/display/SONAR/Issues> in order to measure the TD. While running an analysis, it raises an issue every time a piece of code breaks a coding rule. >> I think what is being counted is any issue that SonarQube TD reports, and >> this is being done on every single commit and summed together. This doesn’t >> seem like a particularly meaningful statistic since you would inevitably >> count the same issue N times where N is the number of commits between where >> an issue was introduced and where it was fixed. It seems like there should >> really be some attempts to perform de-duplication. The number refers to unique issues and it does not include any duplication. (If one issue was fixed and then after some time the same issue appeared in the same piece of code I count as new). >> It also sounds like it doesn’t make any attempts to account for common >> development practices i.e. New code often develops over a series of commits >> with developers implementing outlines first and then refining and cleaning >> up a feature and cleaning up a feature as it matures. I totally agree with the last sentence. As I said on my previous e-mails the cleaning up rate on your project is the highest among the Apache projects that I analyzed and I am wondering why is that. What practices do you follow? Is it a coincidence? >> There are many (, many) minor things and they outweigh the major problems. >> Calling them all "issues" gives them equal weight. Some are about >> canonicalization of the code. I have updated the previously sent spreadsheet (https://docs.google.com/spreadsheets/d/1DloQ_GS9l2KS6ldgdHOQkjsCB1J_rrMyUauHC_Ymgfk/edit?usp=sharing). Now on the sheet: Jena: Open Issues - October 7, 2017 I have added the Severity and the Type of each issue and you can filter them based on these two criteria (they are based on SoanrQube's default classification). >> NB the "issue" word has a specific meaning for JIRA which a lot of Apache >> projects use. Jena's current total, now, is 1424. Thank you for the clarification. I should had mention in my first e-mail that I refer to SonarQube's Issues and to to Jira. With kind regards, George Digkas ________________________________ From: Andy Seaborne <[email protected]> Sent: Thursday, November 16, 2017 12:55 PM To: Γεώργιος Δίγκας; [email protected] Subject: Re: Issues fixed in Apache Jena Do not take git as complete! Jena started in 2000. https://lists.w3.org/Archives/Public/www-rdf-interest/2000Aug/0128.html Jena 2.0 was released 2003-08-28. A whole 40M including dependencies! A 14.7M zip file! https://sourceforge.net/projects/jena/files/ The whole of SF SVN history was imported by the Apache infrastructure team (a herculean effort) into Apache SVN. I don't know how to get to it from git, it may not be there and only in SVN. The earliest git root commit is for the move to Apache from SF [4298106f1e], 6 years ago. (There are 4 root commits due to merges) --- It's an interesting start and to make the analysis usefully inform the reader as to the state of the project I suggest treating different kinds of issues different, not uniformly important. There are many (, many) minor things and they outweigh the major problems. Calling them all "issues" gives them equal weight. Some are about canonicalization of the code. Yet reformatting the whole code base (if practical, which it arguable) then greatly decreases the usefulness of git history. That would be a huge loss. (NB the "issue" word has a specific meaning for JIRA which a lot of Apache projects use. Jena's current total, now, is 1424.) Andy > > Thank you in advance! > > > With kind regards, > > George Digkas
