I think what is being counted is any issue that SonarQube TD reports, and this is being done on every single commit and summed together. This doesn’t seem like a particularly meaningful statistic since you would inevitably count the same issue N times where N is the number of commits between where an issue was introduced and where it was fixed. It seems like there should really be some attempts to perform de-duplication.
I am not sure that this analysis is meaningful in any sense. I imagine it is flagging a lot of trivial stylistic issues of which we have many given the length of time the Code Base has existed for. It also sounds like it doesn’t make any attempts to account for common development practices i.e. New code often develops over a series of commits with developers implementing outlines first and then refining and cleaning up a feature and cleaning up a feature as it matures. Rob On 16/11/2017, 10:40, "Lorenz B." <[email protected]> wrote: I don't get the meaning of this number: # issues: 405,700 What is a single issue in your context? On 16.11.2017 11:15, Γεώργιος Δίγκας wrote: > Dear All, > > First of all I would like to thank you very much for your fast response. > I have created a spreadsheet (https://docs.google.com/spreadsheets/d/1DloQ_GS9l2KS6ldgdHOQkjsCB1J_rrMyUauHC_Ymgfk/edit?usp=sharing) with all the projects that I have analyzed. Below, I describe briefly the first two sheets. > Projects: It contains that stats (Project Name, # of Java Files, # of commits, # of weeks, # of issues over time, and # of FIXED issues over time Fixing Rate) for all the analyzed projects. As the sheet shows the fixing rate for Jena is 90.2%. Which means 90.2% of the issues that were introduced in the project over time, were fixed. > Jena: Open Issues - October 7, 2017: On this sheet there all the issues that are still open (42,258) in the last analyzed commit (SHA: 030398cb0f4bef4dad2b7313e8fce171a6179839 and DATE: October 7, 2017). > >>> How did you go about making these calculations? What span of time does > your analysis concern? > I analyzed the last commit for each week of the selected project over their history. >>> What are you counting as technical debt (anything > that SonarQube claims is "technical debt")? > In this study yes. Everything that SonarQube claims as TD. But, for Apache Jena, the most impressive thing is the fixing rate. You have fixed 90% of the issues that SonarQube detects as TD. >>> Are you comparing Jena to > other projects with a similar lifespan? Are you comparing Jena to > projects that have a similar contribution history? etc. etc. > I would say yes. As the Projects sheet shows there are other projects with the same characteristics. >>> Where does that number come from? > The number (405,700) represents the fixed issues over time. On the last analyzed commit (SHA: 030398cb0f4bef4dad2b7313e8fce171a6179839) there were 481,590 lines of java code. > > Thank you in advance! > > > With kind regards, > > George Digkas > > > > ________________________________ > From: Andy Seaborne <[email protected]> > Sent: Wednesday, November 15, 2017 5:42 PM > To: [email protected] > Subject: Re: Issues fixed in Apache Jena > >>> a tremendous number (405,700) of fixed issues > Where does that number come from? > > There are "only" 481,512 lines of java code in the entire codebase! > > (When Jena was imported into Apache, the whole of the SVN history came > over as well.) > > Andy > > cloc => > > ------------------------------------------------------------------------ > Language files blank comment code > ------------------------------------------------------------------------ > HTML 2669 871 112007 525790 > Java 5637 107373 227331 481512 > XML 504 1016 1144 52578 > Maven 293 3684 6706 52355 > JavaScript 71 4515 5929 26718 > Bourne Shell 62 982 1219 4459 > Bourne Again Shell 51 579 967 2389 > XSLT 6 495 225 2010 > Ruby 7 307 297 1788 > CSS 20 219 194 1629 > DOS Batch 46 245 27 1026 > Perl 6 251 275 651 > Markdown 12 132 0 287 > Smarty 7 16 0 218 > DTD 2 91 147 176 > INI 2 19 0 65 > AspectJ 1 8 46 36 > XSD 1 6 13 34 > Elixir 1 12 42 9 > YAML 1 0 95 0 > ------------------------------------------------------------------------ > SUM: 9399 120821 356664 1153730 > ------------------------------------------------------------------------ > > > > On 15/11/17 15:31, [email protected] wrote: >> It's not really clear to me how to answer these question without more >> context. >> >> How did you go about making these calculations? What span of time does >> your analysis concern? What are you counting as technical debt (anything >> that SonarQube claims is "technical debt")? Are you comparing Jena to >> other projects with a similar lifespan? Are you comparing Jena to >> projects that have a similar contribution history? etc. etc. >> >> >> >> ajs6f >> >> ???????? ?????? wrote on 11/15/17 10:15 AM: >>> Dear developers, >>> >>> I am a PhD student in the university of Groningen and the topic of my >>> PhD is the evolution of Technical Debt (TD) in open-source development. >>> I have analyzed some projects from the Apache Foundation (using >>> SonarQube) and I realized that your project has a tremendous number >>> (405,700) of fixed issues, when we compare it to other projects from >>> Apache. >>> I would like to ask you the following 3 questions: >>> >>> 1. Why had been introduced so many issues of TD into your project? >>> 2. The fixing of those issues was in purpose or a coincidence? >>> 3. Do you use SonarQube (or SonarLint) in order to detect and fix >>> the issues? >>> >>> Thank you in advance! >>> >>> With kind regards, >>> >>> George Digkas >>>
