One possibility: Jena does not (and I assume never did) enforce a "squash-before-merging" policy.
That is to say, if I write a PR with ten commits, and it is approved, and we merge it, it will normally go in as all ten commits. Some projects demand that such a PR be "squashed" (all ten commits be reduced into one with the sum of changes present) before merging. If that is part of the difference, I suppose it should show up in the same way as a difference between Jena and other projects in the number of commits per time unit in the main branch. ajs6f > On Nov 16, 2017, at 7:55 AM, Γεώργιος Δίγκας <[email protected]> wrote: > > Dear All, > > I would like to thank you for your replies! > >>> What is a single issue in your context? > SonarQube uses a set of coding > rules<https://docs.sonarqube.org/display/SONAR/Issues> in order to measure > the TD. While running an analysis, it raises an issue every time a piece of > code breaks a coding rule. >>> I think what is being counted is any issue that SonarQube TD reports, and >>> this is being done on every single commit and summed together. This doesn’t >>> seem like a particularly meaningful statistic since you would inevitably >>> count the same issue N times where N is the number of commits between where >>> an issue was introduced and where it was fixed. It seems like there should >>> really be some attempts to perform de-duplication. > The number refers to unique issues and it does not include any duplication. > (If one issue was fixed and then after some time the same issue appeared in > the same piece of code I count as new). >>> It also sounds like it doesn’t make any attempts to account for common >>> development practices i.e. New code often develops over a series of commits >>> with developers implementing outlines first and then refining and cleaning >>> up a feature and cleaning up a feature as it matures. > I totally agree with the last sentence. As I said on my previous e-mails the > cleaning up rate on your project is the highest among the Apache projects > that I analyzed and I am wondering why is that. What practices do you follow? > Is it a coincidence? >>> There are many (, many) minor things and they outweigh the major problems. >>> Calling them all "issues" gives them equal weight. Some are about >>> canonicalization of the code. > I have updated the previously sent spreadsheet > (https://docs.google.com/spreadsheets/d/1DloQ_GS9l2KS6ldgdHOQkjsCB1J_rrMyUauHC_Ymgfk/edit?usp=sharing). > Now on the sheet: Jena: Open Issues - October 7, 2017 I have added the > Severity and the Type of each issue and you can filter them based on these > two criteria (they are based on SoanrQube's default classification). >>> NB the "issue" word has a specific meaning for JIRA which a lot of Apache >>> projects use. Jena's current total, now, is 1424. > Thank you for the clarification. I should had mention in my first e-mail that > I refer to SonarQube's Issues and to to Jira. > > With kind regards, > > George Digkas > ________________________________ > From: Andy Seaborne <[email protected]> > Sent: Thursday, November 16, 2017 12:55 PM > To: Γεώργιος Δίγκας; [email protected] > Subject: Re: Issues fixed in Apache Jena > > Do not take git as complete! > > Jena started in 2000. > https://lists.w3.org/Archives/Public/www-rdf-interest/2000Aug/0128.html > > Jena 2.0 was released 2003-08-28. > A whole 40M including dependencies! A 14.7M zip file! > https://sourceforge.net/projects/jena/files/ > > The whole of SF SVN history was imported by the Apache infrastructure > team (a herculean effort) into Apache SVN. I don't know how to get to it > from git, it may not be there and only in SVN. > > The earliest git root commit is for the move to Apache from SF > [4298106f1e], 6 years ago. (There are 4 root commits due to merges) > > --- > > It's an interesting start and to make the analysis usefully inform the > reader as to the state of the project I suggest treating different kinds > of issues different, not uniformly important. > > There are many (, many) minor things and they outweigh the major > problems. Calling them all "issues" gives them equal weight. Some are > about canonicalization of the code. > > Yet reformatting the whole code base (if practical, which it arguable) > then greatly decreases the usefulness of git history. That would be a > huge loss. > > (NB the "issue" word has a specific meaning for JIRA which a lot of > Apache projects use. Jena's current total, now, is 1424.) > > Andy > >> >> Thank you in advance! >> >> >> With kind regards, >> >> George Digkas
