One possibility: Jena does not (and I assume never did) enforce a 
"squash-before-merging" policy.

That is to say, if I write a PR with ten commits, and it is approved, and we 
merge it, it will normally go in as all ten commits. Some projects demand that 
such a PR be "squashed" (all ten commits be reduced into one with the sum of 
changes present) before merging. 

If that is part of the difference, I suppose it should show up in the same way 
as a difference between Jena and other projects in the number of commits per 
time unit in the main branch.

ajs6f

> On Nov 16, 2017, at 7:55 AM, Γεώργιος Δίγκας <[email protected]> wrote:
> 
> Dear All,
> 
> I would like to thank you for your replies!
> 
>>> What is a single issue in your context?
> SonarQube uses a set of coding 
> rules<https://docs.sonarqube.org/display/SONAR/Issues> in order to measure 
> the TD. While running an analysis, it raises an issue every time a piece of 
> code breaks a coding rule.
>>> I think what is being counted is any issue that SonarQube TD reports, and 
>>> this is being done on every single commit and summed together. This doesn’t 
>>> seem like a particularly meaningful statistic since you would inevitably 
>>> count the same issue N times where N is the number of commits between where 
>>> an issue was introduced and where it was fixed. It seems like there should 
>>> really be some attempts to perform de-duplication.
> The number refers to unique issues and it does not include any duplication. 
> (If one issue was fixed and then after some time the same issue appeared in 
> the same piece of code I count as new).
>>> It also sounds like it doesn’t make any attempts to account for common 
>>> development practices i.e. New code often develops over a series of commits 
>>> with developers implementing outlines first and then refining and cleaning 
>>> up a feature and cleaning up a feature as it matures.
> I totally agree with the last sentence. As I said on my previous e-mails the 
> cleaning up rate on your project is the highest among the Apache projects 
> that I analyzed and I am wondering why is that. What practices do you follow? 
> Is it a coincidence?
>>> There are many (, many) minor things and they outweigh the major problems. 
>>> Calling them all "issues" gives them equal weight. Some are about 
>>> canonicalization of the code.
> I have updated the previously sent spreadsheet 
> (https://docs.google.com/spreadsheets/d/1DloQ_GS9l2KS6ldgdHOQkjsCB1J_rrMyUauHC_Ymgfk/edit?usp=sharing).
>  Now on the sheet: Jena: Open Issues - October 7, 2017 I have added the 
> Severity and the Type of each issue and you can filter them based on these 
> two criteria (they are based on SoanrQube's default classification).
>>> NB the "issue" word has a specific meaning for JIRA which a lot of Apache 
>>> projects use. Jena's current total, now, is 1424.
> Thank you for the clarification. I should had mention in my first e-mail that 
> I refer to SonarQube's Issues and to to Jira.
> 
> With kind regards,
> 
> George Digkas
> ________________________________
> From: Andy Seaborne <[email protected]>
> Sent: Thursday, November 16, 2017 12:55 PM
> To: Γεώργιος Δίγκας; [email protected]
> Subject: Re: Issues fixed in Apache Jena
> 
> Do not take git as complete!
> 
> Jena started in 2000.
> https://lists.w3.org/Archives/Public/www-rdf-interest/2000Aug/0128.html
> 
> Jena 2.0 was released 2003-08-28.
> A whole 40M including dependencies! A 14.7M zip file!
> https://sourceforge.net/projects/jena/files/
> 
> The whole of SF SVN history was imported by the Apache infrastructure
> team (a herculean effort) into Apache SVN. I don't know how to get to it
> from git, it may not be there and only in SVN.
> 
> The earliest git root commit is for the move to Apache from SF
> [4298106f1e], 6 years ago. (There are 4 root commits due to merges)
> 
> ---
> 
> It's an interesting start and to make the analysis usefully inform the
> reader as to the state of the project I suggest treating different kinds
> of issues different, not uniformly important.
> 
> There are many (, many) minor things and they outweigh the major
> problems. Calling them all "issues" gives them equal weight. Some are
> about canonicalization of the code.
> 
> Yet reformatting the whole code base (if practical, which it arguable)
> then greatly decreases the usefulness of git history. That would be a
> huge loss.
> 
> (NB the "issue" word has a specific meaning for JIRA which a lot of
> Apache projects use. Jena's current total, now, is 1424.)
> 
>     Andy
> 
>> 
>> Thank you in advance!
>> 
>> 
>> With kind regards,
>> 
>> George Digkas

Reply via email to