I'm looking forward to some kind of reply, if you find the time.
On Tue, Nov 11, 2014 at 12:00 PM, Carsten Otto
<[email protected]> wrote:
Hi Marc,
after some troubles understanding your mail, I think I now get what
your main question seems to be.
You are correct in that whatever JaCoCo might return as a result,
the analyzed code still is the same. It is exactly as good or bad as
it was before running the analysis. Whatever change you did in the
past to JaCoCo, including the change I propose here, does not change
that at all.
However, and this is the important part, JaCoCo is not used to
change code. It is used to see where code should be changed by the
developer. For example, developers see which code is not tested.
This makes the developer write more tests. These tests are of real
value to the project (you might say tests are not strictly necessary
for perfect code, but I think we both agree that good tests are very
valuable). So, JaCoCo helps in identifying necessary changes
("Here's a scary blob of red code!"), and making sure the problem is
solved after the developer adds more tests ("Hey, look, it's green
now! Go find some other red blob!").
We need to talk about false positives now. I believe JaCoCo is
correct ("sound") in the sense that no uncovered code is marked as
covered/green. However, it is "incorrect" in the sense that covered
code might be reported as uncovered/red. I'd like to call this
problem "incomplete" instead of "incorrect" or "unsound". Any opcode
marked as uncovered, although it in fact is covered, is a false
positive. The obvious goal is to have as few of these false
positives as possible, while still being correct/sound as explained
above. JaCoCo is quite good at this, and this (together with the
reporting, EclEmma, and a reasonable execution speed penalty) is the
main reason for its success. I really like JaCoCo and EclEmma!
However, there still are false positives. Every single false
positive is annoying, and confuses the user. As an extreme example,
a virus scanner which detects all infected files, but also has a
false positive every few minutes is just useless. You can't write a
letter without being disturbed ten times by that virus scanner. The
same principle can be applied to JaCoCo. If there are too many false
positives reported by JaCoCo, the developer using it gets annoyed
and needs more manual analysis. Depending on how demanding the
developer is, and how many false positives are reported, this could
lead to the developer not using or trusting JaCoCo at all.
The change I am proposing has the only advantage of decreasing the
number of false positives. In other words, it does not change code.
It does not directly add value to the project. However, it helps
JaCoCo users, and helps them having the results they really need.
Thus, they spend less time dealing with false positives, and can use
this time to work on the actual uncovered code (leading to better
tests, and more value). If you look at past comments and questions,
some posted to this mailing list, you see that JaCoCo users are
confused by false positives, and would like to have this problem
solved.
So, I hope this covers the "what does it all mean" part of your
question.
As a rather personal remark, I'd like to add that the percentage
number reported by JaCoCo can also be used to provide value. For
example, in a multi million Euro project done for a well known
company with several 100,000s of employees, we used measurements
including code coverage to answer the following questions: Should we
throw away the existing code and start all over? Should we just
refactor it? Where exactly are the problems that should be solved
soon? Which parts of the code base are not as bad? How bad is it,
and how much do we want that to change? Having a more precise
percentage reported by JaCoCo leads to better decisions and, thus,
to more value.
I answered your more detailled questions below.
On Mon, Nov 10, 2014 at 9:18 PM, Marc R. Hoffmann
<[email protected]> wrote:
It would be interesting to see some examples of highlighted source
code for both versions of JaCoCo.
Sadly, my reports (generated by maven) do not allow me to see
highlighted source code. Do you know how that can be changed easily?
I can provide the HTML reports generated for the projects I
mentioned in my previous report, though. In these you can see (and
directly compare) the percentages and absolute numbers of missed
instructions for individual packages/classes/methods.
What do you think about having a separate version of JaCoCo which
is more precise (and, sadly, slower)?
Speaking for the JaCoCo project we will stay with one version. We
can't handle multiple versions and this will also confuse users as
many users use JaCoCo in different tools.
OK, so we're talking about an inclusion of the changes, or a fork
(as I really feel the need to get rid of this annoying
incompleteness).
Can you think of a way to speed up JaCoCo while still having
better coverage results?
Not sure whether this can be implemented but I'm thinking about
installing exception handlers with the instrumented code since quite
some time.
I also experimented with this approach. Sadly, this approach is
worse, as you'd need a probe for every single opcode which might
throw an exception, in addition to said handler(s).
Best regards,
Carsten
--
Carsten Otto
[email protected]
www.c-otto.de [1]