Re: -cov LOC is inadequate for 1 liner branching; need a metric based on branching
On Monday, 5 February 2018 at 19:32:37 UTC, Timothee Cour wrote: just filed https://issues.dlang.org/show_bug.cgi?id=18377: If I may add something semi-relevant to the current discussion, have a look here: https://clang.llvm.org/docs/SourceBasedCodeCoverage.html It is non-trivial to add llvm-cov support to LDC, but half the work is already done in LDC [*]. The current instrumentation implementation in Clang and LDC for PGO/coverage only instruments branch points, and is therefore faster than DMD's `-cov` (also present in LDC) but less precise: it assumes an exception-less execution flow. It's however quite easy to add precision and deal with exception-flow, at the cost of execution speed of course. It'll be interesting to investigate the cost/benefit! Let me know if you're interested in working on it. A copy from Clang's documentation page, from which some cool ideas/capabilities can be glanced (needs monospaced font): ``` 1| 20|#define BAR(x) ((x) || (x)) ^20 ^2 2|2|template void foo(T x) { 3| 22| for (unsigned I = 0; I < 10; ++I) { BAR(I); } ^22 ^20 ^20^20 4|2|} -- | void foo(int): | 2|1|template void foo(T x) { | 3| 11| for (unsigned I = 0; I < 10; ++I) { BAR(I); } | ^11 ^10 ^10^10 | 4|1|} -- | void foo(int): | 2|1|template void foo(T x) { | 3| 11| for (unsigned I = 0; I < 10; ++I) { BAR(I); } | ^11 ^10 ^10^10 | 4|1|} -- ``` cheers, Johan [*] PGO instrumentation is the same, I can give some guidance to anybody who wants to work on it.
Re: -cov LOC is inadequate for 1 liner branching; need a metric based on branching
On 2/11/2018 3:01 PM, Timothee Cour wrote: It's about `line coverage` vs `branch coverage` (see exact definition in linked article), Ok, I see now, thanks for your patience and the clarification. that difference is very large in practice. Only if one cares about the percentage metric. Pragmatically, I'm much more interested in exactly what was not covered rather than the percentage, and any good QA department should be focused on that rather than the percent metric. BTW, when I wrote the -cov implementation, I threw in the % metric just for fun . What's more, code instrumentation to enable branch coverage is not more complex to implement compared to line coverage (I would even venture it's less complex and less costly). I'm not convinced of that, as you've still got the: if (a) b; problem. Furthermore, in the first link they alluded to the thrown exception problem, with some gobbledy-gook answer on how they dealt with it (they punted). Sequence point coverage doesn't have that problem. Next, in the python linked example, they found it necessary to add a bunch of pragma's to give needed information to the branch counter. I have a hard time seeing anyone using that unless they're forced to by management. The idea with dmd's coverage is you type -cov and you get a coverage report. Done. The coverage report is a simple text file that is both readable by itself and trivially parse-able with a program for those who really want a GUI presentation of the results, or who want to integrate it with github (as has been done for us, yay!). (Back in the olden daze, a coverage analyzer was a separate product that came with a 200 page manual. Who bought it? Managers. Who used it? Nobody. You'd see the manual on programmers' shelves, still in its shrink wrap.) --- I'm actually glad to see people complain about -cov and want to improve it. It shows it is successful - people are using it! As Bjarne Stroustrup once said, there are two kinds of products - ones people complain about, and ones people don't use.
Re: -cov LOC is inadequate for 1 liner branching; need a metric based on branching
> -cov coverage percentage is the line coverage, not the sequence point > coverage [...] > makes it fuzzy in proportion to how many lines contain multiple sequence > points Based on your comment I'm pretty sure you're still not getting my point, so apologies if I was unclear and let me try to explain better: it's not about `line coverage` vs `sequence point coverage`, as that difference is not very large (indeed, just 'fuzzy'). It's about `line coverage` vs `branch coverage` (see exact definition in linked article), that difference is very large in practice. here's my example, but more concretely explained: ``` void main(int a){ if(a>0){ statement1(); // line 3 statement2(); // line 4 ... statement100(); // line 102 } else{ statement101(); // line 104 } } unittest{ fun(1); } ``` * line coverage is around 99%. * sequence point coverage is also 99% (and would be close to that if some lines had multiple statements) * branch coverage is 50%. This is not an artificial example, this is the common case. What's more, code instrumentation to enable branch coverage is not more complex to implement compared to line coverage (I would even venture it's less complex and less costly). On Sun, Feb 11, 2018 at 2:32 PM, Walter Bright via Digitalmars-dwrote: > On 2/11/2018 1:55 PM, Timothee Cour wrote: >> >> I think you're missing my main point: it's explained here >> >> https://www.ncover.com/support/docs/extras/code-coverage/understanding-branch-coverage >> but the gist is that line based coverage is over-counting: >> ``` >> if(A) >>// 100 lines of code >> else >>// 1 line of code >> ``` >> gives a line coverage of ~ 99% vs a branch coverage of ~50% >> (assuming `else` branch never covered in unittests) >> >> What matters as far as bugs are concerned is that 50% of cases are >> covered. Increasing the size of the `if(A)` branch increases line >> coverage (which is irrelevant) but not branch coverage. > > > I understand that point. The -cov coverage percentage is the line coverage, > not the sequence point coverage. (Hence it will never be greater than 100%, > and it will never underestimate the coverage. It would be more accurately > termed an "upper bound" on the coverage.) > > And yes, that makes it fuzzy in proportion to how many lines contain > multiple sequence points. Eliminating that fuzziness does require a vast > increase in the complexity of the -cov implementation.
Re: -cov LOC is inadequate for 1 liner branching; need a metric based on branching
On 2/11/2018 1:55 PM, Timothee Cour wrote: I think you're missing my main point: it's explained here https://www.ncover.com/support/docs/extras/code-coverage/understanding-branch-coverage but the gist is that line based coverage is over-counting: ``` if(A) // 100 lines of code else // 1 line of code ``` gives a line coverage of ~ 99% vs a branch coverage of ~50% (assuming `else` branch never covered in unittests) What matters as far as bugs are concerned is that 50% of cases are covered. Increasing the size of the `if(A)` branch increases line coverage (which is irrelevant) but not branch coverage. I understand that point. The -cov coverage percentage is the line coverage, not the sequence point coverage. (Hence it will never be greater than 100%, and it will never underestimate the coverage. It would be more accurately termed an "upper bound" on the coverage.) And yes, that makes it fuzzy in proportion to how many lines contain multiple sequence points. Eliminating that fuzziness does require a vast increase in the complexity of the -cov implementation.
Re: -cov LOC is inadequate for 1 liner branching; need a metric based on branching
I think you're missing my main point: it's explained here https://www.ncover.com/support/docs/extras/code-coverage/understanding-branch-coverage but the gist is that line based coverage is over-counting: ``` if(A) // 100 lines of code else // 1 line of code ``` gives a line coverage of ~ 99% vs a branch coverage of ~50% (assuming `else` branch never covered in unittests) What matters as far as bugs are concerned is that 50% of cases are covered. Increasing the size of the `if(A)` branch increases line coverage (which is irrelevant) but not branch coverage. On Sun, Feb 11, 2018 at 1:32 PM, Walter Bright via Digitalmars-dwrote: > On 2/5/2018 11:32 AM, Timothee Cour wrote: >> >> just filed https://issues.dlang.org/show_bug.cgi?id=18377: >> >> `dmd -cov -run main.d` shows 100% coverage; this is misleading since a >> branch is not taken: >> >> ``` >> void main(){ >>int a; >>if(false) a+=10; >> } >> ``` > > > Consider how -cov works today: > > 2| x = 3; y = 4; > 1| ++x; > > The first line has a count of 2, because there are two statements and each > contributes one increment to the line. > > 1| x = 3; // reference count > 2| if (true && false) c; > 3| if (true && true) c; > 1| if (false && b) c; > > The sequence points are each counted separately. So, by comparing with the > 'reference count', you can see which sequence points are executed. Also, if > one finds that unattractive, the code can be organized like: > > 1| if (true && > 1| false) > 0| c; > > and the separate counts will be revealed instead of aggregated. > > I agree that this is not ideal, however: > > 1. it works > 2. it is simple and robust > 3. the display to the user is simple > 4. it's easy to aggregate multiple runs together with simple text processing > code > 5. one can 'fix' it with a stylistic change in the formatting of the source > code > > Any further improvement would be a large increase in complexity of the > implementation, and I don't know of reasonable way to present this to the > user in a textual format. > > Is it worth it? I don't think so. Like builtin unittests, the big win with > -cov is it is *trivial* to use, which encourages its adoption. It's a 99% > solution, with 99% of the benefits, with 1% of the implementation effort. We > should be expending effort elsewhere than putting an additional 99% effort > to squeeze out that last 1% of benefit.
Re: -cov LOC is inadequate for 1 liner branching; need a metric based on branching
On 2/5/2018 11:32 AM, Timothee Cour wrote: just filed https://issues.dlang.org/show_bug.cgi?id=18377: `dmd -cov -run main.d` shows 100% coverage; this is misleading since a branch is not taken: ``` void main(){ int a; if(false) a+=10; } ``` Consider how -cov works today: 2| x = 3; y = 4; 1| ++x; The first line has a count of 2, because there are two statements and each contributes one increment to the line. 1| x = 3; // reference count 2| if (true && false) c; 3| if (true && true) c; 1| if (false && b) c; The sequence points are each counted separately. So, by comparing with the 'reference count', you can see which sequence points are executed. Also, if one finds that unattractive, the code can be organized like: 1| if (true && 1| false) 0| c; and the separate counts will be revealed instead of aggregated. I agree that this is not ideal, however: 1. it works 2. it is simple and robust 3. the display to the user is simple 4. it's easy to aggregate multiple runs together with simple text processing code 5. one can 'fix' it with a stylistic change in the formatting of the source code Any further improvement would be a large increase in complexity of the implementation, and I don't know of reasonable way to present this to the user in a textual format. Is it worth it? I don't think so. Like builtin unittests, the big win with -cov is it is *trivial* to use, which encourages its adoption. It's a 99% solution, with 99% of the benefits, with 1% of the implementation effort. We should be expending effort elsewhere than putting an additional 99% effort to squeeze out that last 1% of benefit.
Re: -cov LOC is inadequate for 1 liner branching; need a metric based on branching
On Monday, 5 February 2018 at 19:32:37 UTC, Timothee Cour wrote: just filed https://issues.dlang.org/show_bug.cgi?id=18377: @wilzbach It probably makes sense to have a look at how other languages are dealing with this. Seems to be a common problem some links: * https://www.ncover.com/support/docs/extras/code-coverage/understanding-branch-coverage => "Why do we need the Branch Coverage metric?" * http://coverage.readthedocs.io/en/coverage-4.4.2/branch.html => "coverage run --branch myprog.py"