Re: -cov LOC is inadequate for 1 liner branching; need a metric based on branching

2018-02-12 Thread Johan Engelen via Digitalmars-d

On Monday, 5 February 2018 at 19:32:37 UTC, Timothee Cour wrote:

just filed https://issues.dlang.org/show_bug.cgi?id=18377:


If I may add something semi-relevant to the current discussion, 
have a look here:

https://clang.llvm.org/docs/SourceBasedCodeCoverage.html

It is non-trivial to add llvm-cov support to LDC, but half the 
work is already done in LDC [*].
The current instrumentation implementation in Clang and LDC for 
PGO/coverage only instruments branch points, and is therefore 
faster than DMD's `-cov` (also present in LDC) but less precise: 
it assumes an exception-less execution flow. It's however quite 
easy to add precision and deal with exception-flow, at the cost 
of execution speed of course. It'll be interesting to investigate 
the cost/benefit! Let me know if you're interested in working on 
it.


A copy from Clang's documentation page, from which some cool 
ideas/capabilities can be glanced (needs monospaced font):

```
1|   20|#define BAR(x) ((x) || (x))
   ^20 ^2
2|2|template  void foo(T x) {
3|   22|  for (unsigned I = 0; I < 10; ++I) { BAR(I); }
   ^22 ^20  ^20^20
4|2|}
--
| void foo(int):
|  2|1|template  void foo(T x) {
|  3|   11|  for (unsigned I = 0; I < 10; ++I) { BAR(I); }
| ^11 ^10  ^10^10
|  4|1|}
--
| void foo(int):
|  2|1|template  void foo(T x) {
|  3|   11|  for (unsigned I = 0; I < 10; ++I) { BAR(I); }
| ^11 ^10  ^10^10
|  4|1|}
--
```

cheers,
  Johan

[*] PGO instrumentation is the same, I can give some guidance to 
anybody who wants to work on it.


Re: -cov LOC is inadequate for 1 liner branching; need a metric based on branching

2018-02-11 Thread Walter Bright via Digitalmars-d

On 2/11/2018 3:01 PM, Timothee Cour wrote:

It's about `line coverage` vs `branch coverage` (see exact definition
in linked article),


Ok, I see now, thanks for your patience and the clarification.



that difference is very large in practice.


Only if one cares about the percentage metric. Pragmatically, I'm much more 
interested in exactly what was not covered rather than the percentage, and any 
good QA department should be focused on that rather than the percent metric.


BTW, when I wrote the -cov implementation, I threw in the % metric just for fun 
.



What's more, code instrumentation to enable branch coverage is not
more complex to implement compared to line coverage (I would even
venture it's less complex and less costly).


I'm not convinced of that, as you've still got the:

   if (a) b;

problem. Furthermore, in the first link they alluded to the thrown exception 
problem, with some gobbledy-gook answer on how they dealt with it (they punted). 
Sequence point coverage doesn't have that problem.


Next, in the python linked example, they found it necessary to add a bunch of 
pragma's to give needed information to the branch counter. I have a hard time 
seeing anyone using that unless they're forced to by management. The idea with 
dmd's coverage is you type -cov and you get a coverage report. Done.


The coverage report is a simple text file that is both readable by itself and 
trivially parse-able with a program for those who really want a GUI presentation 
of the results, or who want to integrate it with github (as has been done for 
us, yay!).


(Back in the olden daze, a coverage analyzer was a separate product that came 
with a 200 page manual. Who bought it? Managers. Who used it? Nobody. You'd see 
the manual on programmers' shelves, still in its shrink wrap.)


---

I'm actually glad to see people complain about -cov and want to improve it. It 
shows it is successful - people are using it! As Bjarne Stroustrup once said, 
there are two kinds of products - ones people complain about, and ones people 
don't use.


Re: -cov LOC is inadequate for 1 liner branching; need a metric based on branching

2018-02-11 Thread Timothee Cour via Digitalmars-d
> -cov coverage percentage is the line coverage, not the sequence point 
> coverage [...]
> makes it fuzzy in proportion to how many lines contain multiple sequence 
> points

Based on your comment I'm pretty sure you're still not getting my
point, so apologies if I was unclear and let me try to explain better:

it's not about `line coverage` vs `sequence point coverage`, as that
difference is not very large (indeed, just 'fuzzy').

It's about `line coverage` vs `branch coverage` (see exact definition
in linked article), that difference is very large in practice.

here's my example, but more concretely explained:
```
void main(int a){
if(a>0){
  statement1(); // line 3
  statement2(); // line 4
...
  statement100(); // line 102
} else{
  statement101(); // line 104
}
}
unittest{ fun(1); }
```
* line coverage is around 99%.
* sequence point coverage is also 99% (and would be close to that if
some lines had multiple statements)
* branch coverage is 50%.

This is not an artificial example, this is the common case.

What's more, code instrumentation to enable branch coverage is not
more complex to implement compared to line coverage (I would even
venture it's less complex and less costly).



On Sun, Feb 11, 2018 at 2:32 PM, Walter Bright via Digitalmars-d
 wrote:
> On 2/11/2018 1:55 PM, Timothee Cour wrote:
>>
>> I think you're missing my main point: it's explained here
>>
>> https://www.ncover.com/support/docs/extras/code-coverage/understanding-branch-coverage
>> but the gist is that line based coverage is over-counting:
>> ```
>> if(A)
>>// 100 lines of code
>> else
>>// 1 line of code
>> ```
>> gives a line coverage of ~ 99% vs a branch coverage of ~50%
>> (assuming `else` branch never covered in unittests)
>>
>> What matters as far as bugs are concerned is that 50% of cases are
>> covered. Increasing the size of the `if(A)` branch increases line
>> coverage (which is irrelevant) but not branch coverage.
>
>
> I understand that point. The -cov coverage percentage is the line coverage,
> not the sequence point coverage. (Hence it will never be greater than 100%,
> and it will never underestimate the coverage. It would be more accurately
> termed an "upper bound" on the coverage.)
>
> And yes, that makes it fuzzy in proportion to how many lines contain
> multiple sequence points. Eliminating that fuzziness does require a vast
> increase in the complexity of the -cov implementation.


Re: -cov LOC is inadequate for 1 liner branching; need a metric based on branching

2018-02-11 Thread Walter Bright via Digitalmars-d

On 2/11/2018 1:55 PM, Timothee Cour wrote:

I think you're missing my main point: it's explained here
https://www.ncover.com/support/docs/extras/code-coverage/understanding-branch-coverage
but the gist is that line based coverage is over-counting:
```
if(A)
   // 100 lines of code
else
   // 1 line of code
```
gives a line coverage of ~ 99% vs a branch coverage of ~50%
(assuming `else` branch never covered in unittests)

What matters as far as bugs are concerned is that 50% of cases are
covered. Increasing the size of the `if(A)` branch increases line
coverage (which is irrelevant) but not branch coverage.


I understand that point. The -cov coverage percentage is the line coverage, not 
the sequence point coverage. (Hence it will never be greater than 100%, and it 
will never underestimate the coverage. It would be more accurately termed an 
"upper bound" on the coverage.)


And yes, that makes it fuzzy in proportion to how many lines contain multiple 
sequence points. Eliminating that fuzziness does require a vast increase in the 
complexity of the -cov implementation.


Re: -cov LOC is inadequate for 1 liner branching; need a metric based on branching

2018-02-11 Thread Timothee Cour via Digitalmars-d
I think you're missing my main point: it's explained here
https://www.ncover.com/support/docs/extras/code-coverage/understanding-branch-coverage
but the gist is that line based coverage is over-counting:
```
if(A)
  // 100 lines of code
else
  // 1 line of code
```
gives a line coverage of ~ 99% vs a branch coverage of ~50%
(assuming `else` branch never covered in unittests)

What matters as far as bugs are concerned is that 50% of cases are
covered. Increasing the size of the `if(A)` branch increases line
coverage (which is irrelevant) but not branch coverage.


On Sun, Feb 11, 2018 at 1:32 PM, Walter Bright via Digitalmars-d
 wrote:
> On 2/5/2018 11:32 AM, Timothee Cour wrote:
>>
>> just filed https://issues.dlang.org/show_bug.cgi?id=18377:
>>
>> `dmd -cov -run main.d` shows 100% coverage; this is misleading since a
>> branch is not taken:
>>
>> ```
>> void main(){
>>int a;
>>if(false) a+=10;
>> }
>> ```
>
>
> Consider how -cov works today:
>
>   2|  x = 3; y = 4;
>   1|  ++x;
>
> The first line has a count of 2, because there are two statements and each
> contributes one increment to the line.
>
>   1|  x = 3; // reference count
>   2|  if (true && false) c;
>   3|  if (true && true) c;
>   1|  if (false && b) c;
>
> The sequence points are each counted separately. So, by comparing with the
> 'reference count', you can see which sequence points are executed. Also, if
> one finds that unattractive, the code can be organized like:
>
>   1|  if (true &&
>   1|  false)
>   0| c;
>
> and the separate counts will be revealed instead of aggregated.
>
> I agree that this is not ideal, however:
>
> 1. it works
> 2. it is simple and robust
> 3. the display to the user is simple
> 4. it's easy to aggregate multiple runs together with simple text processing
> code
> 5. one can 'fix' it with a stylistic change in the formatting of the source
> code
>
> Any further improvement would be a large increase in complexity of the
> implementation, and I don't know of reasonable way to present this to the
> user in a textual format.
>
> Is it worth it? I don't think so. Like builtin unittests, the big win with
> -cov is it is *trivial* to use, which encourages its adoption. It's a 99%
> solution, with 99% of the benefits, with 1% of the implementation effort. We
> should be expending effort elsewhere than putting an additional 99% effort
> to squeeze out that last 1% of benefit.


Re: -cov LOC is inadequate for 1 liner branching; need a metric based on branching

2018-02-11 Thread Walter Bright via Digitalmars-d

On 2/5/2018 11:32 AM, Timothee Cour wrote:

just filed https://issues.dlang.org/show_bug.cgi?id=18377:

`dmd -cov -run main.d` shows 100% coverage; this is misleading since a
branch is not taken:

```
void main(){
   int a;
   if(false) a+=10;
}
```


Consider how -cov works today:

  2|  x = 3; y = 4;
  1|  ++x;

The first line has a count of 2, because there are two statements and each 
contributes one increment to the line.


  1|  x = 3; // reference count
  2|  if (true && false) c;
  3|  if (true && true) c;
  1|  if (false && b) c;

The sequence points are each counted separately. So, by comparing with the 
'reference count', you can see which sequence points are executed. Also, if one 
finds that unattractive, the code can be organized like:


  1|  if (true &&
  1|  false)
  0| c;

and the separate counts will be revealed instead of aggregated.

I agree that this is not ideal, however:

1. it works
2. it is simple and robust
3. the display to the user is simple
4. it's easy to aggregate multiple runs together with simple text processing 
code
5. one can 'fix' it with a stylistic change in the formatting of the source code

Any further improvement would be a large increase in complexity of the 
implementation, and I don't know of reasonable way to present this to the user 
in a textual format.


Is it worth it? I don't think so. Like builtin unittests, the big win with -cov 
is it is *trivial* to use, which encourages its adoption. It's a 99% solution, 
with 99% of the benefits, with 1% of the implementation effort. We should be 
expending effort elsewhere than putting an additional 99% effort to squeeze out 
that last 1% of benefit.


Re: -cov LOC is inadequate for 1 liner branching; need a metric based on branching

2018-02-11 Thread timotheecour via Digitalmars-d

On Monday, 5 February 2018 at 19:32:37 UTC, Timothee Cour wrote:

just filed https://issues.dlang.org/show_bug.cgi?id=18377:


@wilzbach
It probably makes sense to have a look at how other languages 
are dealing with this. Seems to be a common problem


some links:
* 
https://www.ncover.com/support/docs/extras/code-coverage/understanding-branch-coverage => "Why do we need the Branch Coverage metric?"
* http://coverage.readthedocs.io/en/coverage-4.4.2/branch.html => 
"coverage run --branch myprog.py"