[ 
https://issues.apache.org/jira/browse/BEAM-1399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15868375#comment-15868375
 ] 

Stas Levin edited comment on BEAM-1399 at 2/15/17 7:06 PM:
-----------------------------------------------------------

Having spent a while researching this, I can share the following findings.

I believe you are correct in that the coverage is inaccurate, and the 
{{report}} goal does not capture coverage provided by external (to the tests) 
modules.
{{report-aggregate}} takes into account coverage provided by such modules, but 
it seems to assume that the external module (e.g. {{DirectRunner}}) is all 
about integration tests, despite it having unit tests of its own which are not 
reported, and so the coverage is yet again, inaccurate.
In addition, as far as I could see {{report-aggregate}} does not provide a 
solution for the rest of the modules, assuming we would like to obtain the 
coverage across the entire Beam project and not just the SDK. This is also the 
assumption throughout the rest of this comment, so if it's wrong and we are 
fine with covering the SDK only, my conclusions below may not apply.

The general problem of obtaining code coverage for an arbitrary depth, 
multi-module maven project is heavily discussed ^1-3^ and is believed to be NP 
complete.
Nonetheless some heuristics are out there:
# Introduce a new module, which should be *dependent on all the modules in the 
project*, this will allow using {{Jacoco}}'s {{report-aggregate}} goal.
# Hack our way.
## Introduce a new reporter module
## Add {{maven-antrun-plugin}} to the new module and set up an {{Ant}} task to 
gain a more fine-grained control ^4^ over the coverage report generation.

I do not believe (1) is appropriate for Beam since it requires tons of manual 
labor, and does not scale well. This leaves us with (2).

To go with (2) we first need to address the following issues:
# We have test classes that essentially act as both unit tests, and integration 
tests (from {{maven}}'s perspective). Such classes are characterised by having 
both regular test methods, and test methods decorated with {{@NeedsRunner}} or 
{{@RunnableOnService}} which are run by {{maven-surefire-plugin}} during the 
{{integration-test}} phase (as opposed to the {{test}} phase for regular test 
methods). From my experiments this scenario is not well accommodated ^5^ in 
{{jacoco-maven-plugin}}.
# Reassigning {{@NeedsRunner}} and {{@RunnableOnService}}  based tests to be 
triggered during the {{test}} phase does not do the trick, probably because 
they scan dependencies (for tests) which fail to be ready when the {{test}} 
phase executes. I assume this was the reason they were configured to run as 
part of the {{verify}} phase in the first place.
# I believe these issues can be alleviated by moving {{@NeedsRunner}} and 
{{@RunnableOnService}} to separate *integration test* classes which will 
execute during the {{integration-test}} phase and thus avoid interfering with 
the tests that run during the {{test}} phase.

My impression on this is that it is doable, but the benefit-cost ratio seems to 
be quite questionable.
Perhaps we could look at external tools such as 
[SonarQube|https://www.sonarqube.org] which specialise in providing software 
metrics such as test coverage and the likes.
There is also a [Jenkins JaCoCo Plugin | 
https://wiki.jenkins-ci.org/display/JENKINS/JaCoCo+Plugin], which I have not 
looked into.

1. 
https://groups.google.com/forum/#!searchin/jacoco/%22multi$20module%22$20coverage%7Csort:relevance
2. https://github.com/jacoco/jacoco/wiki/MavenMultiModule
3. 
https://www.google.co.il/search?q=jacoco+multi+module+coverage&oq=jacoco+multi+module+coverage&aqs=chrome.0.0j69i57j69i60l3.7899j0j4&sourceid=chrome&ie=UTF-8
4. http://www.eclemma.org/jacoco/trunk/doc/ant.html
5. http://www.eclemma.org/jacoco/trunk/doc/classids.html


was (Author: staslev):
Having spent a while researching this, I can share the following findings.

I believe you are correct in that the coverage is inaccurate, and the 
{{report}} goal does not capture coverage provided by external (to the tests) 
modules.
{{report-aggregate}} takes into account coverage provided by such modules, but 
it seems to assume that the external module (e.g. {{DirectRunner}}) is all 
about integration tests, despite it having unit test of its own which are not 
reported, and so the coverage is yet again, inaccurate.
In addition, as far as I could see {{report-aggregate}} does not provide a 
solution for the rest of the modules, assuming we would like to obtain the 
coverage across the entire Beam project and not just the SDK. This is also the 
assumption throughout the rest of this comment, so if it's wrong and we are 
fine with covering the SDK only, my conclusions below may not apply.

The general problem of obtaining code coverage for an arbitrary depth, 
multi-module maven project is heavily discussed ^1-3^ and is believed to be NP 
complete.
Nonetheless some heuristics are out there:
# Introduce a new module, which should be *dependent on all the modules in the 
project*, this will allow using {{Jacoco}}'s {{report-aggregate}} goal.
# Hack our way.
## Introduce a new reporter module
## Add {{maven-antrun-plugin}} to the new module and set up an {{Ant}} task to 
gain a more fine-grained control ^4^ over the coverage report generation.

I do not believe (1) is appropriate for Beam since it requires tons of manual 
labor, and does not scale well. This leaves us with (2).

To go with (2) we first need to address the following issues:
# We have test classes that essentially act as both unit tests, and integration 
tests (from {{maven}}'s perspective). Such classes are characterised by having 
both regular test methods, and test methods decorated with {{@NeedsRunner}} or 
{{@RunnableOnService}} which are run by {{maven-surefire-plugin}} during the 
{{integration-test}} phase (as opposed to the {{test}} phase for regular test 
methods). From my experiments this scenario is not well accommodated ^5^ in 
{{jacoco-maven-plugin}}.
# Reassigning {{@NeedsRunner}} and {{@RunnableOnService}}  based tests to be 
triggered during the {{test}} phase does not do the trick, probably because 
they scan dependencies (for tests) which fail to be ready when the {{test}} 
phase executes. I assume this was the reason they were configured to run as 
part of the {{verify}} phase in the first place.
# I believe these issues can be alleviated by moving {{@NeedsRunner}} and 
{{@RunnableOnService}} to separate *integration test* classes which will 
execute during the {{integration-test}} phase and thus avoid interfering with 
the tests that run during the {{test}} phase.

My impression on this is that it is doable, but the benefit-cost ratio seems to 
be quite questionable.
Perhaps we could look at external tools such as 
[SonarQube|https://www.sonarqube.org] which specialise in providing software 
metrics such as test coverage and the likes.
There is also a [Jenkins JaCoCo Plugin | 
https://wiki.jenkins-ci.org/display/JENKINS/JaCoCo+Plugin], which I have not 
looked into.

1. 
https://groups.google.com/forum/#!searchin/jacoco/%22multi$20module%22$20coverage%7Csort:relevance
2. https://github.com/jacoco/jacoco/wiki/MavenMultiModule
3. 
https://www.google.co.il/search?q=jacoco+multi+module+coverage&oq=jacoco+multi+module+coverage&aqs=chrome.0.0j69i57j69i60l3.7899j0j4&sourceid=chrome&ie=UTF-8
4. http://www.eclemma.org/jacoco/trunk/doc/ant.html
5. http://www.eclemma.org/jacoco/trunk/doc/classids.html

> Code coverage numbers are not accurate
> --------------------------------------
>
>                 Key: BEAM-1399
>                 URL: https://issues.apache.org/jira/browse/BEAM-1399
>             Project: Beam
>          Issue Type: Bug
>          Components: build-system, sdk-java-core, testing
>            Reporter: Daniel Halperin
>              Labels: newbie, starter
>
> We've started adding Java Code Coverage numbers to PRs using the jacoco 
> plugin. However, we are getting very low coverage reported for things like 
> the Java SDK core.
> My belief is that this is happening because we test the bulk of the SDK not 
> in the SDK module , but in fact in the DirectRunner and other similar modules.
> JaCoCo has a {{report:aggregate}} target that might do the trick, but with a 
> few minutes of playing with it I wasn't able to make it work satisfactorily. 
> Basic work in https://github.com/apache/beam/pull/1800
> This is a good "random improvement" issue for anyone to pick up.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to