Hey everyone,

Currently, during the Jenkins build process, we don't generate any code
coverage reports for the Python SDK. This email includes a proposal to
generate python coverage reports during the pre-commit build, upload them
to codecov.io for analysis and visualization, and automatically post the
resulting stats back to GitHub PRs to help developers decide whether their
tests need revision.

You can view/comment on the proposal here
<https://docs.google.com/document/d/1K6zEJnDNvk7TkqDEPwoIa7NTj5KxtUIdq-TGewUgpmo/edit?usp=sharing>,
or the full text of the proposal at the end of this email. Let me know what
you think, or if there are any suggestions for improvements. Thanks!

Python Coverage Reports For CI/CD

Author: Saavan Nanavati ([email protected])

Reviewer: Udi Meiri ([email protected])

Overview

This is a proposal for generating code coverage reports for the Python SDK
during Jenkins’ pre-commit phase, and uploading them to codecov.io
<http://www.codecov.io> for analysis, with integration back into GitHub
using the service’s sister app <https://github.com/marketplace/codecov>.

This would extend the pre-commit build time but provide valuable
information for developers to revise and improve their tests before their
PR is merged, rather than after when it’s less likely developers will go
back to improve their coverage numbers.

This particular 3rd party service has a litany of awesome benefits:

   -

   It’s free for open-source projects
   -

   It seamlessly integrates into GitHub via a comment-bot (example here
   <https://github.com/mockito/mockito/pull/1355#issuecomment-377795325>)
   -

   It overlays coverage report information directly onto GitHub code using
   Sourcegraph <https://docs.codecov.io/docs/browser-extension>
   -

   It requires no changes to Jenkins, thereby reducing the risk of breaking
   the live test-infra
   -

   It’s extensible and can later be used for the Java & Go SDKs if it
   proves to be awesome
   -

   It has an extremely responsive support team that’s happy to help
   open-source projects


A proof-of-concept can be seen here
<https://codecov.io/gh/saavan-google-intern/beam/tree/dd529c6990c9d0d54ade2f5d65bef8542a3237b4/sdks/python/apache_beam>
and here
<https://codecov.io/gh/saavan-google-intern/beam/src/dd529c6990c9d0d54ade2f5d65bef8542a3237b4/sdks/python/apache_beam/typehints/decorators.py>
.

Goals

   -

   Provide coverage stats for the Python SDK that update with every
   pre-commit run
   -

   Integrate these reports into GitHub so developers can take advantage of
   the information
   -

   Open a discussion for how these coverage results can be utilized in code
   reviews


Non-Goals

   -

   Calculate coverage statistics using external tests located outside of
   the Python SDK


This is ideal, but would require not only merging multiple coverage reports
together but, more importantly, waiting for these tests to be triggered in
the first place. The main advantage of calculating coverage during
pre-commit is that developers can revise their PRs before merging, which is
not guaranteed if this is a goal.

However, it could be something to explore for the future.
Background

Providing code coverage for the Python SDK has been a problem since at
least 2017 (BEAM-2762 <https://issues.apache.org/jira/browse/BEAM-2762>)
with the previous solution being to calculate coverage in post-commit with
coverage.py <https://coverage.readthedocs.io/en/coverage-5.1/>, and then
sending the report to coveralls.io
<https://coverage.readthedocs.io/en/coverage-5.1/> which would post to
GitHub. At some point, this solution broke and the Tox environment used to
compute coverage, cover
<https://github.com/apache/beam/blob/6c95955b8bd91d4933bba57ccf1c5114563a2d21/sdks/python/tox.ini#L231>,
was turned off but still remains in the codebase.

There have been 4 main barriers, in the past, to re-implementing coverage
that will be addressed here.


   1.

   It’s difficult to unify coverage for some integration tests, especially
   ones that rely on 3rd party dependencies like GCP since it’s not possible
   to calculate coverage for the dependencies.
   1.

      As stated earlier, this is a non-goal for the proposal.



   1.

   The test reporter outputs results in the same directory which sometimes
   causes previous results to be overwritten. This occurs when using different
   parameters for the same test (e.g. running a test with Dataflow vs
   DirectRunner).
   1.

      This was mainly a post-commit problem but it does require exploration
      since it could be an issue for pre-commit. However, even in the
worst case,
      the coverage numbers would still be valuable since you can still see how
      coverage changed relatively between commits even if the absolute numbers
      are slightly inaccurate.



   1.

   It’s time-consuming and non-trivial to modify and test changes to Jenkins
   1.

      We don’t need to - this proposal integrates directly with codecov.io
      <http://www.codecov.io>, making Jenkins an irrelevant part of the
      testing infrastructure with regards to code coverage - it’s not just
      easier, it’s better because it provides many benefits (mentioned in the
      Overview section) that a Jenkins-based solution like Cobertura
      <https://cobertura.github.io/cobertura/> doesn’t have. Lastly, using
      codecov.io <http://www.codecov.io> would place less strain and
      maintenance on Jenkins (less jobs to run, no need for storing
the reports,
      etc.).



   1.

   coveralls.io isn’t the best to work with (email thread
   
<https://lists.apache.org/thread.html/r0887fcfb0df820b624679ce63ac0c6c4ea4240da338c92c1d1ab71f2%40%3Cdev.beam.apache.org%3E>
   )
   1.

      codecov.io <http://www.codecov.io> ;)

Design Details

To utilize codecov.io, the existing Tox environment cover
<https://github.com/apache/beam/blob/6c95955b8bd91d4933bba57ccf1c5114563a2d21/sdks/python/tox.ini#L231>
needs to be added to the list of toxTasks that Gradle will run in
pre-commit. To reduce strain (and more-or-less duplicate information),
coverage only needs to be calculated for one Python version (say 3.7, with
cloud dependencies) rather than all of them.

To be compatible with Jenkins and codecov.io, the Tox environment should be
configured as follows:

passenv = GIT_* BUILD_* ghprb* CHANGE_ID BRANCH_NAME JENKINS_* CODECOV_*

deps =

...

codecov

commands =

...

codecov -t CODECOV_TOKEN

There are two options for what goes into the ellipsis (...) First, we can
use coverage.py <https://coverage.readthedocs.io/en/coverage-5.1/> to
compute coverage (which is how it’s currently set up) but that uses the old
nose test runner. Alternatively, we can switch to computing coverage with
pytest <https://docs.pytest.org/en/stable/> and pytest-cov
<https://pypi.org/project/pytest-cov/>, which would give more accurate
numbers, and is the preferred solution.

Additionally, the CODECOV_TOKEN, which can be retrieved here
<https://codecov.io/gh/apache/beam/>, should be added as an environment
variable in Jenkins.

Finally, a YAML file should be added so codecov.io can resolve file path
errors <https://docs.codecov.io/docs/fixing-paths>. This can also be used
to define success thresholds and more (for example, here
<https://github.com/GoogleChrome/lighthouse/blob/master/.codecov.yml>)
though, ideally, this coverage task should never fail a Jenkins build due
to the risk of false-negatives.

A proof-of-concept for this design can be seen here
<https://codecov.io/gh/saavan-google-intern/beam/tree/dd529c6990c9d0d54ade2f5d65bef8542a3237b4/sdks/python/apache_beam>
and here
<https://codecov.io/gh/saavan-google-intern/beam/src/dd529c6990c9d0d54ade2f5d65bef8542a3237b4/sdks/python/apache_beam/typehints/decorators.py>
(this is code coverage for my fork).

Reply via email to