[
https://issues.apache.org/jira/browse/SPARK-7721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14566727#comment-14566727
]
Josh Rosen commented on SPARK-7721:
-----------------------------------
I played around with {{coverage.py}} a bit this morning and set up a script
which runs the Python unit tests with coverage, combines the coverage data
files, then generates a combined HTML report. You can find my code at
https://gist.github.com/JoshRosen/60d590b1cdc271d332e5; just clone that Gist
and configure the environment variables properly, then run the bash script from
the Gist directory.
One gotcha: I don't think that this is properly capturing coverage metrics for
Python worker processes. This may actually be somewhat complicated because I'm
not sure that our use of {{fork()}} in {{daemon.py}} will play nicely with
{{coverage.py}}'s parallel coverage file support (the feature that writes
different process's coverage data to different files). We may have to reach a
bit more deeply into PySpark's internals in order to integrate coverage metrics
for worker-side code, perhaps by adding code to programmatically start the
coverage capturing after the fork. It would be great if someone wants to work
on this, although I imagine that worker-side coverage is a lower priority than
having any form of basic coverage for the driver-side code.
> Generate test coverage report from Python
> -----------------------------------------
>
> Key: SPARK-7721
> URL: https://issues.apache.org/jira/browse/SPARK-7721
> Project: Spark
> Issue Type: Test
> Components: PySpark, Tests
> Reporter: Reynold Xin
>
> Would be great to have test coverage report for Python. Compared with Scala,
> it is tricker to understand the coverage without coverage reports in Python
> because we employ both docstring tests and unit tests in test files.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]