[
https://issues.apache.org/jira/browse/SPARK-36504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Haejoon Lee updated SPARK-36504:
--------------------------------
Description:
There are many codes in pandas-on-Spark are not being tested, for example:
* (Series|DataFrame).to_clipboard
!Screen Shot 2021-08-13 at 9.56.48 AM.png|width=548,height=125!
* `value` and `method` argument for Series.fillna
!Screen Shot 2021-08-13 at 9.59.23 AM.png|width=551,height=34!
The red line above screen capture means that "this line is not being tested".
Now the test coverage of pandas-on-Spark is 89.93% for total, 93.43% for
frame.py (which is including DataFrame API), 89.04% for indexing.py (which is
including Index API) and 93.43% for series.py (which is including Series API).
Not necessarily cover the 100% of codes, since some test such as
`DataFrame.to_delta` is untestable for now, but we should cover the codes as
much as possible for healthy of project.
You can find more missing tests and percentage of coverage in [code cov
report|[https://app.codecov.io/gh/apache/spark]|https://app.codecov.io/gh/apache/spark].].
was:
There are many codes in pandas-on-Spark are not being tested, for example:
* (Series|DataFrame).to_clipboard
!Screen Shot 2021-08-13 at 9.56.48 AM.png|width=548,height=125!
* `value` and `method` argument for Series.fillna
!Screen Shot 2021-08-13 at 9.59.23 AM.png|width=551,height=34!
The red line above screen capture means that "this line is not being tested".
Now the test coverage of pandas-on-Spark is 89.93% for total, 93.43% for
frame.py (which is including DataFrame API), 89.04% for indexing.py (which is
including Index API) and 93.43% for series.py (which is including Series API).
Not necessarily cover the 100% of codes, since some test such as
`DataFrame.to_delta` is not easy to test for now, but we should cover the codes
as much as possible for healthy of project.
You can find more missing tests and percentage of coverage in [code cov
report|[https://app.codecov.io/gh/apache/spark]|https://app.codecov.io/gh/apache/spark].].
> Improve test coverage for pandas API on Spark
> ---------------------------------------------
>
> Key: SPARK-36504
> URL: https://issues.apache.org/jira/browse/SPARK-36504
> Project: Spark
> Issue Type: Umbrella
> Components: PySpark
> Affects Versions: 3.3.0
> Reporter: Haejoon Lee
> Priority: Major
> Attachments: Screen Shot 2021-08-13 at 9.56.48 AM.png, Screen Shot
> 2021-08-13 at 9.59.23 AM.png
>
>
> There are many codes in pandas-on-Spark are not being tested, for example:
> * (Series|DataFrame).to_clipboard
> !Screen Shot 2021-08-13 at 9.56.48 AM.png|width=548,height=125!
>
> * `value` and `method` argument for Series.fillna
> !Screen Shot 2021-08-13 at 9.59.23 AM.png|width=551,height=34!
>
> The red line above screen capture means that "this line is not being tested".
> Now the test coverage of pandas-on-Spark is 89.93% for total, 93.43% for
> frame.py (which is including DataFrame API), 89.04% for indexing.py (which is
> including Index API) and 93.43% for series.py (which is including Series API).
> Not necessarily cover the 100% of codes, since some test such as
> `DataFrame.to_delta` is untestable for now, but we should cover the codes as
> much as possible for healthy of project.
> You can find more missing tests and percentage of coverage in [code cov
> report|[https://app.codecov.io/gh/apache/spark]|https://app.codecov.io/gh/apache/spark].].
>
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]