Github user jkbradley commented on the pull request:
https://github.com/apache/spark/pull/9069#issuecomment-147467355
@JoshRosen I think it's that PySpark has less coverage than Scala, and our
linear algebra code could use some more coverage too. And PySpark SparseVector
is in the intersection of these problems. (Though 4 of those bug fix patches
may have been patch + 3 backports.)
Hypothesis looks cool...future work? : )
@bhargav There is not a specific task for adding more tests, but since
you're interested, it'd be awesome if you could check through some of the
PySpark Vector and Matrix APIs and unit tests and see if you can find missing
coverage. Note also that the tests are split between doc tests (in each .py
file) and unit tests (in tests.py files). If you find missing items, can you
please make one or more JIRAs? If you're unsure about any, I'd recommend
making a single JIRA and listing there; we can then create subtasks for each
major issue. Thanks!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]