Github user karlhigley commented on the pull request:
https://github.com/apache/spark/pull/9843#issuecomment-160314300
I understand the issue you're pointing out, but it hasn't been a practical
problem, even with hundreds of thousands of terms. The inclusion of explicit
zeros in the output SparseVectors has been a practical problem, which is what
led me to submit this JIRA and PR.
This is my first contribution to Spark, and I'm trying to adhere to the
contribution guidelines. The guidelines suggest that "simple, targeted" changes
are more likely to be accepted than "big bang" changes. It sounds like you're
telling me this PR won't be accepted without making an additional optimization
which adds ~100 lines of code, requires additional tests, and fixes an issue
that was already present in the code before my changes. Is that what you're
saying?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]