Github user holdenk commented on the pull request:
https://github.com/apache/spark/pull/12404#issuecomment-210617504
So I thought the unified doc string of:
> If true, all non zero results are set to 1. This is useful for discrete
probabilistic models that model binary events rather than integer counts.
Default False.
might make sense in the two cases where we are using it (since `minTF` is
described as a filter). If you think this isn't clear enough with the `minTF`
case on `HashingTF` I'm happy to close it for now and revisit if we add `minTF`
to `CountVectorizer`.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]