Github user huaxingao commented on a diff in the pull request:
https://github.com/apache/spark/pull/20777#discussion_r173369004
--- Diff: python/pyspark/ml/feature.py ---
@@ -465,26 +522,26 @@ class CountVectorizer(JavaEstimator, HasInputCol,
HasOutputCol, JavaMLReadable,
" Default False", typeConverter=TypeConverters.toBoolean)
@keyword_only
- def __init__(self, minTF=1.0, minDF=1.0, vocabSize=1 << 18,
binary=False, inputCol=None,
- outputCol=None):
+ def __init__(self, minTF=1.0, minDF=1.0, maxDF=2 ** 63 - 1,
vocabSize=1 << 18, binary=False,
--- End diff --
Thank you very much for the comments. Will make changes.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]