Github user huaxingao commented on a diff in the pull request:
https://github.com/apache/spark/pull/20777#discussion_r174935305
--- Diff: python/pyspark/ml/feature.py ---
@@ -465,26 +473,26 @@ class CountVectorizer(JavaEstimator, HasInputCol,
HasOutputCol, JavaMLReadable,
" Default False", typeConverter=TypeConverters.toBoolean)
@keyword_only
- def __init__(self, minTF=1.0, minDF=1.0, vocabSize=1 << 18,
binary=False, inputCol=None,
- outputCol=None):
+ def __init__(self, minTF=1.0, minDF=1.0, maxDF=sys.maxsize,
vocabSize=1 << 18, binary=False,
--- End diff --
Will make the change now. Thanks!
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]