Github user viirya commented on a diff in the pull request:
https://github.com/apache/spark/pull/19753#discussion_r152172693
--- Diff: python/pyspark/ml/feature.py ---
@@ -2490,7 +2490,8 @@ def setParams(self, inputCols=None, outputCol=None):
@inherit_doc
-class VectorIndexer(JavaEstimator, HasInputCol, HasOutputCol,
JavaMLReadable, JavaMLWritable):
+class VectorIndexer(JavaEstimator, HasInputCol, HasOutputCol,
HasHandleInvalid, JavaMLReadable,
+ JavaMLWritable):
"""
Class for indexing categorical feature columns in a dataset of
`Vector`.
--- End diff --
There is a TODO in the doc of `VectorIndexer`: `Add option for allowing
unknown categories.`. I think we can remove it?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]