Github user yogeshg commented on a diff in the pull request:
https://github.com/apache/spark/pull/20829#discussion_r176228007
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/feature/VectorAssembler.scala ---
@@ -49,32 +53,57 @@ class VectorAssembler @Since("1.4.0") (@Since("1.4.0")
override val uid: String)
@Since("1.4.0")
def setOutputCol(value: String): this.type = set(outputCol, value)
+ /** @group setParam */
+ @Since("2.4.0")
+ def setHandleInvalid(value: String): this.type = set(handleInvalid,
value)
+
+ /**
+ * Param for how to handle invalid data (NULL values). Options are
'skip' (filter out rows with
--- End diff --
Behavior of options already included, explanation of column length included
here, run time information included in the VectorAssembler class's
documentation. Thanks for the suggestion, this is super important!
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]