Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/22227#discussion_r213519898
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/functions.scala ---
@@ -2554,7 +2554,27 @@ object functions {
* @since 1.5.0
*/
def split(str: Column, pattern: String): Column = withExpr {
- StringSplit(str.expr, lit(pattern).expr)
+ StringSplit(str.expr, Literal(pattern), Literal(-1))
+ }
+
+ /**
+ * Splits str around pattern (pattern is a regular expression).
+ *
+ * The limit parameter controls the number of times the pattern is
applied and therefore
+ * affects the length of the resulting array. If the limit n is greater
than zero then the
+ * pattern will be applied at most n - 1 times, the array's length will
be no greater than
+ * n, and the array's last entry will contain all input beyond the last
matched delimiter.
+ * If n is non-positive then the pattern will be applied as many times
as possible and the
+ * array can have any length. If n is zero then the pattern will be
applied as many times as
+ * possible, the array can have any length, and trailing empty strings
will be discarded.
--- End diff --
Can you copy SQL's doc here? You could describe them via `@param` here as
well.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]