Github user maropu commented on a diff in the pull request: https://github.com/apache/spark/pull/22227#discussion_r212781563 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/functions.scala --- @@ -2554,7 +2554,27 @@ object functions { * @since 1.5.0 */ def split(str: Column, pattern: String): Column = withExpr { - StringSplit(str.expr, lit(pattern).expr) + StringSplit(str.expr, lit(pattern).expr, lit(-1).expr) + } + + /** + * Splits str around pattern (pattern is a regular expression) up to `limit-1` times. + * + * The limit parameter controls the number of times the pattern is applied and therefore + * affects the length of the resulting array. If the limit n is greater than zero then the + * pattern will be applied at most n - 1 times, the array's length will be no greater than + * n, and the array's last entry will contain all input beyond the last matched delimiter. + * If n is non-positive then the pattern will be applied as many times as possible and the + * array can have any length. If n is zero then the pattern will be applied as many times as + * possible, the array can have any length, and trailing empty strings will be discarded. + * + * @note Pattern is a string representation of the regular expression. + * + * @group string_funcs + * @since 1.5.0 --- End diff -- `1.5.0` -> `2.4.0`
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org