Github user phegstrom commented on a diff in the pull request:
https://github.com/apache/spark/pull/22227#discussion_r217000102
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/functions.scala ---
@@ -2546,15 +2546,51 @@ object functions {
def soundex(e: Column): Column = withExpr { SoundEx(e.expr) }
/**
- * Splits str around pattern (pattern is a regular expression).
+ * Splits str around matches of the given regex.
*
- * @note Pattern is a string representation of the regular expression.
+ * @param str a string expression to split
+ * @param regex a string representing a regular expression. The regex
string should be
+ * a Java regular expression.
*
* @group string_funcs
* @since 1.5.0
*/
- def split(str: Column, pattern: String): Column = withExpr {
- StringSplit(str.expr, lit(pattern).expr)
+ def split(str: Column, regex: String): Column = withExpr {
+ StringSplit(str.expr, Literal(regex), Literal(-1))
+ }
+
+ /**
+ * Splits str around matches of the given regex.
+ *
+ * @param str a string expression to split
+ * @param regex a string representing a regular expression. The regex
string should be
+ * a Java regular expression.
+ * @param limit an integer expression which controls the number of times
the regex is applied.
+ * <ul>
+ * <li>limit greater than 0
+ * <ul>
+ * <li>
+ * The resulting array's length will not be more than
limit,
+ * and the resulting array's last entry will contain all
input
+ * beyond the last matched regex.
+ * </li>
+ * </ul>
+ * </li>
+ * <li>limit less than or equal to 0
+ * <ul>
+ * <li>
+ * `regex` will be applied as many times as possible,
+ * and the resulting array can be of any size.
+ * </li>
+ * </ul>
+ * </li>
+ * </ul>
--- End diff --
oh I thought you wanted to have the explanations as sub bullets, will make
that change @HyukjinKwon
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]