cloud-fan commented on code in PR #56240:
URL: https://github.com/apache/spark/pull/56240#discussion_r3345649955
##########
sql/api/src/main/scala/org/apache/spark/sql/functions.scala:
##########
@@ -5401,6 +5411,16 @@ object functions {
def regexp_replace(e: Column, pattern: Column, replacement: Column): Column =
Column.fn("regexp_replace", e, pattern, replacement)
+ /**
+ * Replace all substrings of the specified string value that match regexp
with rep, starting at
+ * `position`.
+ *
+ * @group string_funcs
+ * @since 4.2.0
Review Comment:
Same here — `@since` should be `4.3.0`.
```suggestion
* @since 4.3.0
```
##########
sql/api/src/main/scala/org/apache/spark/sql/functions.scala:
##########
@@ -5392,6 +5392,16 @@ object functions {
def regexp_replace(e: Column, pattern: String, replacement: String): Column =
regexp_replace(e, lit(pattern), lit(replacement))
+ /**
+ * Replace all substrings of the specified string value that match regexp
with rep, starting at
+ * `position`.
+ *
+ * @group string_funcs
+ * @since 4.2.0
Review Comment:
`@since` should be `4.3.0`, not `4.2.0`: 4.2 is already cut (`branch-4.2`
exists) and master is on `5.0.0-SNAPSHOT`, so this new overload first ships in
4.3.0. Same correction as
[SPARK-56820](https://issues.apache.org/jira/browse/SPARK-56820) for
`counter_diff`.
```suggestion
* @since 4.3.0
```
##########
python/pyspark/sql/functions/builtin.py:
##########
@@ -16350,14 +16350,19 @@ def regexp_extract_all(
@_try_remote_functions
def regexp_replace(
- string: "ColumnOrName", pattern: Union[str, Column], replacement:
Union[str, Column]
+ string: "ColumnOrName",
+ pattern: Union[str, Column],
+ replacement: Union[str, Column],
+ position: Optional[Union[int, Column]] = None,
) -> Column:
r"""Replace all substrings of the specified string value that match regexp
with replacement.
.. versionadded:: 1.5.0
.. versionchanged:: 3.4.0
Supports Spark Connect.
+ .. versionchanged:: 4.2.0
Review Comment:
`versionchanged` should be `4.3.0` — the `position` parameter first appears
in 4.3.0, not the already-cut 4.2.0.
```suggestion
.. versionchanged:: 4.3.0
```
##########
sql/api/src/main/scala/org/apache/spark/sql/functions.scala:
##########
@@ -5392,6 +5392,16 @@ object functions {
def regexp_replace(e: Column, pattern: String, replacement: String): Column =
regexp_replace(e, lit(pattern), lit(replacement))
+ /**
+ * Replace all substrings of the specified string value that match regexp
with rep, starting at
Review Comment:
Nit: the prose says "starting at `position`", but the parameter is named
`pos` — the backtick'd `position` implies an identifier that doesn't exist in
this signature. Consider matching the prose to the param name (or renaming
`pos` to `position` to align with the PySpark binding).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]