drexler-sky commented on code in PR #49348:
URL: https://github.com/apache/spark/pull/49348#discussion_r1902511923
##########
python/pyspark/sql/functions/builtin.py:
##########
@@ -15580,36 +15639,70 @@ def regexp_substr(str: "ColumnOrName", regexp:
"ColumnOrName") -> Column:
def regexp_instr(
str: "ColumnOrName", regexp: "ColumnOrName", idx: Optional[Union[int,
Column]] = None
) -> Column:
- r"""Extract all strings in the `str` that match the Java regex `regexp`
+ r"""Returns the position of the first substring in the `str` that match
the Java regex `regexp`
and corresponding to the regex group index.
.. versionadded:: 3.5.0
Parameters
----------
- str : :class:`~pyspark.sql.Column` or str
+ str : :class:`~pyspark.sql.Column` or column name
target column to work on.
- regexp : :class:`~pyspark.sql.Column` or str
+ regexp : :class:`~pyspark.sql.Column` or column name
regex pattern to apply.
- idx : int, optional
+ idx : :class:`~pyspark.sql.Column` or int, optional
Review Comment:
`idx` doesn't seem to be used in the Scala code
https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala#L1153.
Should we remove this parameter for now?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]