[
https://issues.apache.org/jira/browse/SPARK-57436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
ASF GitHub Bot updated SPARK-57436:
-----------------------------------
Labels: pull-request-available (was: )
> Support start and occurrence parameters in instr function
> ---------------------------------------------------------
>
> Key: SPARK-57436
> URL: https://issues.apache.org/jira/browse/SPARK-57436
> Project: Spark
> Issue Type: Improvement
> Components: SQL
> Affects Versions: 4.3.0
> Reporter: Junlong Li
> Priority: Minor
> Labels: pull-request-available
>
> Currently, Spark's {{instr}} function only supports two arguments and can
> only find the first occurrence of a substring. Users who need to locate a
> specific occurrence or start searching from a certain position must write
> complex workarounds.
> In contrast, Oracle, Impala, and Db2 all support a four-argument version of
> {{{}INSTR{}}}, allowing users to specify a starting position and which
> occurrence to locate, with support for both forward and backward search.
> Adding the same capability to Spark SQL would:
> * Simplify user queries that currently require complex expressions.
> * Align Spark's behavior with Oracle, Impala, and Db2, reducing migration
> friction.
> *Proposed syntax:*
> instr(str, substr [, start [, occurrence]])
>
> *Proposed behavior:*
> * {{start}} (INT, default 1): 1‑based starting position.
> *
> ** Positive: search forward from that position.
> *
> ** Negative: search backward from that position (distance from end of
> string).
> *
> ** If {{{}start = 0{}}}, returns 0.
> * {{occurrence}} (INT, default 1): which occurrence to find, in the
> direction determined by the sign of {{{}start{}}}.
> *
> ** If {{{}occurrence <= 0{}}}, an error is raised.
> * Returns the 1‑based index of the specified occurrence, or 0 if not found.
> * Any argument NULL ⇒ result NULL.
> * Collation awareness is fully preserved.
> *Examples:*
> {code:java}
> instr('abcabc','b', 1, 2) → 5
> instr('abcabc','b', -1, 1) → 5
> instr('abcabc','b', -1, 2) → 2
> instr('abc','b', 0, 1) → 0
> instr('abc','b', 1, 0) → error{code}
> *Compatibility:*
> The existing two-argument syntax remains unchanged.{*}{*}
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]