loftiest commented on code in PR #56498:
URL: https://github.com/apache/spark/pull/56498#discussion_r3429169327


##########
common/unsafe/src/main/java/org/apache/spark/unsafe/types/UTF8String.java:
##########
@@ -1246,6 +1246,117 @@ public int indexOf(UTF8String v, int start) {
     return -1;
   }
 
+  /**
+   * Finds the {@code occurrence}-th occurrence of {@code pattern} in this 
string,
+   * starting the search at the specified position.
+   * When {@code start} is positive, the search proceeds forward from the
+   * {@code start}-th character (1‑based). When {@code start} is negative, the
+   * search proceeds backward from the {@code |start|}-th character from the 
end.
+   * Overlapping matches are supported (e.g. "aa" in "aaa" returns 0, 1, 2 for
+   * occurrence 1, 2, 3 respectively).
+   *
+   * @param pattern    the substring to search for
+   * @param start      1‑based start position; if negative, search direction 
is reversed
+   * @param occurrence which occurrence to return (must be >= 1)
+   * @return 0‑based character index of the match, or -1 if not found
+   */
+  public int indexOf(UTF8String pattern, int start, int occurrence) {
+    assert occurrence > 0;
+    if (pattern.numBytes() == 0) {
+      return indexOfEmpty(start);

Review Comment:
   I've added tests for empty substring with negative start.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to