andygrove commented on code in PR #4461:
URL: https://github.com/apache/datafusion-comet/pull/4461#discussion_r3319474823


##########
docs/source/contributor-guide/spark_expressions_support.md:
##########
@@ -566,33 +635,88 @@
 - [ ] regexp_extract_all
 - [ ] regexp_instr
 - [x] regexp_replace
+  - Spark 3.4.3 (audited 2026-05-27): identical to 3.5.8.
+  - Spark 3.5.8 (audited 2026-05-27): baseline. `RegExpReplace(subject, 
regexp, rep, pos)` with foldable `pos > 0`; uses Java `Pattern`. Comet supports 
only `pos = 1` (other offsets fall back) and injects a `'g'` flag because 
DataFusion's `regexp_replace` stops at the first match by default.
+  - Spark 4.0.1 (audited 2026-05-27): adds raw-string literal support at the 
parser level and `nullIntolerant: Boolean = true`; runtime semantics unchanged.
+  - Known limitation: regex semantics differ (Rust `regex` crate vs Java 
`Pattern`); `RegExp.isSupportedPattern` currently returns `false` for every 
pattern, so the path always requires 
`spark.comet.expression.regexp.allowIncompatible=true`.
 - [ ] regexp_substr
 - [x] repeat
+  - Spark 3.4.3 (audited 2026-05-27): identical to 3.5.8.
+  - Spark 3.5.8 (audited 2026-05-27): baseline. `StringRepeat(str, times)` 
with `nullSafeEval(s, n) = s.repeat(n)`; `UTF8String.repeat` returns the empty 
string for `n <= 0`. Comet casts `times` to `LongType` and delegates to 
DataFusion `repeat`.
+  - Spark 4.0.1 (audited 2026-05-27): adds `nullIntolerant: Boolean` field; 
`dataType` becomes `str.dataType` (collation-tracking). Semantics unchanged for 
`UTF8_BINARY`.
+  - Known divergence: DataFusion `repeat` throws on negative counts instead of 
returning the empty string Spark produces. Currently surfaced via 
`getCompatibleNotes` only 
(https://github.com/apache/datafusion-comet/issues/4462).

Review Comment:
   This was a hallucination - this issue did exist once, but is already fixed. 
I closed the issue.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to