Amogh-2404 opened a new pull request, #22497:
URL: https://github.com/apache/datafusion/pull/22497

   ## Which issue does this PR close?
   
   - Closes #22253.
   
   ## Rationale for this change
   
   PostgreSQL returns the input unchanged when `replace` is called with an 
empty `from`. DataFusion was instead inserting `to` before every character and 
at both ends, so `replace('abc', '', 'x')` returned `xaxbxcx`. This PR brings 
the behaviour in line with PostgreSQL. Part of the PG-compatibility cleanup 
tracked in #22247.
   
   ## What changes are included in this PR?
   
   - `datafusion/functions/src/string/replace.rs`: the empty-`from` branch in 
`apply_replace` now writes the input verbatim instead of inserting `to`. Added 
a `LargeUtf8` unit test for the new behaviour.
   - `datafusion/sqllogictest/test_files/string/string_literal.slt`: four new 
SLT asserts covering the `Utf8`, `Dictionary`, `Utf8View`, and `LargeUtf8` 
paths.
   - `datafusion/sqllogictest/test_files/string/string_query.slt.part`: updated 
four expected rows that were asserting the old buggy output.
   
   ## Are these changes tested?
   
   Yes. The unit test in `replace.rs` covers the `LargeUtf8` path, and the four 
new SLT asserts in `string_literal.slt` cover the remaining Arrow string 
encodings end-to-end. The full SLT suite passes locally.
   
   ## Are there any user-facing changes?
   
   Yes. `replace(str, '', x)` now returns `str` unchanged instead of inserting 
`x` between every character. This matches PostgreSQL.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to