devanshu0987 opened a new pull request, #19630:
URL: https://github.com/apache/datafusion/pull/19630

   ## Which issue does this PR close?
   - Closes #.
   
   ## Rationale for this change
   
   In Postgres `translate` function implementation, the duplicates in the 
`from` argument are ignored, and the first occurrence wins.
   
   A similar implementation is also present in DuckDB. 
   If the character already exists, the index/mapping is not updated.
   
https://github.com/duckdb/duckdb/blob/4b7a6b7bd0f8c968bfecab08e801cdc1f0a5cdfd/extension/core_functions/scalar/string/translate.cpp#L45
   
   Before the change in DataFusion
   ```
   > SELECT translate('abcabc', 'aa', 'de');
   +-------------------------------------------------+
   | translate(Utf8("abcabc"),Utf8("aa"),Utf8("de")) |
   +-------------------------------------------------+
   | ebcebc                                          |
   +-------------------------------------------------+
   1 row(s) fetched. 
   Elapsed 0.001 seconds.
   ```
   
   While DuckDB returns
   ```
   D SELECT translate('abcabc', 'aa', 'de');
   ┌─────────────────────────────────┐
   │ translate('abcabc', 'aa', 'de') │
   │             varchar             │
   ├─────────────────────────────────┤
   │ dbcdbc                          │
   └─────────────────────────────────┘
   ```
   
   Postgres returns
   ```
   SELECT translate('abcabc', 'aa', 'de')
   
   Output:
   
    translate 
   -----------
    dbcdbc
   (1 row)
   ```
   ## What changes are included in this PR?
   
   - If there are duplicate characters present in `from`, the first occurrence 
wins.
   
   ## Are these changes tested?
   
   - New Unit Tests are added, which test this behaviour.
   
   ## Are there any user-facing changes?
   This is a contract change in some sense. If someone has taken dependency on 
this behaviour, they will encounter a change. However, I am unsure and need 
help to properly categorize this.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to