devanshu0987 commented on PR #3054:
URL: https://github.com/apache/datafusion/pull/3054#issuecomment-3707088778

   Hi @Dandandan OR @alamb, is it possible to share historical reasons as to 
why lpad and rpad were not covered in this PR? 
   Should we make this contract change and make it Postgres compatible, or is 
this a deliberate choice?
   
   For lpad, 3 arguments, `(string, length, fill)`: string is split on Grapheme 
clusters, and fill is iterated on code points.
   While, as you mentioned, Postgres has code point splits for both `string` 
and `fill` equivalents.
   
   `datafusion/functions/src/unicode/lpad.rs`
   ```
   let graphemes = string.graphemes(true).collect::<Vec<&str>>();
   let fill_chars = fill.chars().collect::<Vec<char>>();
   ```
   
   Datafusion result
   ```
   > select lpad('á', 1);
   +--------------------------+
   | lpad(Utf8("á"),Int64(1)) |
   +--------------------------+
   | á                        |
   +--------------------------+
   1 row(s) fetched. 
   Elapsed 0.075 seconds.
   ```
   
   Postgres result
   ```
   "lpad"
   "a"
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to