neilconway opened a new pull request, #22171:
URL: https://github.com/apache/datafusion/pull/22171

   ## Which issue does this PR close?
   
   - Closes #22170.
   
   ## Rationale for this change
   
   This PR refactors and optimizes the `translate` UDF. In particular, we 
switch to using the new bulk-NULL string builders, avoiding per-row NULL 
computation, and avoid an intermediate string copy by using `append_with` / 
`append_byte_map`.
   
   ## What changes are included in this PR?
   
   * Switch from using the Rust StringBuilders to our new bulk-NULL string 
builders
   * Switch from per-row NULL checks to computing the NULL bitmaps with 
`NullBuffer::union_many`
   * Use `append_with` and `append_byte_map` rather than `append_value`, which 
avoids an intermediate scratch buffer
   * Refactor lookup table code to use a single `TranslationTable` enum
   * Add a benchmark for the "varying `from`/`to`, Unicode strings" case
   * Add a unit test
   
   ## Are these changes tested?
   
   Yes; new test added.
   
   ## Are there any user-facing changes?
   
   No.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to