alamb opened a new issue, #6347:
URL: https://github.com/apache/arrow-rs/issues/6347

   **Is your feature request related to a problem or challenge? Please describe 
what you are trying to do.**
   
   DataFusion has an optimized version of `concat(col1, ...)` for strings (I 
believe contributed by @JasonLi-cn ) that avoids:
   1. Copying strings multiple times
   2. Manipulating nulls uncessarly
   
   To do this today, we added `StringArrayBuilder`, which is similar but not 
hte same as `StringBuilder` in arrow
   
https://github.com/apache/datafusion/blob/4838cfbf453f3c21d9c5a84f9577329dd78aa763/datafusion/functions/src/string/common.rs#L354-L417
   
   The major differences are:
   1. You can call `write` to incrementally build up each string and then call 
`append_offset` to create each string. `StringBuilder` requires each input to 
be a single contiguous string to call 
https://docs.rs/arrow/latest/arrow/array/type.StringBuilder.html#method.append_value
   2. You can call finish() with the specified null buffer (rather than 
building it up incrementally)
   
   **Describe the solution you'd like**
   I think it is worth figuring out how to create a similar API for 
`StringBuilder`
   
   ### Incrementally wirte values
   
   Here is one ideal suggestion of how to write values that I think would be 
relatively easy to use:
   ```rust
   let mut builder = StringBuilder::with_capacity(...);
   // scope for lifetime
   { 
     // get something that implements std::io::write
     let writable = builder.writeable();
     write!(writeable, "foo"); // append "foo" to the inprogress string
     write!(writeable, "bar"); // append "bar" to the inprogress string
   } // scope close, finishes the string "foobar"
   ```
   
   Similarly, adding a `finish_with_nulls(..)` type function that took a 
`NullBuffer` would be beneficial if the caller already knew about nulls
   
   **Describe alternatives you've considered**
   
   We could not do this at all (or just keep the code downstream in DataFusion)
   
   
   **Additional context**
   <!--
   Add any other context or screenshots about the feature request here.
   -->
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to