alamb opened a new issue, #6347: URL: https://github.com/apache/arrow-rs/issues/6347
**Is your feature request related to a problem or challenge? Please describe what you are trying to do.** DataFusion has an optimized version of `concat(col1, ...)` for strings (I believe contributed by @JasonLi-cn ) that avoids: 1. Copying strings multiple times 2. Manipulating nulls uncessarly To do this today, we added `StringArrayBuilder`, which is similar but not hte same as `StringBuilder` in arrow https://github.com/apache/datafusion/blob/4838cfbf453f3c21d9c5a84f9577329dd78aa763/datafusion/functions/src/string/common.rs#L354-L417 The major differences are: 1. You can call `write` to incrementally build up each string and then call `append_offset` to create each string. `StringBuilder` requires each input to be a single contiguous string to call https://docs.rs/arrow/latest/arrow/array/type.StringBuilder.html#method.append_value 2. You can call finish() with the specified null buffer (rather than building it up incrementally) **Describe the solution you'd like** I think it is worth figuring out how to create a similar API for `StringBuilder` ### Incrementally wirte values Here is one ideal suggestion of how to write values that I think would be relatively easy to use: ```rust let mut builder = StringBuilder::with_capacity(...); // scope for lifetime { // get something that implements std::io::write let writable = builder.writeable(); write!(writeable, "foo"); // append "foo" to the inprogress string write!(writeable, "bar"); // append "bar" to the inprogress string } // scope close, finishes the string "foobar" ``` Similarly, adding a `finish_with_nulls(..)` type function that took a `NullBuffer` would be beneficial if the caller already knew about nulls **Describe alternatives you've considered** We could not do this at all (or just keep the code downstream in DataFusion) **Additional context** <!-- Add any other context or screenshots about the feature request here. --> -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
