alamb commented on code in PR #12224:
URL: https://github.com/apache/datafusion/pull/12224#discussion_r1740777076
##########
datafusion/functions/src/string/concat.rs:
##########
@@ -64,13 +66,36 @@ impl ScalarUDFImpl for ConcatFunc {
&self.signature
}
- fn return_type(&self, _arg_types: &[DataType]) -> Result<DataType> {
- Ok(Utf8)
+ fn return_type(&self, arg_types: &[DataType]) -> Result<DataType> {
+ use DataType::*;
+ let mut dt = &Utf8;
+ arg_types.iter().for_each(|data_type| {
+ if data_type == &Utf8View {
+ dt = data_type;
+ }
+ if data_type == &LargeUtf8 && dt != &Utf8View {
Review Comment:
👍
##########
datafusion/functions/src/string/common.rs:
##########
@@ -418,6 +448,148 @@ impl StringArrayBuilder {
}
}
+pub(crate) struct StringViewArrayBuilder {
+ builder: StringViewBuilder,
+ block: String
+}
+
+impl StringViewArrayBuilder {
Review Comment:
I think the original intent of `StringViewBuilder` when @JasonLi-cn added it
was to have a way to append (but not complete) strings to avoid a copy and not
incrementally update the null mask.
Specifically, StringBuilder in arrow
https://docs.rs/arrow/latest/arrow/array/type.StringBuilder.html#method.append_value
lets you append an entire string but it requires the input be contiguous bytes
`StringArrayBuilder` is that you can push bytes (via `write`) wthout copying
the data.
The reason I think it feels hacky is that the API for building up single
values in a StringArray or StringViewArray without a copy is not present
As we see we want to do the same thing with `Utf8View` I wonder if we should
spend some time making the API eaiser to use 🤔
##########
datafusion/functions/src/string/common.rs:
##########
@@ -418,6 +448,148 @@ impl StringArrayBuilder {
}
}
+pub(crate) struct StringViewArrayBuilder {
+ builder: StringViewBuilder,
+ block: String
+}
+
+impl StringViewArrayBuilder {
Review Comment:
I filed https://github.com/apache/arrow-rs/issues/6347 to discuss adding an
API like this in arrow as well
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]