alamb commented on code in PR #12395:
URL: https://github.com/apache/datafusion/pull/12395#discussion_r1761764700
##########
datafusion/functions/src/string/ltrim.rs:
##########
@@ -81,7 +81,11 @@ impl ScalarUDFImpl for LtrimFunc {
}
fn return_type(&self, arg_types: &[DataType]) -> Result<DataType> {
- utf8_to_str_type(&arg_types[0], "ltrim")
+ if arg_types[0] == DataType::Utf8View {
+ Ok(DataType::Utf8View)
Review Comment:
Can we possibly add a .slt test to cover this (showing that the output type
is now a view and some basic end to end tests (if not already done)?)
Perhaps in
https://github.com/apache/datafusion/blob/main/datafusion/sqllogictest/test_files/string_view.slt
?
##########
datafusion/functions/src/string/btrim.rs:
##########
@@ -82,7 +82,11 @@ impl ScalarUDFImpl for BTrimFunc {
}
fn return_type(&self, arg_types: &[DataType]) -> Result<DataType> {
- utf8_to_str_type(&arg_types[0], "btrim")
+ if arg_types[0] == DataType::Utf8View {
+ Ok(DataType::Utf8View)
Review Comment:
👍
Also eventually it would also be possible to return `Utf8View` when the
input was `Utf8` and save a copy as well
##########
datafusion/functions/src/string/common.rs:
##########
@@ -21,17 +21,39 @@ use std::fmt::{Display, Formatter};
use std::sync::Arc;
use arrow::array::{
- new_null_array, Array, ArrayAccessor, ArrayDataBuilder, ArrayIter,
ArrayRef,
- GenericStringArray, GenericStringBuilder, LargeStringArray,
OffsetSizeTrait,
- StringArray, StringBuilder, StringViewArray, StringViewBuilder,
+ make_view, new_null_array, Array, ArrayAccessor, ArrayDataBuilder,
ArrayIter,
+ ArrayRef, ByteView, GenericStringArray, GenericStringBuilder,
LargeStringArray,
+ OffsetSizeTrait, StringArray, StringBuilder, StringViewArray,
StringViewBuilder,
};
use arrow::buffer::{Buffer, MutableBuffer, NullBuffer};
use arrow::datatypes::DataType;
+use arrow_buffer::{NullBufferBuilder, ScalarBuffer};
use datafusion_common::cast::{as_generic_string_array, as_string_view_array};
use datafusion_common::Result;
use datafusion_common::{exec_err, ScalarValue};
use datafusion_expr::ColumnarValue;
+/// Make a `u128` based on the given substr, start(offset to view.offset), and
+/// push into to the given buffers
+pub(crate) fn make_and_append_view(
Review Comment:
🤔 I wonder if we should (as a follow on PR) propose adding this upstream to
arrow-rs as it seems valuable for any trim related kernels on stringview
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]