This is an automated email from the ASF dual-hosted git repository.
github-bot pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/datafusion.git
The following commit(s) were added to refs/heads/main by this push:
new 9079bbd4f3 Fix `concat_elements_utf8view` capacity initialization.
(#18003)
9079bbd4f3 is described below
commit 9079bbd4f337d7c59e2b97700c1e02791c2e7f1d
Author: Samuele Resca <[email protected]>
AuthorDate: Sat Oct 18 08:48:10 2025 +0100
Fix `concat_elements_utf8view` capacity initialization. (#18003)
## Which issue does this PR close?
- Relates to #17857 (See
https://github.com/apache/datafusion/issues/17857#issuecomment-3368519097)
## Rationale for this change
The capacity calculation replaced with `left.len()` (assuming
`left.len()` and `right.len()` are the same). As the `with_capacity`
refers to the length of the views (or strings), not to the length of the
bytes
## Are these changes tested?
The function is already covered by tests.
## Are there any user-facing changes?
No
---
datafusion/physical-expr/src/expressions/binary/kernels.rs | 14 ++++++++------
1 file changed, 8 insertions(+), 6 deletions(-)
diff --git a/datafusion/physical-expr/src/expressions/binary/kernels.rs
b/datafusion/physical-expr/src/expressions/binary/kernels.rs
index 71d1242eea..36ecd1c816 100644
--- a/datafusion/physical-expr/src/expressions/binary/kernels.rs
+++ b/datafusion/physical-expr/src/expressions/binary/kernels.rs
@@ -145,12 +145,14 @@ pub fn concat_elements_utf8view(
left: &StringViewArray,
right: &StringViewArray,
) -> std::result::Result<StringViewArray, ArrowError> {
- let capacity = left
- .data_buffers()
- .iter()
- .zip(right.data_buffers().iter())
- .map(|(b1, b2)| b1.len() + b2.len())
- .sum();
+ if left.len() != right.len() {
+ return Err(ArrowError::ComputeError(format!(
+ "Arrays must have the same length: {} != {}",
+ left.len(),
+ right.len()
+ )));
+ }
+ let capacity = left.len();
let mut result = StringViewBuilder::with_capacity(capacity);
// Avoid reallocations by writing to a reused buffer (note we
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]