lyne7-sc commented on code in PR #22390:
URL: https://github.com/apache/datafusion/pull/22390#discussion_r3293966007


##########
datafusion/functions-nested/src/remove.rs:
##########
@@ -468,6 +533,114 @@ fn general_remove<OffsetSize: OffsetSizeTrait>(
     )?))
 }
 
+/// For each element of `list_array[i]`, removed up to `arr_n[i]` occurrences
+/// of `needle[0]` (scalar element broadcasted).
+///
+/// This is a specialized version of `general_remove` for scalar elements that
+/// uses bulk comparison for better performance.
+fn general_remove_with_scalar<OffsetSize: OffsetSizeTrait>(
+    list_array: &GenericListArray<OffsetSize>,
+    needle: &ArrayRef,
+    arr_n: &[i64],
+) -> Result<ArrayRef> {
+    let list_field = match list_array.data_type() {
+        DataType::List(field) | DataType::LargeList(field) => field,
+        _ => {
+            return exec_err!(
+                "Expected List or LargeList data type, got {:?}",
+                list_array.data_type()
+            );
+        }
+    };
+
+    let list_offsets = list_array.offsets();
+    let first_offset = list_offsets[0].to_usize().unwrap();
+    let last_offset = list_offsets[list_offsets.len() - 1].to_usize().unwrap();
+    let values_range_len = last_offset - first_offset;
+    let values_slice = list_array.values().slice(first_offset, 
values_range_len);
+    let original_data = values_slice.to_data();
+    let mut offsets = Vec::<OffsetSize>::with_capacity(list_array.len() + 1);
+    offsets.push(OffsetSize::zero());
+
+    let mut mutable = MutableArrayData::with_capacities(

Review Comment:
   I benchmarked both `take` and `filter` kernel approaches against the current 
`MutableArrayData::extend` path. Overall, `MutableArrayData` performs best — on 
small lists (size=10) take is faster (~10%), possibly by avoiding 
`MutableArrayData` initialization overhead, but on medium/large lists 
(size≥100) `MutableArrayData` tends to win decisively (take is 60–170% slower 
depending on type). For variable-length types (strings), the gap appears to 
widen further. 
   
   One possible explanation is that take performs per-index random access for 
each element, whereas `MutableArrayData` may instead execute contiguous memcpy 
operations over memory regions?
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to