alamb commented on code in PR #9274:
URL: https://github.com/apache/arrow-rs/pull/9274#discussion_r2747528490
##########
arrow-cast/src/cast/list.rs:
##########
@@ -191,3 +347,151 @@ pub(crate) fn cast_list<I: OffsetSizeTrait, O:
OffsetSizeTrait>(
nulls,
)?))
}
+
+/// Casting list view arrays to list.
+pub(crate) fn cast_list_view_to_list<I, O>(
+ array: &dyn Array,
+ to: &FieldRef,
+ cast_options: &CastOptions,
+) -> Result<ArrayRef, ArrowError>
+where
+ I: OffsetSizeTrait,
+ // We need ArrowPrimitiveType here to be able to create indices array for
the
+ // take kernel.
+ O: ArrowPrimitiveType,
+ O::Native: OffsetSizeTrait,
+{
+ let list_view = array.as_list_view::<I>();
+ let list_view_offsets = list_view.offsets();
+ let sizes = list_view.sizes();
+
+ let mut take_indices: Vec<O::Native> =
Vec::with_capacity(list_view.values().len());
Review Comment:
I got codex to give me an esoteric bug here -- basically the `with_capacity`
can be massive for list_view with RLE:
```rust
#[test]
fn test_cast_list_view_to_list_large_logical_len() {
// Use a RunArray to get a huge logical length without allocating
// large buffers. The list view references a single element at
offset 0.
let run_ends = Int64Array::from(vec![i64::MAX - 1, i64::MAX]);
let values = Int32Array::from(vec![10, 99]);
let run_array = RunArray::<Int64Type>::try_new(&run_ends,
&values).unwrap();
let item_field: FieldRef =
Arc::new(Field::new_list_field(run_array.data_type().clone(),
true));
let list_view = LargeListViewArray::try_new(
Arc::clone(&item_field),
vec![0i64].into(),
vec![1i64].into(),
Arc::new(run_array),
None,
)
.unwrap();
let target_type = DataType::List(item_field);
let cast_result = cast(&list_view, &target_type).unwrap();
let got_list = cast_result.as_list::<i32>();
assert_eq!(got_list.len(), 1);
let got_values = got_list.values().as_run::<Int64Type>();
let typed = got_values.downcast::<Int32Array>().unwrap();
assert_eq!(typed.value(0), 10);
}
```
Fails like this:
```
thread 'cast::tests::test_cast_list_view_to_list_large_logical_len'
(15048336) panicked at arrow-cast/src/cast/list.rs:368:44:
capacity overflow
stack backtrace:
```
I don't think this is a huge issue personally -- I can file a follow on
ticket if needed
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]