Jefffrey commented on code in PR #18424:
URL: https://github.com/apache/datafusion/pull/18424#discussion_r2485221534
##########
datafusion/functions-nested/src/reverse.rs:
##########
@@ -183,6 +195,75 @@ fn general_array_reverse<O: OffsetSizeTrait +
TryFrom<i64>>(
)?))
}
+fn list_view_reverse<O: OffsetSizeTrait + TryFrom<i64>>(
+ array: &GenericListViewArray<O>,
+ field: &FieldRef,
+) -> Result<ArrayRef> {
+ let (_, offsets, sizes, values, nulls) = array.clone().into_parts();
+
+ // Construct indices, sizes and offsets for the reversed array by
iterating over
+ // the list view array in the logical order, and reversing the order of
the elements.
+ // We end up with a list view array where the elements are in order,
+ // even if the original array had elements out of order.
+ let mut indices: Vec<O> = Vec::with_capacity(values.len());
+ let mut new_sizes = Vec::with_capacity(sizes.len());
+ let mut new_offsets: Vec<O> = Vec::with_capacity(offsets.len());
+ let mut new_nulls =
+ Vec::with_capacity(nulls.clone().map(|nulls|
nulls.len()).unwrap_or(0));
+ new_offsets.push(O::zero());
+ let has_nulls = nulls.is_some();
+ for (i, offset) in offsets.iter().enumerate().take(offsets.len()) {
+ // If this array is null, we set the new array to null with size 0 and
continue
+ if let Some(ref nulls) = nulls {
+ if nulls.is_null(i) {
+ new_nulls.push(false); // null
+ new_sizes.push(O::zero());
+ new_offsets.push(new_offsets[i]);
+ continue;
+ } else {
+ new_nulls.push(true); // valid
+ }
+ }
+
+ // Each array is located at [offset, offset + size), so we collect
indices in the reverse order
+ let array_start = offset.as_usize();
+ let array_end = array_start + sizes[i].as_usize();
+ for idx in (array_start..array_end).rev() {
+ indices.push(O::usize_as(idx));
+ }
+ new_sizes.push(sizes[i]);
+ if i < sizes.len() - 1 {
+ new_offsets.push(new_offsets[i] + sizes[i]);
+ }
Review Comment:
Ah it's pushing the offset for the _next_ list each time? So we don't want
to do it in the last iteration as we're finished.
Maybe it's better to do this via a rolling offset, something like (in
pseudocode):
```
let mut offset = 0
for loop {
offsets.push(offset)
offset += size
}
```
It can eliminate this if branch, but also I think can cover an edge case
regarding nulls (see next comment)
##########
datafusion/functions-nested/src/reverse.rs:
##########
@@ -183,6 +195,75 @@ fn general_array_reverse<O: OffsetSizeTrait +
TryFrom<i64>>(
)?))
}
+fn list_view_reverse<O: OffsetSizeTrait + TryFrom<i64>>(
+ array: &GenericListViewArray<O>,
+ field: &FieldRef,
+) -> Result<ArrayRef> {
+ let (_, offsets, sizes, values, nulls) = array.clone().into_parts();
+
+ // Construct indices, sizes and offsets for the reversed array by
iterating over
+ // the list view array in the logical order, and reversing the order of
the elements.
+ // We end up with a list view array where the elements are in order,
+ // even if the original array had elements out of order.
+ let mut indices: Vec<O> = Vec::with_capacity(values.len());
+ let mut new_sizes = Vec::with_capacity(sizes.len());
+ let mut new_offsets: Vec<O> = Vec::with_capacity(offsets.len());
+ let mut new_nulls =
+ Vec::with_capacity(nulls.clone().map(|nulls|
nulls.len()).unwrap_or(0));
+ new_offsets.push(O::zero());
+ let has_nulls = nulls.is_some();
+ for (i, offset) in offsets.iter().enumerate().take(offsets.len()) {
+ // If this array is null, we set the new array to null with size 0 and
continue
+ if let Some(ref nulls) = nulls {
+ if nulls.is_null(i) {
+ new_nulls.push(false); // null
+ new_sizes.push(O::zero());
+ new_offsets.push(new_offsets[i]);
+ continue;
+ } else {
+ new_nulls.push(true); // valid
+ }
+ }
+
+ // Each array is located at [offset, offset + size), so we collect
indices in the reverse order
+ let array_start = offset.as_usize();
+ let array_end = array_start + sizes[i].as_usize();
+ for idx in (array_start..array_end).rev() {
+ indices.push(O::usize_as(idx));
+ }
Review Comment:
Something similar to how `general_array_reverse` does it:
https://github.com/apache/datafusion/blob/9238779f45418e07108079aeef3b51de85e9fb8f/datafusion/functions-nested/src/reverse.rs#L160-L164
e.g.
```suggestion
let array_start = offset;
let array_end = array_start.add(sizes[i]);
let mut idx = array_end - O::one();
while idx >= array_start {
indices.push(idx);
idx -= O::one();
}
```
##########
datafusion/functions-nested/src/reverse.rs:
##########
@@ -183,6 +195,72 @@ fn general_array_reverse<O: OffsetSizeTrait +
TryFrom<i64>>(
)?))
}
+fn list_view_reverse<O: OffsetSizeTrait + TryFrom<i64>>(
+ array: &GenericListViewArray<O>,
+ field: &FieldRef,
+) -> Result<ArrayRef> {
+ let offsets = array.offsets();
+ let values = array.values();
+ let sizes = array.sizes();
+
+ // Construct indices, sizes and offsets for the reversed array by
iterating over
+ // the list view array in the logical order, and reversing the order of
the elements.
+ // We end up with a list view array where the elements are in order,
+ // even if the original array had elements out of order.
+ let mut indices: Vec<O> = Vec::with_capacity(values.len());
+ let mut new_sizes = Vec::with_capacity(sizes.len());
+ let mut new_offsets: Vec<O> = Vec::with_capacity(offsets.len());
+ // Add the offset of the first array
+ new_offsets.push(O::zero());
+ for (i, offset) in offsets.iter().enumerate() {
+ // If this array is null, we set the new array to null with size 0 and
continue
+ if array.is_null(i) {
+ new_sizes.push(O::zero());
+ new_offsets.push(new_offsets[i]);
+ continue;
+ }
Review Comment:
Do we have a test case where we have only null inputs? Because if we push an
offset at the start, then encounter a ListView with a single null element, I
think we end up where offsets length is 2 but sizes length is 1.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]