Jefffrey commented on issue #9227:
URL: https://github.com/apache/arrow-rs/issues/9227#issuecomment-3802949995
Thanks for volunteering @lichuang. My investigation so far has been that
it's related to null lists (i.e. `[NULL]`) specifically, so a test like this is
a minimum reproducer:
```rust
#[test]
fn test_null_list() {
let null_array = Arc::new(NullArray::new(1));
// [NULL]
let list = Arc::new(ListArray::new(
Field::new_list_field(null_array.data_type().clone(), true).into(),
OffsetBuffer::from_lengths(vec![1]),
null_array,
None,
)) as ArrayRef;
let converter =
RowConverter::new(vec![SortField::new(list.data_type().clone())]).unwrap();
let rows = converter.convert_columns(&[Arc::clone(&list)]).unwrap();
let back = converter.convert_rows(&rows).unwrap();
assert_eq!(&list, &back[0]);
}
```
Specifically when we encode one here:
https://github.com/apache/arrow-rs/blob/fab8e75eff6d1dd71708b71e1a2a85275394fa80/arrow-row/src/list.rs#L83-L102
We call encode_one of the variable encoding on line 96.
https://github.com/apache/arrow-rs/blob/fab8e75eff6d1dd71708b71e1a2a85275394fa80/arrow-row/src/variable.rs#L147-L171
Which then falls to the branch on line 150, and a NULL is incorrectly
encoded as an empty list instead of a valid element (that is null)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]