alamb commented on code in PR #7366: URL: https://github.com/apache/arrow-rs/pull/7366#discussion_r2026784665
########## arrow-array/src/array/struct_array.rs: ########## @@ -523,6 +547,55 @@ mod tests { assert_eq!(0, struct_array.offset()); } + #[test] Review Comment: I did verify that these tests fail without the code change in this PR ``` thread 'array::struct_array::tests::test_struct_array_from_data_with_offset_and_length' panicked at arrow-array/src/array/struct_array.rs:551:9: assertion `left == right` failed left: StructArray -- validity: ``` ########## arrow-array/src/array/struct_array.rs: ########## @@ -523,6 +547,55 @@ mod tests { assert_eq!(0, struct_array.offset()); } + #[test] + fn test_struct_array_from_data_with_offset_and_length() { + let int_arr = Int32Array::from(vec![1, 2, 3, 4, 5]); + let int_field = Field::new("x", DataType::Int32, false); + let struct_nulls = NullBuffer::new(BooleanBuffer::from(vec![true, true, false])); Review Comment: I am not entirely familiar with how struct array nulls are represented. Can you please confirm this is correct Specifically, it seems like the ArrayData's offset does not apply to the `NullBuffer` -- so in this case the null buffer only has three elements `true, true, false`, so if the offset was applied to the nulls, that would mean there were only two values for nulls. However, as the length of the Array data is three, this must be as intended ########## arrow-array/src/array/struct_array.rs: ########## @@ -294,10 +294,34 @@ impl StructArray { impl From<ArrayData> for StructArray { fn from(data: ArrayData) -> Self { + let parent_offset = data.offset(); + let parent_len = data.len(); + let fields = data .child_data() .iter() - .map(|cd| make_array(cd.clone())) + .map(|cd| { + let child_offset = cd.offset(); + let child_len = cd.len(); + assert!( + child_len >= parent_len + parent_offset, + "struct array has offset {} and len {} but child array only has {} items", + parent_offset, + parent_len, + child_len + ); + // SAFETY: We have already checked that the child array has enough items and the + // only thing we are changing is the offset and length. As long as the child data + // was previously valid, then the new child data is also valid. + let cd = unsafe { + cd.clone() + .into_builder() + .offset(child_offset + parent_offset) Review Comment: doesn't this require that `child_offset + parent_offset + parent_len is >= child_len`? The check above only verifies that parent_len + parent_offset is greater than child_len -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org