sdf-jkl commented on code in PR #9599:
URL: https://github.com/apache/arrow-rs/pull/9599#discussion_r2977726296
##########
parquet-variant-compute/src/variant_to_arrow.rs:
##########
@@ -941,7 +942,14 @@ where
match value {
Variant::List(list) => {
for element in list.iter() {
- self.element_builder.append_value(element)?;
+ match element {
Review Comment:
Append value would miss the `shred_variant` Null-handling semantics without
it.
`VariantToListArrowRowBuilder` builds the Arrow typed array in an
interesting way. It internally stores a `StructArray` with both `value` and
`typed_value` fields, so both are kept.
##########
parquet-variant-compute/src/shred_variant.rs:
##########
@@ -285,26 +320,26 @@ impl<'a> VariantToShreddedArrayVariantRowBuilder<'a> {
// If the variant is an array, value must be null.
match variant {
Variant::List(list) => {
+ self.nulls.append_non_null();
self.value_builder.append_null();
self.typed_value_builder
.append_value(&Variant::List(list))?;
Ok(true)
}
other => {
+ self.nulls.append_non_null();
self.value_builder.append_value(other);
self.typed_value_builder.append_null()?;
Ok(false)
}
}
}
- fn finish(self) -> Result<(BinaryViewArray, ArrayRef, Option<NullBuffer>)>
{
+ fn finish(mut self) -> Result<(BinaryViewArray, ArrayRef,
Option<NullBuffer>)> {
Review Comment:
Yes
##########
parquet-variant-compute/src/shred_variant.rs:
##########
@@ -881,7 +914,9 @@ mod tests {
expected_variant.clone()
);
}
- None => unreachable!(),
+ None => {
+ assert!(fallbacks.0.is_null(idx));
Review Comment:
It's checking that the given `fallbacks` row is empty: `fallbacks.0 == None`
`Variant::Null` would be a valid `Variant` value.
##########
parquet-variant-compute/src/shred_variant.rs:
##########
@@ -1338,13 +1421,7 @@ mod tests {
5,
&[0, 3, 6, 6, 6, 6],
&[Some(3), Some(3), None, None, Some(0)],
- &[
- None,
- None,
- Some(Variant::from("not a list")),
- Some(Variant::Null),
- None,
- ],
+ &[None, None, Some(Variant::from("not a list")), None, None],
Review Comment:
It validates both:
- row level via `assert_list_structure`
- list elements later in the function.
Previously, `append_null` for list arrays was missing top-level rows
handling via `NullBuffer` altogether. We fixed that and now top-level elements
use proper `None` instead of `Variant::Null`. List elements get `Variant::Null`
as they should.
##########
parquet-variant-compute/src/shred_variant.rs:
##########
@@ -271,11 +303,14 @@ impl<'a> VariantToShreddedArrayVariantRowBuilder<'a> {
cast_options,
capacity,
)?,
+ nulls: NullBufferBuilder::new(capacity),
+ null_mode,
})
}
fn append_null(&mut self) -> Result<()> {
- self.value_builder.append_value(Variant::Null);
+ self.null_mode.append_to_struct_nulls(&mut self.nulls);
+ self.null_mode.append_to_value(&mut self.value_builder);
Review Comment:
Yes, we had a mismatch with the table for Array element in `VariantArray`.
We used to always append `Variant::Null` and not do anything with the Null
buffer, as if every case is a lower array element.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]