sdf-jkl commented on code in PR #9599:
URL: https://github.com/apache/arrow-rs/pull/9599#discussion_r2982371128
##########
parquet-variant-compute/src/shred_variant.rs:
##########
@@ -1338,13 +1421,7 @@ mod tests {
5,
&[0, 3, 6, 6, 6, 6],
&[Some(3), Some(3), None, None, Some(0)],
- &[
- None,
- None,
- Some(Variant::from("not a list")),
- Some(Variant::Null),
- None,
- ],
+ &[None, None, Some(Variant::from("not a list")), None, None],
Review Comment:
`VariantToShreddedArrayVariantRowBuilder` is the only builder that was
actually handling Nulls wrong.
It didn't have a `NullBuffer` and was always appending `Variant::Null` even
when it was a top-level builder.
Other builders were correct before.
##########
parquet-variant-compute/src/variant_to_arrow.rs:
##########
@@ -941,7 +942,14 @@ where
match value {
Variant::List(list) => {
for element in list.iter() {
- self.element_builder.append_value(element)?;
+ match element {
Review Comment:
I added a test showing an error with using just `Variant::Null`.
If we use `variant_get(..., List<Int64>)` and cast_options(safe: False) //
strict casting, appended `Variant::Null` can't be safely cast to a List inner
type value causing an Err.
By using `append_null` we avoid this error.
##########
parquet-variant-compute/src/shred_variant.rs:
##########
@@ -102,19 +102,50 @@ pub fn shred_variant(array: &VariantArray, as_type:
&DataType) -> Result<Variant
))
}
+/// Controls how `append_null` is encoded for a shredded `(value,
typed_value)` pair.
+///
+/// | Mode | Struct validity bit | `value` | `typed_value` | Meaning |
+/// | --- | --- | --- | --- | --- |
+/// | `TopLevelVariant` | null | NULL | NULL | SQL NULL at the top-level
variant row |
+/// | `ObjectField` | non-null | NULL | NULL | Missing object field |
+/// | `ArrayElement` | non-null | `Variant::Null` | NULL | Explicit null array
element |
+#[derive(Debug, Clone, Copy, PartialEq, Eq)]
+pub(crate) enum AppendNullMode {
+ TopLevelVariant,
+ ObjectField,
+ ArrayElement,
+}
+
+impl AppendNullMode {
+ fn append_to_null_row(
Review Comment:
I like the `null_value` rename. Done in
https://github.com/apache/arrow-rs/pull/9599/commits/037e30fadb61f87f14736ba99efa3d08265923c3
https://github.com/apache/arrow-rs/pull/9599/commits/5d8315b48dcf7858cb61f5808405601b8fa63a27
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]