klion26 commented on code in PR #8638:
URL: https://github.com/apache/arrow-rs/pull/8638#discussion_r2443676460
##########
parquet-variant-compute/src/variant_array.rs:
##########
@@ -1004,6 +1036,19 @@ fn typed_value_to_variant<'a>(
index
)
}
+ DataType::Time64(TimeUnit::Microsecond) => {
Review Comment:
Only support the `Time64(TimeUnit::Microsecond)` here because that [variant
spec](https://github.com/apache/parquet-format/blob/master/VariantShredding.md#shredded-value-types)
said that `Time` is `TimeUnit::Microsecond`.
<img width="770" height="674" alt="Image"
src="https://github.com/user-attachments/assets/fa1af260-6622-49f5-ae77-4a8950dcfe12"
/>
##########
parquet-variant-compute/src/variant_to_arrow.rs:
##########
@@ -545,3 +557,21 @@ impl VariantToBinaryVariantArrowRowBuilder {
Ok(ArrayRef::from(variant_array))
}
}
+
+trait AppendValueTrait {
+ // NullBuilder will always append `()`
+ fn append_value(&mut self, v: ());
+}
+
+impl AppendValueTrait for NullBuilder {
Review Comment:
> NullArray (with DataType::Null) is not the same as an array (of some
arbitrary other type) full of nulls tho?
`new_null_array` with different types are different, the following unit test
will fail
```
let first = new_null_array(&DataType::Int32, 3);
let second =
new_null_array(&DataType::Timestamp(TimeUnit::Nanosecond, None), 3);
assert_eq!(*first, second.clone());
```
Use `NullArray` instead of `new_null_array` in the changed code because the
compiler will panic with `the trait bound Result<Arc<dyn Array>, ArrowError>:
Array is not satisfied` if I use `new_null_array` in `FakeNullBuilder::finish`
##########
parquet-variant-compute/src/variant_array.rs:
##########
@@ -968,6 +964,42 @@ fn typed_value_to_variant<'a>(
DataType::Float64 => {
primitive_conversion_single_value!(Float64Type, typed_value, index)
}
+ DataType::Decimal32(_, s) => {
+ generic_conversion_single_value!(
+ Decimal32Type,
+ as_primitive,
+ |v| VariantDecimal4::try_new(v, *s as u8).unwrap(),
+ typed_value,
+ index
+ )
+ }
+ DataType::Decimal64(_, s) => {
+ generic_conversion_single_value!(
+ Decimal64Type,
+ as_primitive,
+ |v| VariantDecimal8::try_new(v, *s as u8).unwrap(),
+ typed_value,
+ index
+ )
+ }
+ DataType::Decimal128(_, s) => {
+ generic_conversion_single_value!(
+ Decimal128Type,
+ as_primitive,
+ |v| VariantDecimal16::try_new(v, *s as u8).unwrap(),
Review Comment:
Thanks for this suggestion, fixed.
Ah, found that we can pass in _different_ types here(`VariantDecimal16` vs
`Variant::Decimal16`), it is cool.
##########
parquet-variant-compute/src/type_conversion.rs:
##########
@@ -89,6 +90,9 @@ impl_primitive_from_variant!(
as_naive_date,
datatypes::Date32Type::from_naive_date
);
+impl_primitive_from_variant!(datatypes::Time64MicrosecondType, as_time_utc,
|v| {
+ (v.num_seconds_from_midnight() * 1_000_000 + v.nanosecond() / 1_000) as i64
Review Comment:
I think it is ok because `nanosecond()` will return `u32`, and the result
will be truncated to zero.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]