progval commented on code in PR #9646:
URL: https://github.com/apache/arrow-datafusion/pull/9646#discussion_r1530918004
##########
datafusion/core/src/datasource/physical_plan/parquet/statistics.rs:
##########
@@ -109,17 +109,24 @@ macro_rules! get_statistic {
}
}
}
- // type not supported yet
+ // type not fully supported yet
ParquetStatistics::FixedLenByteArray(s) => {
match $target_arrow_type {
- // just support the decimal data type
+ // just support specific logical data types, there are
others each
+ // with their own ordering
Some(DataType::Decimal128(precision, scale)) => {
Some(ScalarValue::Decimal128(
Some(from_bytes_to_i128(s.$bytes_func())),
*precision,
*scale,
))
}
+ Some(DataType::FixedSizeBinary(size)) => {
+ Some(ScalarValue::FixedSizeBinary(
Review Comment:
Oh sure, good catch.
The code for strings is:
```
let s = std::str::from_utf8(s.$bytes_func())
.map(|s| s.to_string())
.ok();
Some(ScalarValue::Utf8(s))
```
which looks like it's treated as a NULL, rather than ignored as a statistics
value. Should I do the same here?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]