alamb commented on code in PR #10801: URL: https://github.com/apache/datafusion/pull/10801#discussion_r1628143605
########## datafusion/core/tests/parquet/mod.rs: ########## @@ -925,6 +932,71 @@ fn make_dict_batch() -> RecordBatch { .unwrap() } +fn make_interval_batch(offset: i32) -> RecordBatch { + let schema = Schema::new(vec![ + Field::new( + "year_month", + DataType::Interval(IntervalUnit::YearMonth), + true, + ), + Field::new("day_time", DataType::Interval(IntervalUnit::DayTime), true), + Field::new( + "month_day_nano", + DataType::Interval(IntervalUnit::MonthDayNano), + true, + ), + ]); + let schema = Arc::new(schema); + + let ym_arr = IntervalYearMonthArray::from(vec![ + Some(IntervalYearMonthType::make_value(1 + offset, 1 + offset)), Review Comment: in general I suggest changing this so the values of the two fields are different (so that it would catch bugs where the fields weren't properly interpreted) For example, instead of ```rust Some(IntervalYearMonthType::make_value(1 + offset, 1 + offset)), ``` Something like (use `10 + offset` in the second field so the values are different) ```rust Some(IntervalYearMonthType::make_value(1 + offset, 10 + offset)), ``` The same applies to the rest of the values in this code ########## datafusion/core/src/datasource/physical_plan/parquet/statistics.rs: ########## @@ -256,6 +259,13 @@ macro_rules! get_statistic { Some(DataType::Float16) => { Some(ScalarValue::Float16(from_bytes_to_f16(s.$bytes_func()))) } + Some(DataType::Interval(unit)) => { + match unit { + IntervalUnit::YearMonth => unimplemented!("Interval statistics not yet supported by parquet"), Review Comment: in general, in rust `unimplemented!()` results in a panic which is not a great user experience. I think this code purposely ignores errors (in order to gracefully handle parquet files that might not have the expected statistics or that were created from some other writer) Thus, I suggest changing these cases from `panic` to `None` (and then adjusting the test appropriately) If we return `None`, once statistics are properly stored by the parquet-rs writer, the test will fail on next upgrade and we can update the test with the correct values -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org