deanm0000 commented on issue #34180: URL: https://github.com/apache/arrow/issues/34180#issuecomment-1431321708
So my thoughts are that some readers and optimizers use the "newer" `min_value` and `max_value` stats when they plan queries and filters. (Hopefully pyarrow.dataset is included in that or will be). I'd like a way to verify that my parquet files have those stats. Since the `min` and `max` stats are deprecated it seems fewer libraries are going to even look at those if they exist. As to the `min_raw` and `max_raw`, I've never heard of them so I'm not sure how valuable they are. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
