wjones127 commented on PR #34112: URL: https://github.com/apache/arrow/pull/34112#issuecomment-1433773456
@westonpace that makes sense. > When only one of min and max exists, it usually happens when a binary value has an extreme length or a floating value has NaN. In this case, the stats provide little value and make it tricker to use. It seems like we do have handling for these two cases. See Weston's message for NaN handling and `max_statistics_size` on [WriterProperties](https://arrow.apache.org/docs/cpp/api/formats.html#_CPPv4N7parquet16WriterPropertiesE). Based on that, I'd actually prefer we keep the ability to parse just the min or max if only one is available. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
