sadikovi commented on pull request #33639: URL: https://github.com/apache/spark/pull/33639#issuecomment-941996560
@huaxingao @sunchao Would you be able to elaborate on testing for this change? I am curious how the code handles some esoteric cases such as UINT32 and UINT64 min and max and UTF-8 strings, I remember Parquet stats had a bug regarding those some time ago. Also, I believe stats for strings can be truncated (https://github.com/apache/parquet-mr/blob/8ae7f31e36a298804435565e0cae584aac90f6d5/parquet-hadoop/src/main/java/org/apache/parquet/hadoop/ParquetOutputFormat.java#L150), I am curious how this is handled or taken into account. Thanks. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
