sadikovi edited a comment on pull request #33639:
URL: https://github.com/apache/spark/pull/33639#issuecomment-941996560


   @huaxingao @sunchao Would you be able to elaborate on testing for this 
change? I am curious how the code handles some esoteric cases such as UINT32 
and UINT64 min and max and UTF-8 strings, I remember Parquet stats had a bug 
regarding UTF-8 strings some time ago. Also, I believe stats for strings can be 
truncated 
(https://github.com/apache/parquet-mr/blob/8ae7f31e36a298804435565e0cae584aac90f6d5/parquet-hadoop/src/main/java/org/apache/parquet/hadoop/ParquetOutputFormat.java#L150),
 I am curious how this is handled or taken into account. Thanks.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to