sadikovi edited a comment on pull request #33639:
URL: https://github.com/apache/spark/pull/33639#issuecomment-941996560


   @huaxingao @sunchao Would you be able to elaborate on testing for this 
change? 
   
   I am curious how the code handles some esoteric cases such as UINT32 and 
UINT64 min and max and UTF-8 strings, I remember Parquet stats had a bug 
regarding UTF-8 strings some time ago. 
   
   Also, I believe stats for strings can be truncated 
(https://github.com/apache/parquet-mr/blob/8ae7f31e36a298804435565e0cae584aac90f6d5/parquet-hadoop/src/main/java/org/apache/parquet/hadoop/ParquetOutputFormat.java#L150),
 I am curious how this is handled or taken into account. Would it affect the 
feature? 
   
   Thanks.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to