etseidl commented on issue #6310:
URL: https://github.com/apache/arrow-rs/issues/6310#issuecomment-2311264542

   I think the relevant code is 
https://github.com/apache/arrow-rs/blob/ee2f75a66278dbd3e7aa6b85b5322951c792a58d/parquet/src/column/writer/mod.rs#L752-L779.
 
   
   For the final page (with 30 values), `null_page` should be false, and we 
should wind up at 
https://github.com/apache/arrow-rs/blob/ee2f75a66278dbd3e7aa6b85b5322951c792a58d/parquet/src/column/writer/mod.rs#L811-L816
   
   The chunk statistics look ok (min 1, max 1), so you'd think the page stats 
would similarly be ok. They are created here
   
https://github.com/apache/arrow-rs/blob/ee2f75a66278dbd3e7aa6b85b5322951c792a58d/parquet/src/column/writer/mod.rs#L889-L902
   
   Again, if the min/max were invalid in the page, then you'd expect garbage in 
the chunk stats.  Perhaps some print statements or breakpoints would help here.
   
   If the original file isn't sensitive could you share it here? 
   
   cc @adriangb


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to