mzabaluev commented on PR #9700:
URL: https://github.com/apache/arrow-rs/pull/9700#issuecomment-4242787364

   Good points @etseidl. Our motivation for adding this is that in some cases 
e.g. with high cardinality, the Rust parquet writer produces much larger 
encoded Parquet than the Spark workloads we're aiming to replace. So using a 
default option that enables the heuristic akin to the one hardcoded into 
parquet-java would get us on par (or maybe better, because this implementation 
may choose to fall back at any page chunk in the dataframe).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to