wgtmac commented on issue #38441: URL: https://github.com/apache/arrow/issues/38441#issuecomment-1781205100
Parquet is usually used as a low-level library in the database or data warehouse, which has a better knowledge of the actual data pattern. They can decide the best combination of encoding and compression for any column based on the data already stored. Should we provide a simple tool or something to evaluate each page of every column from an input parquet file and decides the best encoding & compression for them? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
