gianm commented on PR #1270: URL: https://github.com/apache/parquet-mr/pull/1270#issuecomment-1973533544
> > When decompressing a 64MB block using a 4KB chunk size, this leads to thousands of allocations and deallocations totaling GBs of memory. > > Is that a real use case? Usually we don't expect a page to be as large as this. I encountered this on some Snappy-compressed Parquet files that were written by Spark. I don't have access to the Spark cluster or job info, though, so unfortunately I don't have more details than that. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
