Re: [I] [C++][Parquet] Loading big parquet files leads to huge memory consumption [arrow]

via GitHub Tue, 03 Dec 2024 08:56:14 -0800


KuczaRacza commented on issue #44890:
URL: https://github.com/apache/arrow/issues/44890#issuecomment-2515100029


   here is my minimal example of that behavior 
[https://gist.github.com/KuczaRacza/4ddcae7d39aef7bfbe352017b96e15f9](example). 
I was wrong about 20+GiB file  swelling to 60+ GiB in RAM other components in 
my workflow contributed to this high usage. But it look like reading by batches 
takes roughly same amount of memory as size of the file. It's too much to 
handle any files bigger than  few GiBs


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [I] [C++][Parquet] Loading big parquet files leads to huge memory consumption [arrow]

Reply via email to