KuczaRacza commented on issue #44890: URL: https://github.com/apache/arrow/issues/44890#issuecomment-2515100029
here is my minimal example of that behavior [https://gist.github.com/KuczaRacza/4ddcae7d39aef7bfbe352017b96e15f9](example). I was wrong about 20+GiB file swelling to 60+ GiB in RAM other components in my workflow contributed to this high usage. But it look like reading by batches takes roughly same amount of memory as size of the file. It's too much to handle any files bigger than few GiBs -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
