Re: [D] It is possible to reduce peak memory usage when using datasets (to use predicate pushdown) when reading single parquet files [arrow]

via GitHub Wed, 09 Jul 2025 16:19:00 -0700


GitHub user adamreeve added a comment to the discussion: It is possible to 
reduce peak memory usage when using datasets (to use predicate pushdown) when 
reading single parquet files


One other option that comes to mind is reducing `batch_readahead`. I believe it 
is 16 by default, so reducing it to something low like 1 or disabling it 
completely by setting it to 0 should reduce memory use too. 
`fragment_readahead` probably won't have any effect if you are only reading one 
Parquet file.

GitHub link: 
https://github.com/apache/arrow/discussions/47003#discussioncomment-13714353

----
This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]

Re: [D] It is possible to reduce peak memory usage when using datasets (to use predicate pushdown) when reading single parquet files [arrow]

Reply via email to