jorisvandenbossche commented on a change in pull request #10074: URL: https://github.com/apache/arrow/pull/10074#discussion_r615794180
########## File path: python/pyarrow/parquet.py ########## @@ -1674,6 +1686,12 @@ def pieces(self): keys and only a hive-style directory structure is supported. When setting `use_legacy_dataset` to False, also within-file level filtering and different partitioning schemes are supported. +pre_buffer : bool, default True + Coalesce and issue file reads in parallel to improve performance on Review comment: > I guess for someone who wants Arrow to be truly single-threaded, this may be confusing, so I'll see if I can reword this. So for such a case, you would need to set `pre_buffer=False` to have it truly single-threaded? I am thinking about dask, where `use_threads` is set to False, but I am not sure if it would be a problem for them to still do I/O in parallel. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org