jorisvandenbossche commented on a change in pull request #10074:
URL: https://github.com/apache/arrow/pull/10074#discussion_r625980584
##########
File path: python/pyarrow/parquet.py
##########
@@ -1676,6 +1690,13 @@ def pieces(self):
keys and only a hive-style directory structure is supported. When
setting `use_legacy_dataset` to False, also within-file level filtering
and different partitioning schemes are supported.
+pre_buffer : bool, default True
+ Coalesce and issue file reads in parallel to improve performance on
+ high-latency filesystems (e.g. S3). If True, Arrow will use a
+ background I/O thread pool. This option is only supported for
+ use_legacy_dataset=True. If using a filesystem layer that itself
Review comment:
```suggestion
use_legacy_dataset=False. If using a filesystem layer that itself
```
here as well
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]