[GitHub] [arrow] jorisvandenbossche commented on a change in pull request #10074: ARROW-12428: [Python] Expose pre_buffer in pyarrow.parquet

GitBox Tue, 04 May 2021 10:43:02 -0700


jorisvandenbossche commented on a change in pull request #10074:
URL: https://github.com/apache/arrow/pull/10074#discussion_r625980584




##########
File path: python/pyarrow/parquet.py
##########
@@ -1676,6 +1690,13 @@ def pieces(self):
     keys and only a hive-style directory structure is supported. When
     setting `use_legacy_dataset` to False, also within-file level filtering
     and different partitioning schemes are supported.
+pre_buffer : bool, default True
+    Coalesce and issue file reads in parallel to improve performance on
+    high-latency filesystems (e.g. S3). If True, Arrow will use a
+    background I/O thread pool. This option is only supported for
+    use_legacy_dataset=True. If using a filesystem layer that itself

Review comment:
       ```suggestion
       use_legacy_dataset=False. If using a filesystem layer that itself
   ```
   
   here as well




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow] jorisvandenbossche commented on a change in pull request #10074: ARROW-12428: [Python] Expose pre_buffer in pyarrow.parquet

Reply via email to