Re: [PR] GH-36765: [Python][Dataset] Change default of pre_buffer to True for reading Parquet files [arrow]

via GitHub Thu, 05 Oct 2023 06:12:54 -0700


jorisvandenbossche commented on code in PR #37854:
URL: https://github.com/apache/arrow/pull/37854#discussion_r1347400272



##########
python/pyarrow/_dataset_parquet.pyx:
##########
@@ -666,7 +666,7 @@ cdef class ParquetFragmentScanOptions(FragmentScanOptions):
         Disabled by default.
     buffer_size : int, default 8192
         Size of buffered stream, if enabled. Default is 8KB.
-    pre_buffer : bool, default False
+    pre_buffer : bool, default True
         If enabled, pre-buffer the raw Parquet data instead of issuing one
         read per column chunk. This can improve performance on high-latency
         filesystems.

Review Comment:
   Yep, will do



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] GH-36765: [Python][Dataset] Change default of pre_buffer to True for reading Parquet files [arrow]

Reply via email to