[GitHub] [arrow] mapleFU commented on a diff in pull request #36510: PARQUET-2321: [C++] allow customized buffer size when creating ArrowInputStream for a column PageReader

via GitHub Mon, 21 Aug 2023 03:08:31 -0700


mapleFU commented on code in PR #36510:
URL: https://github.com/apache/arrow/pull/36510#discussion_r1299903917



##########
cpp/src/parquet/properties.h:
##########
@@ -79,10 +93,45 @@ class PARQUET_EXPORT ReaderProperties {
   /// Disable buffered stream reading.
   void disable_buffered_stream() { buffered_stream_enabled_ = false; }
 
-  /// Return the size of the buffered stream buffer.
-  int64_t buffer_size() const { return buffer_size_; }
-  /// Set the size of the buffered stream buffer in bytes.
-  void set_buffer_size(int64_t size) { buffer_size_ = size; }
+  /// Return the default size of the buffered stream buffer.
+  int64_t buffer_size() const { return 
default_column_reader_properties_.buffer_size(); }
+  /// Set the default size of the buffered stream buffer in bytes.
+  void set_buffer_size(int64_t size) {
+    default_column_reader_properties_.set_buffer_size(size);
+  }
+
+  /// Return the size of the buffered stream buffer for a column chunk.
+  int64_t buffer_size(int row_group_index, int column_index) const {

Review Comment:
   Personally I think deciding a buffer-size for per-row-group and per-column 
is to hack... Perhaps user here has define a `CalculateBufferSize(const 
ColumnChunkMetadata&)`, and caculating this.
   
   it's ok but I guess it's too complex... Why a consistent buffer_size for 
each column is not ok? cc @pitrou 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow] mapleFU commented on a diff in pull request #36510: PARQUET-2321: [C++] allow customized buffer size when creating ArrowInputStream for a column PageReader

Reply via email to