alamb commented on code in PR #8733:
URL: https://github.com/apache/arrow-rs/pull/8733#discussion_r2510979304


##########
parquet/src/arrow/arrow_reader/mod.rs:
##########
@@ -212,6 +217,14 @@ impl<T> ArrowReaderBuilder<T> {
         }
     }
 
+    /// Configure how row selections should be materialised during execution

Review Comment:
   It would be nice, eventually, to change RowSelection so the callers could 
provide a bitmap as well.  
   
   I will think about how this might look, but it is not something to change in 
this PR
   
   



##########
parquet/src/arrow/arrow_reader/mod.rs:
##########
@@ -43,6 +43,7 @@ use crate::file::reader::{ChunkReader, SerializedPageReader};
 use crate::schema::types::SchemaDescriptor;
 
 use crate::arrow::arrow_reader::metrics::ArrowReaderMetrics;
+// Exposed so integration tests and benchmarks can temporarily override the 
threshold.

Review Comment:
   I think there are other use cases too (aka I am not sure this comment is 
accurate anymore)



##########
parquet/src/arrow/arrow_reader/mod.rs:
##########
@@ -934,12 +950,16 @@ impl<T: ChunkReader + 'static> 
ParquetRecordBatchReaderBuilder<T> {
         let array_reader = ArrayReaderBuilder::new(&reader, &metrics)
             .build_array_reader(fields.as_deref(), &projection)?;
 
-        let read_plan = plan_builder
+        let mut plan_builder = plan_builder
             .limited(reader.num_rows())
             .with_offset(offset)
             .with_limit(limit)
-            .build_limited()
-            .build();
+            .build_limited();
+
+        let preferred_strategy = plan_builder.preferred_selection_strategy();

Review Comment:
   Elsewhere in the PR you used the term "resolved" to reflect the chosen 
strategy -- perhaps we can do the same here 
   
   ```rust
           let resolved_strategy = plan_builder.resolve_selection_strategy();
           plan_builder = 
plan_builder.with_selection_strategy(resolved_strategy);
   ```
   
   I also found it slightly strange that the PlanBuilder has both methods -- 
maybe it would be simpler if build() simply resolved the strategy directly. 
   
   However, I think we can do this as a follow on PR
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to