yeya24 opened a new issue, #8442:
URL: https://github.com/apache/arrow-rs/issues/8442

   **Is your feature request related to a problem or challenge? Please describe 
what you are trying to do.**
   <!--
   A clear and concise description of what the problem is. Ex. I'm always 
frustrated when [...] 
   (This section helps Arrow developers understand the context and *why* for 
this feature, in addition to  the *what*)
   -->
   
   With the recent PRs to expose `ReadPlanBuilder` publicly, users can use 
`with_predicate` to evaluate the predicates. This is how it is being used today 
in arrow-rs:
   
   ```
                   let array_reader = ArrayReaderBuilder::new(&reader, &metrics)
                       .build_array_reader(fields.as_deref(), 
predicate.projection())?;
   
                   plan_builder = plan_builder.with_predicate(array_reader, 
predicate.as_mut())?;
   ```
   
   Then a `ParquetRecordBatchReader::try_new` will be called to read parquet 
record batches created from array reader and the provided batch size.
   
   However, in order to build an array reader publicly, there are several 
problems:
   - ArrowReader metric enum is not publicly exposed in its mod
   - `build_array_reader(fields.as_deref(), ...` I cannot easily get fields. I 
tried to get arrow field levels via `parquet_to_arrow_field_levels` but the 
internal fields from the levels are also private
   
   **Describe the solution you'd like**
   Instead of taking array reader in the `with_predicate` method and create a 
new ParquetRecordBatchReader inside, take `ParquetRecordBatchReader` as input 
parameter instead. 
   
   **Describe alternatives you've considered**
   Expose all the required fields and methods as public to be able to create an 
array reader.
   
   There is alternative to expose a method like below
   
   ```
       pub fn with_predicate_and_level_row_groups(
           mut self,
           levels: &FieldLevels,
           row_groups: &dyn RowGroups,
           predicate: &mut dyn ArrowPredicate,
       ) -> Result<Self> {
           let mut reader = 
ParquetRecordBatchReader::try_new_with_row_groups(levels, row_groups, 
self.batch_size, self.selection.clone(), predicate.projection().clone())?;
   ```
   
   **Additional context**
   <!--
   Add any other context or screenshots about the feature request here.
   -->
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to