alamb opened a new pull request #370:
URL: https://github.com/apache/arrow-datafusion/pull/370


   # Which issue does this PR close?
   
   re https://github.com/apache/arrow-datafusion/issues/363 (leaving as draft 
until #365 is in)
   
    # Rationale for this change
   As explained on #363 the high level idea goal is to make the parquet row 
group pruning logic generic to any types of min/max statistics (not just 
parquet metadata)
   
   
   # What changes are included in this PR?
   1. Changes the *output* of PruningPredicateBuilder to be a `bool` for each 
of the input statistics
   2. Moves the parquet specific functionality (aka the function signature 
required for the `ParquetFileReader`) into the parquet.rs module
   3. Returns errors from `build_pruning_predicate` rather than silently 
ignoring them (though they are still silently ignored in parquet.rs as before)
   4. Improves some docstrings
   
   
   # Are there any user-facing changes?
   No change in parquet functionality is intended in this PR
   
   
   
   # Sequence:
   
   My next PR will change the *input* of the `PruningPredicateBuilder` to be 
generic
   
   I am trying to do this in a few small PRs to reduce review burden; Here is 
how I plan that they will connect together:
   
   Planned changes:
   - [x] Refactor code into a new module 
(https://github.com/apache/arrow-datafusion/pull/365)
   - [x] Return bool rather than parquet specific output (this PR)
   - [ ] Add `PruningStatstics` Trait (forthcoming PR)
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to