sgrebnov opened a new pull request, #13131:
URL: https://github.com/apache/datafusion/pull/13131

   ## Which issue does this PR close?
   
   With filter pushdown optimization, the LogicalPlan can have filters defined 
as part of `TableScan` and `Filter` nodes.
   To avoid overwriting one of the filters, we combine the existing filter with 
the additional filter.
   
   Example query
   
   ```sql
   select
           c_phone as cntrycode,
           c_acctbal
   from
           customer
   where c_mktsegment = 'BUILDING' and c_acctbal > (
           select
                   avg(c_acctbal)
           from
                   customer);
   ```
   Logical Plan
   ```                                                                          
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
        
   |  Projection: customer.c_phone AS cntrycode, customer.c_acctbal             
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                         
   |   Filter: CAST(customer.c_acctbal AS Decimal128(38, 6)) > (<subquery>)     
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                         
   |     Subquery:
   |     ..                                                                     
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                            
   |     TableScan: customer, full_filters=[customer.c_mktsegment = 
Utf8("BUILDING")]
   ```
   
   W/o this change it will be unparsed as 
   
   ```sql
   select
           c_phone as cntrycode,
           c_acctbal
   from
           customer
   where c_mktsegment = 'BUILDING'
   ```
   
   
   ## Rationale for this change
   
   <!--
    Why are you proposing this change? If this is already explained clearly in 
the issue then this section is not needed.
    Explaining clearly why changes are proposed helps reviewers understand your 
changes and offer better suggestions for fixes.  
   -->
   
   ## What changes are included in this PR?
   
   Improves QueryBuilder `pub fn selection` to combine filters if `select` is 
called multiple times.
   
   ## Are these changes tested?
   
   Added unit test, tested as part of TPC-H and TPC-DS queries unparsing by 
https://github.com/spiceai/spiceai (running benchmarks with some filters 
pushdown optimizations enabled)
   
   ## Are there any user-facing changes?
   
   Fixes some unparsing issues related to missing `WHERE`  clauses when running 
TPC-H and TPC-DS queries with filters pushdown optimization enabled
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to