aayushmaanjain opened a new pull request #24271: [SPAR-27342][SQL] Optimize 
Limit 0 queries
URL: https://github.com/apache/spark/pull/24271
 
 
   ## What changes were proposed in this pull request?
   With this change, unnecessary file scans are avoided in case of Limit 0 
queries. 
   
   I added a case (rule) to `PropagateEmptyRelation` to replace `GlobalLimit 0` 
and `LocalLimit 0` nodes with an empty `LocalRelation`. This prunes the subtree 
under the Limit 0 node and further allows other rules of 
`PropagateEmptyRelation` to optimize the Logical Plan - while remaining 
semantically consistent with the Limit 0 query.
   
   For instance:
   **Query:**
   `SELECT * FROM table1 INNER JOIN (SELECT * FROM table2 LIMIT 0) AS table2 ON 
table1.id = table2.id`
   **Optimized Plan without fix:**
   `Join Inner, (id#79 = id#87)
   :- Filter isnotnull(id#79)
   :  +- Relation[id#79,num1#80] parquet
   +- Filter isnotnull(id#87)
      +- GlobalLimit 0
         +- LocalLimit 0
            +- Relation[id#87,num2#88] parquet`
   **Optimized Plan with fix:**
   `LocalRelation <empty>, [id#75, num1#76, id#77, num2#78]`
   
   ## How was this patch tested?
   Added unit tests to verify Limit 0 optimization for:
   - Simple query containing Limit 0
   - Inner Join, Left Outer Join, Right Outer Join, Full Outer Join queries 
containing Limit 0 as one of their children
   - Nested Inner Joins between 3 tables with one of them having a Limit 0 
clause.
   - Intersect query wherein one of the subqueries was a Limit 0 query.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to