[ 
https://issues.apache.org/jira/browse/SPARK-27342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16808369#comment-16808369
 ] 

Aayushmaan Jain commented on SPARK-27342:
-----------------------------------------

[~Francis47]
Since the right child of Left Outer Join is a Limit 0 subquery, we know that it 
would not be returning any rows. Thus, we can avoid file scans of `table2` and 
project respective columns as null instead.

> Optimize limit 0 queries
> ------------------------
>
>                 Key: SPARK-27342
>                 URL: https://issues.apache.org/jira/browse/SPARK-27342
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 2.4.0
>            Reporter: Aayushmaan Jain
>            Priority: Major
>
> Limit 0 queries do not require (table) files to be scanned. However, 
> currently, for queries containing a Limit 0 clause, the tables are read 
> entirely. This leads to needless file I/O(s) and network transmission(s).
> For instance:
>  * *Select Query*
> +QUERY:+ 
> {code:java}
> SELECT * FROM table1 LIMIT 0
> {code}
> +OPTIMIZED PLAN:+
> {code:java}
> GlobalLimit 0
> +- LocalLimit 0
>    +- Relation[id#79,num1#80] parquet{code}
>  * *Inner Join Query*
> +QUERY:+
> {code:java}
> SELECT * FROM table1 INNER JOIN (SELECT * FROM table2 LIMIT 0) AS table2 ON 
> table1.id = table2.id{code}
> +OPTIMIZED PLAN:+
> {code:java}
> Join Inner, (id#79 = id#87)
> :- Filter isnotnull(id#79)
> :  +- Relation[id#79,num1#80] parquet
> +- Filter isnotnull(id#87)
>    +- GlobalLimit 0
>       +- LocalLimit 0
>          +- Relation[id#87,num2#88] parquet{code}
>  
>  
> Similarly, other operations such as Left Outer Join, Right Outer Join, 
> Intersect, etc. are also affected.
> As a result, this causes the Spark application to waste execution time and 
> network bandwidth.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to