[ https://issues.apache.org/jira/browse/SPARK-27342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Xiao Li resolved SPARK-27342. ----------------------------- Resolution: Fixed Assignee: Aayushmaan Jain Fix Version/s: 3.0.0 > Optimize limit 0 queries > ------------------------ > > Key: SPARK-27342 > URL: https://issues.apache.org/jira/browse/SPARK-27342 > Project: Spark > Issue Type: Improvement > Components: SQL > Affects Versions: 2.4.0 > Reporter: Aayushmaan Jain > Assignee: Aayushmaan Jain > Priority: Major > Fix For: 3.0.0 > > > Limit 0 queries do not require (table) files to be scanned. However, > currently, for queries containing a Limit 0 clause, the tables are read > entirely. This leads to needless file I/O(s) and network transmission(s). > For instance: > * *Select Query* > +QUERY:+ > {code:java} > SELECT * FROM table1 LIMIT 0 > {code} > +OPTIMIZED PLAN:+ > {code:java} > GlobalLimit 0 > +- LocalLimit 0 > +- Relation[id#79,num1#80] parquet{code} > * *Inner Join Query* > +QUERY:+ > {code:java} > SELECT * FROM table1 INNER JOIN (SELECT * FROM table2 LIMIT 0) AS table2 ON > table1.id = table2.id{code} > +OPTIMIZED PLAN:+ > {code:java} > Join Inner, (id#79 = id#87) > :- Filter isnotnull(id#79) > : +- Relation[id#79,num1#80] parquet > +- Filter isnotnull(id#87) > +- GlobalLimit 0 > +- LocalLimit 0 > +- Relation[id#87,num2#88] parquet{code} > > > Similarly, other operations such as Left Outer Join, Right Outer Join, > Intersect, etc. are also affected. > As a result, this causes the Spark application to waste execution time and > network bandwidth. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org