[
https://issues.apache.org/jira/browse/SPARK-5614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Michael Armbrust resolved SPARK-5614.
-------------------------------------
Resolution: Fixed
Fix Version/s: 1.3.0
Issue resolved by pull request 4394
[https://github.com/apache/spark/pull/4394]
> Predicate pushdown through Generate
> -----------------------------------
>
> Key: SPARK-5614
> URL: https://issues.apache.org/jira/browse/SPARK-5614
> Project: Spark
> Issue Type: Improvement
> Components: SQL
> Affects Versions: 1.2.0
> Reporter: Lu Yan
> Fix For: 1.3.0
>
>
> Now in Catalyst's rules, predicates can not be pushed through "Generate"
> nodes. Further more, partition pruning in HiveTableScan can not be applied on
> those queries involves "Generate". This makes such queries very inefficient.
> For example, physical plan for query
> {quote}
> select len, bk
> from s_server lateral view explode(len_arr) len_table as len
> where len > 5 and day = '20150102';
> {quote}
> where 'day' is a partition column in metastore is like this in current
> version of Spark SQL:
> {quote}
> Project [len, bk]
> Filter ((len > "5") && "(day = "20150102")")
> Generate explode(len_arr), true, false
> HiveTableScan [bk, len_arr, day], (MetastoreRelation default, s_server,
> None), None
> {quote}
> But theoretically the plan should be like this
> {quote}
> Project [len, bk]
> Filter (len > "5")
> Generate explode(len_arr), true, false
> HiveTableScan [bk, len_arr, day], (MetastoreRelation default, s_server,
> None), Some(day = "20150102")
> {quote}
> Where partition pruning predicates can be pushed to HiveTableScan nodes.
> I've developed a solution on this issue. If you guys do not have a plan for
> this already, I could merge the solution back to master.
> And there is also a problem on column pruning for "Generate", I would file
> another issue about that.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]