[
https://issues.apache.org/jira/browse/SPARK-3109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14115904#comment-14115904
]
Michael Armbrust commented on SPARK-3109:
-----------------------------------------
I don't believe this optimization is valid unless you know that there are no
duplicate values for (d,e) in test.
> Sql query with OR condition should be handled above PhysicalOperation layer
> ---------------------------------------------------------------------------
>
> Key: SPARK-3109
> URL: https://issues.apache.org/jira/browse/SPARK-3109
> Project: Spark
> Issue Type: Improvement
> Components: SQL
> Affects Versions: 1.0.2
> Reporter: Alex Liu
>
> For query like
> {code}
> select d, e from test where a = 1 and b = 1 and c = 1 and d > 20 or d < 0
> {code}
> Spark SQL pushes the whole query to PhysicalOperation. I haven't check how
> Spark SQL internal query plan works, but I think "OR" condition in the above
> query should be handled above physical operation. Physical operation should
> have the following query
> {code} select d, e from test where a = 1 and b = 1 and c = 1 and d > 20
> {code}
> OR
> {code}select d, e from test where d < 0 {code}
--
This message was sent by Atlassian JIRA
(v6.2#6252)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]