[
https://issues.apache.org/jira/browse/SPARK-14820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15253608#comment-15253608
]
Takeshi Yamamuro commented on SPARK-14820:
------------------------------------------
Seems `Optimizer#PushPredicateThroughJoin` handles this kind of push-down
optimization.
Why cannot the current impl. apply filter push-downs into the query described
in your pdf?
> Reduce shuffle data by pushing filter toward storage
> ----------------------------------------------------
>
> Key: SPARK-14820
> URL: https://issues.apache.org/jira/browse/SPARK-14820
> Project: Spark
> Issue Type: Improvement
> Components: SQL
> Affects Versions: 1.6.1
> Reporter: Ali Tootoonchian
> Priority: Trivial
> Attachments: Reduce Shuffle Data by pushing filter toward storage.pdf
>
>
> SQL query planner can have intelligence to push down filter commands towards
> the storage layer. If we optimize the query planner such that the IO to the
> storage is reduced at the cost of running multiple filters (i.e., compute),
> this should be desirable when the system is IO bound.
> Proven analysis and example is attached.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]