Ali Tootoonchian created SPARK-14820:
----------------------------------------
Summary: Reduce shuffle data by pushing filter toward storage
Key: SPARK-14820
URL: https://issues.apache.org/jira/browse/SPARK-14820
Project: Spark
Issue Type: Improvement
Components: SQL
Affects Versions: 1.6.1
Reporter: Ali Tootoonchian
Priority: Trivial
SQL query planner can have intelligence to push down filter commands towards
the storage layer. If we optimize the query planner such that the IO to the
storage is reduced at the cost of running multiple filters (i.e., compute),
this should be desirable when the system is IO bound. An example to prove the
case in point is below from TPCH test bench:
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]