Re: [Spark-SQL] Reduce Shuffle Data by pushing filter toward storage

2016-04-21 Thread atootoonchian
I create an issue in Spark project: SPARK-14820 -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Spark-SQL-Reduce-Shuffle-Data-by-pushing-filter-toward-storage-tp17297p17306.html Sent from the Apache Spark Developers List mailing list archive at

Re: [Spark-SQL] Reduce Shuffle Data by pushing filter toward storage

2016-04-21 Thread Ted Yu
Interesting analysis. Can you log a JIRA ? > On Apr 21, 2016, at 11:07 AM, atootoonchian wrote: > > SQL query planner can have intelligence to push down filter commands towards > the storage layer. If we optimize the query planner such that the IO to the > storage is reduced

Re: [Spark-SQL] Reduce Shuffle Data by pushing filter toward storage

2016-04-21 Thread atootoonchian
Hi Marcin I attached a pdf format of issue. Reduce_Shuffle_Data_by_pushing_filter_toward_storage.pdf -- View this message in context:

Re: [Spark-SQL] Reduce Shuffle Data by pushing filter toward storage

2016-04-21 Thread Marcin Tustin
I think that's an important result. Could you format your email to split out your parts a little more? It all runs together for me in gmail, so it's hard to follow, and I very much would like to. On Thu, Apr 21, 2016 at 2:07 PM, atootoonchian wrote: > SQL query planner can have