Re: [SPARK-SQL] Window Functions optimization

Yin Huai Mon, 13 Jul 2015 14:24:13 -0700

Your query will be partitioned once. Then, a single Window operator will
evaluate these three functions. As mentioned by Harish, you can take a look
at the plan (sql("your sql...").explain()).


On Mon, Jul 13, 2015 at 12:26 PM, Harish Butani <rhbutani.sp...@gmail.com>
wrote:

> Just once.
> You can see this by printing the optimized logical plan.
> You will see just one repartition operation.
>
> So do:
> val df = sql("your sql...")
> println(df.queryExecution.analyzed)
>
> On Mon, Jul 13, 2015 at 6:37 AM, Hao Ren <inv...@gmail.com> wrote:
>
>> Hi,
>>
>> I would like to know: Is there any optimization has been done for window
>> functions in Spark SQL?
>>
>> For example.
>>
>> select key,
>> max(value1) over(partition by key) as m1,
>> max(value2) over(partition by key) as m2,
>> max(value3) over(partition by key) as m3
>> from table
>>
>> The query above creates 3 fields based on the same partition rule.
>>
>> The question is:
>> Will spark-sql partition the table 3 times in the same way to get the
>> three
>> max values ? or just partition once if it finds the partition rule is the
>> same ?
>>
>> It would be nice if someone could point out some lines of code on it.
>>
>> Thank you.
>> Hao
>>
>>
>>
>> --
>> View this message in context:
>> http://apache-spark-user-list.1001560.n3.nabble.com/SPARK-SQL-Window-Functions-optimization-tp23796.html
>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>> For additional commands, e-mail: user-h...@spark.apache.org
>>
>>
>

Re: [SPARK-SQL] Window Functions optimization

Reply via email to