[jira] [Commented] (SPARK-13307) TPCDS query 66 degraded by 30% in 1.6.0 compared to 1.4.1

2016-04-11 Thread JESSE CHEN (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-13307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15235635#comment-15235635
 ] 

JESSE CHEN commented on SPARK-13307:


Performance back on track on Spark 2.0. Closing this.

> TPCDS query 66 degraded by 30% in 1.6.0 compared to 1.4.1
> -
>
> Key: SPARK-13307
> URL: https://issues.apache.org/jira/browse/SPARK-13307
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 1.6.0
>Reporter: JESSE CHEN
>
> Majority of the TPCDS queries ran faster in 1.6.0 than in 1.4.1, average 
> about 9% faster. There are a few degraded, and one that is definitely not 
> within error margin is query 66.
> Query 66 in 1.4.1: 699 seconds
> Query 66 in 1.6.0: 918 seconds
> 30% worse.
> Collected the physical plans from both versions - drastic difference maybe 
> partially from using Tungsten in 1.6, but anything else at play here?
> Please see plans here:
> https://ibm.box.com/spark-sql-q66-debug-160plan
> https://ibm.box.com/spark-sql-q66-debug-141plan



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-13307) TPCDS query 66 degraded by 30% in 1.6.0 compared to 1.4.1

2016-03-01 Thread Xiao Li (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-13307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15174501#comment-15174501
 ] 

Xiao Li commented on SPARK-13307:
-

Really thank you for your response. 

> TPCDS query 66 degraded by 30% in 1.6.0 compared to 1.4.1
> -
>
> Key: SPARK-13307
> URL: https://issues.apache.org/jira/browse/SPARK-13307
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 1.6.0
>Reporter: JESSE CHEN
>
> Majority of the TPCDS queries ran faster in 1.6.0 than in 1.4.1, average 
> about 9% faster. There are a few degraded, and one that is definitely not 
> within error margin is query 66.
> Query 66 in 1.4.1: 699 seconds
> Query 66 in 1.6.0: 918 seconds
> 30% worse.
> Collected the physical plans from both versions - drastic difference maybe 
> partially from using Tungsten in 1.6, but anything else at play here?
> Please see plans here:
> https://ibm.box.com/spark-sql-q66-debug-160plan
> https://ibm.box.com/spark-sql-q66-debug-141plan



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-13307) TPCDS query 66 degraded by 30% in 1.6.0 compared to 1.4.1

2016-03-01 Thread Davies Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-13307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15174473#comment-15174473
 ] 

Davies Liu commented on SPARK-13307:


In the plan, I saw that the column pruning does not work well, it's fixed 
recently in master. I ran it locally with scale factor 10, it took 35 seconds 
(with 1 CPUs), all joins are BroadcastHashJoin.

Could you checkout master, it should be faster than 1.4

It's true that SortMergeJoin could be slower than ShuffleHashJoin, we may 
revisit that later.

> TPCDS query 66 degraded by 30% in 1.6.0 compared to 1.4.1
> -
>
> Key: SPARK-13307
> URL: https://issues.apache.org/jira/browse/SPARK-13307
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 1.6.0
>Reporter: JESSE CHEN
>
> Majority of the TPCDS queries ran faster in 1.6.0 than in 1.4.1, average 
> about 9% faster. There are a few degraded, and one that is definitely not 
> within error margin is query 66.
> Query 66 in 1.4.1: 699 seconds
> Query 66 in 1.6.0: 918 seconds
> 30% worse.
> Collected the physical plans from both versions - drastic difference maybe 
> partially from using Tungsten in 1.6, but anything else at play here?
> Please see plans here:
> https://ibm.box.com/spark-sql-q66-debug-160plan
> https://ibm.box.com/spark-sql-q66-debug-141plan



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-13307) TPCDS query 66 degraded by 30% in 1.6.0 compared to 1.4.1

2016-02-23 Thread Xiao Li (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-13307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15160260#comment-15160260
 ] 

Xiao Li commented on SPARK-13307:
-

You need to check the plan and check the join type it is using. I guess the 
threshold 100 MB is still too low. 

BTW, 1.6 build also has a statistics-related issue. Recently, [~davies] just 
delivered a fix to resolve this problem.  

> TPCDS query 66 degraded by 30% in 1.6.0 compared to 1.4.1
> -
>
> Key: SPARK-13307
> URL: https://issues.apache.org/jira/browse/SPARK-13307
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 1.6.0
>Reporter: JESSE CHEN
>
> Majority of the TPCDS queries ran faster in 1.6.0 than in 1.4.1, average 
> about 9% faster. There are a few degraded, and one that is definitely not 
> within error margin is query 66.
> Query 66 in 1.4.1: 699 seconds
> Query 66 in 1.6.0: 918 seconds
> 30% worse.
> Collected the physical plans from both versions - drastic difference maybe 
> partially from using Tungsten in 1.6, but anything else at play here?
> Please see plans here:
> https://ibm.box.com/spark-sql-q66-debug-160plan
> https://ibm.box.com/spark-sql-q66-debug-141plan



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-13307) TPCDS query 66 degraded by 30% in 1.6.0 compared to 1.4.1

2016-02-23 Thread Xiao Li (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-13307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15160256#comment-15160256
 ] 

Xiao Li commented on SPARK-13307:
-

First, I am not sure if usage of broadcastjoin makes sense in this query, 
especially when your table size is huge. 

Second, are your queries written in SQL? or DataFrame APIs? Spark SQL does not 
provide broadcast hint for SQL users. If using DataFrame API, you can do 
something like 
{code}
df1.join(broadcast(df2), $"key" === $"key2", "leftsemi")
{code}

Third, I think performance regression is expected for avoiding OOM. 
SortMergeJoin is consuming less memory than ShuffleHashJoin. Thus, it might 
make more sense to choose SortMergeJoin as a default join type. 

> TPCDS query 66 degraded by 30% in 1.6.0 compared to 1.4.1
> -
>
> Key: SPARK-13307
> URL: https://issues.apache.org/jira/browse/SPARK-13307
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 1.6.0
>Reporter: JESSE CHEN
>
> Majority of the TPCDS queries ran faster in 1.6.0 than in 1.4.1, average 
> about 9% faster. There are a few degraded, and one that is definitely not 
> within error margin is query 66.
> Query 66 in 1.4.1: 699 seconds
> Query 66 in 1.6.0: 918 seconds
> 30% worse.
> Collected the physical plans from both versions - drastic difference maybe 
> partially from using Tungsten in 1.6, but anything else at play here?
> Please see plans here:
> https://ibm.box.com/spark-sql-q66-debug-160plan
> https://ibm.box.com/spark-sql-q66-debug-141plan



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-13307) TPCDS query 66 degraded by 30% in 1.6.0 compared to 1.4.1

2016-02-23 Thread JESSE CHEN (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-13307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159636#comment-15159636
 ] 

JESSE CHEN commented on SPARK-13307:


I tuned up the autoBroadcastJoinThreshold to 100MB, made no difference:
spark.sql.autoBroadcastJoinThreshold 104857600
query time is 913 seconds.

Now, how would you change the query to add the [broadcast] hint? 

I will try it if you provide me the modified query.

Thanks.


> TPCDS query 66 degraded by 30% in 1.6.0 compared to 1.4.1
> -
>
> Key: SPARK-13307
> URL: https://issues.apache.org/jira/browse/SPARK-13307
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 1.6.0
>Reporter: JESSE CHEN
>
> Majority of the TPCDS queries ran faster in 1.6.0 than in 1.4.1, average 
> about 9% faster. There are a few degraded, and one that is definitely not 
> within error margin is query 66.
> Query 66 in 1.4.1: 699 seconds
> Query 66 in 1.6.0: 918 seconds
> 30% worse.
> Collected the physical plans from both versions - drastic difference maybe 
> partially from using Tungsten in 1.6, but anything else at play here?
> Please see plans here:
> https://ibm.box.com/spark-sql-q66-debug-160plan
> https://ibm.box.com/spark-sql-q66-debug-141plan



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-13307) TPCDS query 66 degraded by 30% in 1.6.0 compared to 1.4.1

2016-02-14 Thread Xiao Li (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-13307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15146765#comment-15146765
 ] 

Xiao Li commented on SPARK-13307:
-

In the following PR: https://github.com/apache/spark/pull/9645, shuffle hash 
join is removed from Spark SQL. Try to see if broadcast join works in this test 
case. You also can use hint to force the broadcast join. 

Let me CC [~rxin] [~yhuai] [~marmbrus]

> TPCDS query 66 degraded by 30% in 1.6.0 compared to 1.4.1
> -
>
> Key: SPARK-13307
> URL: https://issues.apache.org/jira/browse/SPARK-13307
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 1.6.0
>Reporter: JESSE CHEN
>
> Majority of the TPCDS queries ran faster in 1.6.0 than in 1.4.1, average 
> about 9% faster. There are a few degraded, and one that is definitely not 
> within error margin is query 66.
> Query 66 in 1.4.1: 699 seconds
> Query 66 in 1.6.0: 918 seconds
> 30% worse.
> Collected the physical plans from both versions - drastic difference maybe 
> partially from using Tungsten in 1.6, but anything else at play here?
> Please see plans here:
> https://ibm.box.com/spark-sql-q66-debug-160plan
> https://ibm.box.com/spark-sql-q66-debug-141plan



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-13307) TPCDS query 66 degraded by 30% in 1.6.0 compared to 1.4.1

2016-02-14 Thread JESSE CHEN (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-13307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15146740#comment-15146740
 ] 

JESSE CHEN commented on SPARK-13307:


Uploaded newly collected plans (logical, analyzed, optimized and physical). 

Links are the same:

https://ibm.box.com/spark-sql-q66-debug-160plan
https://ibm.box.com/spark-sql-q66-debug-141plan

Please let me know any additional info you need to collect.
Thanks.


> TPCDS query 66 degraded by 30% in 1.6.0 compared to 1.4.1
> -
>
> Key: SPARK-13307
> URL: https://issues.apache.org/jira/browse/SPARK-13307
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 1.6.0
>Reporter: JESSE CHEN
>
> Majority of the TPCDS queries ran faster in 1.6.0 than in 1.4.1, average 
> about 9% faster. There are a few degraded, and one that is definitely not 
> within error margin is query 66.
> Query 66 in 1.4.1: 699 seconds
> Query 66 in 1.6.0: 918 seconds
> 30% worse.
> Collected the physical plans from both versions - drastic difference maybe 
> partially from using Tungsten in 1.6, but anything else at play here?
> Please see plans here:
> https://ibm.box.com/spark-sql-q66-debug-160plan
> https://ibm.box.com/spark-sql-q66-debug-141plan



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-13307) TPCDS query 66 degraded by 30% in 1.6.0 compared to 1.4.1

2016-02-14 Thread Xiao Li (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-13307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15146755#comment-15146755
 ] 

Xiao Li commented on SPARK-13307:
-

Please tune “spark.sql.autoBroadcastJoinThreshold” to enable the broadcast 
Join. Thanks!

> TPCDS query 66 degraded by 30% in 1.6.0 compared to 1.4.1
> -
>
> Key: SPARK-13307
> URL: https://issues.apache.org/jira/browse/SPARK-13307
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 1.6.0
>Reporter: JESSE CHEN
>
> Majority of the TPCDS queries ran faster in 1.6.0 than in 1.4.1, average 
> about 9% faster. There are a few degraded, and one that is definitely not 
> within error margin is query 66.
> Query 66 in 1.4.1: 699 seconds
> Query 66 in 1.6.0: 918 seconds
> 30% worse.
> Collected the physical plans from both versions - drastic difference maybe 
> partially from using Tungsten in 1.6, but anything else at play here?
> Please see plans here:
> https://ibm.box.com/spark-sql-q66-debug-160plan
> https://ibm.box.com/spark-sql-q66-debug-141plan



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-13307) TPCDS query 66 degraded by 30% in 1.6.0 compared to 1.4.1

2016-02-14 Thread Xiao Li (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-13307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15146751#comment-15146751
 ] 

Xiao Li commented on SPARK-13307:
-

1.6.1 is using SortMergeJoin, but 1.4.1 is using ShuffleHashJoin. I believe 
this is the major cause of the performance difference. 

> TPCDS query 66 degraded by 30% in 1.6.0 compared to 1.4.1
> -
>
> Key: SPARK-13307
> URL: https://issues.apache.org/jira/browse/SPARK-13307
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 1.6.0
>Reporter: JESSE CHEN
>
> Majority of the TPCDS queries ran faster in 1.6.0 than in 1.4.1, average 
> about 9% faster. There are a few degraded, and one that is definitely not 
> within error margin is query 66.
> Query 66 in 1.4.1: 699 seconds
> Query 66 in 1.6.0: 918 seconds
> 30% worse.
> Collected the physical plans from both versions - drastic difference maybe 
> partially from using Tungsten in 1.6, but anything else at play here?
> Please see plans here:
> https://ibm.box.com/spark-sql-q66-debug-160plan
> https://ibm.box.com/spark-sql-q66-debug-141plan



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-13307) TPCDS query 66 degraded by 30% in 1.6.0 compared to 1.4.1

2016-02-13 Thread JESSE CHEN (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-13307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15146138#comment-15146138
 ] 

JESSE CHEN commented on SPARK-13307:


Have you looked at the plans? I provided at above links.

> TPCDS query 66 degraded by 30% in 1.6.0 compared to 1.4.1
> -
>
> Key: SPARK-13307
> URL: https://issues.apache.org/jira/browse/SPARK-13307
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 1.6.0
>Reporter: JESSE CHEN
>
> Majority of the TPCDS queries ran faster in 1.6.0 than in 1.4.1, average 
> about 9% faster. There are a few degraded, and one that is definitely not 
> within error margin is query 66.
> Query 66 in 1.4.1: 699 seconds
> Query 66 in 1.6.0: 918 seconds
> 30% worse.
> Collected the physical plans from both versions - drastic difference maybe 
> partially from using Tungsten in 1.6, but anything else at play here?
> Please see plans here:
> https://ibm.box.com/spark-sql-q66-debug-160plan
> https://ibm.box.com/spark-sql-q66-debug-141plan



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-13307) TPCDS query 66 degraded by 30% in 1.6.0 compared to 1.4.1

2016-02-13 Thread Xiao Li (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-13307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15146140#comment-15146140
 ] 

Xiao Li commented on SPARK-13307:
-

Could you provided logical plans, as suggested above? The attached only 
contains the physical plans. Thanks!

> TPCDS query 66 degraded by 30% in 1.6.0 compared to 1.4.1
> -
>
> Key: SPARK-13307
> URL: https://issues.apache.org/jira/browse/SPARK-13307
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 1.6.0
>Reporter: JESSE CHEN
>
> Majority of the TPCDS queries ran faster in 1.6.0 than in 1.4.1, average 
> about 9% faster. There are a few degraded, and one that is definitely not 
> within error margin is query 66.
> Query 66 in 1.4.1: 699 seconds
> Query 66 in 1.6.0: 918 seconds
> 30% worse.
> Collected the physical plans from both versions - drastic difference maybe 
> partially from using Tungsten in 1.6, but anything else at play here?
> Please see plans here:
> https://ibm.box.com/spark-sql-q66-debug-160plan
> https://ibm.box.com/spark-sql-q66-debug-141plan



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-13307) TPCDS query 66 degraded by 30% in 1.6.0 compared to 1.4.1

2016-02-12 Thread Xiao Li (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-13307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15145829#comment-15145829
 ] 

Xiao Li commented on SPARK-13307:
-

Please use explain(true). It will be much easier to analyze when reading the 
logical plans. Thanks!

> TPCDS query 66 degraded by 30% in 1.6.0 compared to 1.4.1
> -
>
> Key: SPARK-13307
> URL: https://issues.apache.org/jira/browse/SPARK-13307
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 1.6.0
>Reporter: JESSE CHEN
>
> Majority of the TPCDS queries ran faster in 1.6.0 than in 1.4.1, average 
> about 9% faster. There are a few degraded, and one that is definitely not 
> within error margin is query 66.
> Query 66 in 1.4.1: 699 seconds
> Query 66 in 1.6.0: 918 seconds
> 30% worse.
> Collected the physical plans from both versions - drastic difference maybe 
> partially from using Tungsten in 1.6, but anything else at play here?
> Please see plans here:
> https://ibm.box.com/spark-sql-q66-debug-160plan
> https://ibm.box.com/spark-sql-q66-debug-141plan



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org