[jira] [Commented] (SPARK-24070) TPC-DS Performance Tests for Parquet 1.10.0 Upgrade

2018-04-25 Thread Takeshi Yamamuro (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-24070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16453258#comment-16453258
 ] 

Takeshi Yamamuro commented on SPARK-24070:
--

Sure, I checked the numbers and see: 
[https://docs.google.com/spreadsheets/d/18EvWZYDqlC_93DI_JaSKSO115OwZ9BfX7Zs5SnU21fA/edit?usp=sharing]
Then, I found no regression at least in TPC-DS.

> TPC-DS Performance Tests for Parquet 1.10.0 Upgrade
> ---
>
> Key: SPARK-24070
> URL: https://issues.apache.org/jira/browse/SPARK-24070
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 2.4.0
>Reporter: Xiao Li
>Assignee: Takeshi Yamamuro
>Priority: Major
>
> TPC-DS performance evaluation of Apache Spark Parquet 1.8.2 and 1.10.0. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-24070) TPC-DS Performance Tests for Parquet 1.10.0 Upgrade

2018-04-25 Thread Xiao Li (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-24070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16452527#comment-16452527
 ] 

Xiao Li commented on SPARK-24070:
-

[~mswit] Thank you for your suggestions! This is very helpful!



> TPC-DS Performance Tests for Parquet 1.10.0 Upgrade
> ---
>
> Key: SPARK-24070
> URL: https://issues.apache.org/jira/browse/SPARK-24070
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 2.4.0
>Reporter: Xiao Li
>Assignee: Takeshi Yamamuro
>Priority: Major
>
> TPC-DS performance evaluation of Apache Spark Parquet 1.8.2 and 1.10.0. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-24070) TPC-DS Performance Tests for Parquet 1.10.0 Upgrade

2018-04-25 Thread JIRA

[ 
https://issues.apache.org/jira/browse/SPARK-24070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16452096#comment-16452096
 ] 

Michał Świtakowski commented on SPARK-24070:


[~maropu] I think you can just use the existing benchmark: 
[https://github.com/apache/spark/blob/master/sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetReadBenchmark.scala]
 

Just make sure that you have a few GB of free memory. If the files are read 
from OS buffer cache, we will be able to see any performance differences better.
 * Visual

> TPC-DS Performance Tests for Parquet 1.10.0 Upgrade
> ---
>
> Key: SPARK-24070
> URL: https://issues.apache.org/jira/browse/SPARK-24070
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 2.4.0
>Reporter: Xiao Li
>Assignee: Takeshi Yamamuro
>Priority: Major
>
> TPC-DS performance evaluation of Apache Spark Parquet 1.8.2 and 1.10.0. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-24070) TPC-DS Performance Tests for Parquet 1.10.0 Upgrade

2018-04-24 Thread Takeshi Yamamuro (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-24070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16451466#comment-16451466
 ] 

Takeshi Yamamuro commented on SPARK-24070:
--

ok

> TPC-DS Performance Tests for Parquet 1.10.0 Upgrade
> ---
>
> Key: SPARK-24070
> URL: https://issues.apache.org/jira/browse/SPARK-24070
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 2.4.0
>Reporter: Xiao Li
>Priority: Major
>
> TPC-DS performance evaluation of Apache Spark Parquet 1.8.2 and 1.10.0. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-24070) TPC-DS Performance Tests for Parquet 1.10.0 Upgrade

2018-04-24 Thread Xiao Li (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-24070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16451427#comment-16451427
 ] 

Xiao Li commented on SPARK-24070:
-

Yeah, please do it here. Thanks! If you have the bandwidth to write the micro 
benchmark suite, that needs a separate PR. 

> TPC-DS Performance Tests for Parquet 1.10.0 Upgrade
> ---
>
> Key: SPARK-24070
> URL: https://issues.apache.org/jira/browse/SPARK-24070
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 2.4.0
>Reporter: Xiao Li
>Priority: Major
>
> TPC-DS performance evaluation of Apache Spark Parquet 1.8.2 and 1.10.0. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-24070) TPC-DS Performance Tests for Parquet 1.10.0 Upgrade

2018-04-24 Thread Takeshi Yamamuro (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-24070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16451404#comment-16451404
 ] 

Takeshi Yamamuro commented on SPARK-24070:
--

ok, this ticket means we will put the performance results here instead of pr?

> TPC-DS Performance Tests for Parquet 1.10.0 Upgrade
> ---
>
> Key: SPARK-24070
> URL: https://issues.apache.org/jira/browse/SPARK-24070
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 2.4.0
>Reporter: Xiao Li
>Priority: Major
>
> TPC-DS performance evaluation of Apache Spark Parquet 1.8.2 and 1.10.0. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-24070) TPC-DS Performance Tests for Parquet 1.10.0 Upgrade

2018-04-24 Thread Xiao Li (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-24070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16450243#comment-16450243
 ] 

Xiao Li commented on SPARK-24070:
-

cc [~maropu]

> TPC-DS Performance Tests for Parquet 1.10.0 Upgrade
> ---
>
> Key: SPARK-24070
> URL: https://issues.apache.org/jira/browse/SPARK-24070
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 2.4.0
>Reporter: Xiao Li
>Priority: Major
>
> TPC-DS performance evaluation of Apache Spark Parquet 1.8.2 and 1.10.0. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org