[GitHub] spark pull request #22219: [SPARK-25224][SQL] Improvement of Spark SQL Thrif...

2018-08-24 Thread Dooyoung-Hwang
Github user Dooyoung-Hwang commented on a diff in the pull request: https://github.com/apache/spark/pull/22219#discussion_r212704387 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/SparkPlan.scala --- @@ -348,30 +350,30 @@ abstract class SparkPlan extends

[GitHub] spark issue #22219: [SPARK-25224][SQL] Improvement of Spark SQL ThriftServer...

2018-08-24 Thread Dooyoung-Hwang
Github user Dooyoung-Hwang commented on the issue: https://github.com/apache/spark/pull/22219 Yes, I verified results of a variety of queries, and memory & performance. This patch passed all our query test. And there was no performance degradation in our test c

[GitHub] spark pull request #22219: [SPARK-25224][SQL] Improvement of Spark SQL Thrif...

2018-08-24 Thread Dooyoung-Hwang
GitHub user Dooyoung-Hwang opened a pull request: https://github.com/apache/spark/pull/22219 [SPARK-25224][SQL] Improvement of Spark SQL ThriftServer memory management ## What changes were proposed in this pull request? Spark SQL only have two options for managing

[GitHub] spark issue #22219: [SPARK-25224][SQL] Improvement of Spark SQL ThriftServer...

2018-08-28 Thread Dooyoung-Hwang
Github user Dooyoung-Hwang commented on the issue: https://github.com/apache/spark/pull/22219 Change the accessor of collectCountAndIterator to private[sql]. And updated doc of feature that I define in ThriftServer

[GitHub] spark pull request #22219: [SPARK-25224][SQL] Improvement of Spark SQL Thrif...

2018-08-29 Thread Dooyoung-Hwang
Github user Dooyoung-Hwang commented on a diff in the pull request: https://github.com/apache/spark/pull/22219#discussion_r213597686 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/SparkPlan.scala --- @@ -329,49 +329,52 @@ abstract class SparkPlan extends

[GitHub] spark pull request #22219: [SPARK-25224][SQL] Improvement of Spark SQL Thrif...

2018-08-29 Thread Dooyoung-Hwang
Github user Dooyoung-Hwang commented on a diff in the pull request: https://github.com/apache/spark/pull/22219#discussion_r213597102 --- Diff: sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkExecuteStatementOperation.scala --- @@ -289,6 +289,14

[GitHub] spark pull request #22219: [SPARK-25224][SQL] Improvement of Spark SQL Thrif...

2018-08-27 Thread Dooyoung-Hwang
Github user Dooyoung-Hwang commented on a diff in the pull request: https://github.com/apache/spark/pull/22219#discussion_r212877431 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala --- @@ -3237,6 +3237,20 @@ class Dataset[T] private[sql

[GitHub] spark pull request #22219: [SPARK-25224][SQL] Improvement of Spark SQL Thrif...

2018-08-26 Thread Dooyoung-Hwang
Github user Dooyoung-Hwang commented on a diff in the pull request: https://github.com/apache/spark/pull/22219#discussion_r212855096 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala --- @@ -3237,6 +3237,20 @@ class Dataset[T] private[sql

[GitHub] spark pull request #22219: [SPARK-25224][SQL] Improvement of Spark SQL Thrif...

2018-08-26 Thread Dooyoung-Hwang
Github user Dooyoung-Hwang commented on a diff in the pull request: https://github.com/apache/spark/pull/22219#discussion_r212862913 --- Diff: sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkExecuteStatementOperation.scala --- @@ -289,6 +289,14

[GitHub] spark pull request #22219: [SPARK-25224][SQL] Improvement of Spark SQL Thrif...

2018-08-26 Thread Dooyoung-Hwang
Github user Dooyoung-Hwang commented on a diff in the pull request: https://github.com/apache/spark/pull/22219#discussion_r212866466 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/SparkPlan.scala --- @@ -329,49 +329,52 @@ abstract class SparkPlan extends

[GitHub] spark pull request #22219: [SPARK-25224][SQL] Improvement of Spark SQL Thrif...

2018-08-28 Thread Dooyoung-Hwang
Github user Dooyoung-Hwang commented on a diff in the pull request: https://github.com/apache/spark/pull/22219#discussion_r213200433 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala --- @@ -3237,6 +3237,28 @@ class Dataset[T] private[sql

[GitHub] spark issue #22219: [SPARK-25224][SQL] Improvement of Spark SQL ThriftServer...

2018-08-28 Thread Dooyoung-Hwang
Github user Dooyoung-Hwang commented on the issue: https://github.com/apache/spark/pull/22219 Add test cases. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #22219: [SPARK-25224][SQL] Improvement of Spark SQL Thrif...

2018-08-29 Thread Dooyoung-Hwang
Github user Dooyoung-Hwang commented on a diff in the pull request: https://github.com/apache/spark/pull/22219#discussion_r213682460 --- Diff: sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkExecuteStatementOperation.scala --- @@ -289,6 +289,14

[GitHub] spark pull request #22219: [SPARK-25224][SQL] Improvement of Spark SQL Thrif...

2018-09-04 Thread Dooyoung-Hwang
Github user Dooyoung-Hwang commented on a diff in the pull request: https://github.com/apache/spark/pull/22219#discussion_r215122865 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala --- @@ -3237,6 +3238,28 @@ class Dataset[T] private[sql

[GitHub] spark pull request #22219: [SPARK-25224][SQL] Improvement of Spark SQL Thrif...

2018-09-04 Thread Dooyoung-Hwang
Github user Dooyoung-Hwang commented on a diff in the pull request: https://github.com/apache/spark/pull/22219#discussion_r214908763 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala --- @@ -3237,6 +3238,28 @@ class Dataset[T] private[sql

[GitHub] spark pull request #22347: [SPARK-25353][SQL] Refactoring executeTake(n: Int...

2018-09-06 Thread Dooyoung-Hwang
GitHub user Dooyoung-Hwang opened a pull request: https://github.com/apache/spark/pull/22347 [SPARK-25353][SQL] Refactoring executeTake(n: Int) in SparkPlan ## What changes were proposed in this pull request? In some cases, executeTake in SparkPlan could deserialize more than

[GitHub] spark issue #22347: [SPARK-25353][SQL] Refactoring executeTake(n: Int) in Sp...

2018-09-06 Thread Dooyoung-Hwang
Github user Dooyoung-Hwang commented on the issue: https://github.com/apache/spark/pull/22347 cc: @maropu @viirya @kiszk @HyukjinKwon This PR is separated from https://github.com/apache/spark/pull/22219 by @maropu's opinion

[GitHub] spark issue #22347: [SPARK-25353][SQL] executeTake in SparkPlan is modified ...

2018-09-07 Thread Dooyoung-Hwang
Github user Dooyoung-Hwang commented on the issue: https://github.com/apache/spark/pull/22347 @kiszk It is impossible counting decoded rows without modify SparkPlan, because there is no way of counting iterated size. Instead I can simulate this patch in Scala WorkSheet

[GitHub] spark issue #22347: [SPARK-25353][SQL] executeTake in SparkPlan is modified ...

2018-09-10 Thread Dooyoung-Hwang
Github user Dooyoung-Hwang commented on the issue: https://github.com/apache/spark/pull/22347 I tested in my local PC. 3.3 GHz Intel Core i5, and selected 400,000 rows x 25 times. I took a total execution time between decodeUnsafeRows. My tested data is skewed, so gathered

[GitHub] spark issue #22347: [SPARK-25353][SQL] executeTake in SparkPlan is modified ...

2018-09-10 Thread Dooyoung-Hwang
Github user Dooyoung-Hwang commented on the issue: https://github.com/apache/spark/pull/22347 Jenkins, retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e

[GitHub] spark issue #22219: [SPARK-25224][SQL] Improvement of Spark SQL ThriftServer...

2018-10-18 Thread Dooyoung-Hwang
Github user Dooyoung-Hwang commented on the issue: https://github.com/apache/spark/pull/22219 I refactored collectionResultAsSeqView function with using implicit class. --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark issue #22219: [SPARK-25224][SQL] Improvement of Spark SQL ThriftServer...

2018-10-16 Thread Dooyoung-Hwang
Github user Dooyoung-Hwang commented on the issue: https://github.com/apache/spark/pull/22219 Dear reviewers (cc : @dongjoon-hyun ) I updated these. 1. No behavior changes, if the new config is off. So, [PR SPARK-25353](https://github.com/apache/spark/pull/22347

[GitHub] spark pull request #22219: [SPARK-25224][SQL] Improvement of Spark SQL Thrif...

2018-10-16 Thread Dooyoung-Hwang
Github user Dooyoung-Hwang commented on a diff in the pull request: https://github.com/apache/spark/pull/22219#discussion_r225500776 --- Diff: sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkExecuteStatementOperation.scala --- @@ -120,10 +120,11

[GitHub] spark issue #22347: [SPARK-25353][SQL] executeTake in SparkPlan is modified ...

2018-10-02 Thread Dooyoung-Hwang
Github user Dooyoung-Hwang commented on the issue: https://github.com/apache/spark/pull/22347 Thank you for review. Yes, ThriftServer will use intermediate "collection view" in this PR. And [Original PR of ThriftServer](https://github.com/apache/spark/

[GitHub] spark pull request #22347: [SPARK-25353][SQL] executeTake in SparkPlan is mo...

2018-10-04 Thread Dooyoung-Hwang
Github user Dooyoung-Hwang commented on a diff in the pull request: https://github.com/apache/spark/pull/22347#discussion_r222623133 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/SparkPlan.scala --- @@ -348,30 +349,30 @@ abstract class SparkPlan extends

[GitHub] spark pull request #22347: [SPARK-25353][SQL] executeTake in SparkPlan is mo...

2018-10-04 Thread Dooyoung-Hwang
Github user Dooyoung-Hwang commented on a diff in the pull request: https://github.com/apache/spark/pull/22347#discussion_r222622501 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/SparkPlan.scala --- @@ -348,30 +349,30 @@ abstract class SparkPlan extends

[GitHub] spark issue #22347: [SPARK-25353][SQL] executeTake in SparkPlan is modified ...

2018-10-04 Thread Dooyoung-Hwang
Github user Dooyoung-Hwang commented on the issue: https://github.com/apache/spark/pull/22347 I added example code of issue case to the content of PR. --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark issue #22219: [SPARK-25224][SQL] Improvement of Spark SQL ThriftServer...

2018-08-30 Thread Dooyoung-Hwang
Github user Dooyoung-Hwang commented on the issue: https://github.com/apache/spark/pull/22219 @kiszk @viirya @HyukjinKwon @cloud-fan Could you review this patch? --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark pull request #22219: [SPARK-25224][SQL] Improvement of Spark SQL Thrif...

2018-09-04 Thread Dooyoung-Hwang
Github user Dooyoung-Hwang commented on a diff in the pull request: https://github.com/apache/spark/pull/22219#discussion_r214819341 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala --- @@ -3237,6 +3238,28 @@ class Dataset[T] private[sql

[GitHub] spark pull request #22219: [SPARK-25224][SQL] Improvement of Spark SQL Thrif...

2018-09-04 Thread Dooyoung-Hwang
Github user Dooyoung-Hwang commented on a diff in the pull request: https://github.com/apache/spark/pull/22219#discussion_r214818164 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/SparkPlan.scala --- @@ -329,17 +337,26 @@ abstract class SparkPlan extends

[GitHub] spark pull request #22219: [SPARK-25224][SQL] Improvement of Spark SQL Thrif...

2018-09-04 Thread Dooyoung-Hwang
Github user Dooyoung-Hwang commented on a diff in the pull request: https://github.com/apache/spark/pull/22219#discussion_r214818478 --- Diff: sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkExecuteStatementOperation.scala --- @@ -120,10 +120,11