[GitHub] spark pull request #19803: [SPARK-22596][SQL] set ctx.currentVars in Codegen...

2017-11-23 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/19803#discussion_r152881786 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/BoundAttribute.scala --- @@ -60,20 +60,23 @@ case class

[GitHub] spark pull request #19807: [SPARK-22495] Fix setup of SPARK_HOME variable on...

2017-11-23 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/19807#discussion_r152889444 --- Diff: bin/find-spark-home.cmd --- @@ -0,0 +1,60 @@ +@echo off + +rem +rem Licensed to the Apache Software Foundation (ASF) under one

[GitHub] spark pull request #19808: [SPARK-22597][SQL] Add spark-sql cmd script for W...

2017-11-23 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/19808#discussion_r152891405 --- Diff: bin/spark-sql.cmd --- @@ -0,0 +1,25 @@ +@echo off + +rem +rem Licensed to the Apache Software Foundation (ASF) under one or

[GitHub] spark pull request #19808: [SPARK-22597][SQL] Add spark-sql cmd script for W...

2017-11-23 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/19808#discussion_r152891440 --- Diff: bin/find-spark-home.cmd --- @@ -32,7 +32,7 @@ if not "x%PYSPARK_PYTHON%"=="x" ( ) rem If there is python installed, trying to

[GitHub] spark issue #19802: [WIP][SPARK-22594][CORE] Handling spark-submit and maste...

2017-11-23 Thread jerryshao
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/19802 Can you please explain more, and how to reproduce this issue? Spark's RPC is not designed for version compatible. --- - To

[GitHub] spark pull request #19763: [SPARK-22537][core] Aggregation of map output sta...

2017-11-23 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/19763#discussion_r152911829 --- Diff: core/src/main/scala/org/apache/spark/internal/config/package.scala --- @@ -485,4 +485,13 @@ package object config { "array in the

[GitHub] spark pull request #19803: [SPARK-22596][SQL] set ctx.currentVars in Codegen...

2017-11-23 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/19803#discussion_r152881282 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/BoundAttribute.scala --- @@ -60,20 +60,23 @@ case class

[GitHub] spark issue #19800: [SPARK-22591][SQL] GenerateOrdering shouldn't change Cod...

2017-11-23 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19800 **[Test build #84144 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84144/testReport)** for PR 19800 at commit

[GitHub] spark issue #19808: [SPARK-22597][SQL] Add spark-sql cmd script for Windows ...

2017-11-23 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19808 **[Test build #84147 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84147/testReport)** for PR 19808 at commit

[GitHub] spark pull request #19763: [SPARK-22537][core] Aggregation of map output sta...

2017-11-23 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/19763#discussion_r152896399 --- Diff: core/src/main/scala/org/apache/spark/internal/config/package.scala --- @@ -485,4 +485,13 @@ package object config { "array in the

[GitHub] spark issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous ...

2017-11-23 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19788 **[Test build #84148 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84148/testReport)** for PR 19788 at commit

[GitHub] spark pull request #19803: [SPARK-22596][SQL] set ctx.currentVars in Codegen...

2017-11-23 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/19803#discussion_r152883588 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/DataSourceScanExec.scala --- @@ -355,19 +355,12 @@ case class FileSourceScanExec(

[GitHub] spark issue #19803: [SPARK-22596][SQL] set ctx.currentVars in CodegenSupport...

2017-11-23 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19803 **[Test build #84146 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84146/testReport)** for PR 19803 at commit

[GitHub] spark pull request #19763: [SPARK-22537][core] Aggregation of map output sta...

2017-11-23 Thread gczsjdy
Github user gczsjdy commented on a diff in the pull request: https://github.com/apache/spark/pull/19763#discussion_r152888380 --- Diff: core/src/main/scala/org/apache/spark/MapOutputTracker.scala --- @@ -472,15 +475,66 @@ private[spark] class MapOutputTrackerMaster(

[GitHub] spark pull request #19763: [SPARK-22537][core] Aggregation of map output sta...

2017-11-23 Thread gczsjdy
Github user gczsjdy commented on a diff in the pull request: https://github.com/apache/spark/pull/19763#discussion_r152888257 --- Diff: core/src/main/scala/org/apache/spark/MapOutputTracker.scala --- @@ -472,15 +475,66 @@ private[spark] class MapOutputTrackerMaster(

[GitHub] spark issue #19808: [SPARK-22597][SQL] Add spark-sql cmd script for Windows ...

2017-11-23 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/19808 cc @cloud-fan, @felixcheung, @jsnowacki and @srowen who I could think are probably interested in this. --- - To

[GitHub] spark pull request #19808: [SPARK-22597][SQL] Add spark-sql cmd script for W...

2017-11-23 Thread HyukjinKwon
GitHub user HyukjinKwon opened a pull request: https://github.com/apache/spark/pull/19808 [SPARK-22597][SQL] Add spark-sql cmd script for Windows users ## What changes were proposed in this pull request? This PR proposes to add cmd scripts so that Windows users can also run

[GitHub] spark issue #19775: [SPARK-22343][core] Add support for publishing Spark met...

2017-11-23 Thread jerryshao
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/19775 Do we have to put this in Spark, is it a necessary part of k8s? I think if we pull in that PR(https://github.com/apache/spark/pull/11994), then this can be stayed out of Spark as a package. Even

[GitHub] spark pull request #19810: Partition level pruning 2

2017-11-23 Thread CodingCat
GitHub user CodingCat opened a pull request: https://github.com/apache/spark/pull/19810 Partition level pruning 2 ## What changes were proposed in this pull request? In the current implementation of Spark, InMemoryTableExec read all data in a cached table, filter

[GitHub] spark issue #19810: [SPARK-22599][SQL] In-Memory Table Pruning without Extra...

2017-11-23 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19810 **[Test build #84152 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84152/testReport)** for PR 19810 at commit

[GitHub] spark pull request #19763: [SPARK-22537][core] Aggregation of map output sta...

2017-11-23 Thread gczsjdy
Github user gczsjdy commented on a diff in the pull request: https://github.com/apache/spark/pull/19763#discussion_r152911325 --- Diff: core/src/main/scala/org/apache/spark/internal/config/package.scala --- @@ -485,4 +485,13 @@ package object config { "array in the

[GitHub] spark issue #19809: [SPARK-17920][SQL] [FOLLOWUP] Backport PR 19779 to branc...

2017-11-23 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19809 **[Test build #84150 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84150/testReport)** for PR 19809 at commit

[GitHub] spark issue #19809: [SPARK-17920][SQL] [FOLLOWUP] Backport PR 19779 to branc...

2017-11-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19809 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19809: [SPARK-17920][SQL] [FOLLOWUP] Backport PR 19779 to branc...

2017-11-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19809 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/84150/ Test PASSed. ---

[GitHub] spark issue #19803: [SPARK-22596][SQL] set ctx.currentVars in CodegenSupport...

2017-11-23 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19803 **[Test build #84145 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84145/testReport)** for PR 19803 at commit

[GitHub] spark pull request #19806: [SPARK-22595][SQL] fix flaky test: CastSuite.SPAR...

2017-11-23 Thread kiszk
Github user kiszk commented on a diff in the pull request: https://github.com/apache/spark/pull/19806#discussion_r152886393 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CastSuite.scala --- @@ -829,7 +829,7 @@ class CastSuite extends

[GitHub] spark issue #19809: [SPARK-17920][SQL] [FOLLOWUP] Backport PR 19779 to branc...

2017-11-23 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19809 **[Test build #84150 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84150/testReport)** for PR 19809 at commit

[GitHub] spark issue #19809: [SPARK-17920][SQL] [FOLLOWUP] Backport PR 19779 to branc...

2017-11-23 Thread vinodkc
Github user vinodkc commented on the issue: https://github.com/apache/spark/pull/19809 ping @cloud-fan --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark pull request #19763: [SPARK-22537][core] Aggregation of map output sta...

2017-11-23 Thread gczsjdy
Github user gczsjdy commented on a diff in the pull request: https://github.com/apache/spark/pull/19763#discussion_r152906960 --- Diff: core/src/main/scala/org/apache/spark/MapOutputTracker.scala --- @@ -472,15 +475,66 @@ private[spark] class MapOutputTrackerMaster(

[GitHub] spark pull request #19763: [SPARK-22537][core] Aggregation of map output sta...

2017-11-23 Thread gczsjdy
Github user gczsjdy commented on a diff in the pull request: https://github.com/apache/spark/pull/19763#discussion_r152907079 --- Diff: core/src/main/scala/org/apache/spark/internal/config/package.scala --- @@ -485,4 +485,13 @@ package object config { "array in the

[GitHub] spark issue #19518: [SPARK-18016][SQL][CATALYST] Code Generation: Constant P...

2017-11-23 Thread kiszk
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/19518 @mgaido91 Thank you for your questions. 1. I am using `javac` as shown. I am sorry that I cannot understand what you are pointing out. In this benchmark, what are differences between `javac` and

[GitHub] spark pull request #19788: [SPARK-9853][Core] Optimize shuffle fetch of cont...

2017-11-23 Thread jerryshao
Github user jerryshao commented on a diff in the pull request: https://github.com/apache/spark/pull/19788#discussion_r152891792 --- Diff: core/src/main/scala/org/apache/spark/MapOutputTracker.scala --- @@ -812,10 +812,13 @@ private[spark] object MapOutputTracker extends Logging {

[GitHub] spark pull request #19788: [SPARK-9853][Core] Optimize shuffle fetch of cont...

2017-11-23 Thread jerryshao
Github user jerryshao commented on a diff in the pull request: https://github.com/apache/spark/pull/19788#discussion_r152891172 --- Diff: core/src/main/scala/org/apache/spark/shuffle/IndexShuffleBlockResolver.scala --- @@ -196,12 +196,14 @@ private[spark] class

[GitHub] spark pull request #19788: [SPARK-9853][Core] Optimize shuffle fetch of cont...

2017-11-23 Thread jerryshao
Github user jerryshao commented on a diff in the pull request: https://github.com/apache/spark/pull/19788#discussion_r152891438 --- Diff: core/src/main/scala/org/apache/spark/shuffle/IndexShuffleBlockResolver.scala --- @@ -196,12 +196,14 @@ private[spark] class

[GitHub] spark pull request #19788: [SPARK-9853][Core] Optimize shuffle fetch of cont...

2017-11-23 Thread jerryshao
Github user jerryshao commented on a diff in the pull request: https://github.com/apache/spark/pull/19788#discussion_r152891920 --- Diff: core/src/main/scala/org/apache/spark/shuffle/IndexShuffleBlockResolver.scala --- @@ -196,12 +196,14 @@ private[spark] class

[GitHub] spark pull request #19789: [SPARK-22562][Streaming] CachedKafkaConsumer unsa...

2017-11-23 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/19789#discussion_r152895550 --- Diff: external/kafka-0-10/src/main/scala/org/apache/spark/streaming/kafka010/KafkaRDD.scala --- @@ -211,8 +211,8 @@ private[spark] class

[GitHub] spark issue #19518: [SPARK-18016][SQL][CATALYST] Code Generation: Constant P...

2017-11-23 Thread maropu
Github user maropu commented on the issue: https://github.com/apache/spark/pull/19518 I like the latest @kiszk hybrid idea in terms of performance and readability. Also, this is a corner case, so I don't want affect most regular small queries. ---

[GitHub] spark issue #19518: [SPARK-18016][SQL][CATALYST] Code Generation: Constant P...

2017-11-23 Thread kiszk
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/19518 I created and ran another synthetic benchmark program for comparing flat global variables, inner global variables, and array. In summary, the followings are performance results (**small number is

[GitHub] spark pull request #19763: [SPARK-22537][core] Aggregation of map output sta...

2017-11-23 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/19763#discussion_r152896879 --- Diff: core/src/main/scala/org/apache/spark/MapOutputTracker.scala --- @@ -472,15 +475,66 @@ private[spark] class MapOutputTrackerMaster(

[GitHub] spark pull request #19809: [SPARK-17920][SQL] [FOLLOWUP] Backport PR 19779 t...

2017-11-23 Thread vinodkc
GitHub user vinodkc opened a pull request: https://github.com/apache/spark/pull/19809 [SPARK-17920][SQL] [FOLLOWUP] Backport PR 19779 to branch-2.2 ## What changes were proposed in this pull request? A followup of > https://github.com/apache/spark/pull/19795

[GitHub] spark pull request #19763: [SPARK-22537][core] Aggregation of map output sta...

2017-11-23 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/19763#discussion_r152907606 --- Diff: core/src/main/scala/org/apache/spark/internal/config/package.scala --- @@ -485,4 +485,13 @@ package object config { "array in the

[GitHub] spark issue #19518: [SPARK-18016][SQL][CATALYST] Code Generation: Constant P...

2017-11-23 Thread mgaido91
Github user mgaido91 commented on the issue: https://github.com/apache/spark/pull/19518 @kiszk I meant that `janinoc` creates a slightly different constant pool from `javac`. I am not sure about performances, but the number of constant pool entries is definitely different. For

[GitHub] spark pull request #19803: [SPARK-22596][SQL] set ctx.currentVars in Codegen...

2017-11-23 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/19803#discussion_r152883562 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/basicPhysicalOperators.scala --- @@ -56,9 +56,7 @@ case class ProjectExec(projectList:

[GitHub] spark issue #19518: [SPARK-18016][SQL][CATALYST] Code Generation: Constant P...

2017-11-23 Thread kiszk
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/19518 Based on performance results and usage of constant pool entry, I would like to use hybrid approach with flat global variable and array. For example, first 500 variables are stored into flat

[GitHub] spark pull request #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer t...

2017-11-23 Thread tengpeng
Github user tengpeng commented on a diff in the pull request: https://github.com/apache/spark/pull/17819#discussion_r152891005 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/Bucketizer.scala --- @@ -108,26 +164,53 @@ final class Bucketizer @Since("1.4.0")

[GitHub] spark issue #19803: [SPARK-22596][SQL] set ctx.currentVars in CodegenSupport...

2017-11-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19803 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19803: [SPARK-22596][SQL] set ctx.currentVars in CodegenSupport...

2017-11-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19803 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/84146/ Test PASSed. ---

[GitHub] spark issue #19803: [SPARK-22596][SQL] set ctx.currentVars in CodegenSupport...

2017-11-23 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19803 **[Test build #84146 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84146/testReport)** for PR 19803 at commit

[GitHub] spark issue #19808: [SPARK-22597][SQL] Add spark-sql cmd script for Windows ...

2017-11-23 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19808 **[Test build #84149 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84149/testReport)** for PR 19808 at commit

[GitHub] spark issue #19810: [SQL][SPARK-22599] In-Memory Table Pruning without Extra...

2017-11-23 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19810 **[Test build #84151 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84151/testReport)** for PR 19810 at commit

[GitHub] spark issue #19800: [SPARK-22591][SQL] GenerateOrdering shouldn't change Cod...

2017-11-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19800 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/84144/ Test PASSed. ---

[GitHub] spark issue #19800: [SPARK-22591][SQL] GenerateOrdering shouldn't change Cod...

2017-11-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19800 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19803: [SPARK-22596][SQL] set ctx.currentVars in CodegenSupport...

2017-11-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19803 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/84145/ Test PASSed. ---

[GitHub] spark issue #19803: [SPARK-22596][SQL] set ctx.currentVars in CodegenSupport...

2017-11-23 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19803 **[Test build #84145 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84145/testReport)** for PR 19803 at commit

[GitHub] spark issue #19803: [SPARK-22596][SQL] set ctx.currentVars in CodegenSupport...

2017-11-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19803 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark pull request #19803: [SPARK-22596][SQL] set ctx.currentVars in Codegen...

2017-11-23 Thread kiszk
Github user kiszk commented on a diff in the pull request: https://github.com/apache/spark/pull/19803#discussion_r152899229 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/WholeStageCodegenExec.scala --- @@ -108,20 +108,22 @@ trait CodegenSupport extends

[GitHub] spark issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous ...

2017-11-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19788 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous ...

2017-11-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19788 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/84148/ Test FAILed. ---

[GitHub] spark issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous ...

2017-11-23 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19788 **[Test build #84148 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84148/testReport)** for PR 19788 at commit

[GitHub] spark issue #19808: [SPARK-22597][SQL] Add spark-sql cmd script for Windows ...

2017-11-23 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/19808 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark pull request #19763: [SPARK-22537][core] Aggregation of map output sta...

2017-11-23 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/19763#discussion_r152908363 --- Diff: core/src/main/scala/org/apache/spark/internal/config/package.scala --- @@ -485,4 +485,13 @@ package object config { "array in the

[GitHub] spark pull request #19763: [SPARK-22537][core] Aggregation of map output sta...

2017-11-23 Thread gczsjdy
Github user gczsjdy commented on a diff in the pull request: https://github.com/apache/spark/pull/19763#discussion_r152912084 --- Diff: core/src/main/scala/org/apache/spark/internal/config/package.scala --- @@ -485,4 +485,13 @@ package object config { "array in the

[GitHub] spark pull request #19763: [SPARK-22537][core] Aggregation of map output sta...

2017-11-23 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/19763#discussion_r152911936 --- Diff: core/src/main/scala/org/apache/spark/internal/config/package.scala --- @@ -485,4 +485,13 @@ package object config { "array in the

[GitHub] spark issue #19808: [SPARK-22597][SQL] Add spark-sql cmd script for Windows ...

2017-11-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19808 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/84147/ Test FAILed. ---

[GitHub] spark issue #19808: [SPARK-22597][SQL] Add spark-sql cmd script for Windows ...

2017-11-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19808 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19808: [SPARK-22597][SQL] Add spark-sql cmd script for Windows ...

2017-11-23 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19808 **[Test build #84147 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84147/testReport)** for PR 19808 at commit

[GitHub] spark pull request #19773: [SPARK-22546][SQL] Supporting for changing column...

2017-11-23 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/19773#discussion_r152753785 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/ddl.scala --- @@ -318,16 +318,26 @@ case class AlterTableChangeColumnCommand(

[GitHub] spark issue #19793: [SPARK-22574] [Mesos] [Submit] Check submission request ...

2017-11-23 Thread Gschiavon
Github user Gschiavon commented on the issue: https://github.com/apache/spark/pull/19793 ping @ArtRand --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #19082: [SPARK-21870][SQL] Split aggregation code into small fun...

2017-11-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19082 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19082: [SPARK-21870][SQL] Split aggregation code into small fun...

2017-11-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19082 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/84126/ Test PASSed. ---

[GitHub] spark pull request #19797: [SPARK-22570][SQL] Avoid to create a lot of globa...

2017-11-23 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/19797#discussion_r152766978 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala --- @@ -851,9 +855,11 @@ case class Cast(child: Expression,

[GitHub] spark issue #19756: [SPARK-22527][SQL] Reuse coordinated exchanges if possib...

2017-11-23 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/19756 then can't we correctly implement `equals` for the coordinator? --- - To unsubscribe, e-mail:

[GitHub] spark issue #19498: [SPARK-17756][PYTHON][STREAMING] Workaround to avoid ret...

2017-11-23 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/19498 gentle ping @zsxwing, @rxin, @tdas and @holdenk --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For

[GitHub] spark pull request #19799: [SPARK-17920][followup] simplify the schema file ...

2017-11-23 Thread cloud-fan
GitHub user cloud-fan opened a pull request: https://github.com/apache/spark/pull/19799 [SPARK-17920][followup] simplify the schema file creation in test ## What changes were proposed in this pull request? a followup of https://github.com/apache/spark/pull/19779 , to

[GitHub] spark pull request #19800: [SPARK-22591][SQL] GenerateOrdering shouldn't cha...

2017-11-23 Thread viirya
GitHub user viirya opened a pull request: https://github.com/apache/spark/pull/19800 [SPARK-22591][SQL] GenerateOrdering shouldn't change CodegenContext.INPUT_ROW ## What changes were proposed in this pull request? When I played with codegen in developing another PR, I

[GitHub] spark issue #19082: [SPARK-21870][SQL] Split aggregation code into small fun...

2017-11-23 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19082 **[Test build #84126 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84126/testReport)** for PR 19082 at commit

[GitHub] spark issue #19799: [SPARK-17920][followup] simplify the schema file creatio...

2017-11-23 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/19799 cc @vinodkc @gatorsmile --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #19518: [SPARK-18016][SQL][CATALYST] Code Generation: Constant P...

2017-11-23 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/19518 You are comparing array vs member variables, can we compare array vs inner class member variable? And too many classes will have overhead on the classloader, we should test some extreme cases

[GitHub] spark pull request #19799: [SPARK-17920][followup] simplify the schema file ...

2017-11-23 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/19799#discussion_r152781264 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/client/VersionsSuite.scala --- @@ -862,17 +859,17 @@ class VersionsSuite extends

[GitHub] spark issue #19621: [SPARK-11215][ML] Add multiple columns support to String...

2017-11-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19621 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/84125/ Test FAILed. ---

[GitHub] spark issue #19621: [SPARK-11215][ML] Add multiple columns support to String...

2017-11-23 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19621 **[Test build #84125 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84125/testReport)** for PR 19621 at commit

[GitHub] spark issue #19621: [SPARK-11215][ML] Add multiple columns support to String...

2017-11-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19621 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19799: [SPARK-17920][followup] simplify the schema file creatio...

2017-11-23 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19799 **[Test build #84127 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84127/testReport)** for PR 19799 at commit

[GitHub] spark issue #19799: [SPARK-17920][followup] simplify the schema file creatio...

2017-11-23 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/19799 LGTM --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #18692: [SPARK-21417][SQL] Infer join conditions using propagate...

2017-11-23 Thread SimonBin
Github user SimonBin commented on the issue: https://github.com/apache/spark/pull/18692 @aokolnychyi thank you for the clarification, I see now --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For

[GitHub] spark issue #19621: [SPARK-11215][ML] Add multiple columns support to String...

2017-11-23 Thread WeichenXu123
Github user WeichenXu123 commented on the issue: https://github.com/apache/spark/pull/19621 @viirya @MLnick Code updated. Thanks! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19621: [SPARK-11215][ML] Add multiple columns support to String...

2017-11-23 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19621 **[Test build #84125 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84125/testReport)** for PR 19621 at commit

[GitHub] spark issue #19082: [SPARK-21870][SQL] Split aggregation code into small fun...

2017-11-23 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19082 **[Test build #84126 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84126/testReport)** for PR 19082 at commit

[GitHub] spark pull request #19758: [SPARK-3162][MLlib] Local Tree Training Pt 1: Ref...

2017-11-23 Thread WeichenXu123
Github user WeichenXu123 commented on a diff in the pull request: https://github.com/apache/spark/pull/19758#discussion_r152749515 --- Diff: mllib/src/test/scala/org/apache/spark/ml/tree/impl/TreeSplitUtilsSuite.scala --- @@ -0,0 +1,280 @@ +/* + * Licensed to the Apache

[GitHub] spark issue #18692: [SPARK-21417][SQL] Infer join conditions using propagate...

2017-11-23 Thread Aklakan
Github user Aklakan commented on the issue: https://github.com/apache/spark/pull/18692 Hi @aokolnychyi, a on note on @SimonBin 's comment (I am his colleague): > The initial solution handled your case but then there was a decision to restrict the proposed rule to cross joins

[GitHub] spark issue #19714: [SPARK-22489][SQL] Shouldn't change broadcast join build...

2017-11-23 Thread wangyum
Github user wangyum commented on the issue: https://github.com/apache/spark/pull/19714 cc @gatorsmile @hvanhovell --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #19764: [SPARK-22539][SQL] Add second order for rangepartitioner...

2017-11-23 Thread caneGuy
Github user caneGuy commented on the issue: https://github.com/apache/spark/pull/19764 Firstly , thanks too much @hvanhovell . And sorry for replying so late since i have some other things to handle during these time. For the question, i think the ordering will not be broken.I

[GitHub] spark pull request #19799: [SPARK-17920][followup] simplify the schema file ...

2017-11-23 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/19799#discussion_r152786374 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/client/VersionsSuite.scala --- @@ -862,17 +859,17 @@ class VersionsSuite extends

[GitHub] spark pull request #19799: [SPARK-17920][followup] simplify the schema file ...

2017-11-23 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/19799#discussion_r152788814 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/client/VersionsSuite.scala --- @@ -862,17 +859,17 @@ class VersionsSuite extends

[GitHub] spark pull request #19799: [SPARK-17920][followup] simplify the schema file ...

2017-11-23 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/19799#discussion_r152792399 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/client/VersionsSuite.scala --- @@ -862,17 +859,17 @@ class VersionsSuite extends

[GitHub] spark pull request #19800: [SPARK-22591][SQL] GenerateOrdering shouldn't cha...

2017-11-23 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/19800#discussion_r152798521 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/GenerateOrdering.scala --- @@ -72,6 +72,7 @@ object

[GitHub] spark pull request #19800: [SPARK-22591][SQL] GenerateOrdering shouldn't cha...

2017-11-23 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/19800#discussion_r152798443 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/OrderingSuite.scala --- @@ -156,4 +156,13 @@ class OrderingSuite extends

[GitHub] spark pull request #19800: [SPARK-22591][SQL] GenerateOrdering shouldn't cha...

2017-11-23 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/19800#discussion_r152798856 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/GenerateOrdering.scala --- @@ -72,6 +72,7 @@ object

[GitHub] spark issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous ...

2017-11-23 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19788 **[Test build #84132 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84132/testReport)** for PR 19788 at commit

[GitHub] spark issue #19792: [SPARK-22566][PYTHON] Better error message for `_merge_t...

2017-11-23 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/19792 D'oh, you mean performance regression test. Manual tests should be fine. When you share some codes you ran, maybe we can double check. ---

  1   2   3   >