[GitHub] spark issue #19840: [SPARK-22640][PYSPARK][YARN]switch python exec on execut...

2017-12-03 Thread yaooqinn
Github user yaooqinn commented on the issue: https://github.com/apache/spark/pull/19840 use spark-2.2.0-bin-hadoop2.7 numpy examples/src/main/python/mllib/correlations_example.py ### case 1 |key|value| |---|---|

[GitHub] spark issue #19868: [SPARK-22676] Avoid iterating all partition paths when s...

2017-12-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19868 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/84408/ Test PASSed. ---

[GitHub] spark issue #19868: [SPARK-22676] Avoid iterating all partition paths when s...

2017-12-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19868 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19868: [SPARK-22676] Avoid iterating all partition paths when s...

2017-12-03 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19868 **[Test build #84408 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84408/testReport)** for PR 19868 at commit

[GitHub] spark issue #17770: [SPARK-20392][SQL] Set barrier to prevent re-entering a ...

2017-12-03 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/17770 @cloud-fan Sure. Seems there is no option to reopen it as it was merged before. Should I create another PR for it? --- - To

[GitHub] spark issue #17770: [SPARK-20392][SQL] Set barrier to prevent re-entering a ...

2017-12-03 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/17770 Hi @viirya , since it's close to Spark 2.3, would you like to reopen this PR? Thanks! --- - To unsubscribe, e-mail:

[GitHub] spark pull request #19871: [SPARK-20728][SQL] Make OrcFileFormat configurabl...

2017-12-03 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request: https://github.com/apache/spark/pull/19871#discussion_r154558750 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala --- @@ -568,8 +570,13 @@ object DataSource extends

[GitHub] spark pull request #19871: [SPARK-20728][SQL] Make OrcFileFormat configurabl...

2017-12-03 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request: https://github.com/apache/spark/pull/19871#discussion_r154558153 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala --- @@ -363,6 +363,11 @@ object SQLConf {

[GitHub] spark issue #19869: [SPARK-22677][SQL] cleanup whole stage codegen for hash ...

2017-12-03 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19869 **[Test build #84410 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84410/testReport)** for PR 19869 at commit

[GitHub] spark pull request #19869: [SPARK-22677][SQL] cleanup whole stage codegen fo...

2017-12-03 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/19869#discussion_r154556638 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/HashAggregateExec.scala --- @@ -672,48 +668,56 @@ case class

[GitHub] spark issue #18995: [SPARK-21787][SPARK-22672][SQL] Support for pushing down...

2017-12-03 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/18995 maybe have a PR to move the tests first? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark pull request #19871: [SPARK-20728][SQL] Make OrcFileFormat configurabl...

2017-12-03 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/19871#discussion_r154556299 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/sources/DDLSourceLoadSuite.scala --- @@ -54,11 +55,17 @@ class DDLSourceLoadSuite extends

[GitHub] spark pull request #19871: [SPARK-20728][SQL] Make OrcFileFormat configurabl...

2017-12-03 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/19871#discussion_r154556243 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala --- @@ -568,8 +570,13 @@ object DataSource extends Logging

[GitHub] spark pull request #19871: [SPARK-20728][SQL] Make OrcFileFormat configurabl...

2017-12-03 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/19871#discussion_r154556215 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala --- @@ -568,8 +570,13 @@ object DataSource extends Logging

[GitHub] spark pull request #19871: [SPARK-20728][SQL] Make OrcFileFormat configurabl...

2017-12-03 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/19871#discussion_r154556132 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala --- @@ -363,6 +363,11 @@ object SQLConf {

[GitHub] spark issue #19840: [SPARK-22640][PYSPARK][YARN]switch python exec on execut...

2017-12-03 Thread yaooqinn
Github user yaooqinn commented on the issue: https://github.com/apache/spark/pull/19840 @vanzin PYSPARK_DRIVER_PYTHON won't work because [context.py#L191](https://github.com/yaooqinn/spark/blob/8ff5663fe9a32eae79c8ee6bc310409170a8da64/python/pyspark/context.py#L191) does't deal with

[GitHub] spark pull request #19717: [SPARK-22646] [Submission] Spark on Kubernetes - ...

2017-12-03 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/19717#discussion_r154554648 --- Diff: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/submit/steps/DriverServiceBootstrapStep.scala --- @@ -0,0 +1,103 @@

[GitHub] spark issue #19871: [SPARK-20728][SQL] Make OrcFileFormat configurable betwe...

2017-12-03 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/19871 This is a second PR after #19651 . --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark pull request #19717: [SPARK-22646] [Submission] Spark on Kubernetes - ...

2017-12-03 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/19717#discussion_r154549074 --- Diff: core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala --- @@ -251,6 +252,7 @@ object SparkSubmit extends CommandLineUtils with Logging {

[GitHub] spark pull request #19717: [SPARK-22646] [Submission] Spark on Kubernetes - ...

2017-12-03 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/19717#discussion_r154549172 --- Diff: core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala --- @@ -296,6 +298,12 @@ object SparkSubmit extends CommandLineUtils with Logging {

[GitHub] spark pull request #19717: [SPARK-22646] [Submission] Spark on Kubernetes - ...

2017-12-03 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/19717#discussion_r154549733 --- Diff: core/src/main/scala/org/apache/spark/deploy/SparkSubmitArguments.scala --- @@ -590,6 +604,11 @@ private[deploy] class SparkSubmitArguments(args:

[GitHub] spark pull request #19717: [SPARK-22646] [Submission] Spark on Kubernetes - ...

2017-12-03 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/19717#discussion_r154553669 --- Diff: resource-managers/kubernetes/docker/src/main/dockerfiles/executor/Dockerfile --- @@ -0,0 +1,31 @@ +# +# Licensed to the Apache Software

[GitHub] spark issue #19871: [SPARK-20728][SQL] Make OrcFileFormat configurable betwe...

2017-12-03 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19871 **[Test build #84409 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84409/testReport)** for PR 19871 at commit

[GitHub] spark pull request #19871: [SPARK-20728][SQL] Make OrcFileFormat configurabl...

2017-12-03 Thread dongjoon-hyun
GitHub user dongjoon-hyun opened a pull request: https://github.com/apache/spark/pull/19871 [SPARK-20728][SQL] Make OrcFileFormat configurable between sql/hive and sql/core ## What changes were proposed in this pull request? This PR aims to provide a configuration to

[GitHub] spark issue #19868: [SPARK-22676] Avoid iterating all partition paths when s...

2017-12-03 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19868 **[Test build #84408 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84408/testReport)** for PR 19868 at commit

[GitHub] spark issue #19855: [SPARK-22662] [SQL] Failed to prune columns after rewrit...

2017-12-03 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19855 **[Test build #84407 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84407/testReport)** for PR 19855 at commit

[GitHub] spark issue #19868: [SPARK-22676] Avoid iterating all partition paths when s...

2017-12-03 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/19868 Jenkins, retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands,

[GitHub] spark issue #19870: [SPARK-22665][SQL] Avoid repartitioning with empty list ...

2017-12-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19870 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/84406/ Test PASSed. ---

[GitHub] spark issue #19870: [SPARK-22665][SQL] Avoid repartitioning with empty list ...

2017-12-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19870 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19870: [SPARK-22665][SQL] Avoid repartitioning with empty list ...

2017-12-03 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19870 **[Test build #84406 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84406/testReport)** for PR 19870 at commit

[GitHub] spark issue #18995: [SPARK-21787][SPARK-22672][SQL] Support for pushing down...

2017-12-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18995 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/84405/ Test PASSed. ---

[GitHub] spark issue #18995: [SPARK-21787][SPARK-22672][SQL] Support for pushing down...

2017-12-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18995 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #18995: [SPARK-21787][SPARK-22672][SQL] Support for pushing down...

2017-12-03 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18995 **[Test build #84405 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84405/testReport)** for PR 18995 at commit

[GitHub] spark issue #19756: [SPARK-22527][SQL] Reuse coordinated exchanges if possib...

2017-12-03 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/19756 ping @cloud-fan @hvanhovell --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #19813: [SPARK-22600][SQL] Fix 64kb limit for deeply nested expr...

2017-12-03 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/19813 ping @cloud-fan @kiszk Do you have more review on this? Thanks. --- - To unsubscribe, e-mail:

[GitHub] spark issue #19870: [SPARK-22665][SQL] Avoid repartitioning with empty list ...

2017-12-03 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19870 **[Test build #84406 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84406/testReport)** for PR 19870 at commit

[GitHub] spark pull request #19870: [SPARK-22665][SQL] Avoid repartitioning with empt...

2017-12-03 Thread mgaido91
GitHub user mgaido91 opened a pull request: https://github.com/apache/spark/pull/19870 [SPARK-22665][SQL] Avoid repartitioning with empty list of expressions ## What changes were proposed in this pull request? Repartitioning by empty set of expressions is currently

[GitHub] spark issue #18995: [SPARK-21787][SQL] Support for pushing down filters for ...

2017-12-03 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/18995 Hi, @cloud-fan . This is the first followup after #19651 . Could you review this PR, too? --- - To unsubscribe,

[GitHub] spark pull request #18995: [SPARK-21787][SQL] Support for pushing down filte...

2017-12-03 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/18995#discussion_r154537895 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/orc/OrcTest.scala --- @@ -15,20 +15,21 @@ * limitations under the

[GitHub] spark pull request #18995: [SPARK-21787][SQL] Support for pushing down filte...

2017-12-03 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/18995#discussion_r154537824 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/orc/OrcFilterSuite.scala --- @@ -0,0 +1,362 @@ +/* + * Licensed

[GitHub] spark pull request #18995: [SPARK-21787][SQL] Support for pushing down filte...

2017-12-03 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/18995#discussion_r154537798 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/orc/OrcFilters.scala --- @@ -82,8 +82,7 @@ private[orc] object

[GitHub] spark issue #18995: [SPARK-21787][SQL] Support for pushing down filters for ...

2017-12-03 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18995 **[Test build #84405 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84405/testReport)** for PR 18995 at commit

[GitHub] spark pull request #19717: [SPARK-22646] [Submission] Spark on Kubernetes - ...

2017-12-03 Thread liyinan926
Github user liyinan926 commented on a diff in the pull request: https://github.com/apache/spark/pull/19717#discussion_r154537698 --- Diff: core/src/main/scala/org/apache/spark/deploy/SparkSubmitArguments.scala --- @@ -304,7 +313,9 @@ private[deploy] class

[GitHub] spark pull request #19717: [SPARK-22646] [Submission] Spark on Kubernetes - ...

2017-12-03 Thread liyinan926
Github user liyinan926 commented on a diff in the pull request: https://github.com/apache/spark/pull/19717#discussion_r154537691 --- Diff: core/src/main/scala/org/apache/spark/deploy/SparkSubmitArguments.scala --- @@ -294,7 +301,9 @@ private[deploy] class

[GitHub] spark pull request #19717: [SPARK-22646] [Submission] Spark on Kubernetes - ...

2017-12-03 Thread liyinan926
Github user liyinan926 commented on a diff in the pull request: https://github.com/apache/spark/pull/19717#discussion_r154537632 --- Diff: resource-managers/kubernetes/core/src/test/scala/org/apache/spark/deploy/k8s/submit/steps/DependencyResolutionStepSuite.scala --- @@ -0,0

[GitHub] spark pull request #19717: [SPARK-22646] [Submission] Spark on Kubernetes - ...

2017-12-03 Thread liyinan926
Github user liyinan926 commented on a diff in the pull request: https://github.com/apache/spark/pull/19717#discussion_r154537561 --- Diff: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/Config.scala --- @@ -119,5 +139,60 @@ private[spark] object

[GitHub] spark issue #16578: [SPARK-4502][SQL] Parquet nested column pruning

2017-12-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16578 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #16578: [SPARK-4502][SQL] Parquet nested column pruning

2017-12-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16578 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/84404/ Test PASSed. ---

[GitHub] spark issue #16578: [SPARK-4502][SQL] Parquet nested column pruning

2017-12-03 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16578 **[Test build #84404 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84404/testReport)** for PR 16578 at commit

[GitHub] spark pull request #19867: [SPARK-22675] [SQL] Deduplicate PropagateTypes in...

2017-12-03 Thread gatorsmile
Github user gatorsmile closed the pull request at: https://github.com/apache/spark/pull/19867 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #19867: [SPARK-22675] [SQL] Deduplicate PropagateTypes in TypeCo...

2017-12-03 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/19867 After rethinking about it, this is not a right fix. I will close it first. There are multiple issues in the existing `PropagateTypes ` ---

[GitHub] spark pull request #19869: [SPARK-22677][SQL] cleanup whole stage codegen fo...

2017-12-03 Thread kiszk
Github user kiszk commented on a diff in the pull request: https://github.com/apache/spark/pull/19869#discussion_r154534333 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/HashAggregateExec.scala --- @@ -672,48 +668,56 @@ case class HashAggregateExec(

[GitHub] spark issue #19869: [SPARK-22677][SQL] cleanup whole stage codegen for hash ...

2017-12-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19869 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/84403/ Test PASSed. ---

[GitHub] spark issue #19869: [SPARK-22677][SQL] cleanup whole stage codegen for hash ...

2017-12-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19869 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19869: [SPARK-22677][SQL] cleanup whole stage codegen for hash ...

2017-12-03 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19869 **[Test build #84403 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84403/testReport)** for PR 19869 at commit

[GitHub] spark issue #16578: [SPARK-4502][SQL] Parquet nested column pruning

2017-12-03 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16578 **[Test build #84404 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84404/testReport)** for PR 16578 at commit

[GitHub] spark issue #19863: [SPARK-22672][TEST][SQL] Move OrcTest to `sql/core`

2017-12-03 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/19863 @gatorsmile . Since the main PR is merged, I'll include this into the others. Thanks! --- - To unsubscribe, e-mail:

[GitHub] spark pull request #19863: [SPARK-22672][TEST][SQL] Move OrcTest to `sql/cor...

2017-12-03 Thread dongjoon-hyun
Github user dongjoon-hyun closed the pull request at: https://github.com/apache/spark/pull/19863 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #16735: [SPARK-19228][SQL] Introduce tryParseDate method to proc...

2017-12-03 Thread sergey-rubtsov
Github user sergey-rubtsov commented on the issue: https://github.com/apache/spark/pull/16735 I wil try to complete it in this month --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #18953: [SPARK-20682][SQL] Update ORC data source based on Apach...

2017-12-03 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/18953 This is resolved via https://github.com/apache/spark/pull/19651 . --- - To unsubscribe, e-mail:

[GitHub] spark pull request #18953: [SPARK-20682][SQL] Update ORC data source based o...

2017-12-03 Thread dongjoon-hyun
Github user dongjoon-hyun closed the pull request at: https://github.com/apache/spark/pull/18953 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark pull request #19571: [SPARK-15474][SQL] Write and read back non-emtpy ...

2017-12-03 Thread dongjoon-hyun
Github user dongjoon-hyun closed the pull request at: https://github.com/apache/spark/pull/19571 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #19571: [SPARK-15474][SQL] Write and read back non-emtpy schema ...

2017-12-03 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/19571 This is resolved in https://github.com/apache/spark/pull/19651 . --- - To unsubscribe, e-mail:

[GitHub] spark issue #19651: [SPARK-20682][SPARK-15474][SPARK-21791] Add new ORCFileF...

2017-12-03 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/19651 Thank you so much for making ORC move forward, @cloud-fan ! Also, thank you, @HyukjinKwon , @gatorsmile , @viirya , @kiszk . ---

[GitHub] spark issue #19869: [SPARK-22677][SQL] cleanup whole stage codegen for hash ...

2017-12-03 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/19869 cc @juliuszsompolski @kiszk @viirya @maropu --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark pull request #19869: [SPARK-22677][SQL] cleanup whole stage codegen fo...

2017-12-03 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/19869#discussion_r154528578 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/HashAggregateExec.scala --- @@ -882,45 +851,65 @@ case class

[GitHub] spark pull request #19869: [SPARK-22677][SQL] cleanup whole stage codegen fo...

2017-12-03 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/19869#discussion_r154528498 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/HashAggregateExec.scala --- @@ -784,86 +774,65 @@ case class

[GitHub] spark issue #19869: [SPARK-22677][SQL] cleanup whole stage codegen for hash ...

2017-12-03 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19869 **[Test build #84403 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84403/testReport)** for PR 19869 at commit

[GitHub] spark pull request #19869: [SPARK-22677][SQL] cleanup whole stage codegen fo...

2017-12-03 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/19869#discussion_r154528455 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/HashAggregateExec.scala --- @@ -784,86 +774,65 @@ case class

[GitHub] spark pull request #19869: [SPARK-22677][SQL] cleanup whole stage codegen fo...

2017-12-03 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/19869#discussion_r154528431 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/HashAggregateExec.scala --- @@ -672,48 +668,56 @@ case class

[GitHub] spark pull request #19869: [SPARK-22677][SQL] cleanup whole stage codegen fo...

2017-12-03 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/19869#discussion_r154528409 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/HashAggregateExec.scala --- @@ -672,48 +668,56 @@ case class

[GitHub] spark pull request #19869: [SPARK-22677][SQL] cleanup whole stage codegen fo...

2017-12-03 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/19869#discussion_r154528386 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/HashAggregateExec.scala --- @@ -444,6 +444,7 @@ case class HashAggregateExec(

[GitHub] spark pull request #19869: [SPARK-22677][SQL] cleanup whole stage codegen fo...

2017-12-03 Thread cloud-fan
GitHub user cloud-fan opened a pull request: https://github.com/apache/spark/pull/19869 [SPARK-22677][SQL] cleanup whole stage codegen for hash aggregate ## What changes were proposed in this pull request? The `HashAggregateExec` whole stage codegen path is a little messy

[GitHub] spark pull request #19831: [SPARK-22626][SQL] It deals with wrong Hive's sta...

2017-12-03 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/19831 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19831: [SPARK-22626][SQL] It deals with wrong Hive's statistics...

2017-12-03 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/19831 thanks, merging to master! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #19865: [SPARK-22668][SQL] Exclude global variables from argumen...

2017-12-03 Thread kiszk
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/19865 I am neutral how to fix this problem in the current master. What I am saying from the beginning is that this problem does not only exists in #19811, but also in the current master. I am

[GitHub] spark pull request #19860: [SPARK-22669][SQL] Avoid unnecessary function cal...

2017-12-03 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/19860 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19865: [SPARK-22668][SQL] Exclude global variables from argumen...

2017-12-03 Thread mgaido91
Github user mgaido91 commented on the issue: https://github.com/apache/spark/pull/19865 @kiszk I think that in the case you hit them, this might have also been done appositely and relying on the way Java behaves, ie. that it uses the local variable and the global one is not used

[GitHub] spark issue #19860: [SPARK-22669][SQL] Avoid unnecessary function calls in c...

2017-12-03 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/19860 LGTM, merging to master! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #19865: [SPARK-22668][SQL] Exclude global variables from argumen...

2017-12-03 Thread kiszk
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/19865 @cloud-fan I see. As I pointed out, there are several places to set a global variable `ExprCode.value` that is passed to successor operations. Should we make lifetime of global time local in an

[GitHub] spark issue #19865: [SPARK-22668][SQL] Exclude global variables from argumen...

2017-12-03 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/19865 I think for this case, shouldn't we fix it and not pass in a global variable into `splitExpressions`? --- - To unsubscribe,

[GitHub] spark issue #19865: [SPARK-22668][SQL] Exclude global variables from argumen...

2017-12-03 Thread kiszk
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/19865 @mgaido91 @viirya As you see, we see an assertion failure. Here is an evidence that we pass a global variable to arguments of split function. In practice, we did not guarantee that we do not pass

[GitHub] spark issue #19865: [SPARK-22668][SQL] Exclude global variables from argumen...

2017-12-03 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/19865 making a variable global need to be done manually(call `ctx.mutableState`), splitting the code into methods also need to be done manually(call `ctx.splitExpressions`). If we hit a problem here,

[GitHub] spark pull request #19651: [SPARK-20682][SPARK-15474][SPARK-21791] Add new O...

2017-12-03 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/19651 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19651: [SPARK-20682][SPARK-15474][SPARK-21791] Add new ORCFileF...

2017-12-03 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/19651 thanks, merging to master! followups: 1. add a config to use new orc by default 2. move orc test to sql core 3. columnar orc reader ---

[GitHub] spark issue #19865: [SPARK-22668][SQL] Exclude global variables from argumen...

2017-12-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19865 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19865: [SPARK-22668][SQL] Exclude global variables from argumen...

2017-12-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19865 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/84402/ Test FAILed. ---

[GitHub] spark issue #19865: [SPARK-22668][SQL] Exclude global variables from argumen...

2017-12-03 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19865 **[Test build #84402 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84402/testReport)** for PR 19865 at commit

[GitHub] spark pull request #19824: [SPARK][STREAMING] Invoke onBatchCompletion() onl...

2017-12-03 Thread victor-wong
Github user victor-wong closed the pull request at: https://github.com/apache/spark/pull/19824 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #19824: [SPARK][STREAMING] Invoke onBatchCompletion() only when ...

2017-12-03 Thread victor-wong
Github user victor-wong commented on the issue: https://github.com/apache/spark/pull/19824 @CodingCat Thank you:) --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands,

[GitHub] spark issue #19865: [SPARK-22668][SQL] Exclude global variables from argumen...

2017-12-03 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19865 **[Test build #84402 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84402/testReport)** for PR 19865 at commit

[GitHub] spark issue #19865: [SPARK-22668][SQL] Exclude global variables from argumen...

2017-12-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19865 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/84401/ Test FAILed. ---

[GitHub] spark issue #19865: [SPARK-22668][SQL] Exclude global variables from argumen...

2017-12-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19865 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19865: [SPARK-22668][SQL] Exclude global variables from argumen...

2017-12-03 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19865 **[Test build #84401 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84401/testReport)** for PR 19865 at commit

[GitHub] spark issue #19867: [SPARK-22675] [SQL] Deduplicate PropagateTypes in TypeCo...

2017-12-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19867 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/84400/ Test PASSed. ---

[GitHub] spark issue #19867: [SPARK-22675] [SQL] Deduplicate PropagateTypes in TypeCo...

2017-12-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19867 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19867: [SPARK-22675] [SQL] Deduplicate PropagateTypes in TypeCo...

2017-12-03 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19867 **[Test build #84400 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84400/testReport)** for PR 19867 at commit

[GitHub] spark issue #19865: [SPARK-22668][SQL] Exclude global variables from argumen...

2017-12-03 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19865 **[Test build #84401 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84401/testReport)** for PR 19865 at commit

[GitHub] spark issue #19860: [SPARK-22669][SQL] Avoid unnecessary function calls in c...

2017-12-03 Thread mgaido91
Github user mgaido91 commented on the issue: https://github.com/apache/spark/pull/19860 @kiszk @viirya I made the following performance test: ``` val a = (1 to 10).map(x => 1).toDS val filtered = a.where($"value".isin((1 to 10): _*)) (1 to

[GitHub] spark issue #19868: [SPARK-22676] Avoid iterating all partition paths when s...

2017-12-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19868 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/84399/ Test FAILed. ---

<    1   2   3   >