[GitHub] spark pull request #16243: [SPARK-18815] [SQL] Fix NPE when collecting colum...

2016-12-09 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/16243#discussion_r91831031 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/StatisticsCollectionSuite.scala --- @@ -180,35 +214,28 @@ abstract class

[GitHub] spark issue #13909: [SPARK-16213][SQL] Reduce runtime overhead of a program ...

2016-12-09 Thread kiszk
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/13909 Ah, I understand my misunderstanding. In this discussion, you mean "intermediate array" is "new int[]". Yes, let me do it. --- If your project is set up for it, you can reply to this email and

[GitHub] spark issue #16193: [SPARK-18766] [SQL] Push Down Filter Through BatchEvalPy...

2016-12-09 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16193 **[Test build #69961 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/69961/consoleFull)** for PR 16193 at commit

[GitHub] spark issue #16244: [SQL][minor] simplify a test to fix the maven tests

2016-12-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16244 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/69955/ Test PASSed. ---

[GitHub] spark issue #16244: [SQL][minor] simplify a test to fix the maven tests

2016-12-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16244 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16244: [SQL][minor] simplify a test to fix the maven tests

2016-12-09 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16244 **[Test build #69955 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/69955/consoleFull)** for PR 16244 at commit

[GitHub] spark issue #16243: [SPARK-18815] [SQL] Fix NPE when collecting column stats...

2016-12-09 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16243 **[Test build #69960 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/69960/consoleFull)** for PR 16243 at commit

[GitHub] spark pull request #16243: [SPARK-18815] [SQL] Fix NPE when collecting colum...

2016-12-09 Thread wzhfy
Github user wzhfy commented on a diff in the pull request: https://github.com/apache/spark/pull/16243#discussion_r91830908 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/StatisticsCollectionSuite.scala --- @@ -180,35 +214,28 @@ abstract class StatisticsCollectionTestBase

[GitHub] spark pull request #16243: [SPARK-18815] [SQL] Fix NPE when collecting colum...

2016-12-09 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/16243#discussion_r91830853 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/StatisticsCollectionSuite.scala --- @@ -180,35 +214,28 @@ abstract class StatisticsCollectionTestBase

[GitHub] spark issue #16243: [SPARK-18815] [SQL] Fix NPE when collecting column stats...

2016-12-09 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16243 **[Test build #69959 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/69959/consoleFull)** for PR 16243 at commit

[GitHub] spark issue #13909: [SPARK-16213][SQL] Reduce runtime overhead of a program ...

2016-12-09 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/13909 I think it's nothing to do with physical operators, we are talking about `CreateArray` right? Avoiding to create the intermedia primitive array should be faster generally. --- If your project

[GitHub] spark issue #13909: [SPARK-16213][SQL] Reduce runtime overhead of a program ...

2016-12-09 Thread kiszk
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/13909 I understand what you want to do. I agree with this in this case. This is because the child of `Project` is an array creation. == Physical Plan == *Project [array((value#2 + 1.1),

[GitHub] spark pull request #16243: [SPARK-18815] [SQL] Fix NPE when collecting colum...

2016-12-09 Thread wzhfy
Github user wzhfy commented on a diff in the pull request: https://github.com/apache/spark/pull/16243#discussion_r91830643 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/StatisticsCollectionSuite.scala --- @@ -133,6 +133,79 @@ class StatisticsCollectionSuite extends

[GitHub] spark issue #16220: [SPARK-18796][SS]StreamingQueryManager should not block ...

2016-12-09 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16220 **[Test build #69958 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/69958/consoleFull)** for PR 16220 at commit

[GitHub] spark pull request #16238: [SPARK-18811] StreamSource resolution should happ...

2016-12-09 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/16238#discussion_r91830440 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala --- @@ -214,6 +228,10 @@ class StreamExecution(

[GitHub] spark issue #13909: [SPARK-16213][SQL] Reduce runtime overhead of a program ...

2016-12-09 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/13909 Yea we can avoid the intermediate primitive array. Maybe we can benchmark against it also. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as

[GitHub] spark issue #16244: [SQL][minor] simplify a test to fix the maven tests

2016-12-09 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/16244 Actually I can't reproduce this issue locally, but by looking at the logs, I'm 90% percent sure this is the cause. The only way to verify it may be merging and checking the jenkins maven status

[GitHub] spark issue #15071: [SPARK-17517][SQL]Improve generated Code for BroadcastHa...

2016-12-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15071 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark issue #13909: [SPARK-16213][SQL] Reduce runtime overhead of a program ...

2016-12-09 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/13909 In `CreateArray` we can just evaluate each expression and write their results to unsafe array with the writer, no intermediate data need to be created. --- If your project is set up for it, you

[GitHub] spark issue #16193: [SPARK-18766] [SQL] Push Down Filter Through BatchEvalPy...

2016-12-09 Thread davies
Github user davies commented on the issue: https://github.com/apache/spark/pull/16193 Pushing down predicates into data source is also during optimization in planner, I think this one is not the first that do optimization outside Optimizer. --- If your project is set up for it, you

[GitHub] spark issue #16193: [SPARK-18766] [SQL] Push Down Filter Through BatchEvalPy...

2016-12-09 Thread davies
Github user davies commented on the issue: https://github.com/apache/spark/pull/16193 The reason we move the PythonUDFEvaluator from logical plan into physical plan, because this one-off break many things, many rules need to treat specially. --- If your project is set up for it,

[GitHub] spark issue #16213: [SPARK-18020][Streaming][Kinesis] Checkpoint SHARD_END t...

2016-12-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16213 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/69956/ Test PASSed. ---

[GitHub] spark pull request #16238: [SPARK-18811] StreamSource resolution should happ...

2016-12-09 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/16238 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark issue #16213: [SPARK-18020][Streaming][Kinesis] Checkpoint SHARD_END t...

2016-12-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16213 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16213: [SPARK-18020][Streaming][Kinesis] Checkpoint SHARD_END t...

2016-12-09 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16213 **[Test build #69956 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/69956/consoleFull)** for PR 16213 at commit

[GitHub] spark issue #16238: [SPARK-18811] StreamSource resolution should happen in s...

2016-12-09 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/16238 Merging to master and 2.1. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark pull request #16238: [SPARK-18811] StreamSource resolution should happ...

2016-12-09 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/16238#discussion_r91830357 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala --- @@ -214,6 +228,10 @@ class StreamExecution(

[GitHub] spark issue #16193: [SPARK-18766] [SQL] Push Down Filter Through BatchEvalPy...

2016-12-09 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/16193 If we add a logical node for python evaluator, we'd push down the Filter through it, so the optimizer rule won't combine two Filter into one again? --- If your project is set up for it, you can

[GitHub] spark pull request #16238: [SPARK-18811] StreamSource resolution should happ...

2016-12-09 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/16238#discussion_r91830324 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala --- @@ -214,6 +228,10 @@ class StreamExecution(

[GitHub] spark issue #16193: [SPARK-18766] [SQL] Push Down Filter Through BatchEvalPy...

2016-12-09 Thread davies
Github user davies commented on the issue: https://github.com/apache/spark/pull/16193 @cloud-fan It's not trivial to do this in optimizer, for example, we should split one Filter into two, that will conflict with another optimizer rule, that combine two filter into one. --- If your

[GitHub] spark issue #13909: [SPARK-16213][SQL] Reduce runtime overhead of a program ...

2016-12-09 Thread kiszk
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/13909 @cloud-fan Yeah, to keep primitive array and to use `UnsafeArrayWrite` can avoid extra bulkcopy. To achieve this avoidance, I think that we need to create `GenericArrayData` after enabling

[GitHub] spark issue #16244: [SQL][minor] simplify a test to fix the maven tests

2016-12-09 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/16244 How do we make sure the fixed test passes Maven-based tests? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request #16134: [SPARK-18703] [SQL] Drop Staging Directories and ...

2016-12-09 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/16134#discussion_r91829975 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/InsertIntoHiveTableSuite.scala --- @@ -166,6 +166,30 @@ class InsertIntoHiveTableSuite

[GitHub] spark issue #16243: [SPARK-18815] [SQL] Fix NPE when collecting column stats...

2016-12-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16243 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/69953/ Test PASSed. ---

[GitHub] spark issue #16243: [SPARK-18815] [SQL] Fix NPE when collecting column stats...

2016-12-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16243 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request #16134: [SPARK-18703] [SQL] Drop Staging Directories and ...

2016-12-09 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/16134#discussion_r91829943 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/InsertIntoHiveTable.scala --- @@ -85,6 +87,7 @@ case class InsertIntoHiveTable(

[GitHub] spark issue #16243: [SPARK-18815] [SQL] Fix NPE when collecting column stats...

2016-12-09 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16243 **[Test build #69953 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/69953/consoleFull)** for PR 16243 at commit

[GitHub] spark pull request #16244: [SQL][minor] simplify a test to fix the maven tes...

2016-12-09 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/16244#discussion_r91829910 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CodeGenerationSuite.scala --- @@ -98,20 +98,15 @@ class

[GitHub] spark issue #16222: [SPARK-18797][SparkR]:Update spark.logit in sparkr-vigne...

2016-12-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16222 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16193: [SPARK-18766] [SQL] Push Down Filter Through BatchEvalPy...

2016-12-09 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/16193 It's a little hacky to me that we do optimization in a planner. How hard is it if we introduce a logical node for python evaluator? We can define an interface in catalyst, e.g.

[GitHub] spark issue #16222: [SPARK-18797][SparkR]:Update spark.logit in sparkr-vigne...

2016-12-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16222 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/69954/ Test PASSed. ---

[GitHub] spark issue #16222: [SPARK-18797][SparkR]:Update spark.logit in sparkr-vigne...

2016-12-09 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16222 **[Test build #69954 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/69954/consoleFull)** for PR 16222 at commit

[GitHub] spark pull request #16193: [SPARK-18766] [SQL] Push Down Filter Through Batc...

2016-12-09 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/16193#discussion_r91829876 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/python/BatchEvalPythonExecSuite.scala --- @@ -0,0 +1,109 @@ +/* + * Licensed to

[GitHub] spark issue #16175: [SPARK-17460][SQL]Make sure sizeInBytes in Statistics wi...

2016-12-09 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16175 **[Test build #69957 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/69957/consoleFull)** for PR 16175 at commit

[GitHub] spark issue #16213: [SPARK-18020][Streaming][Kinesis] Checkpoint SHARD_END t...

2016-12-09 Thread maropu
Github user maropu commented on the issue: https://github.com/apache/spark/pull/16213 Thanks for these comments! ya, I do not like this approach, too. But, since those who reshard streams always hit this issue and resharding is important for load-balancing in Kinesis streams

[GitHub] spark issue #16213: [SPARK-18020][Streaming][Kinesis] Checkpoint SHARD_END t...

2016-12-09 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16213 **[Test build #69956 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/69956/consoleFull)** for PR 16213 at commit

[GitHub] spark pull request #16213: [SPARK-18020][Streaming][Kinesis] Checkpoint SHAR...

2016-12-09 Thread maropu
Github user maropu commented on a diff in the pull request: https://github.com/apache/spark/pull/16213#discussion_r91829798 --- Diff: external/kinesis-asl/src/test/scala/org/apache/spark/streaming/kinesis/KinesisStreamSuite.scala --- @@ -225,6 +225,74 @@ abstract class

[GitHub] spark issue #16175: [SPARK-17460][SQL]Make sure sizeInBytes in Statistics wi...

2016-12-09 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/16175 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes

[GitHub] spark issue #13909: [SPARK-16213][SQL] Reduce runtime overhead of a program ...

2016-12-09 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/13909 creating a primitive array and setting element values is also element-wise copy right? And we need an extra bulkcopy to write the primitive array to unsafe array. By using unsafe array writer, we

[GitHub] spark issue #16222: [SPARK-18797][SparkR]:Update spark.logit in sparkr-vigne...

2016-12-09 Thread wangmiao1981
Github user wangmiao1981 commented on the issue: https://github.com/apache/spark/pull/16222 @mengxr I used the following R code and glmnet to check whether `regParam = 0.5` fits a good model. > iris2 <- iris[iris$Species %in% c("versicolor", "virginica"), ] >

[GitHub] spark pull request #16213: [SPARK-18020][Streaming][Kinesis] Checkpoint SHAR...

2016-12-09 Thread maropu
Github user maropu commented on a diff in the pull request: https://github.com/apache/spark/pull/16213#discussion_r91829740 --- Diff: external/kinesis-asl/src/main/java/com/amazonaws/services/kinesis/clientlibrary/lib/worker/CheckpointerShim.java --- @@ -0,0 +1,43 @@ +/*

[GitHub] spark pull request #16244: [SQL][minor] simplify a test to fix the maven tes...

2016-12-09 Thread kapilsingh5050
Github user kapilsingh5050 commented on a diff in the pull request: https://github.com/apache/spark/pull/16244#discussion_r91829731 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CodeGenerationSuite.scala --- @@ -98,20 +98,15 @@ class

[GitHub] spark pull request #16228: [WIP] [SPARK-17076] [SQL] Cardinality estimation ...

2016-12-09 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/16228#discussion_r91829681 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/estimation/JoinEstimation.scala --- @@ -0,0 +1,175 @@ +/* + *

[GitHub] spark pull request #16244: [SQL][minor] simplify a test to fix the maven tes...

2016-12-09 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/16244#discussion_r91829627 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CodeGenerationSuite.scala --- @@ -98,20 +98,15 @@ class

[GitHub] spark pull request #16244: [SQL][minor] simplify a test to fix the maven tes...

2016-12-09 Thread kapilsingh5050
Github user kapilsingh5050 commented on a diff in the pull request: https://github.com/apache/spark/pull/16244#discussion_r91829600 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CodeGenerationSuite.scala --- @@ -98,20 +98,15 @@ class

[GitHub] spark pull request #16243: [SPARK-18815] [SQL] Fix NPE when collecting colum...

2016-12-09 Thread wzhfy
Github user wzhfy commented on a diff in the pull request: https://github.com/apache/spark/pull/16243#discussion_r91829574 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/Statistics.scala --- @@ -213,7 +214,8 @@ object ColumnStat extends Logging

[GitHub] spark pull request #16179: [SPARK-18752][hive] "isSrcLocal" value should be ...

2016-12-09 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/16179#discussion_r91829517 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveCommandSuite.scala --- @@ -418,4 +431,19 @@ class HiveCommandSuite extends

[GitHub] spark pull request #16243: [SPARK-18815] [SQL] Fix NPE when collecting colum...

2016-12-09 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/16243#discussion_r91829511 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/StatisticsCollectionSuite.scala --- @@ -133,6 +133,79 @@ class StatisticsCollectionSuite extends

[GitHub] spark issue #16244: [SQL][minor] simplify a test to fix the maven tests

2016-12-09 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16244 **[Test build #69955 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/69955/consoleFull)** for PR 16244 at commit

[GitHub] spark issue #16244: [SQL][minor] simplify a test to fix the maven tests

2016-12-09 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/16244 cc @srowen @kapilsingh5050 @ueshin @viirya --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request #16244: [SQL][minor] simplify a test to fix the maven tes...

2016-12-09 Thread cloud-fan
GitHub user cloud-fan opened a pull request: https://github.com/apache/spark/pull/16244 [SQL][minor] simplify a test to fix the maven tests ## What changes were proposed in this pull request? After https://github.com/apache/spark/pull/15620 , all of the Maven-based 2.0

[GitHub] spark issue #16222: [SPARK-18797][SparkR]:Update spark.logit in sparkr-vigne...

2016-12-09 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16222 **[Test build #69954 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/69954/consoleFull)** for PR 16222 at commit

[GitHub] spark pull request #16243: [SPARK-18815] [SQL] Fix NPE when collecting colum...

2016-12-09 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/16243#discussion_r91829428 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/StatisticsCollectionSuite.scala --- @@ -133,6 +133,79 @@ class StatisticsCollectionSuite extends

[GitHub] spark pull request #16243: [SPARK-18815] [SQL] Fix NPE when collecting colum...

2016-12-09 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/16243#discussion_r91829432 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/StatisticsCollectionSuite.scala --- @@ -133,6 +133,79 @@ class StatisticsCollectionSuite extends

[GitHub] spark pull request #16179: [SPARK-18752][hive] "isSrcLocal" value should be ...

2016-12-09 Thread vanzin
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/16179#discussion_r91829413 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveCommandSuite.scala --- @@ -418,4 +431,19 @@ class HiveCommandSuite extends

[GitHub] spark issue #16179: [SPARK-18752][hive] "isSrcLocal" value should be set fro...

2016-12-09 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/16179 The changes LGTM, as we do propagate the `isSrcLocal` incorrectly. It would be better if we can also fix the inconsistent behavior of `LOAD DATA` between spark and hive, and improve the test

[GitHub] spark pull request #16243: [SPARK-18815] [SQL] Fix NPE when collecting colum...

2016-12-09 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/16243#discussion_r91829403 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/Statistics.scala --- @@ -213,7 +214,8 @@ object ColumnStat extends Logging {

[GitHub] spark pull request #16179: [SPARK-18752][hive] "isSrcLocal" value should be ...

2016-12-09 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/16179#discussion_r91829355 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveCommandSuite.scala --- @@ -418,4 +431,19 @@ class HiveCommandSuite extends

[GitHub] spark pull request #16222: [SPARK-18797][SparkR]:Update spark.logit in spark...

2016-12-09 Thread wangmiao1981
Github user wangmiao1981 commented on a diff in the pull request: https://github.com/apache/spark/pull/16222#discussion_r91829349 --- Diff: R/pkg/vignettes/sparkr-vignettes.Rmd --- @@ -768,8 +768,46 @@ newDF <- createDataFrame(data.frame(x = c(1.5, 3.2)))

[GitHub] spark pull request #16243: [SPARK-18815] [SQL] Fix NPE when collecting colum...

2016-12-09 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/16243#discussion_r91829332 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/StatisticsCollectionSuite.scala --- @@ -133,6 +133,79 @@ class StatisticsCollectionSuite extends

[GitHub] spark pull request #15620: [SPARK-18091] [SQL] Deep if expressions cause Gen...

2016-12-09 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/15620#discussion_r91829129 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CodeGenerationSuite.scala --- @@ -97,6 +97,27 @@ class

[GitHub] spark issue #16230: [SPARK-13747][Core]Fix potential ThreadLocal leaks in RP...

2016-12-09 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16230 **[Test build #3486 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3486/consoleFull)** for PR 16230 at commit

[GitHub] spark issue #15717: [SPARK-17910][SQL] Allow users to update the comment of ...

2016-12-09 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/15717 ok then the parser rule looks good to me, my only concern is the new APIs in `ExternalCatalog`, I don't think they are necessary, @jiangxb1987 what's the motivation you added them? --- If your

[GitHub] spark pull request #16135: [SPARK-18700][SQL] Add StripedLock for each table...

2016-12-09 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/16135#discussion_r91828427 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveMetastoreCatalog.scala --- @@ -53,6 +56,18 @@ private[hive] class

[GitHub] spark pull request #16135: [SPARK-18700][SQL] Add StripedLock for each table...

2016-12-09 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/16135#discussion_r91828417 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveMetastoreCatalog.scala --- @@ -53,6 +56,18 @@ private[hive] class

[GitHub] spark pull request #16135: [SPARK-18700][SQL] Add StripedLock for each table...

2016-12-09 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/16135#discussion_r91828385 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveMetastoreCatalog.scala --- @@ -53,6 +56,18 @@ private[hive] class

[GitHub] spark pull request #16135: [SPARK-18700][SQL] Add StripedLock for each table...

2016-12-09 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/16135#discussion_r91828390 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveMetastoreCatalog.scala --- @@ -53,6 +56,18 @@ private[hive] class

[GitHub] spark pull request #16135: [SPARK-18700][SQL] Add StripedLock for each table...

2016-12-09 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/16135#discussion_r91828378 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveMetastoreCatalog.scala --- @@ -33,6 +35,7 @@ import

[GitHub] spark pull request #16135: [SPARK-18700][SQL] Add StripedLock for each table...

2016-12-09 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/16135#discussion_r91828410 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/PartitionedTablePerfStatsSuite.scala --- @@ -352,4 +353,28 @@ class

[GitHub] spark pull request #16135: [SPARK-18700][SQL] Add StripedLock for each table...

2016-12-09 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/16135#discussion_r91828396 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/PartitionedTablePerfStatsSuite.scala --- @@ -352,4 +353,28 @@ class

[GitHub] spark pull request #16135: [SPARK-18700][SQL] Add StripedLock for each table...

2016-12-09 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/16135#discussion_r91828375 --- Diff: core/src/main/scala/org/apache/spark/metrics/source/StaticSources.scala --- @@ -97,6 +97,12 @@ object HiveCatalogMetrics extends Source {

[GitHub] spark issue #16204: [SPARK-18775][SQL] Limit the max number of records writt...

2016-12-09 Thread ericl
Github user ericl commented on the issue: https://github.com/apache/spark/pull/16204 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the

[GitHub] spark issue #16243: [SPARK-18815] [SQL] Fix NPE when collecting column stats...

2016-12-09 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16243 **[Test build #69953 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/69953/consoleFull)** for PR 16243 at commit

[GitHub] spark issue #16238: [SPARK-18811] StreamSource resolution should happen in s...

2016-12-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16238 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16238: [SPARK-18811] StreamSource resolution should happen in s...

2016-12-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16238 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/69952/ Test PASSed. ---

[GitHub] spark issue #16238: [SPARK-18811] StreamSource resolution should happen in s...

2016-12-09 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16238 **[Test build #69952 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/69952/consoleFull)** for PR 16238 at commit

[GitHub] spark issue #16243: [SPARK-18815] [SQL] Fix NPE when collecting column stats...

2016-12-09 Thread wzhfy
Github user wzhfy commented on the issue: https://github.com/apache/spark/pull/16243 cc @cloud-fan @rxin --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so,

[GitHub] spark pull request #16243: [SPARK-18815] [SQL] Fix NPE when collecting colum...

2016-12-09 Thread wzhfy
GitHub user wzhfy opened a pull request: https://github.com/apache/spark/pull/16243 [SPARK-18815] [SQL] Fix NPE when collecting column stats for string/binary column having only null values ## What changes were proposed in this pull request? During column stats collection,

[GitHub] spark issue #16220: [SPARK-18796][SS]StreamingQueryManager should not block ...

2016-12-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16220 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16220: [SPARK-18796][SS]StreamingQueryManager should not block ...

2016-12-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16220 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/69951/ Test PASSed. ---

[GitHub] spark issue #16220: [SPARK-18796][SS]StreamingQueryManager should not block ...

2016-12-09 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16220 **[Test build #69951 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/69951/consoleFull)** for PR 16220 at commit

[GitHub] spark pull request #16237: [SPARK-18807][SPARKR] Should suppress output prin...

2016-12-09 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/16237 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark issue #16237: [SPARK-18807][SPARKR] Should suppress output print for c...

2016-12-09 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/16237 Merging into master, branch-2.1 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #15936: [SPARK-18504][SQL] Scalar subquery with extra group by c...

2016-12-09 Thread ericl
Github user ericl commented on the issue: https://github.com/apache/spark/pull/15936 This seems to be causing https://issues.apache.org/jira/browse/SPARK-18814 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request #16228: [WIP] [SPARK-17076] [SQL] Cardinality estimation ...

2016-12-09 Thread wzhfy
Github user wzhfy commented on a diff in the pull request: https://github.com/apache/spark/pull/16228#discussion_r91826623 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/estimation/JoinEstimation.scala --- @@ -0,0 +1,175 @@ +/* + *

[GitHub] spark pull request #16228: [WIP] [SPARK-17076] [SQL] Cardinality estimation ...

2016-12-09 Thread wzhfy
Github user wzhfy commented on a diff in the pull request: https://github.com/apache/spark/pull/16228#discussion_r91826564 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/estimation/JoinEstimation.scala --- @@ -0,0 +1,175 @@ +/* + *

[GitHub] spark issue #16228: [WIP] [SPARK-17076] [SQL] Cardinality estimation for joi...

2016-12-09 Thread wzhfy
Github user wzhfy commented on the issue: https://github.com/apache/spark/pull/16228 @Tagar Thanks for sharing this information. Yes, it would be better to use PK/FK, but it won't be done in this pr, and we need to implement PK/FK constraints in Spark first. > the

[GitHub] spark issue #16230: [SPARK-13747][Core]Fix potential ThreadLocal leaks in RP...

2016-12-09 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16230 **[Test build #3486 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3486/consoleFull)** for PR 16230 at commit

[GitHub] spark issue #16230: [SPARK-13747][Core]Fix potential ThreadLocal leaks in RP...

2016-12-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16230 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16230: [SPARK-13747][Core]Fix potential ThreadLocal leaks in RP...

2016-12-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16230 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/69950/ Test FAILed. ---

  1   2   3   4   5   >