[GitHub] spark pull request #15054: [SPARK-17502] [SQL] Fix Multiple Bugs in DDL Stat...

2016-09-19 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/15054#discussion_r79540609 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalogSuite.scala --- @@ -444,7 +444,7 @@ class SessionCatalogSuite

[GitHub] spark issue #15135: [pyspark][group]pyspark GroupedData can't apply agg func...

2016-09-19 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/15135 I understand the reasons why you want to add this -- but I feel this is too esoteric and if we add this one, there are also a lot of other cases that can be added and I don't know where we would stop.

[GitHub] spark pull request #15054: [SPARK-17502] [SQL] Fix Multiple Bugs in DDL Stat...

2016-09-19 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/15054#discussion_r79540494 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala --- @@ -282,6 +271,24 @@ class SessionCatalog( }

[GitHub] spark issue #14959: [SPARK-17387][PYSPARK] Creating SparkContext() from pyth...

2016-09-19 Thread zjffdu
Github user zjffdu commented on the issue: https://github.com/apache/spark/pull/14959 ``` The internal SparkConf of the context will not be the same instance as conf. ``` This is the existing implementation that python is different from scala. But I think it is correct. I

[GitHub] spark issue #13513: [SPARK-15698][SQL][Streaming] Add the ability to remove ...

2016-09-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13513 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65631/ Test PASSed. ---

[GitHub] spark issue #13513: [SPARK-15698][SQL][Streaming] Add the ability to remove ...

2016-09-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13513 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #13513: [SPARK-15698][SQL][Streaming] Add the ability to remove ...

2016-09-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13513 **[Test build #65631 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65631/consoleFull)** for PR 13513 at commit

[GitHub] spark issue #15102: [SPARK-17346][SQL] Add Kafka source for Structured Strea...

2016-09-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15102 **[Test build #65636 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65636/consoleFull)** for PR 15102 at commit

[GitHub] spark issue #15054: [SPARK-17502] [SQL] Fix Multiple Bugs in DDL Statements ...

2016-09-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15054 **[Test build #65637 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65637/consoleFull)** for PR 15054 at commit

[GitHub] spark issue #15135: [pyspark][group]pyspark GroupedData can't apply agg func...

2016-09-19 Thread citoubest
Github user citoubest commented on the issue: https://github.com/apache/spark/pull/15135 @rxin @davies @srowen --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark issue #14808: [SPARK-17156][ML][EXAMPLE] Add multiclass logistic regre...

2016-09-19 Thread sethah
Github user sethah commented on the issue: https://github.com/apache/spark/pull/14808 https://github.com/apache/spark/pull/14834 is merged now. We did not implement a new API, but we can still update the logistic regression examples to show the new multiclass functionality. --- If

[GitHub] spark issue #15147: [SPARK-17545] [SQL] Handle additional time offset format...

2016-09-19 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/15147 I mean the problem in the JIRA is not reproduced in the master branch and therefore I believe we need another JIRA to describe the support for other time formats as the same one as casting

[GitHub] spark pull request #15054: [SPARK-17502] [SQL] Fix Multiple Bugs in DDL Stat...

2016-09-19 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/15054#discussion_r79539702 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala --- @@ -357,6 +346,21 @@ class SessionCatalog(

[GitHub] spark pull request #14959: [SPARK-17387][PYSPARK] Creating SparkContext() fr...

2016-09-19 Thread zjffdu
Github user zjffdu commented on a diff in the pull request: https://github.com/apache/spark/pull/14959#discussion_r79539542 --- Diff: python/pyspark/java_gateway.py --- @@ -50,13 +50,18 @@ def launch_gateway(): # proper classpath and settings from spark-env.sh

[GitHub] spark issue #14452: [SPARK-16849][SQL][WIP] Improve subquery execution by de...

2016-09-19 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/14452 @davies Thanks for comment. In our initial benchmark of the TPC-DS queries (totally 13) using CTE, this PR helps about half (6) of them, 5 queries are not affected, 2 queries are regressed.

[GitHub] spark issue #14852: [WIP][SPARK-17138][ML][MLib] Add Python API for multinom...

2016-09-19 Thread sethah
Github user sethah commented on the issue: https://github.com/apache/spark/pull/14852 Now that https://github.com/apache/spark/pull/14834 has been merged, we can make the updates to Python API. There is no new interface to implement, but it would be great if this PR could take care

[GitHub] spark issue #15148: [SPARK-5992][ML] Locality Sensitive Hashing

2016-09-19 Thread sethah
Github user sethah commented on the issue: https://github.com/apache/spark/pull/15148 A few high-level comments/questions: * Should this go into the `feature` package as a feature estimator/transformer? That is where other dimensionality reduction techniques have gone and

[GitHub] spark issue #14803: [SPARK-17153][SQL] Should read partition data when readi...

2016-09-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14803 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65627/ Test PASSed. ---

[GitHub] spark issue #14803: [SPARK-17153][SQL] Should read partition data when readi...

2016-09-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14803 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #14803: [SPARK-17153][SQL] Should read partition data when readi...

2016-09-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14803 **[Test build #65627 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65627/consoleFull)** for PR 14803 at commit

[GitHub] spark issue #13705: [SPARK-15472][SQL] Add support for writing in `csv` form...

2016-09-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13705 **[Test build #65635 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65635/consoleFull)** for PR 13705 at commit

[GitHub] spark pull request #15134: [SPARK-17580][CORE]Add random UUID as app name wh...

2016-09-19 Thread phalodi
Github user phalodi closed the pull request at: https://github.com/apache/spark/pull/15134 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request #15133: [SPARK-17578][Docs] Add spark.app.name default va...

2016-09-19 Thread phalodi
Github user phalodi closed the pull request at: https://github.com/apache/spark/pull/15133 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request #15126: [SPARK-17513][SQL] Make StreamExecution garbage-c...

2016-09-19 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/15126 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark issue #15126: [SPARK-17513][SQL] Make StreamExecution garbage-collect ...

2016-09-19 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/15126 Merging in master/2.0. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes

[GitHub] spark issue #15067: [SPARK-17513] [STREAMING] [SQL] Make StreamExecution gar...

2016-09-19 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/15067 @frreiss can you close this now? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark issue #15126: [SPARK-17513][SQL] Make StreamExecution garbage-collect ...

2016-09-19 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/15126 Since @frreiss hasn't updated the pr yet, I'm going to merge this one and assign the jira ticket to Fred. --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark issue #14634: [SPARK-17051][SQL] we should use hadoopConf in InsertInt...

2016-09-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14634 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65629/ Test PASSed. ---

[GitHub] spark issue #14634: [SPARK-17051][SQL] we should use hadoopConf in InsertInt...

2016-09-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14634 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #14634: [SPARK-17051][SQL] we should use hadoopConf in InsertInt...

2016-09-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14634 **[Test build #65629 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65629/consoleFull)** for PR 14634 at commit

[GitHub] spark issue #15157: Revert "[SPARK-17549][SQL] Only collect table size stat ...

2016-09-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15157 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #15157: Revert "[SPARK-17549][SQL] Only collect table size stat ...

2016-09-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15157 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65624/ Test PASSed. ---

[GitHub] spark issue #15157: Revert "[SPARK-17549][SQL] Only collect table size stat ...

2016-09-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15157 **[Test build #65624 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65624/consoleFull)** for PR 15157 at commit

[GitHub] spark issue #15158: [SPARK-17603] [SQL] Utilize Hive-generated Statistics Fo...

2016-09-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15158 **[Test build #65634 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65634/consoleFull)** for PR 15158 at commit

[GitHub] spark issue #15147: [SPARK-17545] [SQL] Handle additional time offset format...

2016-09-19 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/15147 @nbeyer Thanks for your investigation. I think that sounds reasonable though I think it might be arguable because adding more cases virtually means more time and computation to parse/infer

[GitHub] spark issue #14803: [SPARK-17153][SQL] Should read partition data when readi...

2016-09-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14803 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65625/ Test PASSed. ---

[GitHub] spark issue #14803: [SPARK-17153][SQL] Should read partition data when readi...

2016-09-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14803 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #14803: [SPARK-17153][SQL] Should read partition data when readi...

2016-09-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14803 **[Test build #65625 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65625/consoleFull)** for PR 14803 at commit

[GitHub] spark pull request #15158: [SPARK-17603] [SQL] Utilize Hive-generated Statis...

2016-09-19 Thread gatorsmile
GitHub user gatorsmile opened a pull request: https://github.com/apache/spark/pull/15158 [SPARK-17603] [SQL] Utilize Hive-generated Statistics For Partitioned Tables ### What changes were proposed in this pull request? For non-partitioned tables, Hive-generated statistics are

[GitHub] spark issue #15090: [SPARK-17073] [SQL] generate column-level statistics

2016-09-19 Thread wzhfy
Github user wzhfy commented on the issue: https://github.com/apache/spark/pull/15090 This pr has been updated based on all the above comments, changes are as follows: 1. Modify analyze syntax a little bit: `identifierSeq` is now non-optional, i.e. users must specify column names

[GitHub] spark issue #15090: [SPARK-17073] [SQL] generate column-level statistics

2016-09-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15090 **[Test build #65633 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65633/consoleFull)** for PR 15090 at commit

[GitHub] spark pull request #14834: [SPARK-17163][ML] Unified LogisticRegression inte...

2016-09-19 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/14834 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark issue #14834: [SPARK-17163][ML] Unified LogisticRegression interface

2016-09-19 Thread dbtsai
Github user dbtsai commented on the issue: https://github.com/apache/spark/pull/14834 Merged into master. Thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark issue #15147: [SPARK-17545] [SQL] Handle additional time offset format...

2016-09-19 Thread nbeyer
Github user nbeyer commented on the issue: https://github.com/apache/spark/pull/15147 @HyukjinKwon Based on my further reading of the code, I'd like to suggest that add a deprecation to the stringToTime method and then update the stringToTimestamp method, specifically here

[GitHub] spark issue #14803: [SPARK-17153][SQL] Should read partition data when readi...

2016-09-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14803 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #14803: [SPARK-17153][SQL] Should read partition data when readi...

2016-09-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14803 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65623/ Test PASSed. ---

[GitHub] spark issue #14803: [SPARK-17153][SQL] Should read partition data when readi...

2016-09-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14803 **[Test build #65623 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65623/consoleFull)** for PR 14803 at commit

[GitHub] spark issue #15150: [SPARK-17595] [MLLib] Use a bounded priority queue to fi...

2016-09-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15150 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #15150: [SPARK-17595] [MLLib] Use a bounded priority queue to fi...

2016-09-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15150 **[Test build #65632 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65632/consoleFull)** for PR 15150 at commit

[GitHub] spark issue #15150: [SPARK-17595] [MLLib] Use a bounded priority queue to fi...

2016-09-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15150 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65632/ Test FAILed. ---

[GitHub] spark issue #15082: [SPARK-17528][SQL] MutableProjection should not cache co...

2016-09-19 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/15082 I re-targeted it to 2.1 only. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request #15148: [SPARK-5992][ML] Locality Sensitive Hashing

2016-09-19 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/15148#discussion_r79533403 --- Diff: mllib/src/main/scala/org/apache/spark/ml/lsh/LSH.scala --- @@ -0,0 +1,270 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under

[GitHub] spark issue #15150: [SPARK-17595] [MLLib] Use a bounded priority queue to fi...

2016-09-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15150 **[Test build #65632 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65632/consoleFull)** for PR 15150 at commit

[GitHub] spark pull request #15046: [SPARK-17492] [SQL] Fix Reading Cataloged Data So...

2016-09-19 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/15046#discussion_r79533281 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/test/DataFrameReaderWriterSuite.scala --- @@ -293,6 +293,39 @@ class DataFrameReaderWriterSuite

[GitHub] spark pull request #15046: [SPARK-17492] [SQL] Fix Reading Cataloged Data So...

2016-09-19 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/15046#discussion_r79533310 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala --- @@ -327,8 +327,13 @@ case class DataSource(

[GitHub] spark pull request #15046: [SPARK-17492] [SQL] Fix Reading Cataloged Data So...

2016-09-19 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/15046#discussion_r79533043 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/sources/TableScanSuite.scala --- @@ -345,34 +345,72 @@ class TableScanSuite extends DataSourceTest

[GitHub] spark pull request #15046: [SPARK-17492] [SQL] Fix Reading Cataloged Data So...

2016-09-19 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/15046#discussion_r79532870 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/sources/TableScanSuite.scala --- @@ -345,34 +345,72 @@ class TableScanSuite extends DataSourceTest

[GitHub] spark pull request #15046: [SPARK-17492] [SQL] Fix Reading Cataloged Data So...

2016-09-19 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/15046#discussion_r79532807 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/sources/InsertSuite.scala --- @@ -65,6 +65,26 @@ class InsertSuite extends DataSourceTest with

[GitHub] spark issue #15148: [SPARK-5992][ML] Locality Sensitive Hashing

2016-09-19 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/15148 @Yunni Thanks for working on this. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #13513: [SPARK-15698][SQL][Streaming] Add the ability to remove ...

2016-09-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13513 **[Test build #65631 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65631/consoleFull)** for PR 13513 at commit

[GitHub] spark pull request #15148: [SPARK-5992][ML] Locality Sensitive Hashing

2016-09-19 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/15148#discussion_r79532298 --- Diff: mllib/src/main/scala/org/apache/spark/ml/lsh/LSH.scala --- @@ -0,0 +1,270 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under

[GitHub] spark issue #15155: [SPARK-17477][SQL] SparkSQL cannot handle schema evoluti...

2016-09-19 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/15155 Yea. I meant if we want to read old/new Parquet files without user-given schema with enabling merging schemas, then, we'd face SPARK-15516 first. This is why I thought that JIRA blocks this

[GitHub] spark issue #14784: [SPARK-17210][SPARKR] sparkr.zip is not distributed to e...

2016-09-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14784 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #15146: [SPARK-17590][SQL] Analyze CTE definitions at once and a...

2016-09-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15146 **[Test build #65630 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65630/consoleFull)** for PR 15146 at commit

[GitHub] spark issue #14784: [SPARK-17210][SPARKR] sparkr.zip is not distributed to e...

2016-09-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14784 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65626/ Test PASSed. ---

[GitHub] spark issue #14784: [SPARK-17210][SPARKR] sparkr.zip is not distributed to e...

2016-09-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14784 **[Test build #65626 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65626/consoleFull)** for PR 14784 at commit

[GitHub] spark issue #14634: [SPARK-17051][SQL] we should use hadoopConf in InsertInt...

2016-09-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14634 **[Test build #65629 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65629/consoleFull)** for PR 14634 at commit

[GitHub] spark issue #15146: [SPARK-17590][SQL] Analyze CTE definitions at once and a...

2016-09-19 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/15146 I guess if using the same analyzed plan increases the chance to reuse exchange, then it may improve the performance. Anyway, it is not the purpose of this change. Because the analyzed subquery plan

[GitHub] spark issue #13513: [SPARK-15698][SQL][Streaming] Add the ability to remove ...

2016-09-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13513 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65628/ Test FAILed. ---

[GitHub] spark issue #13513: [SPARK-15698][SQL][Streaming] Add the ability to remove ...

2016-09-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13513 **[Test build #65628 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65628/consoleFull)** for PR 13513 at commit

[GitHub] spark issue #13513: [SPARK-15698][SQL][Streaming] Add the ability to remove ...

2016-09-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13513 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #15155: [SPARK-17477][SQL] SparkSQL cannot handle schema evoluti...

2016-09-19 Thread wgtmac
Github user wgtmac commented on the issue: https://github.com/apache/spark/pull/15155 @HyukjinKwon Yup this PR is very similar to yours. For merging parquet schema, it won't work. Think about this: the table contains two parquet files, one has int, one has long. The DataFrame

[GitHub] spark issue #15146: [SPARK-17590][SQL] Analyze CTE definitions at once and a...

2016-09-19 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/15146 @hvanhovell This is for analyzer change and adds CTE in CTE feature. I don't expect there is performance improvement. --- If your project is set up for it, you can reply to this email and have your

[GitHub] spark issue #13513: [SPARK-15698][SQL][Streaming] Add the ability to remove ...

2016-09-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13513 **[Test build #65628 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65628/consoleFull)** for PR 13513 at commit

[GitHub] spark issue #14803: [SPARK-17153][SQL] Should read partition data when readi...

2016-09-19 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/14803 @marmbrus > * I think that for all but text you have to include the partition columns in the schema if inference is turned off (which it is by default). For text format, when

[GitHub] spark issue #14803: [SPARK-17153][SQL] Should read partition data when readi...

2016-09-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14803 **[Test build #65627 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65627/consoleFull)** for PR 14803 at commit

[GitHub] spark pull request #15156: [SPARK-17160] Properly escape field names in code...

2016-09-19 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/15156 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark issue #15156: [SPARK-17160] Properly escape field names in code-genera...

2016-09-19 Thread JoshRosen
Github user JoshRosen commented on the issue: https://github.com/apache/spark/pull/15156 I'm going to merge this to master and branch-2.0. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request #13513: [SPARK-15698][SQL][Streaming] Add the ability to ...

2016-09-19 Thread jerryshao
Github user jerryshao commented on a diff in the pull request: https://github.com/apache/spark/pull/13513#discussion_r79530152 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/FileStreamSourceLog.scala --- @@ -0,0 +1,132 @@ +/* + * Licensed to

[GitHub] spark issue #14639: [SPARK-17054][SPARKR] SparkR can not run in yarn-cluster...

2016-09-19 Thread zjffdu
Github user zjffdu commented on the issue: https://github.com/apache/spark/pull/14639 Close it as it is resolved somewhere else. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request #14639: [SPARK-17054][SPARKR] SparkR can not run in yarn-...

2016-09-19 Thread zjffdu
Github user zjffdu closed the pull request at: https://github.com/apache/spark/pull/14639 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark issue #14784: [SPARK-17210][SPARKR] sparkr.zip is not distributed to e...

2016-09-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14784 **[Test build #65626 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65626/consoleFull)** for PR 14784 at commit

[GitHub] spark issue #14803: [SPARK-17153][SQL] Should read partition data when readi...

2016-09-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14803 **[Test build #65625 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65625/consoleFull)** for PR 14803 at commit

[GitHub] spark issue #15157: Revert "[SPARK-17549][SQL] Only collect table size stat ...

2016-09-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15157 **[Test build #65624 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65624/consoleFull)** for PR 15157 at commit

[GitHub] spark issue #14834: [SPARK-17163][ML] Unified LogisticRegression interface

2016-09-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14834 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65622/ Test PASSed. ---

[GitHub] spark issue #14834: [SPARK-17163][ML] Unified LogisticRegression interface

2016-09-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14834 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #14834: [SPARK-17163][ML] Unified LogisticRegression interface

2016-09-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14834 **[Test build #65622 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65622/consoleFull)** for PR 14834 at commit

[GitHub] spark issue #15157: Revert "[SPARK-17549][SQL] Only collect table size stat ...

2016-09-19 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/15157 cc @vanzin --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the

[GitHub] spark pull request #15157: Revert "[SPARK-17549][SQL] Only collect table siz...

2016-09-19 Thread yhuai
GitHub user yhuai opened a pull request: https://github.com/apache/spark/pull/15157 Revert "[SPARK-17549][SQL] Only collect table size stat in driver for cached relation." This reverts commit 39e2bad6a866d27c3ca594d15e574a1da3ee84cc because of the problem mentioned at

[GitHub] spark issue #14803: [SPARK-17153][SQL] Should read partition data when readi...

2016-09-19 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/14803 > * If the partition directories are not present when the stream starts then I believe this breaks. Yes. Schema inference only happens when starting the stream. > * I

[GitHub] spark issue #14634: [SPARK-17051][SQL] we should use hadoopConf in InsertInt...

2016-09-19 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/14634 This change looks good. Let's add a regression test. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request #15054: [SPARK-17502] [SQL] Fix Multiple Bugs in DDL Stat...

2016-09-19 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/15054#discussion_r79528505 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala --- @@ -357,6 +346,21 @@ class SessionCatalog(

[GitHub] spark pull request #14803: [SPARK-17153][SQL] Should read partition data whe...

2016-09-19 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/14803#discussion_r79526950 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/streaming/FileStreamSourceSuite.scala --- @@ -608,6 +608,34 @@ class FileStreamSourceSuite extends

[GitHub] spark issue #14803: [SPARK-17153][SQL] Should read partition data when readi...

2016-09-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14803 **[Test build #65623 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65623/consoleFull)** for PR 14803 at commit

[GitHub] spark issue #14634: [SPARK-17051][SQL] we should use hadoopConf in InsertInt...

2016-09-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14634 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65620/ Test PASSed. ---

[GitHub] spark issue #14634: [SPARK-17051][SQL] we should use hadoopConf in InsertInt...

2016-09-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14634 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #14634: [SPARK-17051][SQL] we should use hadoopConf in InsertInt...

2016-09-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14634 **[Test build #65620 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65620/consoleFull)** for PR 14634 at commit

[GitHub] spark pull request #15090: [SPARK-17073] [SQL] generate column-level statist...

2016-09-19 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/15090#discussion_r79521372 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/StatisticsColumnSuite.scala --- @@ -0,0 +1,228 @@ +/* + * Licensed to the Apache

[GitHub] spark issue #15034: [SPARK-16240][ML] ML persistence backward compatibility ...

2016-09-19 Thread JoshRosen
Github user JoshRosen commented on the issue: https://github.com/apache/spark/pull/15034 @jkbradley, it looks like this is legitimately failing MiMa (not sure why it passed on the first run...): ``` [error] * the type hierarchy of object

[GitHub] spark pull request #15090: [SPARK-17073] [SQL] generate column-level statist...

2016-09-19 Thread wzhfy
Github user wzhfy commented on a diff in the pull request: https://github.com/apache/spark/pull/15090#discussion_r79520564 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/StatisticsColumnSuite.scala --- @@ -0,0 +1,228 @@ +/* + * Licensed to the Apache Software

  1   2   3   4   5   >