[GitHub] spark pull request #16944: [SPARK-19611][SQL] Introduce configurable table s...

2017-02-24 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/16944#discussion_r103067980 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveMetastoreCatalog.scala --- @@ -161,22 +164,51 @@ private[hive] class

[GitHub] spark issue #17061: [SPARK-13446] [SQL] Support reading data from Hive 2.0.0...

2017-02-24 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17061 **[Test build #73458 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73458/testReport)** for PR 17061 at commit

[GitHub] spark pull request #17061: [SPARK-13446] [SQL] Support reading data from Hiv...

2017-02-24 Thread gatorsmile
GitHub user gatorsmile opened a pull request: https://github.com/apache/spark/pull/17061 [SPARK-13446] [SQL] Support reading data from Hive 2.0.0 metastore [WIP] ### What changes were proposed in this pull request? This PR is to make Spark work with Hive 2.x's metastores.

[GitHub] spark pull request #17001: [SPARK-19667][SQL]create table with hiveenabled i...

2017-02-24 Thread windpiger
Github user windpiger commented on a diff in the pull request: https://github.com/apache/spark/pull/17001#discussion_r103067748 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala --- @@ -408,7 +408,15 @@ private[spark] class

[GitHub] spark issue #16976: [SPARK-19610][SQL] Support parsing multiline CSV files

2017-02-24 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16976 **[Test build #73457 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73457/testReport)** for PR 16976 at commit

[GitHub] spark pull request #16826: [SPARK-19540][SQL] Add ability to clone SparkSess...

2017-02-24 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/16826#discussion_r103067630 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala --- @@ -274,6 +274,8 @@ private[hive] class HiveClientImpl(

[GitHub] spark pull request #17037: [MINOR][DOCS] Fix few typos in structured streami...

2017-02-24 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/17037 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark issue #17037: [MINOR][DOCS] Fix few typos in structured streaming doc

2017-02-24 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/17037 Merged to master --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or

[GitHub] spark pull request #16976: [SPARK-19610][SQL] Support parsing multiline CSV ...

2017-02-24 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/16976#discussion_r103067531 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/csv/CSVSuite.scala --- @@ -243,12 +244,15 @@ class CSVSuite extends

[GitHub] spark issue #16965: [Spark-18450][ML] Scala API Change for LSH AND-amplifica...

2017-02-24 Thread merlintang
Github user merlintang commented on the issue: https://github.com/apache/spark/pull/16965 @Yunni thanks, where I mention the L is the number of hash tables. By this way, the memory usage would be O(L*N). the approximate NN searching cost in one partition is O(L*N'). Where N

[GitHub] spark issue #17059: [SPARK-19733][ML]Removed unnecessary castings and refact...

2017-02-24 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/17059 You are effectively handling the cases that casting handles. Doesn't a Scala long get boxed another time now? I bet it works, just wondering why that's not handled like Int. I'm neutral on this and

[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-02-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15415 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-02-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15415 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73453/ Test PASSed. ---

[GitHub] spark issue #16976: [SPARK-19610][SQL] Support parsing multiline CSV files

2017-02-24 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16976 **[Test build #73456 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73456/testReport)** for PR 16976 at commit

[GitHub] spark issue #17059: [SPARK-19733][ML]Removed unnecessary castings and refact...

2017-02-24 Thread datumbox
Github user datumbox commented on the issue: https://github.com/apache/spark/pull/17059 Could you explain what you mean by "duplicating"? It is safe with Scala Long; I did lots of tests to ensure that it works well. If you require any changes, I'm happy to update the PR. --- If

[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-02-24 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15415 **[Test build #73453 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73453/testReport)** for PR 15415 at commit

[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-02-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15415 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73452/ Test PASSed. ---

[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-02-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15415 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-02-24 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15415 **[Test build #73452 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73452/testReport)** for PR 15415 at commit

[GitHub] spark issue #17059: [SPARK-19733][ML]Removed unnecessary castings and refact...

2017-02-24 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/17059 I get it, though the drawback is mostly that you've duplicated in a way the machinery that would construe any number as something comparable to the min/max int value. (Does this not miss the case of

[GitHub] spark pull request #16826: [SPARK-19540][SQL] Add ability to clone SparkSess...

2017-02-24 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/16826#discussion_r103066371 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/internal/SessionState.scala --- @@ -17,43 +17,60 @@ package org.apache.spark.sql.internal

[GitHub] spark issue #17059: [SPARK-19733][ML]Removed unnecessary castings and refact...

2017-02-24 Thread datumbox
Github user datumbox commented on the issue: https://github.com/apache/spark/pull/17059 Concerning the failed scala style test, this is caused by the long in-line comment that I added to explain the if statement. If you decide to approve the PR, I can remove it. Cheers! :) --- If

[GitHub] spark issue #17037: [MINOR][DOCS] Fix few typos in structured streaming doc

2017-02-24 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17037 **[Test build #3583 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3583/testReport)** for PR 17037 at commit

[GitHub] spark issue #16976: [SPARK-19610][SQL] Support parsing multiline CSV files

2017-02-24 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/16976 @cloud-fan, I am running a build with 2.10 per b4e6983. I think it is ready for another look. --- If your project is set up for it, you can reply to this email and have your reply appear on

[GitHub] spark issue #17059: [SPARK-19733][ML]Removed unnecessary castings and refact...

2017-02-24 Thread datumbox
Github user datumbox commented on the issue: https://github.com/apache/spark/pull/17059 @srowen I believe that this needs to be fixed for 2 reasons: 1. Casting the ids to double just to convert it back to integer is not an elegant solution and it is rather confusing. 2. The

[GitHub] spark issue #17059: [SPARK-19733][ML]Removed unnecessary castings and refact...

2017-02-24 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17059 **[Test build #3582 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3582/testReport)** for PR 17059 at commit

[GitHub] spark issue #16976: [SPARK-19610][SQL] Support parsing multiline CSV files

2017-02-24 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16976 **[Test build #73455 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73455/testReport)** for PR 16976 at commit

[GitHub] spark issue #17059: [SPARK-19733][ML]Removed unnecessary castings and refact...

2017-02-24 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17059 **[Test build #3582 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3582/testReport)** for PR 17059 at commit

[GitHub] spark issue #17037: [MINOR][DOCS] Fix few typos in structured streaming doc

2017-02-24 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17037 **[Test build #3583 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3583/testReport)** for PR 17037 at commit

[GitHub] spark pull request #16826: [SPARK-19540][SQL] Add ability to clone SparkSess...

2017-02-24 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/16826#discussion_r103061676 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalogSuite.scala --- @@ -1196,4 +1198,28 @@ class SessionCatalogSuite

[GitHub] spark pull request #16826: [SPARK-19540][SQL] Add ability to clone SparkSess...

2017-02-24 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/16826#discussion_r103061990 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/internal/SessionState.scala --- @@ -17,43 +17,60 @@ package org.apache.spark.sql.internal

[GitHub] spark pull request #16826: [SPARK-19540][SQL] Add ability to clone SparkSess...

2017-02-24 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/16826#discussion_r103061563 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalogSuite.scala --- @@ -1196,4 +1198,28 @@ class SessionCatalogSuite

[GitHub] spark pull request #16826: [SPARK-19540][SQL] Add ability to clone SparkSess...

2017-02-24 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/16826#discussion_r103065209 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SessionStateSuite.scala --- @@ -0,0 +1,132 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request #16826: [SPARK-19540][SQL] Add ability to clone SparkSess...

2017-02-24 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/16826#discussion_r103064184 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/internal/SharedState.scala --- @@ -136,6 +139,26 @@ private[sql] class SharedState(val sparkContext:

[GitHub] spark pull request #16826: [SPARK-19540][SQL] Add ability to clone SparkSess...

2017-02-24 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/16826#discussion_r103064301 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveSessionCatalog.scala --- @@ -212,3 +247,31 @@ private[sql] class HiveSessionCatalog(

[GitHub] spark pull request #16826: [SPARK-19540][SQL] Add ability to clone SparkSess...

2017-02-24 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/16826#discussion_r103062959 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/internal/SessionState.scala --- @@ -65,22 +82,118 @@ private[sql] class SessionState(sparkSession:

[GitHub] spark pull request #16826: [SPARK-19540][SQL] Add ability to clone SparkSess...

2017-02-24 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/16826#discussion_r103063696 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveSessionState.scala --- @@ -146,4 +107,153 @@ private[hive] class

[GitHub] spark pull request #16826: [SPARK-19540][SQL] Add ability to clone SparkSess...

2017-02-24 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/16826#discussion_r103063798 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/test/TestHive.scala --- @@ -144,11 +145,37 @@ private[hive] class TestHiveSparkSession(

[GitHub] spark pull request #16826: [SPARK-19540][SQL] Add ability to clone SparkSess...

2017-02-24 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/16826#discussion_r103064415 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveSessionState.scala --- @@ -146,4 +107,153 @@ private[hive] class

[GitHub] spark pull request #16826: [SPARK-19540][SQL] Add ability to clone SparkSess...

2017-02-24 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/16826#discussion_r103062683 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/internal/SessionState.scala --- @@ -90,110 +203,37 @@ private[sql] class SessionState(sparkSession:

[GitHub] spark pull request #16826: [SPARK-19540][SQL] Add ability to clone SparkSess...

2017-02-24 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/16826#discussion_r103065045 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveSessionCatalogSuite.scala --- @@ -0,0 +1,58 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request #16826: [SPARK-19540][SQL] Add ability to clone SparkSess...

2017-02-24 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/16826#discussion_r103063348 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/internal/SharedState.scala --- @@ -136,6 +139,26 @@ private[sql] class SharedState(val sparkContext:

[GitHub] spark pull request #16826: [SPARK-19540][SQL] Add ability to clone SparkSess...

2017-02-24 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/16826#discussion_r103064385 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveSessionState.scala --- @@ -17,89 +17,50 @@ package org.apache.spark.sql.hive

[GitHub] spark pull request #16826: [SPARK-19540][SQL] Add ability to clone SparkSess...

2017-02-24 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/16826#discussion_r103063700 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveSessionState.scala --- @@ -146,4 +107,153 @@ private[hive] class

[GitHub] spark pull request #16826: [SPARK-19540][SQL] Add ability to clone SparkSess...

2017-02-24 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/16826#discussion_r103065116 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveSessionStateSuite.scala --- @@ -0,0 +1,38 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request #16826: [SPARK-19540][SQL] Add ability to clone SparkSess...

2017-02-24 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/16826#discussion_r103064464 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveSessionStateSuite.scala --- @@ -0,0 +1,38 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request #16826: [SPARK-19540][SQL] Add ability to clone SparkSess...

2017-02-24 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/16826#discussion_r103062140 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/internal/SessionState.scala --- @@ -90,110 +203,37 @@ private[sql] class SessionState(sparkSession:

[GitHub] spark pull request #16826: [SPARK-19540][SQL] Add ability to clone SparkSess...

2017-02-24 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/16826#discussion_r103063173 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/internal/SharedState.scala --- @@ -136,6 +139,26 @@ private[sql] class SharedState(val sparkContext:

[GitHub] spark pull request #16826: [SPARK-19540][SQL] Add ability to clone SparkSess...

2017-02-24 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/16826#discussion_r103063544 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/internal/CatalogSuite.scala --- @@ -493,6 +493,28 @@ class CatalogSuite } } +

[GitHub] spark pull request #16826: [SPARK-19540][SQL] Add ability to clone SparkSess...

2017-02-24 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/16826#discussion_r103065263 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SessionStateSuite.scala --- @@ -0,0 +1,132 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request #16826: [SPARK-19540][SQL] Add ability to clone SparkSess...

2017-02-24 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/16826#discussion_r103062876 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/internal/SessionState.scala --- @@ -65,22 +82,118 @@ private[sql] class SessionState(sparkSession:

[GitHub] spark pull request #16826: [SPARK-19540][SQL] Add ability to clone SparkSess...

2017-02-24 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/16826#discussion_r103063507 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SessionStateSuite.scala --- @@ -0,0 +1,132 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request #16826: [SPARK-19540][SQL] Add ability to clone SparkSess...

2017-02-24 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/16826#discussion_r103062312 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/internal/SessionState.scala --- @@ -90,110 +203,37 @@ private[sql] class SessionState(sparkSession:

[GitHub] spark issue #17059: [SPARK-19733][ML]Removed unnecessary castings and refact...

2017-02-24 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/17059 I get it, but is that cast actually slowing things down measurably? because this change also adds some overhead in the checks it does. I think it's better to let the cast to double deal with the

[GitHub] spark pull request #16826: [SPARK-19540][SQL] Add ability to clone SparkSess...

2017-02-24 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/16826#discussion_r103064447 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveSessionState.scala --- @@ -146,4 +107,153 @@ private[hive] class

[GitHub] spark pull request #16826: [SPARK-19540][SQL] Add ability to clone SparkSess...

2017-02-24 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/16826#discussion_r103062258 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/internal/SessionState.scala --- @@ -90,110 +208,29 @@ private[sql] class SessionState(sparkSession:

[GitHub] spark pull request #16826: [SPARK-19540][SQL] Add ability to clone SparkSess...

2017-02-24 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/16826#discussion_r103063053 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/internal/SessionState.scala --- @@ -65,22 +82,118 @@ private[sql] class SessionState(sparkSession:

[GitHub] spark pull request #16826: [SPARK-19540][SQL] Add ability to clone SparkSess...

2017-02-24 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/16826#discussion_r103061761 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala --- @@ -1178,4 +1181,36 @@ class SessionCatalog( }

[GitHub] spark pull request #16826: [SPARK-19540][SQL] Add ability to clone SparkSess...

2017-02-24 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/16826#discussion_r103062718 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/internal/SessionState.scala --- @@ -65,22 +82,118 @@ private[sql] class SessionState(sparkSession:

[GitHub] spark pull request #16826: [SPARK-19540][SQL] Add ability to clone SparkSess...

2017-02-24 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/16826#discussion_r103062697 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/internal/SessionState.scala --- @@ -90,110 +203,37 @@ private[sql] class SessionState(sparkSession:

[GitHub] spark pull request #16826: [SPARK-19540][SQL] Add ability to clone SparkSess...

2017-02-24 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/16826#discussion_r103064704 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SessionStateSuite.scala --- @@ -0,0 +1,132 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request #16826: [SPARK-19540][SQL] Add ability to clone SparkSess...

2017-02-24 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/16826#discussion_r103064055 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/internal/SessionState.scala --- @@ -17,43 +17,60 @@ package org.apache.spark.sql.internal

[GitHub] spark pull request #16826: [SPARK-19540][SQL] Add ability to clone SparkSess...

2017-02-24 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/16826#discussion_r103062936 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/internal/SessionState.scala --- @@ -65,22 +82,118 @@ private[sql] class SessionState(sparkSession:

[GitHub] spark pull request #16826: [SPARK-19540][SQL] Add ability to clone SparkSess...

2017-02-24 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/16826#discussion_r103065284 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SessionStateSuite.scala --- @@ -0,0 +1,132 @@ +/* + * Licensed to the Apache Software

[GitHub] spark issue #16976: [SPARK-19610][SQL] Support parsing multiline CSV files

2017-02-24 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16976 **[Test build #73454 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73454/testReport)** for PR 16976 at commit

[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-02-24 Thread hhbyyh
Github user hhbyyh commented on the issue: https://github.com/apache/spark/pull/15415 > Btw, I could imagine us wanting to change this later. If we're recommending items a user could add to their basket, then we might want to suggest the most frequent item rather than nothing.

[GitHub] spark issue #17018: [SPARK-19684] [DOCS] Remove developer info from docs.

2017-02-24 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/17018 Thanks - will keep an eye out to see if the time out errors are gone. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request #16774: [SPARK-19357][ML] Adding parallel model evaluatio...

2017-02-24 Thread devesh
Github user devesh commented on a diff in the pull request: https://github.com/apache/spark/pull/16774#discussion_r103021457 --- Diff: mllib/src/main/scala/org/apache/spark/ml/tuning/CrossValidator.scala --- @@ -100,31 +105,50 @@ class CrossValidator @Since("1.2.0")

[GitHub] spark issue #17018: [SPARK-19684] [DOCS] Remove developer info from docs.

2017-02-24 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/17018 I have a confirmation that it is increased to 90 mins now. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-02-24 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15415 **[Test build #73453 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73453/testReport)** for PR 15415 at commit

[GitHub] spark issue #16965: [Spark-18450][ML] Scala API Change for LSH AND-amplifica...

2017-02-24 Thread Yunni
Github user Yunni commented on the issue: https://github.com/apache/spark/pull/16965 @merlintang Not exactly. Each row will explode to L rows, where L is the number of hash tables. Like the following: ``` ++-++ |

[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-02-24 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15415 **[Test build #73452 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73452/testReport)** for PR 15415 at commit

[GitHub] spark issue #16907: [SPARK-19572][SPARKR] Allow to disable hive in sparkR sh...

2017-02-24 Thread zjffdu
Github user zjffdu commented on the issue: https://github.com/apache/spark/pull/16907 Yeah, it would be nice to be merged into 2.1 as well. Thanks --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark issue #14963: [SPARK-16992][PYSPARK] Virtualenv for Pylint and pep8 in...

2017-02-24 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14963 **[Test build #73451 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73451/testReport)** for PR 14963 at commit

[GitHub] spark issue #16626: [SPARK-19261][SQL] Alter add columns for Hive serde and ...

2017-02-24 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16626 **[Test build #73450 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73450/testReport)** for PR 16626 at commit

[GitHub] spark issue #17060: [SQL] Duplicate test exception in SQLQueryTestSuite due ...

2017-02-24 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17060 **[Test build #73449 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73449/testReport)** for PR 17060 at commit

[GitHub] spark pull request #17060: [SQL] Duplicate test exception in SQLQueryTestSui...

2017-02-24 Thread dilipbiswal
GitHub user dilipbiswal opened a pull request: https://github.com/apache/spark/pull/17060 [SQL] Duplicate test exception in SQLQueryTestSuite due to meta files(.DS_Store) on Mac ## What changes were proposed in this pull request? After adding the tests for subquery, we now have

[GitHub] spark pull request #16826: [SPARK-19540][SQL] Add ability to clone SparkSess...

2017-02-24 Thread kunalkhamar
Github user kunalkhamar commented on a diff in the pull request: https://github.com/apache/spark/pull/16826#discussion_r103060505 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala --- @@ -1178,4 +1178,36 @@ class SessionCatalog(

[GitHub] spark pull request #16826: [SPARK-19540][SQL] Add ability to clone SparkSess...

2017-02-24 Thread kunalkhamar
Github user kunalkhamar commented on a diff in the pull request: https://github.com/apache/spark/pull/16826#discussion_r103061069 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveSessionCatalog.scala --- @@ -77,10 +71,44 @@ private[sql] class HiveSessionCatalog(

[GitHub] spark pull request #16826: [SPARK-19540][SQL] Add ability to clone SparkSess...

2017-02-24 Thread kunalkhamar
Github user kunalkhamar commented on a diff in the pull request: https://github.com/apache/spark/pull/16826#discussion_r103060937 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/test/TestSQLContext.scala --- @@ -34,9 +40,32 @@ private[sql] class TestSparkSession(sc:

[GitHub] spark pull request #16826: [SPARK-19540][SQL] Add ability to clone SparkSess...

2017-02-24 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/16826#discussion_r103060863 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala --- @@ -1178,4 +1178,36 @@ class SessionCatalog(

[GitHub] spark pull request #16826: [SPARK-19540][SQL] Add ability to clone SparkSess...

2017-02-24 Thread kunalkhamar
Github user kunalkhamar commented on a diff in the pull request: https://github.com/apache/spark/pull/16826#discussion_r103060748 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveSessionStateSuite.scala --- @@ -0,0 +1,54 @@ +/* + * Licensed to the Apache

[GitHub] spark issue #16826: [SPARK-19540][SQL] Add ability to clone SparkSession whe...

2017-02-24 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16826 **[Test build #73448 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73448/testReport)** for PR 16826 at commit

[GitHub] spark pull request #16826: [SPARK-19540][SQL] Add ability to clone SparkSess...

2017-02-24 Thread kunalkhamar
Github user kunalkhamar commented on a diff in the pull request: https://github.com/apache/spark/pull/16826#discussion_r103060717 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveSparkSubmitSuite.scala --- @@ -217,7 +217,8 @@ class HiveSparkSubmitSuite

[GitHub] spark pull request #16976: [SPARK-19610][SQL] Support parsing multiline CSV ...

2017-02-24 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/16976#discussion_r103060555 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVDataSource.scala --- @@ -0,0 +1,238 @@ +/* + * Licensed to

[GitHub] spark issue #16826: [SPARK-19540][SQL] Add ability to clone SparkSession whe...

2017-02-24 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16826 **[Test build #73447 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73447/testReport)** for PR 16826 at commit

[GitHub] spark issue #16965: [Spark-18450][ML] Scala API Change for LSH AND-amplifica...

2017-02-24 Thread merlintang
Github user merlintang commented on the issue: https://github.com/apache/spark/pull/16965 @Yunni Ok, if we want to move this quicker, we can keep the current AND-OR implementation. (2)(3) you mention that you explode the inner table (dataset). Does it mean for each tuple of

[GitHub] spark issue #16826: [SPARK-19540][SQL] Add ability to clone SparkSession whe...

2017-02-24 Thread kunalkhamar
Github user kunalkhamar commented on the issue: https://github.com/apache/spark/pull/16826 jenkins retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark pull request #16976: [SPARK-19610][SQL] Support parsing multiline CSV ...

2017-02-24 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/16976#discussion_r103060062 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/csv/CSVSuite.scala --- @@ -961,56 +978,135 @@ class CSVSuite extends

[GitHub] spark issue #16826: [SPARK-19540][SQL] Add ability to clone SparkSession whe...

2017-02-24 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/16826 add to whitelist --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or

[GitHub] spark pull request #16826: [SPARK-19540][SQL] Add ability to clone SparkSess...

2017-02-24 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/16826#discussion_r103059989 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala --- @@ -1178,4 +1178,36 @@ class SessionCatalog( }

[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-02-24 Thread hhbyyh
Github user hhbyyh commented on the issue: https://github.com/apache/spark/pull/15415 Thanks @jkbradley for contributing the code. That helps a lot. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request #16976: [SPARK-19610][SQL] Support parsing multiline CSV ...

2017-02-24 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/16976#discussion_r103059746 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/csv/CSVSuite.scala --- @@ -958,4 +975,77 @@ class CSVSuite extends

[GitHub] spark pull request #16976: [SPARK-19610][SQL] Support parsing multiline CSV ...

2017-02-24 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/16976#discussion_r103059540 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/UnivocityParser.scala --- @@ -269,3 +273,89 @@ private[csv] class

[GitHub] spark pull request #17012: [SPARK-19677][SS] Renaming a file atop an existin...

2017-02-24 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/17012#discussion_r103059294 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/HDFSBackedStateStoreProvider.scala --- @@ -274,6 +274,11 @@

[GitHub] spark issue #16954: [SPARK-18874][SQL] First phase: Deferring the correlated...

2017-02-24 Thread dilipbiswal
Github user dilipbiswal commented on the issue: https://github.com/apache/spark/pull/16954 @hvanhovell Hi Herman, was wondering if you had some time to look into this PR ? Please let me know your thoughts. --- If your project is set up for it, you can reply to this email and have

[GitHub] spark pull request #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-02-24 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/15415#discussion_r103050909 --- Diff: mllib/src/main/scala/org/apache/spark/ml/fpm/FPGrowth.scala --- @@ -0,0 +1,347 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark pull request #17043: [SPARK-19719][SS][WIP] Kafka writer for both stru...

2017-02-24 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/17043#discussion_r103057183 --- Diff: external/kafka-0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/KafkaSinkSuite.scala --- @@ -0,0 +1,413 @@ +/* + * Licensed to the

[GitHub] spark pull request #16976: [SPARK-19610][SQL] Support parsing multiline CSV ...

2017-02-24 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/16976#discussion_r103057146 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/CodecStreams.scala --- @@ -86,4 +88,11 @@ object CodecStreams {

[GitHub] spark pull request #17043: [SPARK-19719][SS][WIP] Kafka writer for both stru...

2017-02-24 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/17043#discussion_r103056768 --- Diff: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaWriteTask.scala --- @@ -0,0 +1,119 @@ +/* + * Licensed to the

<    1   2   3   4   5   6   >