[GitHub] spark pull request #15513: [SPARK-17963][SQL][Documentation] Add examples (e...

2016-10-22 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/15513#discussion_r84590609 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/CallMethodViaReflection.scala --- @@ -43,11 +43,20 @@ import

[GitHub] spark pull request #15513: [SPARK-17963][SQL][Documentation] Add examples (e...

2016-10-22 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/15513#discussion_r84590562 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/functions.scala --- @@ -125,7 +129,7 @@ case class DescribeFunctionCommand(

[GitHub] spark pull request #15595: [SPARK-18058][SQL] Comparing column types ignorin...

2016-10-22 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/15595#discussion_r84590334 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/AnalysisSuite.scala --- @@ -377,4 +377,14 @@ class AnalysisSuite extends

[GitHub] spark issue #15219: [SPARK-14098][SQL] Generate Java code to build CachedCol...

2016-10-22 Thread kiszk
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/15219 @davies @rxin, would it be possible to review this? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark issue #15588: [SPARK-18039][Scheduler] fix bug maxRegisteredWaitingTim...

2016-10-22 Thread Astralidea
Github user Astralidea commented on the issue: https://github.com/apache/spark/pull/15588 @lw-lin Thanks for you reply. In my private cluster running spark is a little different. (I start drivr & executor by myself) I had try maxRegisteredWaitingTime, but I had not try

[GitHub] spark issue #15575: [SPARK-18038] [SQL] Move output partitioning definition ...

2016-10-22 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/15575 LGTM, since the scope of this PR is just refactoring. Let me first post the existing code for `outputPartitioning` in `ExpandExec`: ```Scala // The GroupExpressions can output

[GitHub] spark pull request #15361: [SPARK-17765][SQL] Support for writing out user-d...

2016-10-22 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/15361#discussion_r84590191 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/orc/OrcQuerySuite.scala --- @@ -91,6 +91,16 @@ class OrcQuerySuite extends QueryTest with

[GitHub] spark pull request #15361: [SPARK-17765][SQL] Support for writing out user-d...

2016-10-22 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/15361#discussion_r84590147 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveInspectors.scala --- @@ -246,6 +246,9 @@ private[hive] trait HiveInspectors { *

[GitHub] spark issue #15441: [SPARK-4411] [Web UI] Add "kill" link for jobs in the UI

2016-10-22 Thread ajbozarth
Github user ajbozarth commented on the issue: https://github.com/apache/spark/pull/15441 @srowen I addressed most of your comments except the one about the try-finally I commented on above --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark issue #15441: [SPARK-4411] [Web UI] Add "kill" link for jobs in the UI

2016-10-22 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15441 **[Test build #67406 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67406/consoleFull)** for PR 15441 at commit

[GitHub] spark issue #15588: [SPARK-18039][Scheduler] fix bug maxRegisteredWaitingTim...

2016-10-22 Thread lw-lin
Github user lw-lin commented on the issue: https://github.com/apache/spark/pull/15588 Spark Streaming would do a very simple dummy job ensure that all slaves have registered before scheduling the `Receiver`s; please see

[GitHub] spark pull request #15441: [SPARK-4411] [Web UI] Add "kill" link for jobs in...

2016-10-22 Thread ajbozarth
Github user ajbozarth commented on a diff in the pull request: https://github.com/apache/spark/pull/15441#discussion_r84589829 --- Diff: core/src/test/scala/org/apache/spark/ui/UISeleniumSuite.scala --- @@ -651,6 +671,15 @@ class UISeleniumSuite extends SparkFunSuite with

[GitHub] spark issue #15484: [SPARK-17868][SQL] Do not use bitmasks during parsing an...

2016-10-22 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/15484 @tejasapatil @rxin I've addressed most of your comments, thanks for reviewing this! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as

[GitHub] spark pull request #15484: [SPARK-17868][SQL] Do not use bitmasks during par...

2016-10-22 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request: https://github.com/apache/spark/pull/15484#discussion_r84589691 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -216,10 +216,16 @@ class Analyzer( *

[GitHub] spark issue #15582: [SPARK-18045][SQL][TESTS] Move `HiveDataFrameAnalyticsSu...

2016-10-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15582 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67403/ Test PASSed. ---

[GitHub] spark issue #15582: [SPARK-18045][SQL][TESTS] Move `HiveDataFrameAnalyticsSu...

2016-10-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15582 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #15484: [SPARK-17868][SQL] Do not use bitmasks during parsing an...

2016-10-22 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15484 **[Test build #67405 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67405/consoleFull)** for PR 15484 at commit

[GitHub] spark issue #15582: [SPARK-18045][SQL][TESTS] Move `HiveDataFrameAnalyticsSu...

2016-10-22 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15582 **[Test build #67403 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67403/consoleFull)** for PR 15582 at commit

[GitHub] spark issue #13705: [SPARK-15472][SQL] Add support for writing in `csv` form...

2016-10-22 Thread lw-lin
Github user lw-lin commented on the issue: https://github.com/apache/spark/pull/13705 closing this in favor of SPARK-17924 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request #13705: [SPARK-15472][SQL] Add support for writing in `cs...

2016-10-22 Thread lw-lin
Github user lw-lin closed the pull request at: https://github.com/apache/spark/pull/13705 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark issue #15582: [SPARK-18045][SQL][TESTS] Move `HiveDataFrameAnalyticsSu...

2016-10-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15582 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67404/ Test PASSed. ---

[GitHub] spark issue #15582: [SPARK-18045][SQL][TESTS] Move `HiveDataFrameAnalyticsSu...

2016-10-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15582 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #15582: [SPARK-18045][SQL][TESTS] Move `HiveDataFrameAnalyticsSu...

2016-10-22 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15582 **[Test build #67404 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67404/consoleFull)** for PR 15582 at commit

[GitHub] spark pull request #15573: [SPARK-18035] [SQL] Introduce performant and memo...

2016-10-22 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/15573 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark issue #15573: [SPARK-18035] [SQL] Introduce performant and memory effi...

2016-10-22 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/15573 Merging to master! Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark issue #15573: [SPARK-18035] [SQL] Introduce performant and memory effi...

2016-10-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15573 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #15573: [SPARK-18035] [SQL] Introduce performant and memory effi...

2016-10-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15573 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67402/ Test PASSed. ---

[GitHub] spark issue #15573: [SPARK-18035] [SQL] Introduce performant and memory effi...

2016-10-22 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15573 **[Test build #67402 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67402/consoleFull)** for PR 15573 at commit

[GitHub] spark pull request #15148: [SPARK-5992][ML] Locality Sensitive Hashing

2016-10-22 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/15148#discussion_r84589297 --- Diff: mllib/src/test/scala/org/apache/spark/ml/feature/RandomProjectionSuite.scala --- @@ -0,0 +1,148 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request #15148: [SPARK-5992][ML] Locality Sensitive Hashing

2016-10-22 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/15148#discussion_r84588285 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/MinHash.scala --- @@ -0,0 +1,118 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark pull request #15148: [SPARK-5992][ML] Locality Sensitive Hashing

2016-10-22 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/15148#discussion_r84588545 --- Diff: mllib/src/test/scala/org/apache/spark/ml/feature/MinHashSuite.scala --- @@ -0,0 +1,84 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request #15148: [SPARK-5992][ML] Locality Sensitive Hashing

2016-10-22 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/15148#discussion_r84589114 --- Diff: mllib/src/test/scala/org/apache/spark/ml/feature/MinHashSuite.scala --- @@ -0,0 +1,84 @@ +/* + * Licensed to the Apache Software

[GitHub] spark issue #15582: [SPARK-18045][SQL][TESTS] Move `HiveDataFrameAnalyticsSu...

2016-10-22 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/15582 It'd be great to move those as well! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #15575: [SPARK-18038] [SQL] Move output partitioning definition ...

2016-10-22 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/15575 In practice, setting the `outputPartitioning` of a physical plan like `ExpandExec` to `child.outputPartitioning` doesn't cause any real problem, even this physical plan doesn't keep the same row

[GitHub] spark pull request #15588: [SPARK-18039][Scheduler] fix bug maxRegisteredWai...

2016-10-22 Thread Astralidea
Github user Astralidea commented on a diff in the pull request: https://github.com/apache/spark/pull/15588#discussion_r84588811 --- Diff: streaming/src/main/scala/org/apache/spark/streaming/scheduler/ReceiverTracker.scala --- @@ -440,7 +430,10 @@ class ReceiverTracker(ssc:

[GitHub] spark issue #15588: [SPARK-18039][Scheduler] fix bug maxRegisteredWaitingTim...

2016-10-22 Thread Astralidea
Github user Astralidea commented on the issue: https://github.com/apache/spark/pull/15588 @srowen But in my cluster I tested 10 times. 9 times successed, 1 time failed. Why not necessary? receiver balance scheduler affect performance. If new executor delay add to driver.

[GitHub] spark issue #15354: [SPARK-17764][SQL] Add `to_json` supporting to convert n...

2016-10-22 Thread marmbrus
Github user marmbrus commented on the issue: https://github.com/apache/spark/pull/15354 It would be really nice to fail in analysis rather than execution. What if it only fails after hours of computation? As a user I'd be upset. I'm also concerned they will think it's a spark bug.

[GitHub] spark pull request #15361: [SPARK-17765][SQL] Support for writing out user-d...

2016-10-22 Thread tejasapatil
Github user tejasapatil commented on a diff in the pull request: https://github.com/apache/spark/pull/15361#discussion_r84588700 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/orc/OrcQuerySuite.scala --- @@ -91,6 +91,16 @@ class OrcQuerySuite extends QueryTest with

[GitHub] spark pull request #15361: [SPARK-17765][SQL] Support for writing out user-d...

2016-10-22 Thread tejasapatil
Github user tejasapatil commented on a diff in the pull request: https://github.com/apache/spark/pull/15361#discussion_r84588678 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveInspectors.scala --- @@ -246,6 +246,9 @@ private[hive] trait HiveInspectors { *

[GitHub] spark issue #15575: [SPARK-18038] [SQL] Move output partitioning definition ...

2016-10-22 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/15575 @tejasapatil yeah, that is correct. however, I am wondering if we can say this `ExpandExec` have the same distribution of rows as its child...because it even doesn't have the `col`... --- If your

[GitHub] spark issue #15148: [SPARK-5992][ML] Locality Sensitive Hashing

2016-10-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15148 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67401/ Test PASSed. ---

[GitHub] spark issue #15148: [SPARK-5992][ML] Locality Sensitive Hashing

2016-10-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15148 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #15148: [SPARK-5992][ML] Locality Sensitive Hashing

2016-10-22 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15148 **[Test build #67401 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67401/consoleFull)** for PR 15148 at commit

[GitHub] spark issue #15582: [SPARK-18045][SQL][TESTS] Move `HiveDataFrameAnalyticsSu...

2016-10-22 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/15582 @rxin I've moved the testcases added in this PR to an query file test, do we need to move other test cases for `ROLLUP/CUBE/GROUPING-SETS` too? Currently in `SQLQuerySuite` we have the

[GitHub] spark issue #15582: [SPARK-18045][SQL][TESTS] Move `HiveDataFrameAnalyticsSu...

2016-10-22 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15582 **[Test build #67404 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67404/consoleFull)** for PR 15582 at commit

[GitHub] spark issue #15463: [SPARK-17894] [CORE] Ensure uniqueness of TaskSetManager...

2016-10-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15463 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67399/ Test PASSed. ---

[GitHub] spark issue #15463: [SPARK-17894] [CORE] Ensure uniqueness of TaskSetManager...

2016-10-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15463 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #15463: [SPARK-17894] [CORE] Ensure uniqueness of TaskSetManager...

2016-10-22 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15463 **[Test build #67399 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67399/consoleFull)** for PR 15463 at commit

[GitHub] spark pull request #15595: [SPARK-18058][SQL] Comparing column types ignorin...

2016-10-22 Thread tejasapatil
Github user tejasapatil commented on a diff in the pull request: https://github.com/apache/spark/pull/15595#discussion_r84588354 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/AnalysisSuite.scala --- @@ -377,4 +377,14 @@ class AnalysisSuite extends

[GitHub] spark issue #15582: [SPARK-18045][SQL][TESTS] Move `HiveDataFrameAnalyticsSu...

2016-10-22 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15582 **[Test build #67403 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67403/consoleFull)** for PR 15582 at commit

[GitHub] spark pull request #15595: [SPARK-18058][SQL] Comparing column types ignorin...

2016-10-22 Thread tejasapatil
Github user tejasapatil commented on a diff in the pull request: https://github.com/apache/spark/pull/15595#discussion_r84588308 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/AnalysisSuite.scala --- @@ -377,4 +377,14 @@ class AnalysisSuite extends

[GitHub] spark issue #15575: [SPARK-18038] [SQL] Move output partitioning definition ...

2016-10-22 Thread tejasapatil
Github user tejasapatil commented on the issue: https://github.com/apache/spark/pull/15575 @viirya : As per my understanding, if the child operator emits `col`, after applying `ExpandExec`, the output is `col'`. The original child partitioning being over `col`, `ExpandExec` does not

[GitHub] spark issue #15573: [SPARK-18035] [SQL] Unwrapping java maps in HiveInspecto...

2016-10-22 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15573 **[Test build #67402 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67402/consoleFull)** for PR 15573 at commit

[GitHub] spark issue #15573: [SPARK-18035] [SQL] Unwrapping java maps in HiveInspecto...

2016-10-22 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/15573 LGTM pending test --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes

[GitHub] spark pull request #15573: [SPARK-18035] [SQL] Unwrapping java maps in HiveI...

2016-10-22 Thread tejasapatil
Github user tejasapatil commented on a diff in the pull request: https://github.com/apache/spark/pull/15573#discussion_r84588059 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/CatalystTypeConverters.scala --- @@ -433,18 +413,12 @@ object

[GitHub] spark issue #15148: [SPARK-5992][ML] Locality Sensitive Hashing

2016-10-22 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15148 **[Test build #67401 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67401/consoleFull)** for PR 15148 at commit

[GitHub] spark pull request #15573: [SPARK-18035] [SQL] Unwrapping java maps in HiveI...

2016-10-22 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/15573#discussion_r84587828 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/CatalystTypeConverters.scala --- @@ -433,18 +413,12 @@ object CatalystTypeConverters

[GitHub] spark issue #15595: [SPARK-18058][SQL] Comparing column types ignoring Nulla...

2016-10-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15595 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #14547: [SPARK-16718][MLlib] gbm-style treeboost

2016-10-22 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14547 **[Test build #67400 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67400/consoleFull)** for PR 14547 at commit

[GitHub] spark issue #14547: [SPARK-16718][MLlib] gbm-style treeboost

2016-10-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14547 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67400/ Test FAILed. ---

[GitHub] spark issue #14547: [SPARK-16718][MLlib] gbm-style treeboost

2016-10-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14547 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #15595: [SPARK-18058][SQL] Comparing column types ignoring Nulla...

2016-10-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15595 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67397/ Test PASSed. ---

[GitHub] spark issue #15595: [SPARK-18058][SQL] Comparing column types ignoring Nulla...

2016-10-22 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15595 **[Test build #67397 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67397/consoleFull)** for PR 15595 at commit

[GitHub] spark issue #15575: [SPARK-18038] [SQL] Move output partitioning definition ...

2016-10-22 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/15575 @rxin yeah, I am curious why `ExpandExec` and `GenerateExec` have different `outputPartitioning`... --- If your project is set up for it, you can reply to this email and have your reply appear on

[GitHub] spark issue #15575: [SPARK-18038] [SQL] Move output partitioning definition ...

2016-10-22 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/15575 @tejasapatil I see there is 1:1 mapping among output partition of child operator and output partition of `ExpandExec`. For example we have an Expand applying on a data set like col: [1, 2,

[GitHub] spark issue #14547: [SPARK-16718][MLlib] gbm-style treeboost

2016-10-22 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14547 **[Test build #67400 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67400/consoleFull)** for PR 14547 at commit

[GitHub] spark issue #15575: [SPARK-18038] [SQL] Move output partitioning definition ...

2016-10-22 Thread tejasapatil
Github user tejasapatil commented on the issue: https://github.com/apache/spark/pull/15575 @viirya >> However, if its child has certain partition such as HashPartition, after ExpandExec it becomes a UnknownPartitioning The notion of `Partitioning` in Spark is the

[GitHub] spark issue #15575: [SPARK-18038] [SQL] Move output partitioning definition ...

2016-10-22 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/15575 The current thing LGTM. cc @yhuai do you have any other feedback? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request #15148: [SPARK-5992][ML] Locality Sensitive Hashing

2016-10-22 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/15148#discussion_r84587197 --- Diff: mllib/src/test/scala/org/apache/spark/ml/feature/MinHashSuite.scala --- @@ -0,0 +1,84 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request #15148: [SPARK-5992][ML] Locality Sensitive Hashing

2016-10-22 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/15148#discussion_r84587191 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/LSH.scala --- @@ -0,0 +1,340 @@ +/* + * Licensed to the Apache Software Foundation (ASF)

[GitHub] spark pull request #15148: [SPARK-5992][ML] Locality Sensitive Hashing

2016-10-22 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/15148#discussion_r84587195 --- Diff: mllib/src/test/scala/org/apache/spark/ml/feature/MinHashSuite.scala --- @@ -0,0 +1,84 @@ +/* + * Licensed to the Apache Software

[GitHub] spark issue #15148: [SPARK-5992][ML] Locality Sensitive Hashing

2016-10-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15148 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #15148: [SPARK-5992][ML] Locality Sensitive Hashing

2016-10-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15148 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67398/ Test PASSed. ---

[GitHub] spark issue #15148: [SPARK-5992][ML] Locality Sensitive Hashing

2016-10-22 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15148 **[Test build #67398 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67398/consoleFull)** for PR 15148 at commit

[GitHub] spark issue #15575: [SPARK-18038] [SQL] Move output partitioning definition ...

2016-10-22 Thread tejasapatil
Github user tejasapatil commented on the issue: https://github.com/apache/spark/pull/15575 @viirya >> In the table in the description, CoalesceExec output UnknownPartitioning Yes. Since partitions == 1 is a corner case, I did not put that in the table. If you look

[GitHub] spark pull request #15600: [SPARK-17698] [SQL] Join predicates should not co...

2016-10-22 Thread tejasapatil
Github user tejasapatil closed the pull request at: https://github.com/apache/spark/pull/15600 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark issue #15463: [SPARK-17894] [CORE] Ensure uniqueness of TaskSetManager...

2016-10-22 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15463 **[Test build #67399 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67399/consoleFull)** for PR 15463 at commit

[GitHub] spark issue #15600: [SPARK-17698] [SQL] Join predicates should not contain f...

2016-10-22 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/15600 Thanks - merging in. Can you close this? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #15463: [SPARK-17894] [CORE] Ensure uniqueness of TaskSetManager...

2016-10-22 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/15463 Jenkins, retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark issue #15575: [SPARK-18038] [SQL] Move output partitioning definition ...

2016-10-22 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/15575 @viirya of course if you say coalesce(1) it is a single partition -- any operator that changes partition to 1 partition is single partition. For Expand isn't it just the same as Generate?

[GitHub] spark issue #15575: [SPARK-18038] [SQL] Move output partitioning definition ...

2016-10-22 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/15575 @rxin In the table in the description, `CoalesceExec` output `UnknownPartitioning`, actually it can be `SinglePartition` if what you do is `coalesce(1)`. `ExpandExec` doesn't actually move

[GitHub] spark pull request #15148: [SPARK-5992][ML] Locality Sensitive Hashing

2016-10-22 Thread Yunni
Github user Yunni commented on a diff in the pull request: https://github.com/apache/spark/pull/15148#discussion_r84586831 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/LSH.scala --- @@ -0,0 +1,343 @@ +/* + * Licensed to the Apache Software Foundation (ASF)

[GitHub] spark pull request #15148: [SPARK-5992][ML] Locality Sensitive Hashing

2016-10-22 Thread Yunni
Github user Yunni commented on a diff in the pull request: https://github.com/apache/spark/pull/15148#discussion_r84586829 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/LSH.scala --- @@ -0,0 +1,343 @@ +/* + * Licensed to the Apache Software Foundation (ASF)

[GitHub] spark issue #15148: [SPARK-5992][ML] Locality Sensitive Hashing

2016-10-22 Thread Yunni
Github user Yunni commented on the issue: https://github.com/apache/spark/pull/15148 Thanks @jkbradley. I have removed BitSampling and SignRandomProjection for a follow-up PR. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as

[GitHub] spark issue #15148: [SPARK-5992][ML] Locality Sensitive Hashing

2016-10-22 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15148 **[Test build #67398 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67398/consoleFull)** for PR 15148 at commit

[GitHub] spark issue #14529: [TRIVIAL][SQL] Match the name of OrcRelation companion o...

2016-10-22 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/14529 Thanks, I am closing this! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark pull request #14529: [TRIVIAL][SQL] Match the name of OrcRelation comp...

2016-10-22 Thread HyukjinKwon
Github user HyukjinKwon closed the pull request at: https://github.com/apache/spark/pull/14529 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark issue #15600: [SPARK-17698] [SQL] Join predicates should not contain f...

2016-10-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15600 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #15600: [SPARK-17698] [SQL] Join predicates should not contain f...

2016-10-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15600 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67396/ Test PASSed. ---

[GitHub] spark issue #15600: [SPARK-17698] [SQL] Join predicates should not contain f...

2016-10-22 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15600 **[Test build #67396 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67396/consoleFull)** for PR 15600 at commit

[GitHub] spark issue #15595: [SPARK-18058][SQL] Comparing column types ignoring Nulla...

2016-10-22 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15595 **[Test build #67397 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67397/consoleFull)** for PR 15595 at commit

[GitHub] spark issue #15541: [SPARK-17637][Scheduler]Packed scheduling for Spark task...

2016-10-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15541 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67395/ Test PASSed. ---

[GitHub] spark issue #15541: [SPARK-17637][Scheduler]Packed scheduling for Spark task...

2016-10-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15541 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #15541: [SPARK-17637][Scheduler]Packed scheduling for Spark task...

2016-10-22 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15541 **[Test build #67395 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67395/consoleFull)** for PR 15541 at commit

[GitHub] spark pull request #15484: [SPARK-17868][SQL] Do not use bitmasks during par...

2016-10-22 Thread tejasapatil
Github user tejasapatil commented on a diff in the pull request: https://github.com/apache/spark/pull/15484#discussion_r84585794 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -255,98 +265,125 @@ class Analyzer( expr

[GitHub] spark pull request #15484: [SPARK-17868][SQL] Do not use bitmasks during par...

2016-10-22 Thread tejasapatil
Github user tejasapatil commented on a diff in the pull request: https://github.com/apache/spark/pull/15484#discussion_r84585577 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -216,10 +216,16 @@ class Analyzer( *

[GitHub] spark pull request #15484: [SPARK-17868][SQL] Do not use bitmasks during par...

2016-10-22 Thread tejasapatil
Github user tejasapatil commented on a diff in the pull request: https://github.com/apache/spark/pull/15484#discussion_r84585881 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -255,98 +265,125 @@ class Analyzer( expr

[GitHub] spark issue #15600: [SPARK-17698] [SQL] Join predicates should not contain f...

2016-10-22 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15600 **[Test build #67396 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67396/consoleFull)** for PR 15600 at commit

[GitHub] spark issue #15272: [SPARK-17698] [SQL] Join predicates should not contain f...

2016-10-22 Thread tejasapatil
Github user tejasapatil commented on the issue: https://github.com/apache/spark/pull/15272 @rxin : Here is the backport for 2.0 branch: https://github.com/apache/spark/pull/15600 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark pull request #15600: [SPARK-17698] [SQL] Join predicates should not co...

2016-10-22 Thread tejasapatil
GitHub user tejasapatil opened a pull request: https://github.com/apache/spark/pull/15600 [SPARK-17698] [SQL] Join predicates should not contain filter clauses ## What changes were proposed in this pull request? This is a backport of

  1   2   3   >