[GitHub] spark pull request #17633: [SPARK-20331][SQL] Enhanced Hive partition prunin...

2017-05-24 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/17633#discussion_r118207319 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveShim.scala --- @@ -589,18 +590,39 @@ private[client] class Shim_v0_13 extends

[GitHub] spark issue #18085: [SPARK-20631][FOLLOW-UP] Fix incorrect tests.

2017-05-24 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18085 **[Test build #77298 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77298/testReport)** for PR 18085 at commit

[GitHub] spark issue #18085: [SPARK-20631][FOLLOW-UP] Fix incorrect tests.

2017-05-24 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18085 **[Test build #77299 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77299/testReport)** for PR 18085 at commit

[GitHub] spark issue #18085: [SPARK-20631][FOLLOW-UP] Fix incorrect tests.

2017-05-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18085 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77299/ Test PASSed. ---

[GitHub] spark issue #18085: [SPARK-20631][FOLLOW-UP] Fix incorrect tests.

2017-05-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18085 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #18068: [SPARK-20764][ML][PySpark][FOLLOWUP]Fix visibility discr...

2017-05-24 Thread yanboliang
Github user yanboliang commented on the issue: https://github.com/apache/spark/pull/18068 LGTM, merged into master and branch-2.2. Thanks for all. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark issue #17633: [SPARK-20331][SQL] Enhanced Hive partition pruning predi...

2017-05-24 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/17633 it's really hard to review the tests... Can we just add some simple tests and refactor the test suites in a follow-up PR? --- If your project is set up for it, you can reply to this email and

[GitHub] spark pull request #18009: [SPARK-18891][SQL] Support for specific Java List...

2017-05-24 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/18009#discussion_r118209944 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DatasetPrimitiveSuite.scala --- @@ -28,6 +28,8 @@ case class SeqClass(s: Seq[Int]) case

[GitHub] spark issue #17308: [SPARK-19968][SPARK-20737][SS] Use a cached instance of ...

2017-05-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17308 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request #17891: [SPARK-20631][PYTHON][ML] LogisticRegression._che...

2017-05-24 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/17891#discussion_r118221814 --- Diff: python/pyspark/ml/tests.py --- @@ -807,6 +807,18 @@ def test_logistic_regression(self): except OSError: pass

[GitHub] spark issue #18085: [SPARK-20631][FOLLOW-UP] Fix incorrect tests.

2017-05-24 Thread yanboliang
Github user yanboliang commented on the issue: https://github.com/apache/spark/pull/18085 LGTM, merged into master and branch-2.2. Thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark issue #17308: [SPARK-19968][SPARK-20737][SS] Use a cached instance of ...

2017-05-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17308 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77297/ Test PASSed. ---

[GitHub] spark issue #18085: [SPARK-20631][FOLLOW-UP] Fix incorrect tests.

2017-05-24 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18085 **[Test build #77299 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77299/testReport)** for PR 18085 at commit

[GitHub] spark issue #18060: [SPARK-20835][Core]It should exit directly when the --to...

2017-05-24 Thread eatoncys
Github user eatoncys commented on the issue: https://github.com/apache/spark/pull/18060 @SparkQA Retest this please,thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request #18068: [SPARK-20764][ML][PySpark][FOLLOWUP]Fix visibilit...

2017-05-24 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/18068#discussion_r118231105 --- Diff: python/pyspark/ml/tests.py --- @@ -1075,7 +1076,8 @@ def test_linear_regression_summary(self): pValues = s.pValues

[GitHub] spark pull request #18085: [SPARK-20631][FOLLOW-UP] Fix incorrect tests.

2017-05-24 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/18085 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark issue #17308: [SPARK-19968][SPARK-20737][SS] Use a cached instance of ...

2017-05-24 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17308 **[Test build #77297 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77297/testReport)** for PR 17308 at commit

[GitHub] spark issue #18014: [SPARK-20783][SQL] Enhance ColumnVector to keep UnsafeAr...

2017-05-24 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/18014 I took a look at `ColumnVector.getArray`, seems it's already no cost? The writing needs some copy though. --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark issue #18085: [SPARK-20631][FOLLOW-UP] Fix incorrect tests.

2017-05-24 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18085 **[Test build #77298 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77298/testReport)** for PR 18085 at commit

[GitHub] spark issue #18085: [SPARK-20631][FOLLOW-UP] Fix incorrect tests.

2017-05-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18085 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77298/ Test FAILed. ---

[GitHub] spark issue #18085: [SPARK-20631][FOLLOW-UP] Fix incorrect tests.

2017-05-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18085 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request #18087: [SPARK-20867][SQL] Move hints from Statistics int...

2017-05-24 Thread rxin
GitHub user rxin opened a pull request: https://github.com/apache/spark/pull/18087 [SPARK-20867][SQL] Move hints from Statistics into HintInfo class ## What changes were proposed in this pull request? This is a follow-up to SPARK-20857 to move the broadcast hint from Statistics

[GitHub] spark pull request #18086: [SPARK-20854][SQL] Extend hint syntax to support ...

2017-05-24 Thread bogdanrdc
GitHub user bogdanrdc opened a pull request: https://github.com/apache/spark/pull/18086 [SPARK-20854][SQL] Extend hint syntax to support expressions ## What changes were proposed in this pull request? SQL hint syntax: * support expressions such as strings, numbers, etc.

[GitHub] spark pull request #18068: [SPARK-20764][ML][PySpark][FOLLOWUP]Fix visibilit...

2017-05-24 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/18068 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request #18009: [SPARK-18891][SQL] Support for specific Java List...

2017-05-24 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/18009#discussion_r118210207 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DatasetPrimitiveSuite.scala --- @@ -28,6 +28,8 @@ case class SeqClass(s: Seq[Int]) case

[GitHub] spark issue #18082: [SPARK-20665][SQL][FOLLOW-UP]Move test case to SQLQueryT...

2017-05-24 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/18082 Hm I'm not sure if it is a good idea to run so many "unit test" style tests for expressions in the end to end suites. It takes a lot of time than just running unit tests. --- If your project is set

[GitHub] spark pull request #18085: [SPARK-20631][FOLLOW-UP] Fix incorrect tests.

2017-05-24 Thread zero323
GitHub user zero323 opened a pull request: https://github.com/apache/spark/pull/18085 [SPARK-20631][FOLLOW-UP] Fix incorrect tests. ## What changes were proposed in this pull request? - Fix incorrect tests for `_check_thresholds`. - Move test to `ParamTests`.

[GitHub] spark issue #17891: [SPARK-20631][PYTHON][ML] LogisticRegression._checkThres...

2017-05-24 Thread zero323
Github user zero323 commented on the issue: https://github.com/apache/spark/pull/17891 @jkbradley It shouldn't. It is not a correct test #18085 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark issue #18073: [SPARK-20848][SQL] Shutdown the pool after reading parqu...

2017-05-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18073 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77296/ Test PASSed. ---

[GitHub] spark issue #18073: [SPARK-20848][SQL] Shutdown the pool after reading parqu...

2017-05-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18073 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #18073: [SPARK-20848][SQL] Shutdown the pool after reading parqu...

2017-05-24 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18073 **[Test build #77296 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77296/testReport)** for PR 18073 at commit

[GitHub] spark issue #18087: [SPARK-20867][SQL] Move hints from Statistics into HintI...

2017-05-24 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18087 **[Test build #77301 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77301/testReport)** for PR 18087 at commit

[GitHub] spark issue #18086: [SPARK-20854][SQL] Extend hint syntax to support express...

2017-05-24 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18086 **[Test build #77300 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77300/testReport)** for PR 18086 at commit

[GitHub] spark pull request #18075: [SPARK-18016][SQL][CATALYST] Code Generation: Con...

2017-05-24 Thread bdrillard
Github user bdrillard commented on a diff in the pull request: https://github.com/apache/spark/pull/18075#discussion_r118263506 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala --- @@ -792,7 +887,18 @@ class

[GitHub] spark issue #18086: [SPARK-20854][SQL] Extend hint syntax to support express...

2017-05-24 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18086 **[Test build #77300 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77300/testReport)** for PR 18086 at commit

[GitHub] spark issue #18081: [SPARK-20862][MLLIB][PYTHON] Avoid passing float to ndar...

2017-05-24 Thread yanboliang
Github user yanboliang commented on the issue: https://github.com/apache/spark/pull/18081 LGTM, merged into master/branch-2.2/branch-2.1. Thanks for all. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request #18081: [SPARK-20862][MLLIB][PYTHON] Avoid passing float ...

2017-05-24 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/18081 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark issue #16989: [SPARK-19659] Fetch big blocks to disk when shuffle-read...

2017-05-24 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16989 **[Test build #77302 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77302/testReport)** for PR 16989 at commit

[GitHub] spark pull request #18058: [SPARK-20768][PYSPARK][ML] Expose numPartitions (...

2017-05-24 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/18058#discussion_r118283005 --- Diff: python/pyspark/ml/fpm.py --- @@ -49,6 +49,32 @@ def getMinSupport(self): return self.getOrDefault(self.minSupport)

[GitHub] spark issue #18090: [SPARK-20250][Core]Improper OOM error when a task been k...

2017-05-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18090 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark issue #18086: [SPARK-20854][SQL] Extend hint syntax to support express...

2017-05-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18086 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77300/ Test PASSed. ---

[GitHub] spark issue #12414: [SPARK-14657][SPARKR][ML] RFormula w/o intercept should ...

2017-05-24 Thread yanboliang
Github user yanboliang commented on the issue: https://github.com/apache/spark/pull/12414 @felixcheung @actuaryzhang Would you mind to have a look at this? Thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark issue #18073: [SPARK-20848][SQL] Shutdown the pool after reading parqu...

2017-05-24 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/18073 @cloud-fan My dev environment is not convenient to run GUI-based tools like jconsole. I use a command-line tool [jvmtop](https://github.com/patric-r/jvmtop). Screen shots (the column "#T"

[GitHub] spark issue #17113: [SPARK-13669][Core] Improve the blacklist mechanism to h...

2017-05-24 Thread tgravescs
Github user tgravescs commented on the issue: https://github.com/apache/spark/pull/17113 @jerryshao sorry my delay on this, we have rough design what we want to do for future changes but I think those are going to take a while and in the mean time I think this is a useful addition

[GitHub] spark pull request #18089: [SPARK-19281][FOLLOWUP][ML] Minor fix for PySpark...

2017-05-24 Thread yanboliang
GitHub user yanboliang opened a pull request: https://github.com/apache/spark/pull/18089 [SPARK-19281][FOLLOWUP][ML] Minor fix for PySpark FPGrowth. ## What changes were proposed in this pull request? Minor fix for PySpark ```FPGrowth```. ## How was this patch tested?

[GitHub] spark pull request #17113: [SPARK-13669][Core] Improve the blacklist mechani...

2017-05-24 Thread tgravescs
Github user tgravescs commented on a diff in the pull request: https://github.com/apache/spark/pull/17113#discussion_r118261422 --- Diff: core/src/main/scala/org/apache/spark/scheduler/BlacklistTracker.scala --- @@ -145,6 +146,75 @@ private[scheduler] class BlacklistTracker (

[GitHub] spark pull request #17113: [SPARK-13669][Core] Improve the blacklist mechani...

2017-05-24 Thread tgravescs
Github user tgravescs commented on a diff in the pull request: https://github.com/apache/spark/pull/17113#discussion_r118260480 --- Diff: core/src/main/scala/org/apache/spark/scheduler/BlacklistTracker.scala --- @@ -145,6 +146,75 @@ private[scheduler] class BlacklistTracker (

[GitHub] spark issue #17967: [SPARK-14659][ML] RFormula consistent with R when handli...

2017-05-24 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/17967 Personally, I would prefer a HTML list or table one. But I am fine with the current status if this is okay to all of you here (as I guess none of them is particularly better given all the

[GitHub] spark pull request #17113: [SPARK-13669][Core] Improve the blacklist mechani...

2017-05-24 Thread tgravescs
Github user tgravescs commented on a diff in the pull request: https://github.com/apache/spark/pull/17113#discussion_r118260789 --- Diff: core/src/main/scala/org/apache/spark/scheduler/BlacklistTracker.scala --- @@ -145,6 +146,75 @@ private[scheduler] class BlacklistTracker (

[GitHub] spark pull request #16989: [SPARK-19659] Fetch big blocks to disk when shuff...

2017-05-24 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/16989#discussion_r118272605 --- Diff: core/src/main/scala/org/apache/spark/storage/ShuffleBlockFetcherIterator.scala --- @@ -163,6 +170,11 @@ final class

[GitHub] spark pull request #16989: [SPARK-19659] Fetch big blocks to disk when shuff...

2017-05-24 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/16989#discussion_r118272761 --- Diff: core/src/main/scala/org/apache/spark/storage/ShuffleBlockFetcherIterator.scala --- @@ -175,33 +187,49 @@ final class

[GitHub] spark pull request #18075: [SPARK-18016][SQL][CATALYST] Code Generation: Con...

2017-05-24 Thread bdrillard
Github user bdrillard commented on a diff in the pull request: https://github.com/apache/spark/pull/18075#discussion_r118274669 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala --- @@ -629,7 +736,9 @@ class CodegenContext

[GitHub] spark pull request #18090: [SPARK-20250][Core]Improper OOM error when a task...

2017-05-24 Thread ConeyLiu
GitHub user ConeyLiu opened a pull request: https://github.com/apache/spark/pull/18090 [SPARK-20250][Core]Improper OOM error when a task been killed while spilling data ## What changes were proposed in this pull request? Currently, when a task is calling spill() but

[GitHub] spark issue #15009: [SPARK-17443][SPARK-11035] Stop Spark Application if lau...

2017-05-24 Thread kishorvpatil
Github user kishorvpatil commented on the issue: https://github.com/apache/spark/pull/15009 @vanzin I added `waitFor` method to ChildSparkHandle allowing it to wait for launched process or thread. Now the test waits for thread to finish. --- If your project is set up for it, you can

[GitHub] spark issue #18086: [SPARK-20854][SQL] Extend hint syntax to support express...

2017-05-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18086 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #17967: [SPARK-14659][ML] RFormula consistent with R when handli...

2017-05-24 Thread yanboliang
Github user yanboliang commented on the issue: https://github.com/apache/spark/pull/17967 @actuaryzhang Thanks for your clarification, it makes sense. This looks good to me. @HyukjinKwon @felixcheung What do you think of the documentation issue? --- If your project is set up for

[GitHub] spark pull request #17770: [SPARK-20392][SQL] Set barrier to prevent re-ente...

2017-05-24 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/17770#discussion_r118261213 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala --- @@ -174,17 +174,19 @@ class Dataset[T] private[sql](

[GitHub] spark pull request #18058: [SPARK-20768][PYSPARK][ML] Expose numPartitions (...

2017-05-24 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/18058#discussion_r118264465 --- Diff: python/pyspark/ml/fpm.py --- @@ -49,6 +49,32 @@ def getMinSupport(self): return self.getOrDefault(self.minSupport)

[GitHub] spark pull request #16989: [SPARK-19659] Fetch big blocks to disk when shuff...

2017-05-24 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/16989#discussion_r118272653 --- Diff: core/src/main/scala/org/apache/spark/storage/ShuffleBlockFetcherIterator.scala --- @@ -175,33 +187,49 @@ final class

[GitHub] spark pull request #16989: [SPARK-19659] Fetch big blocks to disk when shuff...

2017-05-24 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/16989#discussion_r118272532 --- Diff: core/src/main/scala/org/apache/spark/internal/config/package.scala --- @@ -287,4 +287,10 @@ package object config {

[GitHub] spark pull request #18075: [SPARK-18016][SQL][CATALYST] Code Generation: Con...

2017-05-24 Thread bdrillard
Github user bdrillard commented on a diff in the pull request: https://github.com/apache/spark/pull/18075#discussion_r118274597 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala --- @@ -792,7 +887,18 @@ class

[GitHub] spark pull request #18089: [SPARK-19281][FOLLOWUP][ML] Minor fix for PySpark...

2017-05-24 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/18089#discussion_r118280610 --- Diff: python/pyspark/ml/fpm.py --- @@ -23,17 +23,17 @@ __all__ = ["FPGrowth", "FPGrowthModel"] -class HasSupport(Params):

[GitHub] spark pull request #18088: [Minor] document edge case of updateFunc usage

2017-05-24 Thread wselwood
GitHub user wselwood opened a pull request: https://github.com/apache/spark/pull/18088 [Minor] document edge case of updateFunc usage ## What changes were proposed in this pull request? Include documentation of the fact that the updateFunc is sometimes called with no new

[GitHub] spark issue #18088: [Minor] document edge case of updateFunc usage

2017-05-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18088 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request #18075: [SPARK-18016][SQL][CATALYST] Code Generation: Con...

2017-05-24 Thread bdrillard
Github user bdrillard commented on a diff in the pull request: https://github.com/apache/spark/pull/18075#discussion_r118274094 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala --- @@ -233,10 +223,129 @@ class

[GitHub] spark pull request #18075: [SPARK-18016][SQL][CATALYST] Code Generation: Con...

2017-05-24 Thread bdrillard
Github user bdrillard commented on a diff in the pull request: https://github.com/apache/spark/pull/18075#discussion_r118274179 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala --- @@ -233,10 +223,129 @@ class

[GitHub] spark issue #18089: [SPARK-19281][FOLLOWUP][ML] Minor fix for PySpark FPGrow...

2017-05-24 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18089 **[Test build #77303 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77303/testReport)** for PR 18089 at commit

[GitHub] spark issue #18089: [SPARK-19281][FOLLOWUP][ML] Minor fix for PySpark FPGrow...

2017-05-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18089 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77303/ Test PASSed. ---

[GitHub] spark issue #18089: [SPARK-19281][FOLLOWUP][ML] Minor fix for PySpark FPGrow...

2017-05-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18089 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request #18075: [SPARK-18016][SQL][CATALYST] Code Generation: Con...

2017-05-24 Thread kiszk
Github user kiszk commented on a diff in the pull request: https://github.com/apache/spark/pull/18075#discussion_r118300406 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala --- @@ -233,10 +223,124 @@ class CodegenContext

[GitHub] spark pull request #18075: [SPARK-18016][SQL][CATALYST] Code Generation: Con...

2017-05-24 Thread bdrillard
Github user bdrillard commented on a diff in the pull request: https://github.com/apache/spark/pull/18075#discussion_r118302075 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/basicPhysicalOperators.scala --- @@ -299,6 +297,9 @@ case class SampleExec(

[GitHub] spark pull request #18075: [SPARK-18016][SQL][CATALYST] Code Generation: Con...

2017-05-24 Thread bdrillard
Github user bdrillard commented on a diff in the pull request: https://github.com/apache/spark/pull/18075#discussion_r118302508 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala --- @@ -233,10 +223,124 @@ class

[GitHub] spark pull request #16989: [SPARK-19659] Fetch big blocks to disk when shuff...

2017-05-24 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/16989#discussion_r118305148 --- Diff: docs/configuration.md --- @@ -520,6 +520,14 @@ Apart from these, the following properties are also available, and may be useful

[GitHub] spark pull request #16989: [SPARK-19659] Fetch big blocks to disk when shuff...

2017-05-24 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/16989#discussion_r118305105 --- Diff: core/src/test/scala/org/apache/spark/storage/ShuffleBlockFetcherIteratorSuite.scala --- @@ -401,4 +411,61 @@ class

[GitHub] spark pull request #18079: [SPARK-20841][SQL] Support column aliases for cat...

2017-05-24 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/18079#discussion_r118305063 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/TableIdentifierParserSuite.scala --- @@ -49,7 +49,7 @@ class

[GitHub] spark pull request #12414: [SPARK-14657][SPARKR][ML] RFormula w/o intercept ...

2017-05-24 Thread felixcheung
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/12414#discussion_r118307768 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/RFormula.scala --- @@ -163,12 +163,20 @@ class RFormula @Since("1.5.0") (@Since("1.5.0")

[GitHub] spark issue #16989: [SPARK-19659] Fetch big blocks to disk when shuffle-read...

2017-05-24 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16989 **[Test build #77302 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77302/testReport)** for PR 16989 at commit

[GitHub] spark issue #18079: [SPARK-20841][SQL] Support column aliases for catalog ta...

2017-05-24 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/18079 Could you check the alias precedence of the other database? ```SQL select col1 as a, col2 as b from t1 as (c, d); ``` Which alias should be used as the output schema? `(a, b)`

[GitHub] spark pull request #18075: [SPARK-18016][SQL][CATALYST] Code Generation: Con...

2017-05-24 Thread kiszk
Github user kiszk commented on a diff in the pull request: https://github.com/apache/spark/pull/18075#discussion_r118320606 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala --- @@ -233,10 +223,124 @@ class CodegenContext

[GitHub] spark issue #18087: [SPARK-20867][SQL] Move hints from Statistics into HintI...

2017-05-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18087 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #18087: [SPARK-20867][SQL] Move hints from Statistics into HintI...

2017-05-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18087 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77301/ Test PASSed. ---

[GitHub] spark issue #18091: [SPARK-20868][CORE] UnsafeShuffleWriter should verify th...

2017-05-24 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/18091 cc @joshRosan --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or

[GitHub] spark pull request #18073: [SPARK-20848][SQL] Shutdown the pool after readin...

2017-05-24 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/18073 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark issue #17967: [SPARK-14659][ML] RFormula consistent with R when handli...

2017-05-24 Thread felixcheung
Github user felixcheung commented on the issue: https://github.com/apache/spark/pull/17967 I think a html table is better? https://github.com/apache/spark/pull/17967#discussion_r117917444 + @srowen for your opinion- to be honest I don't think I've actually seen a table in Spark

[GitHub] spark issue #18051: [SPARK-18825][SPARKR][DOCS][WIP] Eliminate duplicate lin...

2017-05-24 Thread felixcheung
Github user felixcheung commented on the issue: https://github.com/apache/spark/pull/18051 Improvements to R API doc online can be useful. But I think that is somewhat independent of this change/PR. Let's focus on your points on `Cleaning duplicate links` and `Trying to clean see

[GitHub] spark issue #18051: [SPARK-18825][SPARKR][DOCS][WIP] Eliminate duplicate lin...

2017-05-24 Thread zero323
Github user zero323 commented on the issue: https://github.com/apache/spark/pull/18051 If we consider improvement of the online documentation to be a separate problem, then I fully agree with @actuaryzhang. --- If your project is set up for it, you can reply to this email and have

[GitHub] spark issue #12646: [SPARK-14878][SQL] Trim characters string function suppo...

2017-05-24 Thread kevinyu98
Github user kevinyu98 commented on the issue: https://github.com/apache/spark/pull/12646 test it please. Thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark pull request #12414: [SPARK-14657][SPARKR][ML] RFormula w/o intercept ...

2017-05-24 Thread felixcheung
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/12414#discussion_r118306841 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/RFormula.scala --- @@ -163,12 +163,20 @@ class RFormula @Since("1.5.0") (@Since("1.5.0")

[GitHub] spark issue #16989: [SPARK-19659] Fetch big blocks to disk when shuffle-read...

2017-05-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16989 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16989: [SPARK-19659] Fetch big blocks to disk when shuffle-read...

2017-05-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16989 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77302/ Test FAILed. ---

[GitHub] spark pull request #18067: [SPARK-20849][DOC][SPARKR] Document R DecisionTre...

2017-05-24 Thread felixcheung
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/18067#discussion_r118315143 --- Diff: R/pkg/vignettes/sparkr-vignettes.Rmd --- @@ -776,6 +778,19 @@ newDF <- createDataFrame(data.frame(x = c(1.5, 3.2)))

[GitHub] spark issue #17633: [SPARK-20331][SQL] Enhanced Hive partition pruning predi...

2017-05-24 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17633 **[Test build #77306 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77306/testReport)** for PR 17633 at commit

[GitHub] spark issue #18087: [SPARK-20867][SQL] Move hints from Statistics into HintI...

2017-05-24 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18087 **[Test build #77301 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77301/testReport)** for PR 18087 at commit

[GitHub] spark pull request #18091: [SPARK-20868][CORE] UnsafeShuffleWriter should ve...

2017-05-24 Thread cloud-fan
GitHub user cloud-fan opened a pull request: https://github.com/apache/spark/pull/18091 [SPARK-20868][CORE] UnsafeShuffleWriter should verify the position after FileChannel.transferTo ## What changes were proposed in this pull request? Long time ago we fixed a

[GitHub] spark pull request #16989: [SPARK-19659] Fetch big blocks to disk when shuff...

2017-05-24 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/16989#discussion_r118304766 --- Diff: core/src/main/scala/org/apache/spark/storage/ShuffleBlockFetcherIterator.scala --- @@ -154,7 +161,7 @@ final class

[GitHub] spark pull request #17750: [SPARK-4899][MESOS] Support for checkpointing on ...

2017-05-24 Thread mgummelt
Github user mgummelt commented on a diff in the pull request: https://github.com/apache/spark/pull/17750#discussion_r118310789 --- Diff: resource-managers/mesos/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosCoarseGrainedSchedulerBackend.scala --- @@ -158,7 +158,7

[GitHub] spark pull request #17113: [SPARK-13669][Core] Improve the blacklist mechani...

2017-05-24 Thread tgravescs
Github user tgravescs commented on a diff in the pull request: https://github.com/apache/spark/pull/17113#discussion_r118315661 --- Diff: core/src/main/scala/org/apache/spark/scheduler/BlacklistTracker.scala --- @@ -145,6 +146,75 @@ private[scheduler] class BlacklistTracker (

[GitHub] spark pull request #17633: [SPARK-20331][SQL] Enhanced Hive partition prunin...

2017-05-24 Thread mallman
Github user mallman commented on a diff in the pull request: https://github.com/apache/spark/pull/17633#discussion_r118318733 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveShim.scala --- @@ -589,18 +590,34 @@ private[client] class Shim_v0_13 extends

[GitHub] spark issue #18089: [SPARK-19281][FOLLOWUP][ML] Minor fix for PySpark FPGrow...

2017-05-24 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18089 **[Test build #77303 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77303/testReport)** for PR 18089 at commit

[GitHub] spark issue #18089: [SPARK-19281][FOLLOWUP][ML] Minor fix for PySpark FPGrow...

2017-05-24 Thread zero323
Github user zero323 commented on the issue: https://github.com/apache/spark/pull/18089 Thanks @yanboliang --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so,

  1   2   3   4   5   6   >