[GitHub] spark issue #18107: [SPARK-20883][SPARK-20376][SS] Refactored StateStore API...

2017-05-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18107 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #18084: [SPARK-19900][core]Remove driver when relaunching.

2017-05-25 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/18084 Maybe some more actions should be done in `relaunchDriver()` such as have `driver.worker` removes the dependency of the relaunched driver, but it will be sort of wasting resources to remove and

[GitHub] spark pull request #18114: [SPARK-20889][SparkR] Grouped documentation for d...

2017-05-25 Thread actuaryzhang
GitHub user actuaryzhang opened a pull request: https://github.com/apache/spark/pull/18114 [SPARK-20889][SparkR] Grouped documentation for datetime column methods ## What changes were proposed in this pull request? Grouped documentation for datetime column methods. You

[GitHub] spark pull request #18078: [SPARK-10643] Make spark-submit download remote f...

2017-05-25 Thread loneknightpy
Github user loneknightpy commented on a diff in the pull request: https://github.com/apache/spark/pull/18078#discussion_r118583624 --- Diff: core/src/test/scala/org/apache/spark/deploy/SparkSubmitSuite.scala --- @@ -535,7 +538,7 @@ class SparkSubmitSuite

[GitHub] spark pull request #18098: [SPARK-16944][Mesos] Improve data locality when l...

2017-05-25 Thread mgummelt
Github user mgummelt commented on a diff in the pull request: https://github.com/apache/spark/pull/18098#discussion_r118586090 --- Diff: resource-managers/mesos/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosCoarseGrainedSchedulerBackend.scala --- @@ -502,6 +521,25

[GitHub] spark pull request #18078: [SPARK-10643] [Core] Make spark-submit download r...

2017-05-25 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request: https://github.com/apache/spark/pull/18078#discussion_r118593252 --- Diff: core/src/test/scala/org/apache/spark/deploy/SparkSubmitSuite.scala --- @@ -535,7 +538,7 @@ class SparkSubmitSuite test("resolves

[GitHub] spark issue #17864: [SPARK-20604][ML] Allow imputer to handle numeric types

2017-05-25 Thread actuaryzhang
Github user actuaryzhang commented on the issue: https://github.com/apache/spark/pull/17864 @MLnick Thanks much for your comments. Yes, I think always returning Double is consistent with Python and R and also other transformers in ML. Plus, as @hhbyyh mentioned, this makes the

[GitHub] spark pull request #17864: [SPARK-20604][ML] Allow imputer to handle numeric...

2017-05-25 Thread actuaryzhang
Github user actuaryzhang commented on a diff in the pull request: https://github.com/apache/spark/pull/17864#discussion_r118600408 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/Imputer.scala --- @@ -94,12 +94,13 @@ private[feature] trait ImputerParams extends Params

[GitHub] spark issue #18114: [SPARK-20889][SparkR] Grouped documentation for datetime...

2017-05-25 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18114 **[Test build #77391 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77391/testReport)** for PR 18114 at commit

[GitHub] spark issue #18114: [SPARK-20889][SparkR] Grouped documentation for datetime...

2017-05-25 Thread actuaryzhang
Github user actuaryzhang commented on the issue: https://github.com/apache/spark/pull/18114 @felixcheung Created this PR to update the doc for the date time methods, similar to #18114. About 27 date time methods are documented into one page. I'm attaching the snapshot of

[GitHub] spark pull request #18114: [SPARK-20889][SparkR] Grouped documentation for d...

2017-05-25 Thread actuaryzhang
Github user actuaryzhang commented on a diff in the pull request: https://github.com/apache/spark/pull/18114#discussion_r118605422 --- Diff: R/pkg/R/functions.R --- @@ -2476,24 +2430,27 @@ setMethod("from_json", signature(x = "Column", schema = "structType"),

[GitHub] spark issue #11974: [SPARK-14174][ML] Accelerate KMeans via Mini-Batch EM

2017-05-25 Thread sethah
Github user sethah commented on the issue: https://github.com/apache/spark/pull/11974 Mini-batching in Spark generally isn't that efficient, since to extract a mini-batch you still need to iterate over the entire dataset - and that means reading it from disk if it doesn't fit into

[GitHub] spark issue #17343: [SPARK-20014] Optimize mergeSpillsWithFileStream method

2017-05-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17343 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77382/ Test PASSed. ---

[GitHub] spark pull request #18113: [SPARK-20890][SQL] Added min and max typed aggreg...

2017-05-25 Thread setjet
GitHub user setjet opened a pull request: https://github.com/apache/spark/pull/18113 [SPARK-20890][SQL] Added min and max typed aggregation functions ## What changes were proposed in this pull request? Typed Min and Max functions are missing for aggregations done on dataset.

[GitHub] spark issue #18113: [SPARK-20890][SQL] Added min and max typed aggregation f...

2017-05-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18113 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark issue #18078: [SPARK-10643] [Core] Make spark-submit download remote f...

2017-05-25 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/18078 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the

[GitHub] spark issue #18078: [SPARK-10643] [Core] Make spark-submit download remote f...

2017-05-25 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18078 **[Test build #77389 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77389/testReport)** for PR 18078 at commit

[GitHub] spark pull request #18078: [SPARK-10643] Make spark-submit download remote f...

2017-05-25 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request: https://github.com/apache/spark/pull/18078#discussion_r118580623 --- Diff: core/src/test/scala/org/apache/spark/deploy/SparkSubmitSuite.scala --- @@ -535,7 +538,7 @@ class SparkSubmitSuite test("resolves

[GitHub] spark pull request #18078: [SPARK-10643] Make spark-submit download remote f...

2017-05-25 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request: https://github.com/apache/spark/pull/18078#discussion_r118580006 --- Diff: core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala --- @@ -308,6 +311,15 @@ object SparkSubmit extends CommandLineUtils {

[GitHub] spark issue #18098: [SPARK-16944][Mesos] Improve data locality when launchin...

2017-05-25 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18098 **[Test build #77387 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77387/testReport)** for PR 18098 at commit

[GitHub] spark issue #18078: [SPARK-10643] Make spark-submit download remote files to...

2017-05-25 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/18078 Could you also add "[Core]" tag in the title? @loneknightpy Also cc @cloud-fan @gatorsmile --- If your project is set up for it, you can reply to this email and have your reply appear on

[GitHub] spark issue #17343: [SPARK-20014] Optimize mergeSpillsWithFileStream method

2017-05-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17343 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #17343: [SPARK-20014] Optimize mergeSpillsWithFileStream method

2017-05-25 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17343 **[Test build #77382 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77382/testReport)** for PR 17343 at commit

[GitHub] spark pull request #18107: [SPARK-20883][SPARK-20376][SS] Refactored StateSt...

2017-05-25 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/18107#discussion_r118595928 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/streaming/FlatMapGroupsWithStateSuite.scala --- @@ -508,22 +508,6 @@ class FlatMapGroupsWithStateSuite

[GitHub] spark issue #18094: [Spark-20775][SQL] Added scala support from_json

2017-05-25 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18094 **[Test build #77379 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77379/testReport)** for PR 18094 at commit

[GitHub] spark pull request #18101: [SPARK-20874][Examples]Add Structured Streaming K...

2017-05-25 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/18101 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark issue #18051: [SPARK-18825][SPARKR][DOCS][WIP] Eliminate duplicate lin...

2017-05-25 Thread actuaryzhang
Github user actuaryzhang commented on the issue: https://github.com/apache/spark/pull/18051 That makes sense! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes

[GitHub] spark issue #18105: [SPARK-20881] [SQL] Use Hive's stats in metastore when c...

2017-05-25 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/18105 If users have not analyzed the table in Spark yet, we should respect the stats from hive metastore. But if users have already run the analyze table command in Spark, I think it's fair to ask them

[GitHub] spark pull request #18112: [SPARK-20888][SQL][DOCS] Document change of defau...

2017-05-25 Thread mallman
GitHub user mallman opened a pull request: https://github.com/apache/spark/pull/18112 [SPARK-20888][SQL][DOCS] Document change of default setting of spark.sql.hive.caseSensitiveInferenceMode (Link to Jira: https://issues.apache.org/jira/browse/SPARK-20888) ## What changes

[GitHub] spark issue #18094: [Spark-20775][SQL] Added scala support from_json

2017-05-25 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18094 **[Test build #77379 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77379/testReport)** for PR 18094 at commit

[GitHub] spark issue #18094: [Spark-20775][SQL] Added scala support from_json

2017-05-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18094 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77379/ Test FAILed. ---

[GitHub] spark pull request #11746: [SPARK-13602][CORE] Add shutdown hook to DriverRu...

2017-05-25 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/11746#discussion_r118544204 --- Diff: core/src/main/scala/org/apache/spark/deploy/worker/DriverRunner.scala --- @@ -53,9 +53,11 @@ private[deploy] class DriverRunner( @volatile

[GitHub] spark issue #18094: [Spark-20775][SQL] Added scala support from_json

2017-05-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18094 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #18112: [SPARK-20888][SQL][DOCS] Document change of default sett...

2017-05-25 Thread mallman
Github user mallman commented on the issue: https://github.com/apache/spark/pull/18112 @budde Can you please review (urgently) for inclusion as a migration note for 2.2? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well.

[GitHub] spark issue #18112: [SPARK-20888][SQL][DOCS] Document change of default sett...

2017-05-25 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18112 **[Test build #77380 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77380/testReport)** for PR 18112 at commit

[GitHub] spark issue #18094: [Spark-20775][SQL] Added scala support from_json

2017-05-25 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18094 **[Test build #77381 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77381/testReport)** for PR 18094 at commit

[GitHub] spark issue #18092: [SPARK-20640][CORE]Make rpc timeout and retry for shuffl...

2017-05-25 Thread sitalkedia
Github user sitalkedia commented on the issue: https://github.com/apache/spark/pull/18092 >> I can not think of meaningful test cases, are there any suggestions? How about just "unit tests" ? --- If your project is set up for it, you can reply to this email and have your

[GitHub] spark pull request #11974: [SPARK-14174][ML] Accelerate KMeans via Mini-Batc...

2017-05-25 Thread MLnick
Github user MLnick commented on a diff in the pull request: https://github.com/apache/spark/pull/11974#discussion_r118546602 --- Diff: mllib/src/test/scala/org/apache/spark/ml/clustering/KMeansSuite.scala --- @@ -89,6 +92,9 @@ class KMeansSuite extends SparkFunSuite with

[GitHub] spark issue #18110: [SPARK-20887][CORE] support alternative keys in ConfigBu...

2017-05-25 Thread JoshRosen
Github user JoshRosen commented on the issue: https://github.com/apache/spark/pull/18110 @cloud-fan, what about `SparkConf`'s `configsWithAlternatives `: https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/SparkConf.scala#L596 --- If your project is

[GitHub] spark issue #17471: [SPARK-3577] Report Spill size on disk for UnsafeExterna...

2017-05-25 Thread sitalkedia
Github user sitalkedia commented on the issue: https://github.com/apache/spark/pull/17471 @sameeragarwal - Thanks for taking a look. I will update the PR adding test case soon. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as

[GitHub] spark issue #18110: [SPARK-20887][CORE] support alternative keys in ConfigBu...

2017-05-25 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/18110 It's only used in the `SparkCong.get(key: String)` code path, not `SparkConf.get(entry: ConfigEntry[T])` code path. That's why we only support alternative keys if users get conf value by

[GitHub] spark pull request #18092: [SPARK-20640][CORE]Make rpc timeout and retry for...

2017-05-25 Thread JoshRosen
Github user JoshRosen commented on a diff in the pull request: https://github.com/apache/spark/pull/18092#discussion_r118547246 --- Diff: core/src/main/scala/org/apache/spark/storage/BlockManager.scala --- @@ -170,11 +170,17 @@ private[spark] class BlockManager( // service,

[GitHub] spark issue #18112: [SPARK-20888][SQL][DOCS] Document change of default sett...

2017-05-25 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18112 **[Test build #77380 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77380/testReport)** for PR 18112 at commit

[GitHub] spark issue #16578: [SPARK-4502][SQL] Parquet nested column pruning

2017-05-25 Thread mallman
Github user mallman commented on the issue: https://github.com/apache/spark/pull/16578 Also, I'm confused about something—who has jenkins retest privileges? And can I get them? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark issue #18112: [SPARK-20888][SQL][DOCS] Document change of default sett...

2017-05-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18112 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77380/ Test PASSed. ---

[GitHub] spark issue #18112: [SPARK-20888][SQL][DOCS] Document change of default sett...

2017-05-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18112 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request #11974: [SPARK-14174][ML] Accelerate KMeans via Mini-Batc...

2017-05-25 Thread MLnick
Github user MLnick commented on a diff in the pull request: https://github.com/apache/spark/pull/11974#discussion_r118548583 --- Diff: mllib/src/main/scala/org/apache/spark/ml/clustering/KMeans.scala --- @@ -85,6 +85,20 @@ private[clustering] trait KMeansParams extends Params

[GitHub] spark issue #18110: [SPARK-20887][CORE] support alternative keys in ConfigBu...

2017-05-25 Thread JoshRosen
Github user JoshRosen commented on the issue: https://github.com/apache/spark/pull/18110 Ahhh, makes sense. Thanks for the clarification. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request #11974: [SPARK-14174][ML] Accelerate KMeans via Mini-Batc...

2017-05-25 Thread MLnick
Github user MLnick commented on a diff in the pull request: https://github.com/apache/spark/pull/11974#discussion_r118548684 --- Diff: mllib/src/main/scala/org/apache/spark/ml/clustering/KMeans.scala --- @@ -85,6 +85,20 @@ private[clustering] trait KMeansParams extends Params

[GitHub] spark pull request #11974: [SPARK-14174][ML] Accelerate KMeans via Mini-Batc...

2017-05-25 Thread MLnick
Github user MLnick commented on a diff in the pull request: https://github.com/apache/spark/pull/11974#discussion_r118548792 --- Diff: mllib/src/main/scala/org/apache/spark/ml/clustering/KMeans.scala --- @@ -85,6 +85,20 @@ private[clustering] trait KMeansParams extends Params

[GitHub] spark issue #18064: [SPARK-20213][SQL] Fix DataFrameWriter operations in SQL...

2017-05-25 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18064 **[Test build #77374 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77374/testReport)** for PR 18064 at commit

[GitHub] spark issue #18064: [SPARK-20213][SQL] Fix DataFrameWriter operations in SQL...

2017-05-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18064 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77374/ Test FAILed. ---

[GitHub] spark issue #18064: [SPARK-20213][SQL] Fix DataFrameWriter operations in SQL...

2017-05-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18064 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #11974: [SPARK-14174][ML] Accelerate KMeans via Mini-Batch EM

2017-05-25 Thread MLnick
Github user MLnick commented on the issue: https://github.com/apache/spark/pull/11974 cc @srowen @setha also --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes

[GitHub] spark issue #18051: [SPARK-18825][SPARKR][DOCS][WIP] Eliminate duplicate lin...

2017-05-25 Thread zero323
Github user zero323 commented on the issue: https://github.com/apache/spark/pull/18051 Exactly my point. Run examples internally ([it is not hard to patch knitr](https://github.com/zero323/knitr/commit/7a0d8f9ddb9d77a9c235f25aca26131e83c1f6cc) or even `tools::Rd2ex`) to validate

[GitHub] spark pull request #18051: [SPARK-18825][SPARKR][DOCS][WIP] Eliminate duplic...

2017-05-25 Thread zero323
Github user zero323 closed the pull request at: https://github.com/apache/spark/pull/18051 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark issue #18112: [SPARK-20888][SQL][DOCS] Document change of default sett...

2017-05-25 Thread mallman
Github user mallman commented on the issue: https://github.com/apache/spark/pull/18112 CC @cloud-fan @ericl --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes

[GitHub] spark issue #18083: [SPARK-20863] Add metrics/instrumentation to LiveListene...

2017-05-25 Thread JoshRosen
Github user JoshRosen commented on the issue: https://github.com/apache/spark/pull/18083 > I am not sure that monitoring (with real metrics) the number of dropped events really worth it. You just want to know if messages have been dropped (and having the number in the log is fine).

[GitHub] spark pull request #18098: [SPARK-16944][Mesos] Improve data locality when l...

2017-05-25 Thread mgummelt
Github user mgummelt commented on a diff in the pull request: https://github.com/apache/spark/pull/18098#discussion_r118550928 --- Diff: resource-managers/mesos/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosCoarseGrainedSchedulerBackend.scala --- @@ -393,7 +409,30

[GitHub] spark pull request #17094: [SPARK-19762][ML] Hierarchy for consolidating ML ...

2017-05-25 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/17094#discussion_r118475804 --- Diff: mllib/src/main/scala/org/apache/spark/ml/optim/aggregator/LeastSquaresAggregator.scala --- @@ -0,0 +1,224 @@ +/* + * Licensed to the

[GitHub] spark issue #18106: [SPARK-20754][SQL] Support TRUNC (number)

2017-05-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18106 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #18106: [SPARK-20754][SQL] Support TRUNC (number)

2017-05-25 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18106 **[Test build #77361 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77361/testReport)** for PR 18106 at commit

[GitHub] spark issue #18106: [SPARK-20754][SQL] Support TRUNC (number)

2017-05-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18106 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77361/ Test FAILed. ---

[GitHub] spark issue #18058: [SPARK-20768][PYSPARK][ML] Expose numPartitions (expert)...

2017-05-25 Thread facaiy
Github user facaiy commented on the issue: https://github.com/apache/spark/pull/18058 Resolved. By the way, Which one is preferable, rebase or merge? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request #17972: [SPARK-20723][ML]Add intermediate storage level t...

2017-05-25 Thread MLnick
Github user MLnick commented on a diff in the pull request: https://github.com/apache/spark/pull/17972#discussion_r118479964 --- Diff: mllib/src/test/scala/org/apache/spark/ml/classification/DecisionTreeClassifierSuite.scala --- @@ -398,6 +398,14 @@ class

[GitHub] spark pull request #17972: [SPARK-20723][ML]Add intermediate storage level t...

2017-05-25 Thread MLnick
Github user MLnick commented on a diff in the pull request: https://github.com/apache/spark/pull/17972#discussion_r118479347 --- Diff: mllib/src/main/scala/org/apache/spark/ml/param/shared/sharedParams.scala --- @@ -406,4 +409,21 @@ private[ml] trait HasAggregationDepth extends

[GitHub] spark pull request #17770: [SPARK-20392][SQL] Set barrier to prevent re-ente...

2017-05-25 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/17770#discussion_r118479956 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala --- @@ -573,7 +576,7 @@ class Dataset[T] private[sql]( Dataset.ofRows(

[GitHub] spark issue #18064: [SPARK-20213][SQL] Fix DataFrameWriter operations in SQL...

2017-05-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18064 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #18064: [SPARK-20213][SQL] Fix DataFrameWriter operations in SQL...

2017-05-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18064 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77366/ Test FAILed. ---

[GitHub] spark pull request #18109: Merge pull request #1 from apache/master

2017-05-25 Thread WindCanDie
GitHub user WindCanDie opened a pull request: https://github.com/apache/spark/pull/18109 Merge pull request #1 from apache/master 2017/5/23 ## What changes were proposed in this pull request? (Please fill in changes proposed in this fix) ## How was this

[GitHub] spark issue #18019: [SPARK-20748][SQL] Add built-in SQL function CH[A]R.

2017-05-25 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18019 **[Test build #77371 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77371/testReport)** for PR 18019 at commit

[GitHub] spark issue #18075: [SPARK-18016][SQL][CATALYST] Code Generation: Constant P...

2017-05-25 Thread kiszk
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/18075 Thanks, sound good to me for now. cc @ueshin --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request #18110: [SPARK-20887][CORE] support alternative keys in C...

2017-05-25 Thread cloud-fan
GitHub user cloud-fan opened a pull request: https://github.com/apache/spark/pull/18110 [SPARK-20887][CORE] support alternative keys in ConfigBuilder ## What changes were proposed in this pull request? `ConfigBuilder` builds `ConfigEntry` which can only read value with one

[GitHub] spark issue #18110: [SPARK-20887][CORE] support alternative keys in ConfigBu...

2017-05-25 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18110 **[Test build #77372 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77372/testReport)** for PR 18110 at commit

[GitHub] spark issue #17113: [SPARK-13669][Core] Improve the blacklist mechanism to h...

2017-05-25 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17113 **[Test build #77367 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77367/testReport)** for PR 17113 at commit

[GitHub] spark issue #17150: [SPARK-19810][BUILD][CORE] Remove support for Scala 2.10

2017-05-25 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17150 **[Test build #3756 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3756/testReport)** for PR 17150 at commit

[GitHub] spark issue #17113: [SPARK-13669][Core] Improve the blacklist mechanism to h...

2017-05-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17113 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #17113: [SPARK-13669][Core] Improve the blacklist mechanism to h...

2017-05-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17113 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77367/ Test PASSed. ---

[GitHub] spark issue #18110: [SPARK-20887][CORE] support alternative keys in ConfigBu...

2017-05-25 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/18110 cc @JoshRosen @dhruve --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes

[GitHub] spark pull request #18111: [SPARK-20886][CORE] HadoopMapReduceCommitProtocol...

2017-05-25 Thread steveloughran
GitHub user steveloughran opened a pull request: https://github.com/apache/spark/pull/18111 [SPARK-20886][CORE] HadoopMapReduceCommitProtocol to fail meaningfully if FileOutputCommitter.getWorkPath==null ## What changes were proposed in this pull request? Handles the

[GitHub] spark issue #18110: [SPARK-20887][CORE] support alternative keys in ConfigBu...

2017-05-25 Thread JoshRosen
Github user JoshRosen commented on the issue: https://github.com/apache/spark/pull/18110 FWIW I didn't actually say that we should rename that key since the cost of the confusing name isn't that high right now. So while I don't oppose this mechanism I'm neutral on it given that

[GitHub] spark issue #18105: [SPARK-20881] [SQL] Use Hive's stats in metastore when c...

2017-05-25 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/18105 I think we should always trust Spark's table stats over Hive's, no matter CBO is on or not. If users update the stats at hive side, it's their own responsibility to update it at Spark side.

[GitHub] spark issue #18107: [SPARK-20883][SPARK-20376][SS] Refactored StateStore API...

2017-05-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18107 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #18107: [SPARK-20883][SPARK-20376][SS] Refactored StateStore API...

2017-05-25 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18107 **[Test build #77363 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77363/testReport)** for PR 18107 at commit

[GitHub] spark issue #18107: [SPARK-20883][SPARK-20376][SS] Refactored StateStore API...

2017-05-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18107 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77363/ Test FAILed. ---

[GitHub] spark issue #17770: [SPARK-20392][SQL] Set barrier to prevent re-entering a ...

2017-05-25 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/17770 LGTM except some minor comments --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request #16989: [SPARK-19659] Fetch big blocks to disk when shuff...

2017-05-25 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/16989#discussion_r118482160 --- Diff: core/src/test/scala/org/apache/spark/storage/ShuffleBlockFetcherIteratorSuite.scala --- @@ -401,4 +413,64 @@ class

[GitHub] spark issue #18064: [SPARK-20213][SQL] Fix DataFrameWriter operations in SQL...

2017-05-25 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18064 **[Test build #77366 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77366/testReport)** for PR 18064 at commit

[GitHub] spark issue #18058: [SPARK-20768][PYSPARK][ML] Expose numPartitions (expert)...

2017-05-25 Thread yanboliang
Github user yanboliang commented on the issue: https://github.com/apache/spark/pull/18058 Merged into master and branch-2.2. Thanks for all. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark issue #18091: [SPARK-20868][CORE] UnsafeShuffleWriter should verify th...

2017-05-25 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18091 **[Test build #77364 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77364/testReport)** for PR 18091 at commit

[GitHub] spark pull request #18075: [SPARK-18016][SQL][CATALYST] Code Generation: Con...

2017-05-25 Thread bdrillard
Github user bdrillard commented on a diff in the pull request: https://github.com/apache/spark/pull/18075#discussion_r118497931 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala --- @@ -233,10 +223,124 @@ class

[GitHub] spark issue #17310: [SPARK-18579][SQL] Use ignoreLeadingWhiteSpace and ignor...

2017-05-25 Thread prabcs
Github user prabcs commented on the issue: https://github.com/apache/spark/pull/17310 OK, great then !! We'll use 2.2 Thanks ! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark issue #17150: [SPARK-19810][BUILD][CORE] Remove support for Scala 2.10

2017-05-25 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17150 **[Test build #3756 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3756/testReport)** for PR 17150 at commit

[GitHub] spark issue #18107: [SPARK-20883][SPARK-20376][SS] Refactored StateStore API...

2017-05-25 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18107 **[Test build #3755 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3755/testReport)** for PR 18107 at commit

[GitHub] spark issue #18015: [SAPRK-20785][WEB-UI][SQL]Spark should provide jump link...

2017-05-25 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18015 **[Test build #3758 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3758/testReport)** for PR 18015 at commit

[GitHub] spark issue #18060: [SPARK-20835][Core]It should exit directly when the --to...

2017-05-25 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18060 **[Test build #3757 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3757/testReport)** for PR 18060 at commit

[GitHub] spark issue #18058: [SPARK-20768][PYSPARK][ML] Expose numPartitions (expert)...

2017-05-25 Thread MLnick
Github user MLnick commented on the issue: https://github.com/apache/spark/pull/18058 I personally prefer merging when the PR is still in progress - it preserves the commit history for reviewers. --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark issue #18058: [SPARK-20768][PYSPARK][ML] Expose numPartitions (expert)...

2017-05-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18058 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #18058: [SPARK-20768][PYSPARK][ML] Expose numPartitions (expert)...

2017-05-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18058 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77369/ Test PASSed. ---

[GitHub] spark issue #18058: [SPARK-20768][PYSPARK][ML] Expose numPartitions (expert)...

2017-05-25 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18058 **[Test build #77369 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77369/testReport)** for PR 18058 at commit

<    1   2   3   4   5   6   >