[GitHub] spark pull request #13557: [SPARK-15819][PYSPARK][ML] Add KMeanSummary in KM...

2016-11-28 Thread sethah
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/13557#discussion_r89817603 --- Diff: python/pyspark/ml/clustering.py --- @@ -330,6 +357,20 @@ class KMeans(JavaEstimator, HasFeaturesCol, HasPredictionCol, HasMaxIter, HasTol >

[GitHub] spark pull request #13557: [SPARK-15819][PYSPARK][ML] Add KMeanSummary in KM...

2016-11-28 Thread sethah
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/13557#discussion_r89817165 --- Diff: python/pyspark/ml/clustering.py --- @@ -330,6 +357,20 @@ class KMeans(JavaEstimator, HasFeaturesCol, HasPredictionCol, HasMaxIter, HasTol >

[GitHub] spark pull request #13557: [SPARK-15819][PYSPARK][ML] Add KMeanSummary in KM...

2016-11-28 Thread sethah
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/13557#discussion_r89817695 --- Diff: python/pyspark/ml/clustering.py --- @@ -316,7 +318,32 @@ def computeCost(self, dataset): """ return self._call_java("comput

[GitHub] spark pull request #16040: [SPARK-18612][MLLIB] Delete broadcasted variable ...

2016-11-28 Thread AnthonyTruchet
GitHub user AnthonyTruchet opened a pull request: https://github.com/apache/spark/pull/16040 [SPARK-18612][MLLIB] Delete broadcasted variable in LBFGS CostFun ## What changes were proposed in this pull request? Fix a broadcasted variable leak occurring at each invocation of

[GitHub] spark pull request #16017: [SPARK-18592][ML] Move DT/RF/GBT Param setter met...

2016-11-28 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/16017#discussion_r89819028 --- Diff: mllib/src/main/scala/org/apache/spark/ml/tree/treeParams.scala --- @@ -134,27 +150,31 @@ private[ml] trait DecisionTreeParams extends PredictorPa

[GitHub] spark issue #16017: [SPARK-18592][ML] Move DT/RF/GBT Param setter methods to...

2016-11-28 Thread jkbradley
Github user jkbradley commented on the issue: https://github.com/apache/spark/pull/16017 LGTM pending a check on whether the deprecations affect Estimators in addition to Models Thanks for the follow-up! --- If your project is set up for it, you can reply to this email and have y

[GitHub] spark issue #16040: [SPARK-18612][MLLIB] Delete broadcasted variable in LBFG...

2016-11-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16040 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feat

[GitHub] spark issue #16040: [SPARK-18612][MLLIB] Delete broadcasted variable in LBFG...

2016-11-28 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/16040 Jenkins test this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes

[GitHub] spark issue #16040: [SPARK-18612][MLLIB] Delete broadcasted variable in LBFG...

2016-11-28 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16040 **[Test build #69251 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/69251/consoleFull)** for PR 16040 at commit [`7aa7976`](https://github.com/apache/spark/commit/7

[GitHub] spark pull request #16041: [SPARK-18058][SQL][TRIVIAL] Use dataType.sameResu...

2016-11-28 Thread hvanhovell
GitHub user hvanhovell opened a pull request: https://github.com/apache/spark/pull/16041 [SPARK-18058][SQL][TRIVIAL] Use dataType.sameResult(...) instead equality on asNullable datatypes ## What changes were proposed in this pull request? This is absolutely minor. PR https://git

[GitHub] spark issue #16040: [SPARK-18612][MLLIB] Delete broadcasted variable in LBFG...

2016-11-28 Thread AnthonyTruchet
Github user AnthonyTruchet commented on the issue: https://github.com/apache/spark/pull/16040 Agreed, and we might also consider making the broadcasted variable compatible with some Automatic Resource Management to avoid such leak (like https://github.com/jsuereth/scala-arm). --- If

[GitHub] spark issue #16036: [SPARK-17732][SQL] Revert ALTER TABLE DROP PARTITION sho...

2016-11-28 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16036 **[Test build #69248 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/69248/consoleFull)** for PR 16036 at commit [`b4b8f20`](https://github.com/apache/spark/commit/

[GitHub] spark issue #16041: [SPARK-18058][SQL][TRIVIAL] Use dataType.sameResult(...)...

2016-11-28 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16041 **[Test build #69252 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/69252/consoleFull)** for PR 16041 at commit [`37b6760`](https://github.com/apache/spark/commit/3

[GitHub] spark issue #16036: [SPARK-17732][SQL] Revert ALTER TABLE DROP PARTITION sho...

2016-11-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16036 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/69248/ Test PASSed. ---

[GitHub] spark issue #16036: [SPARK-17732][SQL] Revert ALTER TABLE DROP PARTITION sho...

2016-11-28 Thread hvanhovell
Github user hvanhovell commented on the issue: https://github.com/apache/spark/pull/16036 LGTM. Merging to master (2.1 was already reverted). Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does no

[GitHub] spark issue #16040: [SPARK-18612][MLLIB] Delete broadcasted variable in LBFG...

2016-11-28 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/16040 If that's just a Scala version of try-with-resources, it is unfortunately insufficient. It's not clear when the broadcast is actually no longer used because it may be used in computations that are co

[GitHub] spark issue #16036: [SPARK-17732][SQL] Revert ALTER TABLE DROP PARTITION sho...

2016-11-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16036 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #16038: [SPARK-18471][CORE] New treeAggregate overload for big l...

2016-11-28 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/16038 If you're right here, then https://github.com/apache/spark/pull/16037 does nothing, right? In which case my understanding at https://github.com/apache/spark/pull/15905#pullrequestreview-8844

[GitHub] spark issue #15994: [SPARK-18555][SQL]DataFrameNaFunctions.fill miss up orig...

2016-11-28 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15994 **[Test build #69249 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/69249/consoleFull)** for PR 15994 at commit [`9f28b83`](https://github.com/apache/spark/commit/

[GitHub] spark pull request #16036: [SPARK-17732][SQL] Revert ALTER TABLE DROP PARTIT...

2016-11-28 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/16036 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is ena

[GitHub] spark pull request #16017: [SPARK-18592][ML] Move DT/RF/GBT Param setter met...

2016-11-28 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/16017#discussion_r89824958 --- Diff: mllib/src/main/scala/org/apache/spark/ml/tree/treeParams.scala --- @@ -107,25 +107,41 @@ private[ml] trait DecisionTreeParams extends PredictorPa

[GitHub] spark issue #15994: [SPARK-18555][SQL]DataFrameNaFunctions.fill miss up orig...

2016-11-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15994 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/69249/ Test PASSed. ---

[GitHub] spark issue #15994: [SPARK-18555][SQL]DataFrameNaFunctions.fill miss up orig...

2016-11-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15994 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #16011: [SPARK-18587][ML] Remove handleInvalid from QuantileDisc...

2016-11-28 Thread jkbradley
Github user jkbradley commented on the issue: https://github.com/apache/spark/pull/16011 I agree with @MLnick that we should not make this change. We need users to be able to set all Params in Estimators, including the Params of the Models they produce. If this is confusing for Quan

[GitHub] spark issue #15971: [SPARK-18535][UI][YARN] Redact sensitive information fro...

2016-11-28 Thread vanzin
Github user vanzin commented on the issue: https://github.com/apache/spark/pull/15971 Merging to master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or

[GitHub] spark issue #15874: [Spark-18408][ML] API Improvements for LSH

2016-11-28 Thread jkbradley
Github user jkbradley commented on the issue: https://github.com/apache/spark/pull/15874 LGTM Thanks everyone! Merging with master and branch-2.1 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project do

[GitHub] spark issue #15981: [SPARK-18547][core] Propagate I/O encryption key when ex...

2016-11-28 Thread vanzin
Github user vanzin commented on the issue: https://github.com/apache/spark/pull/15981 Ping. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the featu

[GitHub] spark issue #15982: [SPARK-18546][core] Fix merging shuffle spills when usin...

2016-11-28 Thread vanzin
Github user vanzin commented on the issue: https://github.com/apache/spark/pull/15982 Ping. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the featu

[GitHub] spark pull request #15971: [SPARK-18535][UI][YARN] Redact sensitive informat...

2016-11-28 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/15971 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is ena

[GitHub] spark pull request #16026: [SPARK-18597][SQL] Do not push-down join conditio...

2016-11-28 Thread nsyca
Github user nsyca commented on a diff in the pull request: https://github.com/apache/spark/pull/16026#discussion_r89830125 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -932,7 +932,7 @@ object PushPredicateThroughJoin extends

[GitHub] spark pull request #16042: Fix of dev scripts and new, Criteo specific, ones...

2016-11-28 Thread AnthonyTruchet
GitHub user AnthonyTruchet opened a pull request: https://github.com/apache/spark/pull/16042 Fix of dev scripts and new, Criteo specific, ones WIP You can merge this pull request into a Git repository by running: $ git pull https://github.com/AnthonyTruchet/spark dev-tools A

[GitHub] spark issue #16042: Fix of dev scripts and new, Criteo specific, ones WIP

2016-11-28 Thread AnthonyTruchet
Github user AnthonyTruchet commented on the issue: https://github.com/apache/spark/pull/16042 Sorry created by mistake --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark issue #15874: [Spark-18408][ML] API Improvements for LSH

2016-11-28 Thread jkbradley
Github user jkbradley commented on the issue: https://github.com/apache/spark/pull/15874 Well, I'm having trouble merging b/c of bad wifi during travel. Ping @yanboliang @MLnick @mengxr would one of you mind merging this with master and branch-2.1? @sethah and I having both given LG

[GitHub] spark pull request #16042: Fix of dev scripts and new, Criteo specific, ones...

2016-11-28 Thread AnthonyTruchet
Github user AnthonyTruchet closed the pull request at: https://github.com/apache/spark/pull/16042 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the featur

[GitHub] spark issue #16042: Fix of dev scripts and new, Criteo specific, ones WIP

2016-11-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16042 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feat

[GitHub] spark issue #15358: [SPARK-17783] [SQL] Hide Credentials in CREATE and DESC ...

2016-11-28 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/15358 Will do it soon. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wis

[GitHub] spark issue #16040: [SPARK-18612][MLLIB] Delete broadcasted variable in LBFG...

2016-11-28 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16040 **[Test build #69251 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/69251/consoleFull)** for PR 16040 at commit [`7aa7976`](https://github.com/apache/spark/commit/

[GitHub] spark issue #16040: [SPARK-18612][MLLIB] Delete broadcasted variable in LBFG...

2016-11-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16040 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/69251/ Test PASSed. ---

[GitHub] spark issue #16040: [SPARK-18612][MLLIB] Delete broadcasted variable in LBFG...

2016-11-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16040 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #16036: [SPARK-17732][SQL] Revert ALTER TABLE DROP PARTITION sho...

2016-11-28 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/16036 @hvanhovell , could you give me some direction on #15987 before updating? The as-is direction of #15987 seems to be not proper. --- If your project is set up for it, you can reply to this

[GitHub] spark pull request #15998: [SPARK-18572][SQL] Add a method `listPartitionNam...

2016-11-28 Thread mallman
Github user mallman commented on a diff in the pull request: https://github.com/apache/spark/pull/15998#discussion_r89839435 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala --- @@ -730,6 +730,23 @@ class SessionCatalog( }

[GitHub] spark issue #15998: [SPARK-18572][SQL] Add a method `listPartitionNames` to ...

2016-11-28 Thread mallman
Github user mallman commented on the issue: https://github.com/apache/spark/pull/15998 > where is the speed-up come from? Is it because the hive API getPartitionNames is faster than getPartitions? Or is it because we generate the partition string(a=1/b=2/c=3) at hive side and it's fas

[GitHub] spark pull request #14038: [SPARK-16317][SQL] Add a new interface to filter ...

2016-11-28 Thread steveloughran
Github user steveloughran commented on a diff in the pull request: https://github.com/apache/spark/pull/14038#discussion_r89839965 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/FileSourceStrategySuite.scala --- @@ -441,6 +441,44 @@ class FileSourceSt

[GitHub] spark pull request #15788: [SPARK-18291][SparkR][ML] SparkR glm predict shou...

2016-11-28 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/15788#discussion_r89840613 --- Diff: mllib/src/main/scala/org/apache/spark/ml/r/GeneralizedLinearRegressionWrapper.scala --- @@ -52,7 +59,15 @@ private[r] class GeneralizedLinearReg

[GitHub] spark issue #15788: [SPARK-18291][SparkR][ML] SparkR glm predict should outp...

2016-11-28 Thread jkbradley
Github user jkbradley commented on the issue: https://github.com/apache/spark/pull/15788 I agree that R makes it much easier to convert back to the string label. My main worries are: * changing the API, which may break some user code * hiding probabilities from R users (since

[GitHub] spark issue #16025: [SPARK-18602] Set the version of org.codehaus.janino:com...

2016-11-28 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/16025 Thanks. Since we start to use janino 3.0.0 in spark 2.1, I am merging this pr to both master and branch 2.1. --- If your project is set up for it, you can reply to this email and have your reply app

[GitHub] spark issue #15722: [SPARK-18208] [Shuffle] Executor OOM due to a growing Lo...

2016-11-28 Thread jiexiong
Github user jiexiong commented on the issue: https://github.com/apache/spark/pull/15722 @hvanhovell , I have already updated the description and explained how the PR fixed it. Could you please take another look? --- If your project is set up for it, you can reply to this email and ha

[GitHub] spark pull request #16025: [SPARK-18602] Set the version of org.codehaus.jan...

2016-11-28 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/16025 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is ena

[GitHub] spark pull request #15975: [SPARK-18538] [SQL] Fix Concurrent Table Fetching...

2016-11-28 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/15975#discussion_r89844825 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/jdbc/JDBCSuite.scala --- @@ -404,6 +425,7 @@ class JDBCSuite extends SparkFunSuite numPa

[GitHub] spark issue #15975: [SPARK-18538] [SQL] Fix Concurrent Table Fetching Using ...

2016-11-28 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15975 **[Test build #69253 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/69253/consoleFull)** for PR 15975 at commit [`c29199c`](https://github.com/apache/spark/commit/c

[GitHub] spark pull request #16030: [SPARK-18108][SQL] Fix a bug to fail partition sc...

2016-11-28 Thread brkyvz
Github user brkyvz commented on a diff in the pull request: https://github.com/apache/spark/pull/16030#discussion_r89845728 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetPartitionDiscoverySuite.scala --- @@ -969,4 +969,15 @@ class Parq

[GitHub] spark issue #16039: [SPARK-18597][SQL] Do not push-down join conditions to t...

2016-11-28 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16039 **[Test build #69250 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/69250/consoleFull)** for PR 16039 at commit [`f36c16d`](https://github.com/apache/spark/commit/

[GitHub] spark issue #16030: [SPARK-18108][SQL] Fix a bug to fail partition schema in...

2016-11-28 Thread brkyvz
Github user brkyvz commented on the issue: https://github.com/apache/spark/pull/16030 @maropu I wouldn't say this is a regression. I would say that this working for 2.0.2 was a bug in 2.0.2. If you want the column `a` to be interpreted as a `LongType` instead of `IntegerType`, you sho

[GitHub] spark issue #16039: [SPARK-18597][SQL] Do not push-down join conditions to t...

2016-11-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16039 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #16039: [SPARK-18597][SQL] Do not push-down join conditions to t...

2016-11-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16039 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/69250/ Test PASSed. ---

[GitHub] spark issue #16030: [SPARK-18108][SQL] Fix a bug to fail partition schema in...

2016-11-28 Thread brkyvz
Github user brkyvz commented on the issue: https://github.com/apache/spark/pull/16030 Or the thing that we should fix here is that if a partition column is found also as part of the dataSchema, to throw an exception. --- If your project is set up for it, you can reply to this email a

[GitHub] spark issue #13909: [SPARK-16213][SQL] Reduce runtime overhead of a program ...

2016-11-28 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13909 **[Test build #69254 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/69254/consoleFull)** for PR 13909 at commit [`8b2c787`](https://github.com/apache/spark/commit/8

[GitHub] spark pull request #16014: [SPARK-18590][SPARKR] build R source package when...

2016-11-28 Thread shivaram
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/16014#discussion_r89850470 --- Diff: dev/make-distribution.sh --- @@ -208,11 +212,24 @@ cp -r "$SPARK_HOME/data" "$DISTDIR" # Make pip package if [ "$MAKE_PIP" == "true" ]; t

[GitHub] spark pull request #16014: [SPARK-18590][SPARKR] build R source package when...

2016-11-28 Thread shivaram
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/16014#discussion_r89849402 --- Diff: R/check-cran.sh --- @@ -82,4 +83,20 @@ else # This will run tests and/or build vignettes, and require SPARK_HOME SPARK_HOME="${SPARK_

[GitHub] spark pull request #15949: [SPARK-18339] [SPARK-18513] [SQL] Don't push down...

2016-11-28 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/15949#discussion_r89851054 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/streaming/StreamExecutionMetadataSuite.scala --- @@ -0,0 +1,99 @@ +/* + * Licensed to the Apa

[GitHub] spark pull request #15910: [SPARK-18476][SPARKR][ML]:SparkR Logistic Regress...

2016-11-28 Thread wangmiao1981
Github user wangmiao1981 commented on a diff in the pull request: https://github.com/apache/spark/pull/15910#discussion_r89852551 --- Diff: R/pkg/inst/tests/testthat/test_mllib.R --- @@ -674,6 +674,16 @@ test_that("spark.logit", { expect_error(summary(blr_model2)) unli

[GitHub] spark issue #16035: [SQL][minor] DESC should use 'Catalog' as partition prov...

2016-11-28 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/16035 Merging in master/branch-2.1. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wis

[GitHub] spark pull request #15910: [SPARK-18476][SPARKR][ML]:SparkR Logistic Regress...

2016-11-28 Thread wangmiao1981
Github user wangmiao1981 commented on a diff in the pull request: https://github.com/apache/spark/pull/15910#discussion_r89852751 --- Diff: mllib/src/main/scala/org/apache/spark/ml/r/LogisticRegressionWrapper.scala --- @@ -84,14 +93,12 @@ private[r] object LogisticRegressionWrapper

[GitHub] spark issue #16002: [SPARK-18341][ML] Eliminate use of SingularMatrixExcepti...

2016-11-28 Thread sethah
Github user sethah commented on the issue: https://github.com/apache/spark/pull/16002 I understand the desire to avoid exception handling, but I am not convinced this is a better solution. Some comments: * We are pushing an integer "error code" from lapack up about 3 levels of

[GitHub] spark pull request #16035: [SQL][minor] DESC should use 'Catalog' as partiti...

2016-11-28 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/16035 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is ena

[GitHub] spark pull request #16002: [SPARK-18341][ML] Eliminate use of SingularMatrix...

2016-11-28 Thread sethah
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/16002#discussion_r89854110 --- Diff: mllib/src/main/scala/org/apache/spark/ml/optim/NormalEquationSolver.scala --- @@ -33,11 +33,16 @@ import org.apache.spark.mllib.linalg.CholeskyDeco

[GitHub] spark issue #15910: [SPARK-18476][SPARKR][ML]:SparkR Logistic Regression sho...

2016-11-28 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15910 **[Test build #69255 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/69255/consoleFull)** for PR 15910 at commit [`57cf430`](https://github.com/apache/spark/commit/5

[GitHub] spark issue #15992: [SPARK-18560][CORE][STREAMING] Receiver data can not be ...

2016-11-28 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/15992 Sorry for the delay. My high level comment is this is a critical bug and we also need to fix 2.0.x. However, we usually don't want to break APIs between maintenance releases. So I'm thinkin

[GitHub] spark issue #15986: [SPARK-18553][CORE][branch-2.0] Fix leak of TaskSetManag...

2016-11-28 Thread JoshRosen
Github user JoshRosen commented on the issue: https://github.com/apache/spark/pull/15986 @kayousterhout, @markhamstra, I believe that I've addressed all of the outstanding review comments, so do you want to take a final look at this to see whether it's ready to go? --- If your proje

[GitHub] spark issue #15961: [SPARK-18523][PySpark]Make SparkContext.stop more reliab...

2016-11-28 Thread holdenk
Github user holdenk commented on the issue: https://github.com/apache/spark/pull/15961 Great, can you maybe rebase one more time? For some reason in the PR dashboard its showing up as not mergeable even though github thinks it is. --- If your project is set up for it, you can reply t

[GitHub] spark issue #15462: [SPARK-17680] [SQL] [TEST] Added test cases for InMemory...

2016-11-28 Thread andrewor14
Github user andrewor14 commented on the issue: https://github.com/apache/spark/pull/15462 LGTM, merging into master 2.1 thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature en

[GitHub] spark issue #15817: [SPARK-18366][PYSPARK][ML] Add handleInvalid to Pyspark ...

2016-11-28 Thread holdenk
Github user holdenk commented on the issue: https://github.com/apache/spark/pull/15817 ok let's re-ping @MLnick / @sethah - I know we asked to update the docstring - but the current one is consistent with the Scala docstring so maybe it make sense as is (otherwise we should probably a

[GitHub] spark issue #15462: [SPARK-17680] [SQL] [TEST] Added test cases for InMemory...

2016-11-28 Thread andrewor14
Github user andrewor14 commented on the issue: https://github.com/apache/spark/pull/15462 @kiszk is there a JIRA associated specifically with adding tests for `InMemoryRelation`? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as

[GitHub] spark pull request #15462: [SPARK-17680] [SQL] [TEST] Added test cases for I...

2016-11-28 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/15462 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is ena

[GitHub] spark pull request #14136: [SPARK-16282][SQL] Implement percentile SQL funct...

2016-11-28 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/14136 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is ena

[GitHub] spark issue #16039: [SPARK-18597][SQL] Do not push-down join conditions to t...

2016-11-28 Thread hvanhovell
Github user hvanhovell commented on the issue: https://github.com/apache/spark/pull/16039 I am merging this. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark issue #14136: [SPARK-16282][SQL] Implement percentile SQL function.

2016-11-28 Thread hvanhovell
Github user hvanhovell commented on the issue: https://github.com/apache/spark/pull/14136 LGTM. Merging to master/2.1. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature ena

[GitHub] spark pull request #15949: [SPARK-18339] [SPARK-18513] [SQL] Don't push down...

2016-11-28 Thread tcondie
Github user tcondie commented on a diff in the pull request: https://github.com/apache/spark/pull/15949#discussion_r89856411 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/streaming/StreamExecutionMetadataSuite.scala --- @@ -0,0 +1,99 @@ +/* + * Licensed to the Apa

[GitHub] spark issue #14136: [SPARK-16282][SQL] Implement percentile SQL function.

2016-11-28 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/14136 @hvanhovell why did this go into branch-2.1? It's way past branch cut time. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your proj

[GitHub] spark issue #16041: [SPARK-18058][SQL][TRIVIAL] Use dataType.sameResult(...)...

2016-11-28 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16041 **[Test build #69252 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/69252/consoleFull)** for PR 16041 at commit [`37b6760`](https://github.com/apache/spark/commit/

[GitHub] spark issue #16041: [SPARK-18058][SQL][TRIVIAL] Use dataType.sameResult(...)...

2016-11-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16041 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/69252/ Test PASSed. ---

[GitHub] spark issue #15986: [SPARK-18553][CORE][branch-2.0] Fix leak of TaskSetManag...

2016-11-28 Thread markhamstra
Github user markhamstra commented on the issue: https://github.com/apache/spark/pull/15986 If Kay is happy with the last couple of changes, then I'm fine with this, too. The only tiny nit I've still got is a change from `runningTasksByExecutors()` to `runningTasksByExecutors`. Outsi

[GitHub] spark issue #16024: [MINOR][DOCS] Updates to the Accumulator example in the ...

2016-11-28 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16024 **[Test build #69256 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/69256/consoleFull)** for PR 16024 at commit [`0366a3c`](https://github.com/apache/spark/commit/0

[GitHub] spark pull request #15949: [SPARK-18339] [SPARK-18513] [SQL] Don't push down...

2016-11-28 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/15949#discussion_r89859904 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/streaming/WatermarkSuite.scala --- @@ -96,27 +96,42 @@ class WatermarkSuite extends StreamTest with B

[GitHub] spark issue #15949: [SPARK-18339] [SPARK-18513] [SQL] Don't push down curren...

2016-11-28 Thread brkyvz
Github user brkyvz commented on the issue: https://github.com/apache/spark/pull/15949 The implementation LGTM! +1 to @tdas 's comments for tests --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not hav

[GitHub] spark pull request #16039: [SPARK-18597][SQL] Do not push-down join conditio...

2016-11-28 Thread hvanhovell
Github user hvanhovell closed the pull request at: https://github.com/apache/spark/pull/16039 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark issue #16041: [SPARK-18058][SQL][TRIVIAL] Use dataType.sameResult(...)...

2016-11-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16041 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark pull request #15949: [SPARK-18339] [SPARK-18513] [SQL] Don't push down...

2016-11-28 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/15949#discussion_r89861730 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/streaming/StreamExecutionMetadataSuite.scala --- @@ -0,0 +1,99 @@ +/* + * Licensed to the Apache

[GitHub] spark issue #15961: [SPARK-18523][PySpark]Make SparkContext.stop more reliab...

2016-11-28 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15961 **[Test build #3441 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3441/consoleFull)** for PR 15961 at commit [`518e31b`](https://github.com/apache/spark/commit/

[GitHub] spark issue #15795: [SPARK-18081] Add user guide for Locality Sensitive Hash...

2016-11-28 Thread sethah
Github user sethah commented on the issue: https://github.com/apache/spark/pull/15795 Is this still targeted for 2.1? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark issue #15986: [SPARK-18553][CORE][branch-2.0] Fix leak of TaskSetManag...

2016-11-28 Thread kayousterhout
Github user kayousterhout commented on the issue: https://github.com/apache/spark/pull/15986 Thanks for all of the work on this Josh! Happy to review the version for master if the merge isn't clean. --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark issue #16000: [SPARK-18537][Web UI]Add a REST api to spark streaming

2016-11-28 Thread ajbozarth
Github user ajbozarth commented on the issue: https://github.com/apache/spark/pull/16000 @ChorPangChan thanks for asking, I'll take a look through the code by EOD(PST) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If y

[GitHub] spark pull request #16026: [SPARK-18597][SQL] Do not push-down join conditio...

2016-11-28 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/16026#discussion_r89865122 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -932,7 +932,7 @@ object PushPredicateThroughJoin ext

[GitHub] spark issue #14136: [SPARK-16282][SQL] Implement percentile SQL function.

2016-11-28 Thread hvanhovell
Github user hvanhovell commented on the issue: https://github.com/apache/spark/pull/14136 @rxin this is a very contained patch. It only adds the percentile function. The advantage here is that this further reduces our dependency on relatively slow Hive UDAFs (one more to go), and that

[GitHub] spark pull request #15730: [SPARK-18218][ML][MLLib] Optimize BlockMatrix mul...

2016-11-28 Thread brkyvz
Github user brkyvz commented on a diff in the pull request: https://github.com/apache/spark/pull/15730#discussion_r89867556 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/linalg/distributed/BlockMatrix.scala --- @@ -456,15 +461,20 @@ class BlockMatrix @Since("1.3.0") (

[GitHub] spark issue #16024: [MINOR][DOCS] Updates to the Accumulator example in the ...

2016-11-28 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16024 **[Test build #69256 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/69256/consoleFull)** for PR 16024 at commit [`0366a3c`](https://github.com/apache/spark/commit/

[GitHub] spark issue #16024: [MINOR][DOCS] Updates to the Accumulator example in the ...

2016-11-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16024 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/69256/ Test PASSed. ---

[GitHub] spark issue #16024: [MINOR][DOCS] Updates to the Accumulator example in the ...

2016-11-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16024 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark pull request #15730: [SPARK-18218][ML][MLLib] Optimize BlockMatrix mul...

2016-11-28 Thread brkyvz
Github user brkyvz commented on a diff in the pull request: https://github.com/apache/spark/pull/15730#discussion_r89868225 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/linalg/distributed/BlockMatrix.scala --- @@ -456,15 +461,20 @@ class BlockMatrix @Since("1.3.0") (

[GitHub] spark pull request #16009: [SPARK-18318][ML] ML, Graph 2.1 QA: API: New Scal...

2016-11-28 Thread sethah
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/16009#discussion_r89866977 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/ChiSqSelector.scala --- @@ -49,15 +49,13 @@ private[feature] trait ChiSqSelectorParams extends Par

<    1   2   3   4   5   6   7   >