[GitHub] spark pull request #21195: [Spark-23975][ML] Add support of array input for ...

2018-05-03 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/21195#discussion_r185984894 --- Diff: mllib/src/test/scala/org/apache/spark/ml/clustering/LDASuite.scala --- @@ -323,4 +324,44 @@ class LDASuite extends SparkFunSuite

[GitHub] spark pull request #21195: [Spark-23975][ML] Add support of array input for ...

2018-05-03 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/21195#discussion_r185983646 --- Diff: mllib/src/test/scala/org/apache/spark/ml/clustering/BisectingKMeansSuite.scala --- @@ -182,6 +184,40 @@ class BisectingKMeansSuite

[GitHub] spark pull request #21195: [Spark-23975][ML] Add support of array input for ...

2018-05-03 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/21195#discussion_r185984527 --- Diff: mllib/src/test/scala/org/apache/spark/ml/clustering/GaussianMixtureSuite.scala --- @@ -256,6 +258,42 @@ class GaussianMixtureSuite extends

[GitHub] spark pull request #21195: [Spark-23975][ML] Add support of array input for ...

2018-05-03 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/21195#discussion_r185971647 --- Diff: mllib/src/test/scala/org/apache/spark/ml/clustering/BisectingKMeansSuite.scala --- @@ -182,6 +184,40 @@ class BisectingKMeansSuite

[GitHub] spark pull request #21195: [Spark-23975][ML] Add support of array input for ...

2018-05-03 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/21195#discussion_r185984500 --- Diff: mllib/src/test/scala/org/apache/spark/ml/clustering/BisectingKMeansSuite.scala --- @@ -182,6 +184,40 @@ class BisectingKMeansSuite

[GitHub] spark pull request #21195: [Spark-23975][ML] Add support of array input for ...

2018-05-03 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/21195#discussion_r185971385 --- Diff: mllib/src/main/scala/org/apache/spark/ml/util/SchemaUtils.scala --- @@ -101,4 +102,17 @@ private[spark] object SchemaUtils { require

[GitHub] spark issue #20929: [SPARK-23772][SQL][WIP] Provide an option to ignore colu...

2018-04-30 Thread mengxr
Github user mengxr commented on the issue: https://github.com/apache/spark/pull/20929 @maropu Any updates? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews

[GitHub] spark pull request #20929: [SPARK-23772][SQL][WIP] Provide an option to igno...

2018-04-09 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/20929#discussion_r180291983 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala --- @@ -887,6 +887,14 @@ object SQLConf { .booleanConf

[GitHub] spark pull request #20929: [SPARK-23772][SQL][WIP] Provide an option to igno...

2018-04-09 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/20929#discussion_r180293096 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/streaming/FileStreamSourceSuite.scala --- @@ -624,6 +624,42 @@ class FileStreamSourceSuite extends

[GitHub] spark pull request #20929: [SPARK-23772][SQL][WIP] Provide an option to igno...

2018-04-09 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/20929#discussion_r180291486 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/types/TypePlaceholder.scala --- @@ -0,0 +1,23 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request #20929: [SPARK-23772][SQL][WIP] Provide an option to igno...

2018-04-09 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/20929#discussion_r180291939 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala --- @@ -887,6 +887,14 @@ object SQLConf { .booleanConf

[GitHub] spark issue #20285: [SPARK-22735][ML][DOC] Added VectorSizeHint docs and exa...

2018-01-23 Thread mengxr
Github user mengxr commented on the issue: https://github.com/apache/spark/pull/20285 LGTM. Merged into master and branch-2.3. Thanks! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark pull request #20285: [SPARK-22735][ML][DOC] Added VectorSizeHint docs ...

2018-01-23 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/20285#discussion_r163378065 --- Diff: docs/ml-features.md --- @@ -1283,6 +1283,56 @@ for more details on the API. +## VectorSizeHint + +It can sometimes

[GitHub] spark pull request #20285: [SPARK-22735][ML][DOC] Added VectorSizeHint docs ...

2018-01-23 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/20285#discussion_r163377934 --- Diff: docs/ml-features.md --- @@ -1283,6 +1283,56 @@ for more details on the API. +## VectorSizeHint + +It can sometimes

[GitHub] spark pull request #20285: [SPARK-22735][ML][DOC] Added VectorSizeHint docs ...

2018-01-23 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/20285#discussion_r163377373 --- Diff: docs/ml-features.md --- @@ -1283,6 +1283,56 @@ for more details on the API. +## VectorSizeHint + +It can sometimes

[GitHub] spark pull request #19387: [SPARK-22160][SQL] Allow changing sample points p...

2017-09-28 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/19387#discussion_r141755198 --- Diff: core/src/main/scala/org/apache/spark/Partitioner.scala --- @@ -108,9 +108,17 @@ class HashPartitioner(partitions: Int) extends Partitioner

[GitHub] spark issue #18988: [SPARK-21778][SQL] Simpler Dataset.sample API in Scala /...

2017-08-17 Thread mengxr
Github user mengxr commented on the issue: https://github.com/apache/spark/pull/18988 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark issue #17742: [Spark-11968][ML][MLLIB]Optimize MLLIB ALS recommendForA...

2017-05-10 Thread mengxr
Github user mengxr commented on the issue: https://github.com/apache/spark/pull/17742 @mpjlu Could you try not linking with native BLAS or system BLAS in your test? Just let it fallback to f2j BLAS. I can do some tests on my end too. --- If your project is set up for it, you can

[GitHub] spark issue #17742: [Spark-11968][ML][MLLIB]Optimize MLLIB ALS recommendForA...

2017-05-10 Thread mengxr
Github user mengxr commented on the issue: https://github.com/apache/spark/pull/17742 A single buffer doesn't lead to long GC pause. If it request lot of memory, it might trigger GC to collect other objects. But itself is a single object, which can be easily GC'ed. The problem here

[GitHub] spark issue #17742: [Spark-11968][ML][MLLIB]Optimize MLLIB ALS recommendForA...

2017-05-09 Thread mengxr
Github user mengxr commented on the issue: https://github.com/apache/spark/pull/17742 I think the problem is not BLAS-3 ops, nor the 256MB total memory. The `val output = new Array[(Int, (Int, Double))](m * n)` is not specialized. Each element holds two references. If `m=4096` and `n

[GitHub] spark issue #17423: [SPARK-20088] Do not create new SparkContext in SparkR c...

2017-03-27 Thread mengxr
Github user mengxr commented on the issue: https://github.com/apache/spark/pull/17423 LGTM. Merged into master. The failed tests are irrelevant to this PR, fixed in https://github.com/apache/spark/commit/a2ce0a2e309e70d74ae5d2ed203f7919a0f79397. --- If your project is set up

[GitHub] spark pull request #16864: [SPARK-19527][Core] Approximate Size of Intersect...

2017-02-10 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/16864#discussion_r100569750 --- Diff: common/sketch/src/main/java/org/apache/spark/util/sketch/BloomFilter.java --- @@ -81,6 +81,11 @@ int getVersionNumber() { public abstract

[GitHub] spark pull request #16864: [SPARK-19527][Core] Approximate Size of Intersect...

2017-02-10 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/16864#discussion_r100571446 --- Diff: common/sketch/src/main/java/org/apache/spark/util/sketch/BloomFilter.java --- @@ -148,6 +153,24 @@ int getVersionNumber() { public abstract

[GitHub] spark pull request #16864: [SPARK-19527][Core] Approximate Size of Intersect...

2017-02-10 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/16864#discussion_r100570007 --- Diff: common/sketch/src/main/java/org/apache/spark/util/sketch/BloomFilter.java --- @@ -81,6 +81,11 @@ int getVersionNumber() { public abstract

[GitHub] spark pull request #16864: [SPARK-19527][Core] Approximate Size of Intersect...

2017-02-10 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/16864#discussion_r100571673 --- Diff: common/sketch/src/main/java/org/apache/spark/util/sketch/BloomFilterImpl.java --- @@ -221,6 +221,49 @@ public BloomFilter mergeInPlace

[GitHub] spark pull request #16301: [SPARK-18849][ML][SPARKR][DOC] vignettes final ch...

2016-12-16 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/16301#discussion_r92879246 --- Diff: R/pkg/vignettes/sparkr-vignettes.Rmd --- @@ -496,9 +508,114 @@ count(carsDF_test) head(carsDF_test) ``` - ### Models

[GitHub] spark issue #16286: [SPARK-18849][ML][SPARKR][DOC] vignettes final check upd...

2016-12-15 Thread mengxr
Github user mengxr commented on the issue: https://github.com/apache/spark/pull/16286 I would vote for some logical grouping (instead of alphabetical ordering) and keep the sections in the same order, but it is not very necessary. --- If your project is set up for it, you can reply

[GitHub] spark issue #16264: [SPARK-18793] [SPARK-18794] [R] add spark.randomForest/s...

2016-12-13 Thread mengxr
Github user mengxr commented on the issue: https://github.com/apache/spark/pull/16264 @HyukjinKwon Could you take a look at the AppVeyor error? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark issue #16264: [SPARK-18793] [SPARK-18794] [R] add spark.randomForest/s...

2016-12-13 Thread mengxr
Github user mengxr commented on the issue: https://github.com/apache/spark/pull/16264 Merged into master and branch-2.1. The last commit passed Jenkins. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark issue #16264: [SPARK-18793] [SPARK-18794] [R] add spark.randomForest/s...

2016-12-13 Thread mengxr
Github user mengxr commented on the issue: https://github.com/apache/spark/pull/16264 Agree that it is not necessary to mention the version added here. I will send a follow-up PR after this one to rearrange the ordering of the ML algorithms. There is no logical ordering now

[GitHub] spark pull request #16264: [SPARK-18793] [SPARK-18794] [R] add spark.randomF...

2016-12-13 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/16264#discussion_r92257313 --- Diff: R/pkg/vignettes/sparkr-vignettes.Rmd --- @@ -526,6 +530,34 @@ gaussianFitted <- predict(gaussianGLM, carsDF) head(select(gaussianFitted, &qu

[GitHub] spark pull request #16264: [SPARK-18793] [SPARK-18794] [R] add spark.randomF...

2016-12-13 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/16264#discussion_r92257291 --- Diff: R/pkg/vignettes/sparkr-vignettes.Rmd --- @@ -539,7 +539,7 @@ In the following example, we use the `longley` dataset to train a random forest

[GitHub] spark pull request #16264: SPARK-18792] [R] add spark.randomForest to vignet...

2016-12-12 Thread mengxr
GitHub user mengxr opened a pull request: https://github.com/apache/spark/pull/16264 SPARK-18792] [R] add spark.randomForest to vignettes ## What changes were proposed in this pull request? Mention `spark.randomForest` in vignettes. Keep the content minimal since users can

[GitHub] spark issue #16222: [SPARK-18797][SparkR]:Update spark.logit in sparkr-vigne...

2016-12-12 Thread mengxr
Github user mengxr commented on the issue: https://github.com/apache/spark/pull/16222 LGTM. Merged into master and branch-2.1. I will change the `regParam` value in a follow-up PR. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark issue #16241: [SPARK-18812] [MLLIB] explain "Spark ML"

2016-12-09 Thread mengxr
Github user mengxr commented on the issue: https://github.com/apache/spark/pull/16241 Merged into master and branch-2.1. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request #16222: [SPARK-18797][SparkR]:Update spark.logit in spark...

2016-12-09 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/16222#discussion_r91822977 --- Diff: R/pkg/vignettes/sparkr-vignettes.Rmd --- @@ -768,8 +768,46 @@ newDF <- createDataFrame(data.frame(x = c(1.5, 3.2))) head(predict(isoregMo

[GitHub] spark pull request #16222: [SPARK-18797][SparkR]:Update spark.logit in spark...

2016-12-09 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/16222#discussion_r91822954 --- Diff: R/pkg/vignettes/sparkr-vignettes.Rmd --- @@ -768,8 +768,46 @@ newDF <- createDataFrame(data.frame(x = c(1.5, 3.2))) head(predict(isoregMo

[GitHub] spark pull request #16222: [SPARK-18797][SparkR]:Update spark.logit in spark...

2016-12-09 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/16222#discussion_r91822499 --- Diff: R/pkg/vignettes/sparkr-vignettes.Rmd --- @@ -768,8 +768,46 @@ newDF <- createDataFrame(data.frame(x = c(1.5, 3.2))) head(predict(isoregMo

[GitHub] spark pull request #16222: [SPARK-18797][SparkR]:Update spark.logit in spark...

2016-12-09 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/16222#discussion_r91823285 --- Diff: R/pkg/vignettes/sparkr-vignettes.Rmd --- @@ -768,8 +768,46 @@ newDF <- createDataFrame(data.frame(x = c(1.5, 3.2))) head(predict(isoregMo

[GitHub] spark pull request #16222: [SPARK-18797][SparkR]:Update spark.logit in spark...

2016-12-09 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/16222#discussion_r91823306 --- Diff: R/pkg/vignettes/sparkr-vignettes.Rmd --- @@ -768,8 +768,46 @@ newDF <- createDataFrame(data.frame(x = c(1.5, 3.2))) head(predict(isoregMo

[GitHub] spark pull request #16222: [SPARK-18797][SparkR]:Update spark.logit in spark...

2016-12-09 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/16222#discussion_r91822509 --- Diff: R/pkg/vignettes/sparkr-vignettes.Rmd --- @@ -768,8 +768,46 @@ newDF <- createDataFrame(data.frame(x = c(1.5, 3.2))) head(predict(isoregMo

[GitHub] spark pull request #16222: [SPARK-18797][SparkR]:Update spark.logit in spark...

2016-12-09 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/16222#discussion_r91823013 --- Diff: R/pkg/vignettes/sparkr-vignettes.Rmd --- @@ -768,8 +768,46 @@ newDF <- createDataFrame(data.frame(x = c(1.5, 3.2))) head(predict(isoregMo

[GitHub] spark pull request #16222: [SPARK-18797][SparkR]:Update spark.logit in spark...

2016-12-09 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/16222#discussion_r91822502 --- Diff: R/pkg/vignettes/sparkr-vignettes.Rmd --- @@ -768,8 +768,46 @@ newDF <- createDataFrame(data.frame(x = c(1.5, 3.2))) head(predict(isoregMo

[GitHub] spark pull request #16222: [SPARK-18797][SparkR]:Update spark.logit in spark...

2016-12-09 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/16222#discussion_r91823358 --- Diff: R/pkg/vignettes/sparkr-vignettes.Rmd --- @@ -768,8 +768,46 @@ newDF <- createDataFrame(data.frame(x = c(1.5, 3.2))) head(predict(isoregMo

[GitHub] spark pull request #16241: [SPARK-18812] [MLLIB] explain "Spark ML"

2016-12-09 Thread mengxr
GitHub user mengxr opened a pull request: https://github.com/apache/spark/pull/16241 [SPARK-18812] [MLLIB] explain "Spark ML" ## What changes were proposed in this pull request? There has been some confusion around "Spark ML" vs. "MLlib". This

[GitHub] spark issue #16154: [SPARK-17822] [R] Make JVMObjectTracker a member variabl...

2016-12-09 Thread mengxr
Github user mengxr commented on the issue: https://github.com/apache/spark/pull/16154 Merged into master and branch-2.1. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #16222: [SPARK-18797][SparkR]:Update spark.logit in sparkr-vigne...

2016-12-09 Thread mengxr
Github user mengxr commented on the issue: https://github.com/apache/spark/pull/16222 @wangmiao1981 Sorry, I was actually suggesting removing the math part. Those are standard logistic regression formulations, which could be found in many other places. We don't really need to repeat

[GitHub] spark issue #16154: [SPARK-17822] [R] Make JVMObjectTracker a member variabl...

2016-12-09 Thread mengxr
Github user mengxr commented on the issue: https://github.com/apache/spark/pull/16154 test this please (unrelated test failures) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16222: [SPARK-18797][SparkR]:Update spark.logit in sparkr-vigne...

2016-12-08 Thread mengxr
Github user mengxr commented on the issue: https://github.com/apache/spark/pull/16222 @wangmiao1981 Do we need to explain what "logistic regression" is? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If yo

[GitHub] spark pull request #16224: [SPARK-18792] [R] mention spark.logit in vignette...

2016-12-08 Thread mengxr
Github user mengxr closed the pull request at: https://github.com/apache/spark/pull/16224 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark issue #16224: [SPARK-18792] [R] mention spark.logit in vignettes

2016-12-08 Thread mengxr
Github user mengxr commented on the issue: https://github.com/apache/spark/pull/16224 Duplicates #16222 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark pull request #16224: [SPARK-18792] [R] mention spark.logit in vignette...

2016-12-08 Thread mengxr
GitHub user mengxr opened a pull request: https://github.com/apache/spark/pull/16224 [SPARK-18792] [R] mention spark.logit in vignettes ## What changes were proposed in this pull request? Mention `spark.logit` in vignettes. I didn't add example code because it will mostly

[GitHub] spark issue #16154: [SPARK-17822] [R] Make JVMObjectTracker a member variabl...

2016-12-08 Thread mengxr
Github user mengxr commented on the issue: https://github.com/apache/spark/pull/16154 All test failures seem unrelated. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request #16154: [SPARK-17822] [R] Make JVMObjectTracker a member ...

2016-12-08 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/16154#discussion_r91625634 --- Diff: core/src/main/scala/org/apache/spark/api/r/JVMObjectTracker.scala --- @@ -0,0 +1,87 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark issue #16154: [SPARK-17822] [R] Make JVMObjectTracker a member variabl...

2016-12-08 Thread mengxr
Github user mengxr commented on the issue: https://github.com/apache/spark/pull/16154 test this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark issue #16154: [SPARK-17822] [R] Make JVMObjectTracker a member variabl...

2016-12-08 Thread mengxr
Github user mengxr commented on the issue: https://github.com/apache/spark/pull/16154 Confirmed this patch fixed the issue in our environment:) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark issue #16154: [SPARK-17822] [R] Make JVMObjectTracker a member variabl...

2016-12-08 Thread mengxr
Github user mengxr commented on the issue: https://github.com/apache/spark/pull/16154 @yhuai @jkbradley I'm testing the patch in our environment to see whether it actually solves the problem. --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark issue #16214: [SPARK-18325][SPARKR] Add example for using native R pac...

2016-12-08 Thread mengxr
Github user mengxr commented on the issue: https://github.com/apache/spark/pull/16214 @yanboliang What happens if there are multiple executors on the same machine? Are there concurrency issues? --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark issue #16154: [SPARK-17822] [R] Make JVMObjectTracker a member variabl...

2016-12-08 Thread mengxr
Github user mengxr commented on the issue: https://github.com/apache/spark/pull/16154 @falaki I'm happy to add the first Scala unit test for `RBackend`:) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request #16154: [SPARK-17822] [R] Make JVMObjectTracker a member ...

2016-12-08 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/16154#discussion_r91586125 --- Diff: core/src/main/scala/org/apache/spark/api/r/RBackendHandler.scala --- @@ -143,12 +142,8 @@ private[r] class RBackendHandler(server: RBackend

[GitHub] spark issue #16190: [SPARK-18762][WEBUI] Web UI should be http:4040 instead ...

2016-12-07 Thread mengxr
Github user mengxr commented on the issue: https://github.com/apache/spark/pull/16190 @sarutak Thanks for the quick fix! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #16190: [SPARK-18762][WEBUI] Web UI should be http:4040 instead ...

2016-12-07 Thread mengxr
Github user mengxr commented on the issue: https://github.com/apache/spark/pull/16190 @sarutak Do you know why we needed this change to fix ssl-enabled history server in #15611? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark pull request #16154: [SPARK-17822] [R] Make JVMObjectTracker a member ...

2016-12-06 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/16154#discussion_r91152561 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/api/r/SQLUtils.scala --- @@ -247,7 +247,7 @@ private[sql] object SQLUtils extends Logging

[GitHub] spark pull request #16154: [SPARK-17822] [R] Make JVMObjectTracker a member ...

2016-12-06 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/16154#discussion_r91151520 --- Diff: core/src/main/scala/org/apache/spark/api/r/RBackendHandler.scala --- @@ -143,12 +142,8 @@ private[r] class RBackendHandler(server: RBackend

[GitHub] spark pull request #16154: [SPARK-17822] [R] Make JVMObjectTracker a member ...

2016-12-06 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/16154#discussion_r91151225 --- Diff: core/src/main/scala/org/apache/spark/api/r/JVMObjectTracker.scala --- @@ -0,0 +1,65 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark issue #16154: [SPARK-17822] [R] Make JVMObjectTracker a member variabl...

2016-12-05 Thread mengxr
Github user mengxr commented on the issue: https://github.com/apache/spark/pull/16154 @shivaram PR seems ready for you to make a pass:) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark issue #16154: [SPARK-17822] Make JVMObjectTracker a member variable of...

2016-12-05 Thread mengxr
Github user mengxr commented on the issue: https://github.com/apache/spark/pull/16154 @shivaram Let's wait Jenkins first:) I'm not sure it will pass. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request #16154: [SPARK-17822] Make JVMObjectTracker a member vari...

2016-12-05 Thread mengxr
GitHub user mengxr opened a pull request: https://github.com/apache/spark/pull/16154 [SPARK-17822] Make JVMObjectTracker a member variable of RBackend ## What changes were proposed in this pull request? * This PR changes `JVMObjectTracker` from `object` to `class` and let

[GitHub] spark issue #15567: [SPARK-14393][SQL] values generated by non-deterministic...

2016-11-02 Thread mengxr
Github user mengxr commented on the issue: https://github.com/apache/spark/pull/15567 I reverted the changes I made to enforce `Projection.initialize`, which touched too many files and most of them doesn't really need to handle nondeterministic expressions. The current implementation

[GitHub] spark pull request #15567: [SPARK-14393][SQL] values generated by non-determ...

2016-11-01 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/15567#discussion_r86019635 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Expression.scala --- @@ -264,23 +264,43 @@ trait NonSQLExpression extends

[GitHub] spark issue #15567: [SPARK-14393][SQL] values generated by non-deterministic...

2016-11-01 Thread mengxr
Github user mengxr commented on the issue: https://github.com/apache/spark/pull/15567 @rxin I updated the implementation to force initialization in Projection/Expression. This will fail many tests. I fixed all in `catalyst`, but not yet in `sql`. I want to propose the following

[GitHub] spark pull request #15567: [SPARK-14393][SQL] values generated by non-determ...

2016-11-01 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/15567#discussion_r85882058 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/SparkPartitionID.scala --- @@ -17,16 +17,15 @@ package

[GitHub] spark issue #13891: [SPARK-6685][MLLIB]Use DSYRK to compute AtA in ALS

2016-10-21 Thread mengxr
Github user mengxr commented on the issue: https://github.com/apache/spark/pull/13891 @hqzizania Thanks for the performance tests! This matches my guess. I'm not sure how often people use a rank greater than 1000 or even 250. But I think it is good to use BLAS level-3 routines. We

[GitHub] spark issue #15398: [SPARK-17647][SQL] Fix backslash escaping in 'LIKE' patt...

2016-10-20 Thread mengxr
Github user mengxr commented on the issue: https://github.com/apache/spark/pull/15398 Thanks for checking the standard! I think that behavior is what most people expect:) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark pull request #15567: [SPARK-14393][SQL] values generated by non-determ...

2016-10-20 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/15567#discussion_r84340352 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/GeneratePredicate.scala --- @@ -25,19 +25,26 @@ import

[GitHub] spark pull request #15567: [SPARK-14393][SQL] values generated by non-determ...

2016-10-20 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/15567#discussion_r84339837 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/SparkPartitionID.scala --- @@ -17,16 +17,15 @@ package

[GitHub] spark pull request #15567: [SPARK-14393][SQL] values generated by non-determ...

2016-10-20 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/15567#discussion_r84339569 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala --- @@ -185,6 +185,20 @@ class CodegenContext

[GitHub] spark pull request #15567: [SPARK-14393][SQL] values generated by non-determ...

2016-10-20 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/15567#discussion_r84339070 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Expression.scala --- @@ -274,12 +274,12 @@ trait Nondeterministic extends

[GitHub] spark pull request #15567: [SPARK-14393][SQL] values generated by non-determ...

2016-10-20 Thread mengxr
GitHub user mengxr opened a pull request: https://github.com/apache/spark/pull/15567 [SPARK-14393][SQL] values generated by non-deterministic functions shouldn't change after coalesce or union ## What changes were proposed in this pull request? When a user appended

[GitHub] spark pull request #15398: [SPARK-17647][SQL] Fix backslash escaping in 'LIK...

2016-10-19 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/15398#discussion_r84007236 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala --- @@ -68,7 +68,20 @@ trait StringRegexExpression

[GitHub] spark pull request #15398: [SPARK-17647][SQL] Fix backslash escaping in 'LIK...

2016-10-19 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/15398#discussion_r84007087 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala --- @@ -68,7 +68,20 @@ trait StringRegexExpression

[GitHub] spark pull request #15398: [SPARK-17647][SQL] Fix backslash escaping in 'LIK...

2016-10-19 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/15398#discussion_r84008009 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala --- @@ -68,7 +68,20 @@ trait StringRegexExpression

[GitHub] spark issue #15285: [SPARK-17711] Compress rolled executor log

2016-09-29 Thread mengxr
Github user mengxr commented on the issue: https://github.com/apache/spark/pull/15285 add to whitelist --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark issue #15285: [SPARK-17711] Compress rolled executor log

2016-09-29 Thread mengxr
Github user mengxr commented on the issue: https://github.com/apache/spark/pull/15285 ok to test --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark issue #14853: [SparkR][Minor] Fix LDA doc

2016-08-29 Thread mengxr
Github user mengxr commented on the issue: https://github.com/apache/spark/pull/14853 LGTM. Merged into master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #14524: [SPARK-16832] [ML] [WIP] CrossValidator and TrainValidat...

2016-08-22 Thread mengxr
Github user mengxr commented on the issue: https://github.com/apache/spark/pull/14524 Sorry for late response! I'm against this change since it introduces indeterministic behavior and makes applications hard to debug. For example, I want to cross validate some estimator that accepts

[GitHub] spark pull request #14761: [SparkR][Minor] Add installation message for remo...

2016-08-22 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/14761#discussion_r75797308 --- Diff: R/pkg/R/utils.R --- @@ -697,3 +697,20 @@ is_master_local <- function(master) { is_sparkR_shell <- function() { grepl("

[GitHub] spark pull request #14761: [SparkR][Minor] Add installation message for remo...

2016-08-22 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/14761#discussion_r75797228 --- Diff: R/pkg/R/sparkR.R --- @@ -367,19 +367,21 @@ sparkR.session <- function( overrideEnvs(sparkConfigMap, paramMap) } #

[GitHub] spark issue #14764: [SPARKR][MINOR] Update R DESCRIPTION file

2016-08-22 Thread mengxr
Github user mengxr commented on the issue: https://github.com/apache/spark/pull/14764 LGTM. Merged into master and branch-2.0. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark issue #14761: [SparkR][Minor] Add installation message for remote mast...

2016-08-22 Thread mengxr
Github user mengxr commented on the issue: https://github.com/apache/spark/pull/14761 I'm making a pass. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark issue #14740: [MINOR] [R] add SparkR.Rcheck/ and SparkR_*.tar.gz to R/...

2016-08-21 Thread mengxr
Github user mengxr commented on the issue: https://github.com/apache/spark/pull/14740 Merged into branch-2.0 and master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request #14740: [MINOR] [R] add SparkR.Rcheck/ and SparkR_*.tar.g...

2016-08-21 Thread mengxr
GitHub user mengxr opened a pull request: https://github.com/apache/spark/pull/14740 [MINOR] [R] add SparkR.Rcheck/ and SparkR_*.tar.gz to R/.gitignore ## What changes were proposed in this pull request? Ignore temp files generated by `check-cran.sh`. You can merge

[GitHub] spark issue #14735: [SPARK-17173][SPARKR] R MLlib refactor, cleanup, reforma...

2016-08-21 Thread mengxr
Github user mengxr commented on the issue: https://github.com/apache/spark/pull/14735 @junyangq Could you help review this PR? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #14384: [Spark-16443][SparkR] Alternating Least Squares (ALS) wr...

2016-08-19 Thread mengxr
Github user mengxr commented on the issue: https://github.com/apache/spark/pull/14384 LGTM. Merged into master. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #14705: [SPARK-16508][SparkR] Fix CRAN undocumented/duplicated a...

2016-08-19 Thread mengxr
Github user mengxr commented on the issue: https://github.com/apache/spark/pull/14705 Agree with @felixcheung . Let do the `setGeneric(, ...)` and do not use `...` in function definition if it is not required. Note that having every param documented is not a strict requirement

[GitHub] spark issue #14558: [SPARK-16508][SparkR] Fix warnings on undocumented/dupli...

2016-08-17 Thread mengxr
Github user mengxr commented on the issue: https://github.com/apache/spark/pull/14558 @felixcheung I agree that we shouldn't put real documentation to `generics.R`. I discussed with @junyangq offline and suggest the following: 1) If we have to put "..." in `setGen

[GitHub] spark pull request #14558: [SPARK-16508][SparkR] Fix warnings on undocumente...

2016-08-17 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/14558#discussion_r75215338 --- Diff: R/pkg/R/generics.R --- @@ -1251,10 +1311,57 @@ setGeneric("year", function(x) { standardGeneric("year") }) #' @ex

[GitHub] spark pull request #14558: [SPARK-16508][SparkR] Fix warnings on undocumente...

2016-08-17 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/14558#discussion_r75182227 --- Diff: .gitignore --- @@ -77,3 +77,8 @@ spark-warehouse/ # For R session data .RData .RHistory +.Rhistory --- End diff

[GitHub] spark issue #14384: [Spark-16443][SparkR] Alternating Least Squares (ALS) wr...

2016-08-17 Thread mengxr
Github user mengxr commented on the issue: https://github.com/apache/spark/pull/14384 @junyangq Could you merge recent master and update this PR? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request #14384: [Spark-16443][SparkR] Alternating Least Squares (...

2016-08-17 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/14384#discussion_r75181798 --- Diff: R/pkg/inst/tests/testthat/test_mllib.R --- @@ -454,4 +454,61 @@ test_that("spark.survreg", { } }) +test_that(

<    1   2   3   4   5   6   7   8   9   10   >