[GitHub] spark pull request #13476: [SPARK-15684][SparkR]Not mask startsWith and ends...

2016-06-02 Thread wangmiao1981
GitHub user wangmiao1981 opened a pull request: https://github.com/apache/spark/pull/13476 [SPARK-15684][SparkR]Not mask startsWith and endsWith in R ## What changes were proposed in this pull request? In R 3.3.0, startsWith and endsWith are added. In this PR, I make

[GitHub] spark issue #13476: [SPARK-15684][SparkR]Not mask startsWith and endsWith in...

2016-06-02 Thread wangmiao1981
Github user wangmiao1981 commented on the issue: https://github.com/apache/spark/pull/13476 @shivaram It seems that integration server uses a lower version of R. Do you know whether has a mechanism to optionally compiling functions by checking version? For example, in C, we can use

[GitHub] spark pull request #13476: [SPARK-15684][SparkR]Not mask startsWith and ends...

2016-06-03 Thread wangmiao1981
Github user wangmiao1981 commented on a diff in the pull request: https://github.com/apache/spark/pull/13476#discussion_r65742476 --- Diff: R/pkg/R/column.R --- @@ -151,6 +151,40 @@ setMethod("substr", signature(x = "Column"),

[GitHub] spark issue #13508: [SPARK-15766][SparkR]:R should export is.nan

2016-06-04 Thread wangmiao1981
Github user wangmiao1981 commented on the issue: https://github.com/apache/spark/pull/13508 @felixcheung As we discussed, I opened a JIRA on exporting the is.nan. Can you review it? Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark issue #13476: [SPARK-15684][SparkR]Not mask startsWith and endsWith in...

2016-06-04 Thread wangmiao1981
Github user wangmiao1981 commented on the issue: https://github.com/apache/spark/pull/13476 @felixcheung Any more comments? The version check is needed otherwise it can't pass unit tests. All other review comments are done. --- If your project is set up for it, you can reply

[GitHub] spark issue #13508: [SPARK-15766][SparkR]:R should export is.nan

2016-06-06 Thread wangmiao1981
Github user wangmiao1981 commented on the issue: https://github.com/apache/spark/pull/13508 @felixcheung Add isnan in the NAMESPACE file. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request #13508: [SPARK-15766][SparkR]:R should export is.nan

2016-06-03 Thread wangmiao1981
GitHub user wangmiao1981 opened a pull request: https://github.com/apache/spark/pull/13508 [SPARK-15766][SparkR]:R should export is.nan ## What changes were proposed in this pull request? When reviewing SPARK-15545, we found that is.nan is not exported, which should

[GitHub] spark pull request #13476: [SPARK-15684][SparkR]Not mask startsWith and ends...

2016-06-02 Thread wangmiao1981
Github user wangmiao1981 commented on a diff in the pull request: https://github.com/apache/spark/pull/13476#discussion_r65657352 --- Diff: R/pkg/R/generics.R --- @@ -695,10 +695,6 @@ setGeneric("desc", function(x) { standardGeneric("desc") })

[GitHub] spark pull request: [SPARK-15439][SparkR]:Failed to run unit test ...

2016-05-25 Thread wangmiao1981
Github user wangmiao1981 commented on the pull request: https://github.com/apache/spark/pull/13284#issuecomment-221656579 Re-tested on Ubuntu, the pipedRDD test case still fails. R version 3.3.0 beta (2016-03-30 r70404) --- If your project is set up for it, you can reply

[GitHub] spark pull request: [SPARK-15449][MLlib][Example]:Wrong Data Forma...

2016-05-25 Thread wangmiao1981
Github user wangmiao1981 commented on the pull request: https://github.com/apache/spark/pull/13301#issuecomment-221706864 @srowen ML uses the sample_libsvm_data.txt in all three examples. sample_naive_bayes_data.txt is not in libsvm format. The format is shown below: 0,1

[GitHub] spark pull request: [SPARK-15492][ML][DOC]:Binarization scala exam...

2016-05-25 Thread wangmiao1981
Github user wangmiao1981 commented on the pull request: https://github.com/apache/spark/pull/13266#issuecomment-221681098 @MLnick Done. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-15492][ML][DOC]:Binarization scala exam...

2016-05-25 Thread wangmiao1981
Github user wangmiao1981 commented on the pull request: https://github.com/apache/spark/pull/13266#issuecomment-221670964 @MLnick Sure. I will do it soon. Now, I am debugging a R bug. Thanks! --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark pull request: [SPARK-15439][SparkR]:Failed to run unit test ...

2016-05-24 Thread wangmiao1981
Github user wangmiao1981 commented on the pull request: https://github.com/apache/spark/pull/13284#issuecomment-221476011 @shivaram The pipedRDD one seems working when using sudo in Linux. My mac does not work though. --- If your project is set up for it, you can reply to this email

[GitHub] spark pull request: [SPARK-15439][SparkR]:Failed to run unit test ...

2016-05-24 Thread wangmiao1981
Github user wangmiao1981 commented on the pull request: https://github.com/apache/spark/pull/13284#issuecomment-221477072 @shivaram I will make the change with R version check. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark pull request: [SPARK-15439][SparkR]:Failed to run unit test ...

2016-05-25 Thread wangmiao1981
Github user wangmiao1981 commented on the pull request: https://github.com/apache/spark/pull/13284#issuecomment-221483353 @shivaram @felixcheung For the failure pipedRDD test case, if I copy & paste it in the SparkR, it works fine. --- If your project is set up for it, you can r

[GitHub] spark pull request: [SPARK-15439][SparkR]:Failed to run unit test ...

2016-05-25 Thread wangmiao1981
Github user wangmiao1981 commented on the pull request: https://github.com/apache/spark/pull/13284#issuecomment-221668874 @shivaram I am debugging and try to find a hint. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark pull request: [SPARK-15439][SparkR]:Failed to run unit test ...

2016-05-24 Thread wangmiao1981
Github user wangmiao1981 commented on the pull request: https://github.com/apache/spark/pull/13284#issuecomment-221475397 R version 3.3.0 (2016-05-03) -- "Supposedly Educational" Copyright (C) 2016 The R Foundation for Statistical Computing Platform: x86_64-apple-da

[GitHub] spark pull request: [SPARK-15439][SparkR]:Failed to run unit test ...

2016-05-24 Thread wangmiao1981
Github user wangmiao1981 commented on the pull request: https://github.com/apache/spark/pull/13284#issuecomment-221475769 @felixcheung > conflicts(detail = TRUE) $.GlobalEnv [1] "df" $`package:SparkR` [1] "alias"

[GitHub] spark pull request: [SPARK-15449][MLlib][Example]:Wrong Data Forma...

2016-05-25 Thread wangmiao1981
Github user wangmiao1981 commented on the pull request: https://github.com/apache/spark/pull/13301#issuecomment-221722166 @srowen I think we can delete it. Let me double check it and update this PR. Thanks! --- If your project is set up for it, you can reply to this email and have

[GitHub] spark pull request: [SPARK-15449][MLlib][Example]:Wrong Data Forma...

2016-05-25 Thread wangmiao1981
Github user wangmiao1981 commented on the pull request: https://github.com/apache/spark/pull/13301#issuecomment-221722860 I grep all scala/java/py files and there is no reference to the data file. --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark pull request: [SPARK-15449][MLlib][Example]:Wrong Data Forma...

2016-05-25 Thread wangmiao1981
GitHub user wangmiao1981 opened a pull request: https://github.com/apache/spark/pull/13301 [SPARK-15449][MLlib][Example]:Wrong Data Format - Documentation Issue ## What changes were proposed in this pull request? (Please fill in changes proposed in this fix

[GitHub] spark pull request: [SPARK-15439][SparkR]:Failed to run unit test ...

2016-05-25 Thread wangmiao1981
Github user wangmiao1981 commented on a diff in the pull request: https://github.com/apache/spark/pull/13284#discussion_r64520919 --- Diff: R/pkg/inst/tests/testthat/test_context.R --- @@ -24,10 +24,18 @@ test_that("Check masked functions", { func <- lapply(mas

[GitHub] spark pull request: [SPARK-15439][SparkR]:Failed to run unit test ...

2016-05-25 Thread wangmiao1981
Github user wangmiao1981 commented on the pull request: https://github.com/apache/spark/pull/13284#issuecomment-221486087 @vectorijk As we discussed above, R version is different on local and Jenkins. We installed R 3.3.0 in local while Jenkins still uses the old version

[GitHub] spark pull request: [SPARK-15439][SparkR]:Failed to run unit test ...

2016-05-25 Thread wangmiao1981
Github user wangmiao1981 commented on the pull request: https://github.com/apache/spark/pull/13284#issuecomment-221487340 @vectorijk please run conflicts(detail = TRUE) and check the $package:SparkR with test_context.R, namesOfMasked. 3.2.0 should have more methods than 3.1.0

[GitHub] spark pull request #13647: [SPARK-15784][ML][WIP]:Add Power Iteration Cluste...

2016-06-13 Thread wangmiao1981
GitHub user wangmiao1981 opened a pull request: https://github.com/apache/spark/pull/13647 [SPARK-15784][ML][WIP]:Add Power Iteration Clustering to spark.ml ## What changes were proposed in this pull request? This PR is to add Power Iteration Clustering to spark.ml

[GitHub] spark pull request #13476: [SPARK-15684][SparkR]Not mask startsWith and ends...

2016-06-06 Thread wangmiao1981
Github user wangmiao1981 commented on a diff in the pull request: https://github.com/apache/spark/pull/13476#discussion_r65842260 --- Diff: R/pkg/inst/tests/testthat/test_sparkSQL.R --- @@ -1137,6 +1137,13 @@ test_that("string operators", { expect_equal(count(wher

[GitHub] spark pull request: [SPARK-15449][MLlib][Example]:Wrong Data Forma...

2016-05-27 Thread wangmiao1981
Github user wangmiao1981 commented on the pull request: https://github.com/apache/spark/pull/13301#issuecomment-20897 @MLnick Done. removed seed in python and java --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark pull request: [SPARK-15439][SparkR]:Failed to run unit test ...

2016-05-26 Thread wangmiao1981
Github user wangmiao1981 commented on the pull request: https://github.com/apache/spark/pull/13284#issuecomment-221786544 @shivaram I will create a JIRA soon. Thursday and Friday, I will be on travel to NYC. Will do it on Saturday. --- If your project is set up for it, you can

[GitHub] spark pull request: [SPARK-15449][MLlib][Example]:Wrong Data Forma...

2016-05-26 Thread wangmiao1981
Github user wangmiao1981 commented on the pull request: https://github.com/apache/spark/pull/13301#issuecomment-222040536 @MLnick I am on travel now. I will update it on Saturday. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark issue #13647: [SPARK-15784][ML]:Add Power Iteration Clustering to spar...

2016-06-21 Thread wangmiao1981
Github user wangmiao1981 commented on the issue: https://github.com/apache/spark/pull/13647 @yanboliang As we discussed offline, I made the implementation. You might want to take a pass on the implementation. cc @mengxr @josephkb --- If your project is set up for it, you can reply

[GitHub] spark pull request #13755: [SPARK-16040][MLlib][DOC]:spark.mllib PIC documen...

2016-06-17 Thread wangmiao1981
GitHub user wangmiao1981 opened a pull request: https://github.com/apache/spark/pull/13755 [SPARK-16040][MLlib][DOC]:spark.mllib PIC document extra line of refernece ## What changes were proposed in this pull request? In the 2.0 document, Line "A full ex

[GitHub] spark pull request: merge code

2016-02-25 Thread wangmiao1981
GitHub user wangmiao1981 opened a pull request: https://github.com/apache/spark/pull/11380 merge code ## What changes were proposed in this pull request? (Please fill in changes proposed in this fix) ## How was this patch tested? (Please explain how

[GitHub] spark pull request: merge code

2016-02-25 Thread wangmiao1981
Github user wangmiao1981 closed the pull request at: https://github.com/apache/spark/pull/11380 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request: merge code

2016-02-25 Thread wangmiao1981
Github user wangmiao1981 commented on the pull request: https://github.com/apache/spark/pull/11380#issuecomment-189038485 Sorry for mistakenly sending it out. I want to merge Master code to my own branch. --- If your project is set up for it, you can reply to this email and have

[GitHub] spark pull request: [SPARK-13034] Add export/import for all estima...

2016-03-09 Thread wangmiao1981
Github user wangmiao1981 commented on the pull request: https://github.com/apache/spark/pull/11552#issuecomment-194441901 @GayathriMurali Thanks! I see you add one more classification other than logisticregression and navie bayes. When I was working on my code base, that classifier

[GitHub] spark pull request: SPARK-13034[ML]:PySpark ml.classification supp...

2016-03-08 Thread wangmiao1981
Github user wangmiao1981 commented on the pull request: https://github.com/apache/spark/pull/11582#issuecomment-194014892 @srowen I added the title in the pull request. Sorry for causing the confusion here. I only made changes in one python file. All other changes are merged from

[GitHub] spark pull request: [SPARK-14071][PySpark][ML]Change MLWritable.wr...

2016-03-28 Thread wangmiao1981
Github user wangmiao1981 commented on a diff in the pull request: https://github.com/apache/spark/pull/11945#discussion_r57632372 --- Diff: python/pyspark/ml/tests.py --- @@ -655,6 +656,20 @@ def test_nested_pipeline_persistence(self): except OSError

[GitHub] spark pull request: [SPARK-14071][PySpark][ML]Change MLWritable.wr...

2016-03-28 Thread wangmiao1981
Github user wangmiao1981 commented on a diff in the pull request: https://github.com/apache/spark/pull/11945#discussion_r57636219 --- Diff: python/pyspark/ml/tests.py --- @@ -655,6 +656,20 @@ def test_nested_pipeline_persistence(self): except OSError

[GitHub] spark pull request: [SPARK-14392][ML]CountVectorizer Estimator sho...

2016-04-06 Thread wangmiao1981
Github user wangmiao1981 commented on a diff in the pull request: https://github.com/apache/spark/pull/12200#discussion_r58754307 --- Diff: mllib/src/test/scala/org/apache/spark/ml/feature/CountVectorizerSuite.scala --- @@ -183,6 +183,26 @@ class CountVectorizerSuite extends

[GitHub] spark pull request: [SPARK-14392][ML]CountVectorizer Estimator sho...

2016-04-06 Thread wangmiao1981
Github user wangmiao1981 commented on a diff in the pull request: https://github.com/apache/spark/pull/12200#discussion_r58748929 --- Diff: mllib/src/test/scala/org/apache/spark/ml/feature/CountVectorizerSuite.scala --- @@ -183,6 +183,26 @@ class CountVectorizerSuite extends

[GitHub] spark pull request: [SPARK-14392][ML]CountVectorizer Estimator sho...

2016-04-06 Thread wangmiao1981
Github user wangmiao1981 commented on a diff in the pull request: https://github.com/apache/spark/pull/12200#discussion_r58750775 --- Diff: mllib/src/test/scala/org/apache/spark/ml/feature/CountVectorizerSuite.scala --- @@ -183,6 +183,26 @@ class CountVectorizerSuite extends

[GitHub] spark pull request: [SPARK-14392][ML]CountVectorizer Estimator sho...

2016-04-06 Thread wangmiao1981
Github user wangmiao1981 commented on a diff in the pull request: https://github.com/apache/spark/pull/12200#discussion_r58744522 --- Diff: mllib/src/test/scala/org/apache/spark/ml/feature/CountVectorizerSuite.scala --- @@ -183,6 +183,26 @@ class CountVectorizerSuite extends

[GitHub] spark pull request: [SPARK-14392][ML]CountVectorizer Estimator sho...

2016-04-06 Thread wangmiao1981
Github user wangmiao1981 commented on a diff in the pull request: https://github.com/apache/spark/pull/12200#discussion_r58746316 --- Diff: mllib/src/test/scala/org/apache/spark/ml/feature/CountVectorizerSuite.scala --- @@ -183,6 +183,26 @@ class CountVectorizerSuite extends

[GitHub] spark pull request: [SPARK-12569][PySpark][ML]:DecisionTreeRegress...

2016-04-07 Thread wangmiao1981
Github user wangmiao1981 commented on a diff in the pull request: https://github.com/apache/spark/pull/12116#discussion_r58909275 --- Diff: python/pyspark/ml/regression.py --- @@ -425,6 +425,10 @@ class DecisionTreeRegressor(JavaEstimator, HasFeaturesCol, HasLabelCol, HasPredi

[GitHub] spark pull request: [SPARK-14392][ML]CountVectorizer Estimator sho...

2016-04-07 Thread wangmiao1981
Github user wangmiao1981 commented on the pull request: https://github.com/apache/spark/pull/12200#issuecomment-206999812 @MLnick I will revise the test accordingly. I think after testing the estimator, I need to turn off the flag of the trained model first. Otherwise, the binary

[GitHub] spark pull request: [SPARK-14392][ML]CountVectorizer Estimator sho...

2016-04-07 Thread wangmiao1981
Github user wangmiao1981 commented on a diff in the pull request: https://github.com/apache/spark/pull/12200#discussion_r58929398 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/CountVectorizer.scala --- @@ -127,6 +146,9 @@ class CountVectorizer(override val uid: String

[GitHub] spark pull request: [SPARK-12569][PySpark][ML]:DecisionTreeRegress...

2016-04-05 Thread wangmiao1981
Github user wangmiao1981 commented on the pull request: https://github.com/apache/spark/pull/12116#issuecomment-205896732 @holdenk Thanks for your comments! I will make changes accordingly. --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark pull request: [SPARK-14392][ML]CountVectorizer Estimator sho...

2016-04-06 Thread wangmiao1981
Github user wangmiao1981 commented on a diff in the pull request: https://github.com/apache/spark/pull/12200#discussion_r58662007 --- Diff: mllib/src/test/scala/org/apache/spark/ml/feature/CountVectorizerSuite.scala --- @@ -115,6 +115,27 @@ class CountVectorizerSuite extends

[GitHub] spark pull request: [SPARK-14392][ML]CountVectorizer Estimator sho...

2016-04-06 Thread wangmiao1981
GitHub user wangmiao1981 opened a pull request: https://github.com/apache/spark/pull/12200 [SPARK-14392][ML]CountVectorizer Estimator should include binary toggle Param ## What changes were proposed in this pull request? CountVectorizerModel has a binary toggle param

[GitHub] spark pull request: [SPARK-14392][ML]CountVectorizer Estimator sho...

2016-04-06 Thread wangmiao1981
Github user wangmiao1981 commented on the pull request: https://github.com/apache/spark/pull/12200#issuecomment-206181665 @MLnick can you trigger the auto test? It seems that I am not in the white list. I had one JIRA merged to master. Thanks! Miao --- If your project is set up

[GitHub] spark pull request: [SPARK-12569][PySpark][ML]:DecisionTreeRegress...

2016-04-06 Thread wangmiao1981
Github user wangmiao1981 commented on the pull request: https://github.com/apache/spark/pull/12116#issuecomment-206145891 @jkbradley Can you add me to white list to trigger the integration test? Thanks! Miao --- If your project is set up for it, you can reply

[GitHub] spark pull request: [SPARK-12569][PySpark][ML]:DecisionTreeRegress...

2016-04-06 Thread wangmiao1981
Github user wangmiao1981 commented on the pull request: https://github.com/apache/spark/pull/12116#issuecomment-206449257 @holdenk I am think what tests should be added. Do you have any suggestions? Thanks! Miao --- If your project is set up for it, you can reply

[GitHub] spark pull request: [SPARK-14392][ML]CountVectorizer Estimator sho...

2016-04-06 Thread wangmiao1981
Github user wangmiao1981 commented on a diff in the pull request: https://github.com/apache/spark/pull/12200#discussion_r58739580 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/CountVectorizer.scala --- @@ -42,7 +42,8 @@ private[feature] trait CountVectorizerParams

[GitHub] spark pull request: [SPARK-14392][ML]CountVectorizer Estimator sho...

2016-04-06 Thread wangmiao1981
Github user wangmiao1981 commented on a diff in the pull request: https://github.com/apache/spark/pull/12200#discussion_r58740250 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/CountVectorizer.scala --- @@ -100,6 +103,24 @@ private[feature] trait CountVectorizerParams

[GitHub] spark pull request: [SPARK-12569][PySpark][ML]:DecisionTreeRegress...

2016-04-07 Thread wangmiao1981
Github user wangmiao1981 commented on a diff in the pull request: https://github.com/apache/spark/pull/12116#discussion_r58981236 --- Diff: python/pyspark/ml/regression.py --- @@ -433,12 +440,12 @@ class DecisionTreeRegressor(JavaEstimator, HasFeaturesCol, HasLabelCol, HasPredi

[GitHub] spark pull request: SPARK-13034[ML]:PySpark ml.classification supp...

2016-03-19 Thread wangmiao1981
Github user wangmiao1981 closed the pull request at: https://github.com/apache/spark/pull/11582 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request: SPARK-13034[ML]:PySpark ml.classification supp...

2016-03-19 Thread wangmiao1981
Github user wangmiao1981 commented on the pull request: https://github.com/apache/spark/pull/11582#issuecomment-198064456 close this one as it has been merged with 11707. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark pull request: [SPARK-14071][PySpark][ML]Change MLWritable.wr...

2016-03-24 Thread wangmiao1981
Github user wangmiao1981 commented on the pull request: https://github.com/apache/spark/pull/11945#issuecomment-201146670 Found the issue: PEP8 checks failed. ./python/pyspark/ml/tests.py:658:5: E301 expected 1 blank line, found 0 --- If your project is set up

[GitHub] spark pull request: [SPARK-14071][PySpark][ML]Change MLWritable.wr...

2016-03-24 Thread wangmiao1981
Github user wangmiao1981 commented on the pull request: https://github.com/apache/spark/pull/11945#issuecomment-201146467 Build finished. The HTML pages are in _build/html. [error] running /home/jenkins/workspace/SparkPullRequestBuilder@3/dev/lint-python ; received return code 1

[GitHub] spark pull request: [SPARK-14071][PySpark][ML]Change MLWritable.wr...

2016-03-25 Thread wangmiao1981
Github user wangmiao1981 commented on the pull request: https://github.com/apache/spark/pull/11945#issuecomment-201174996 Jenkins, test this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-14071][PySpark][ML]Change MLWritable.wr...

2016-03-25 Thread wangmiao1981
Github user wangmiao1981 commented on the pull request: https://github.com/apache/spark/pull/11945#issuecomment-201375961 Jenkins, test this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-14071][PySpark][ML]Change MLWritable.wr...

2016-03-24 Thread wangmiao1981
GitHub user wangmiao1981 opened a pull request: https://github.com/apache/spark/pull/11945 [SPARK-14071][PySpark][ML]Change MLWritable.write to be a property Add property to MLWritable.write method, so we can use .write instead of .write() Add a new test to ml/test.py

[GitHub] spark pull request: [SPARK-14071][PySpark][ML]Change MLWritable.wr...

2016-03-26 Thread wangmiao1981
Github user wangmiao1981 commented on the pull request: https://github.com/apache/spark/pull/11945#issuecomment-201947768 @jkbradley I am not sure whether the property tag will change the appearance of the members in the doc. I can do a quick check by roll-back the change to check

[GitHub] spark pull request: SPARK-13034

2016-03-08 Thread wangmiao1981
GitHub user wangmiao1981 opened a pull request: https://github.com/apache/spark/pull/11582 SPARK-13034 I added Import and Export for Logisticregression and Naive Bayes Test ./python/run-tests --python-executables=python2.7 --modules=pyspark-ml Result: Running

[GitHub] spark pull request: [SPARK-13034] Add export/import for all estima...

2016-03-08 Thread wangmiao1981
Github user wangmiao1981 commented on the pull request: https://github.com/apache/spark/pull/11552#issuecomment-193931859 Hi Gayathri, I put my comments in the JIRA about 2 weeks ago and worked with Yanbo on putting some code. Can we work together to get it merged? I

[GitHub] spark pull request: [SPARK-12569][PySpark][ML]:DecisionTreeRegress...

2016-04-01 Thread wangmiao1981
GitHub user wangmiao1981 opened a pull request: https://github.com/apache/spark/pull/12116 [SPARK-12569][PySpark][ML]:DecisionTreeRegressor: provide variance of prediction: Python AP ## What changes were proposed in this pull request? A new column VarianceCol has been

[GitHub] spark pull request: [SPARK-14071][PySpark][ML]Change MLWritable.wr...

2016-03-27 Thread wangmiao1981
Github user wangmiao1981 commented on a diff in the pull request: https://github.com/apache/spark/pull/11945#discussion_r57545765 --- Diff: python/pyspark/ml/tests.py --- @@ -655,6 +656,20 @@ def test_nested_pipeline_persistence(self): except OSError

[GitHub] spark pull request: [SPARK-12569][PySpark][ML]:DecisionTreeRegress...

2016-04-04 Thread wangmiao1981
Github user wangmiao1981 commented on the pull request: https://github.com/apache/spark/pull/12116#issuecomment-205513619 @holdenk Thanks for pointing it out. I will revise it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark pull request: [SPARK-12569][PySpark][ML]:DecisionTreeRegress...

2016-04-04 Thread wangmiao1981
Github user wangmiao1981 commented on the pull request: https://github.com/apache/spark/pull/12116#issuecomment-205520304 @holdenk I made the changes and tested the gen code. Can you review it? Thanks! --- If your project is set up for it, you can reply to this email and have your

[GitHub] spark pull request: [SPARK-14571][ML]Log instrumentation in ALS

2016-04-25 Thread wangmiao1981
Github user wangmiao1981 commented on the pull request: https://github.com/apache/spark/pull/12560#issuecomment-214515243 @MLnick I agree. I will remove the feature log now and only log parameters. I will keep the named feature method. --- If your project is set up for it, you can

[GitHub] spark pull request: [SPARK-14571][ML]Log instrumentation in ALS

2016-04-25 Thread wangmiao1981
Github user wangmiao1981 commented on the pull request: https://github.com/apache/spark/pull/12560#issuecomment-214474647 @MLnick Yanbo does not like the change of train() API. The new parameter is optional, so the user of train should not be aware of this change. In addition, I

[GitHub] spark pull request: [SPARK-14433][PySpark][ML]:PySpark ml Gaussian...

2016-04-25 Thread wangmiao1981
Github user wangmiao1981 commented on a diff in the pull request: https://github.com/apache/spark/pull/12402#discussion_r60963521 --- Diff: python/pyspark/ml/clustering.py --- @@ -22,7 +22,151 @@ from pyspark.mllib.common import inherit_doc __all__

[GitHub] spark pull request: [SPARK-14571][ML]Log instrumentation in ALS

2016-04-22 Thread wangmiao1981
Github user wangmiao1981 commented on a diff in the pull request: https://github.com/apache/spark/pull/12560#discussion_r60701976 --- Diff: mllib/src/main/scala/org/apache/spark/ml/recommendation/ALS.scala --- @@ -607,7 +611,8 @@ object ALS extends DefaultParamsReadable[ALS

[GitHub] spark pull request: [SPARK-14571][ML]Log instrumentation in ALS

2016-04-21 Thread wangmiao1981
Github user wangmiao1981 commented on the pull request: https://github.com/apache/spark/pull/12560#issuecomment-213264393 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-14433][PySpark][ML]:PySpark ml Gaussian...

2016-04-22 Thread wangmiao1981
Github user wangmiao1981 commented on a diff in the pull request: https://github.com/apache/spark/pull/12402#discussion_r60694336 --- Diff: mllib/src/main/scala/org/apache/spark/ml/clustering/GaussianMixture.scala --- @@ -104,6 +105,17 @@ class GaussianMixtureModel private[ml

[GitHub] spark pull request: [SPARK-14433][PySpark][ML]:PySpark ml Gaussian...

2016-04-22 Thread wangmiao1981
Github user wangmiao1981 commented on the pull request: https://github.com/apache/spark/pull/12402#issuecomment-213306253 retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-14571][ML]Log instrumentation in ALS

2016-04-21 Thread wangmiao1981
Github user wangmiao1981 commented on the pull request: https://github.com/apache/spark/pull/12560#issuecomment-213023729 Thanks all for your comments! Let me figure out how to collect the information without slowing the algorithm. @MLnick The names are passed to the log. For example

[GitHub] spark pull request: [SPARK-14571][ML]Log instrumentation in ALS

2016-04-21 Thread wangmiao1981
Github user wangmiao1981 commented on the pull request: https://github.com/apache/spark/pull/12560#issuecomment-213059111 @thunterdb train method has count information, but it will change the signature of the train method. I am learning how to avoid collect and changing signature

[GitHub] spark pull request: [SPARK-14433][PySpark][ML]:PySpark ml Gaussian...

2016-04-21 Thread wangmiao1981
Github user wangmiao1981 commented on the pull request: https://github.com/apache/spark/pull/12402#issuecomment-213096256 @jkbradley @yanboliang I made changes and remove unused import. --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark pull request: [SPARK-14433][PySpark][ML]:PySpark ml Gaussian...

2016-04-22 Thread wangmiao1981
Github user wangmiao1981 commented on the pull request: https://github.com/apache/spark/pull/12402#issuecomment-213514940 @yanboliang @jkbradley I made all suggested changes and improved document in the comments. Thanks! --- If your project is set up for it, you can reply

[GitHub] spark pull request: [SPARK-14937][ML][Document]spark.ml LogisticRe...

2016-04-26 Thread wangmiao1981
GitHub user wangmiao1981 opened a pull request: https://github.com/apache/spark/pull/12717 [SPARK-14937][ML][Document]spark.ml LogisticRegression sqlCtx in scala is inconsistent with java and python ## What changes were proposed in this pull request? In spark.ml document

[GitHub] spark pull request: [SPARK-14937][ML][Document]spark.ml LogisticRe...

2016-04-26 Thread wangmiao1981
Github user wangmiao1981 commented on the pull request: https://github.com/apache/spark/pull/12717#issuecomment-214919250 retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-14937][ML][Document]spark.ml LogisticRe...

2016-04-27 Thread wangmiao1981
Github user wangmiao1981 commented on a diff in the pull request: https://github.com/apache/spark/pull/12717#discussion_r61305059 --- Diff: examples/src/main/scala/org/apache/spark/examples/ml/LogisticRegressionSummaryExample.scala --- @@ -30,11 +30,11 @@ object

[GitHub] spark pull request: [SPARK-14571][ML]Log instrumentation in ALS

2016-04-27 Thread wangmiao1981
Github user wangmiao1981 commented on the pull request: https://github.com/apache/spark/pull/12560#issuecomment-215135795 @MLnick @yanboliang Any further comments? Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark pull request: [SPARK-14937][ML][Document]spark.ml LogisticRe...

2016-04-27 Thread wangmiao1981
Github user wangmiao1981 commented on a diff in the pull request: https://github.com/apache/spark/pull/12717#discussion_r61288394 --- Diff: examples/src/main/scala/org/apache/spark/examples/ml/LogisticRegressionSummaryExample.scala --- @@ -30,11 +30,11 @@ object

[GitHub] spark pull request: [SPARK-14937][ML][Document]spark.ml LogisticRe...

2016-04-27 Thread wangmiao1981
Github user wangmiao1981 commented on a diff in the pull request: https://github.com/apache/spark/pull/12717#discussion_r61306574 --- Diff: examples/src/main/scala/org/apache/spark/examples/ml/LogisticRegressionSummaryExample.scala --- @@ -30,11 +30,11 @@ object

[GitHub] spark pull request: [SPARK-14571][ML]Log instrumentation in ALS

2016-04-27 Thread wangmiao1981
Github user wangmiao1981 commented on the pull request: https://github.com/apache/spark/pull/12560#issuecomment-215235728 @thunterdb What do you think about our discussions? Thanks! Miao --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark pull request: [SPARK-14937][ML][Document]spark.ml LogisticRe...

2016-04-26 Thread wangmiao1981
Github user wangmiao1981 commented on the pull request: https://github.com/apache/spark/pull/12717#issuecomment-214970534 @yanboliang Can you take a look ? It is a simple fix. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark pull request: [SPARK-14434][ML]:User guide doc and examples ...

2016-04-30 Thread wangmiao1981
Github user wangmiao1981 commented on the pull request: https://github.com/apache/spark/pull/12788#issuecomment-215999406 cc @yanboliang @jkbradley @MLnick @holdenk --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark pull request: [SPARK-14433][PySpark][ML]:PySpark ml Gaussian...

2016-04-21 Thread wangmiao1981
Github user wangmiao1981 commented on the pull request: https://github.com/apache/spark/pull/12402#issuecomment-213232640 @jkbradley Thanks for your review! I will make the changes accordingly. --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark pull request: [SPARK-15360][Spark-Submit]Should print spark-...

2016-05-19 Thread wangmiao1981
Github user wangmiao1981 commented on a diff in the pull request: https://github.com/apache/spark/pull/13163#discussion_r63970749 --- Diff: launcher/src/test/java/org/apache/spark/launcher/SparkSubmitCommandBuilderSuite.java --- @@ -59,6 +59,17 @@ public void

[GitHub] spark pull request: [SPARK-15360][Spark-Submit]Should print spark-...

2016-05-18 Thread wangmiao1981
GitHub user wangmiao1981 opened a pull request: https://github.com/apache/spark/pull/13163 [SPARK-15360][Spark-Submit]Should print spark-submit usage when no arguments is specified ## What changes were proposed in this pull request? (Please fill in changes proposed

[GitHub] spark pull request: [SPARK-15360][Spark-Submit]Should print spark-...

2016-05-19 Thread wangmiao1981
Github user wangmiao1981 commented on a diff in the pull request: https://github.com/apache/spark/pull/13163#discussion_r63988573 --- Diff: launcher/src/test/java/org/apache/spark/launcher/SparkSubmitCommandBuilderSuite.java --- @@ -59,6 +59,18 @@ public void

[GitHub] spark pull request: [SPARK-15363][ML][Example]:Example code should...

2016-05-19 Thread wangmiao1981
GitHub user wangmiao1981 opened a pull request: https://github.com/apache/spark/pull/13213 [SPARK-15363][ML][Example]:Example code shouldn't use VectorImplicits._, asML/fromML ## What changes were proposed in this pull request? (Please fill in changes proposed in this fix

[GitHub] spark pull request: [SPARK-15492][ML][DOC]:Binarization scala exam...

2016-05-23 Thread wangmiao1981
Github user wangmiao1981 commented on the pull request: https://github.com/apache/spark/pull/13266#issuecomment-221168009 @jerryshao We have several similar bugs fixed. I am doing QA for ML 2.0 document now. --- If your project is set up for it, you can reply to this email and have

[GitHub] spark pull request: [SPARK-15439][SparkR]:Failed to run unit test ...

2016-05-24 Thread wangmiao1981
GitHub user wangmiao1981 opened a pull request: https://github.com/apache/spark/pull/13284 [SPARK-15439][SparkR]:Failed to run unit test in SparkR ## What changes were proposed in this pull request? (Please fill in changes proposed in this fix) There are some failures

[GitHub] spark pull request: [SPARK-15492][ML][DOC]:Binarization scala exam...

2016-05-23 Thread wangmiao1981
GitHub user wangmiao1981 opened a pull request: https://github.com/apache/spark/pull/13266 [SPARK-15492][ML][DOC]:Binarization scala example copy & paste to spark-shell error ## What changes were proposed in this pull request? (Please fill in changes proposed in this

[GitHub] spark pull request: [SPARK-15360][Spark-Submit]Should print spark-...

2016-05-18 Thread wangmiao1981
Github user wangmiao1981 commented on the pull request: https://github.com/apache/spark/pull/13163#issuecomment-220163385 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-15360][Spark-Submit]Should print spark-...

2016-05-18 Thread wangmiao1981
Github user wangmiao1981 commented on the pull request: https://github.com/apache/spark/pull/13163#issuecomment-220198683 @vanzin Yes, that is what I mean and I want to confirm with you. I only check no exception. --- If your project is set up for it, you can reply to this email

[GitHub] spark pull request: [SPARK-15360][Spark-Submit]Should print spark-...

2016-05-18 Thread wangmiao1981
Github user wangmiao1981 commented on the pull request: https://github.com/apache/spark/pull/13163#issuecomment-220186565 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

  1   2   3   4   5   6   7   >