[GitHub] spark issue #13514: [SPARK-15770][ML] Annotation audit for Experimental and ...

2016-06-04 Thread zhengruifeng
Github user zhengruifeng commented on the issue: https://github.com/apache/spark/pull/13514 **[Test build #60002 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60002/consoleFull)** for PR 13514 at commit [`42adf3f`](https://github.com/apache/spark

[GitHub] spark issue #13514: [SPARK-15770][ML] 'Experimental' annotation audit

2016-06-04 Thread zhengruifeng
Github user zhengruifeng commented on the issue: https://github.com/apache/spark/pull/13514 **[Test build #60001 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60001/consoleFull)** for PR 13514 at commit [`546987f`](https://github.com/apache/spark

[GitHub] spark pull request #13514: [SPARK-15770][ML] 'Experimental' annotation audit

2016-06-04 Thread zhengruifeng
GitHub user zhengruifeng opened a pull request: https://github.com/apache/spark/pull/13514 [SPARK-15770][ML] 'Experimental' annotation audit ## What changes were proposed in this pull request? 1, remove comments `:: Experimental ::` for non-experimental API 2, add comments

[GitHub] spark issue #13511: [SPARK-14900][ML][PySpark] Add accuracy and deprecate pr...

2016-06-04 Thread zhengruifeng
Github user zhengruifeng commented on the issue: https://github.com/apache/spark/pull/13511 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #13511: [SPARK-14900][ML][PySpark] Add accuracy and deprecate pr...

2016-06-04 Thread zhengruifeng
Github user zhengruifeng commented on the issue: https://github.com/apache/spark/pull/13511 **[Test build #59997 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59997/consoleFull)** for PR 13511 at commit [`652b9ef`](https://github.com/apache/spark

[GitHub] spark issue #13511: [SPARK-14900][ML][PySpark] Add accuracy and deprecate pr...

2016-06-04 Thread zhengruifeng
Github user zhengruifeng commented on the issue: https://github.com/apache/spark/pull/13511 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/59997/ Test PASSed

[GitHub] spark issue #13511: [SPARK-14900][ML][PySpark] Add accuracy and deprecate pr...

2016-06-04 Thread zhengruifeng
Github user zhengruifeng commented on the issue: https://github.com/apache/spark/pull/13511 @srowen It is just the local test code, please ignore it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark issue #13511: [SPARK-14900][ML][PySpark] Add accuracy and deprecate pr...

2016-06-04 Thread zhengruifeng
Github user zhengruifeng commented on the issue: https://github.com/apache/spark/pull/13511 Looks correct, what are you pointing out? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark issue #13511: [SPARK-14900][ML][PySpark] Add accuracy and deprecate pr...

2016-06-04 Thread zhengruifeng
Github user zhengruifeng commented on the issue: https://github.com/apache/spark/pull/13511 **[Test build #59997 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59997/consoleFull)** for PR 13511 at commit [`652b9ef`](https://github.com/apache/spark

[GitHub] spark issue #13511: [SPARK-14900][ML][PySpark] Add accuracy and deprecate pr...

2016-06-04 Thread zhengruifeng
Github user zhengruifeng commented on the issue: https://github.com/apache/spark/pull/13511 cc @srowen --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark issue #13511: [SPARK-14900][ML][PySpark] Add accuracy and deprecate pr...

2016-06-04 Thread zhengruifeng
Github user zhengruifeng commented on the issue: https://github.com/apache/spark/pull/13511 ``` >>> from pyspark.mllib.evaluation import * >>> predictionAndLabels = sc.parallelize([(0.0, 0.0), (0.0, 1.0), (0.0, 0.0), (1.0, 0.0), (1.0, 1.0), (1.0, 1.0), (1.

[GitHub] spark pull request #13511: [SPARK-14900][ML][PySpark] Add accuracy and depre...

2016-06-04 Thread zhengruifeng
GitHub user zhengruifeng opened a pull request: https://github.com/apache/spark/pull/13511 [SPARK-14900][ML][PySpark] Add accuracy and deprecate precison,recall,f1 ## What changes were proposed in this pull request? 1, add accuracy for MulticlassMetrics 2, deprecate overall

[GitHub] spark issue #13390: [SPARK-15617][ML][DOC] Clarify that fMeasure in Multicla...

2016-06-04 Thread zhengruifeng
Github user zhengruifeng commented on the issue: https://github.com/apache/spark/pull/13390 Merged to master/2.0 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request #13390: [SPARK-15617][ML][DOC] Clarify that fMeasure in M...

2016-06-03 Thread zhengruifeng
Github user zhengruifeng commented on a diff in the pull request: https://github.com/apache/spark/pull/13390#discussion_r65692163 --- Diff: mllib/src/main/scala/org/apache/spark/ml/evaluation/MulticlassClassificationEvaluator.scala --- @@ -39,16 +39,16 @@ class

[GitHub] spark issue #13390: [SPARK-15617][ML][DOC] Clarify that fMeasure in Multicla...

2016-06-01 Thread zhengruifeng
Github user zhengruifeng commented on the issue: https://github.com/apache/spark/pull/13390 cc @srowen --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark pull request: [SPARK-15650][ML] Add correctness test for Mul...

2016-05-30 Thread zhengruifeng
Github user zhengruifeng closed the pull request at: https://github.com/apache/spark/pull/13398 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request: [SPARK-15650][ML] Add correctness test for Mul...

2016-05-30 Thread zhengruifeng
Github user zhengruifeng commented on a diff in the pull request: https://github.com/apache/spark/pull/13398#discussion_r65077584 --- Diff: mllib/src/test/scala/org/apache/spark/ml/evaluation/MulticlassClassificationEvaluatorSuite.scala --- @@ -40,4 +41,51 @@ class

[GitHub] spark pull request: [SPARK-15650][ML] Add correctness test for Mul...

2016-05-30 Thread zhengruifeng
Github user zhengruifeng commented on a diff in the pull request: https://github.com/apache/spark/pull/13398#discussion_r65048853 --- Diff: mllib/src/test/scala/org/apache/spark/ml/evaluation/MulticlassClassificationEvaluatorSuite.scala --- @@ -40,4 +41,51 @@ class

[GitHub] spark pull request: [SPARK-15650][ML] Add correctness test for Mul...

2016-05-30 Thread zhengruifeng
Github user zhengruifeng commented on a diff in the pull request: https://github.com/apache/spark/pull/13398#discussion_r65046391 --- Diff: mllib/src/test/scala/org/apache/spark/ml/evaluation/MulticlassClassificationEvaluatorSuite.scala --- @@ -40,4 +41,51 @@ class

[GitHub] spark pull request: [SPARK-15650][ML] Add correctness test for Mul...

2016-05-30 Thread zhengruifeng
Github user zhengruifeng commented on the pull request: https://github.com/apache/spark/pull/13398#issuecomment-222445158 cc @yanboliang --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-15650][ML] Add correctness test for Mul...

2016-05-30 Thread zhengruifeng
GitHub user zhengruifeng opened a pull request: https://github.com/apache/spark/pull/13398 [SPARK-15650][ML] Add correctness test for MulticlassClassification ## What changes were proposed in this pull request? Add tests to verify the correctness

[GitHub] spark pull request: [SPARK-15617][ML][DOC] Clarify that fMeasure i...

2016-05-30 Thread zhengruifeng
Github user zhengruifeng commented on a diff in the pull request: https://github.com/apache/spark/pull/13390#discussion_r65030639 --- Diff: docs/mllib-evaluation-metrics.md --- @@ -182,17 +183,22 @@ $$\hat{\delta}(x) = \begin{cases}1 & \text{if $x = 0$}, \\ 0 & \text{o

[GitHub] spark pull request: [SPARK-15617][ML][DOC] Clarify that fMeasure i...

2016-05-30 Thread zhengruifeng
Github user zhengruifeng commented on a diff in the pull request: https://github.com/apache/spark/pull/13390#discussion_r65027076 --- Diff: docs/mllib-evaluation-metrics.md --- @@ -182,17 +183,22 @@ $$\hat{\delta}(x) = \begin{cases}1 & \text{if $x = 0$}, \\ 0 & \text{o

[GitHub] spark pull request: [SPARK-15617][ML][DOC] Clarify that fMeasure i...

2016-05-29 Thread zhengruifeng
Github user zhengruifeng commented on the pull request: https://github.com/apache/spark/pull/13390#issuecomment-222364467 test this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-15617][ML][DOC] Clarify that fMeasure i...

2016-05-29 Thread zhengruifeng
Github user zhengruifeng commented on a diff in the pull request: https://github.com/apache/spark/pull/13390#discussion_r65005147 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/evaluation/MulticlassMetrics.scala --- @@ -153,7 +153,7 @@ class MulticlassMetrics @Since("

[GitHub] spark pull request: [SPARK-15617][ML][DOC] Clarify that fMeasure i...

2016-05-29 Thread zhengruifeng
GitHub user zhengruifeng opened a pull request: https://github.com/apache/spark/pull/13390 [SPARK-15617][ML][DOC] Clarify that fMeasure in MulticlassMetrics is "micro" f1_score ## What changes were proposed in this pull request? clarify that fMeasure in Multicl

[GitHub] spark pull request: [DO_NOT_MERGE][ML] Enable java save/load tests...

2016-05-28 Thread zhengruifeng
GitHub user zhengruifeng opened a pull request: https://github.com/apache/spark/pull/13376 [DO_NOT_MERGE][ML] Enable java save/load tests for ML ## What changes were proposed in this pull request? Because `SPARK-6725` is already Resolved, the `TODO` save/load tests depending

[GitHub] spark pull request: [SPARK-15610][ML] PCA should not support k == ...

2016-05-27 Thread zhengruifeng
Github user zhengruifeng commented on the pull request: https://github.com/apache/spark/pull/13356#issuecomment-77181 So I will just change the error message. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-15607][ML] Remove redundant toArray in ...

2016-05-27 Thread zhengruifeng
Github user zhengruifeng closed the pull request at: https://github.com/apache/spark/pull/13354 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request: [SPARK-15291][GraphX] Remove redundant operati...

2016-05-27 Thread zhengruifeng
Github user zhengruifeng closed the pull request at: https://github.com/apache/spark/pull/13075 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request: [SPARK-15610][ML] PCA should not support k == ...

2016-05-27 Thread zhengruifeng
Github user zhengruifeng commented on the pull request: https://github.com/apache/spark/pull/13356#issuecomment-222109494 cc @jkbradley @mengxr --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-15610][ML] PCA should not support k == ...

2016-05-27 Thread zhengruifeng
GitHub user zhengruifeng opened a pull request: https://github.com/apache/spark/pull/13356 [SPARK-15610][ML] PCA should not support k == numFeatures ## What changes were proposed in this pull request? Fix the wrong bound of `k` in `PCA` `require(k <= sources.first().s

[GitHub] spark pull request: [SPARK-15607][ML] Remove redundant toArray in ...

2016-05-27 Thread zhengruifeng
GitHub user zhengruifeng opened a pull request: https://github.com/apache/spark/pull/13354 [SPARK-15607][ML] Remove redundant toArray in ml.linalg ## What changes were proposed in this pull request? `sliceInds, sliceVals` are already of type `Array`, so remove `toArray

[GitHub] spark pull request: [MINOR] Fix Typos

2016-05-26 Thread zhengruifeng
Github user zhengruifeng commented on the pull request: https://github.com/apache/spark/pull/13317#issuecomment-221803163 @holdenk Thanks. Good night. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [MINOR] Fix Typos

2016-05-26 Thread zhengruifeng
Github user zhengruifeng commented on the pull request: https://github.com/apache/spark/pull/13317#issuecomment-221801013 @holdenk Excuse me, How to deal with MiMa test failures? It seems caused by change in `core` --- If your project is set up for it, you can reply to this email

[GitHub] spark pull request: [MINOR] Fix Typos

2016-05-26 Thread zhengruifeng
Github user zhengruifeng commented on the pull request: https://github.com/apache/spark/pull/13317#issuecomment-221795573 @holdenk Yes. I use cmd like this `grep -in ' a [aeiou]' mllib/src/main/scala/org/apache/spark/ml/*/*scala` to generate potential wrong lines. --- If your

[GitHub] spark pull request: [MINOR] Fix Typos

2016-05-26 Thread zhengruifeng
Github user zhengruifeng commented on the pull request: https://github.com/apache/spark/pull/13317#issuecomment-221792040 @holdenk Thanks. I think you are right. I will revert `an one-xxx` to `a one-xxx`. --- If your project is set up for it, you can reply to this email and have

[GitHub] spark pull request: [MINOR] Fix Typos

2016-05-26 Thread zhengruifeng
Github user zhengruifeng commented on the pull request: https://github.com/apache/spark/pull/13317#issuecomment-221790963 @holdenk Thanks. I have fixed this. and run `lint-java` to check java file. --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark pull request: [MINOR] Fix Typos

2016-05-26 Thread zhengruifeng
GitHub user zhengruifeng opened a pull request: https://github.com/apache/spark/pull/13317 [MINOR] Fix Typos ## What changes were proposed in this pull request? `a` -> `an` ## How was this patch tested? local build You can me

[GitHub] spark pull request: [SPARK-14634][ML] Add BisectingKMeansSummary

2016-05-25 Thread zhengruifeng
Github user zhengruifeng commented on the pull request: https://github.com/apache/spark/pull/12394#issuecomment-221764870 cc @mengxr --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-15031][EXAMPLE] Use SparkSession in exa...

2016-05-20 Thread zhengruifeng
Github user zhengruifeng commented on the pull request: https://github.com/apache/spark/pull/13164#issuecomment-220589176 @andrewor14 @dongjoon-hyun `lint-java` checks pass now. I will run `lint-java` in the future if java file is modified. Thanks! --- If your project is set up

[GitHub] spark pull request: [SPARK-15031][EXAMPLE] Use SparkSession in exa...

2016-05-20 Thread zhengruifeng
Github user zhengruifeng commented on the pull request: https://github.com/apache/spark/pull/13164#issuecomment-220579481 @andrewor14 ok, I will do it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-15398][ML] Update the warning message t...

2016-05-19 Thread zhengruifeng
GitHub user zhengruifeng opened a pull request: https://github.com/apache/spark/pull/13190 [SPARK-15398][ML] Update the warning message to recommend ML usage ## What changes were proposed in this pull request? MLlib are not recommended to use, and some methods are even

[GitHub] spark pull request: [SPARK-15031][EXAMPLE] Use SparkSession in exa...

2016-05-18 Thread zhengruifeng
Github user zhengruifeng commented on the pull request: https://github.com/apache/spark/pull/13164#issuecomment-219958157 @dongjoon-hyun I don't mind. It maybe nice to have a progress bar or log scrolling in the `lint` checking. --- If your project is set up for it, you can

[GitHub] spark pull request: [SPARK-15031][EXAMPLE] Use SparkSession in exa...

2016-05-18 Thread zhengruifeng
Github user zhengruifeng commented on the pull request: https://github.com/apache/spark/pull/13164#issuecomment-219953877 @dongjoon-hyun Do you mean run `./dev/lint-java` locally? It is too time-consuming... --- If your project is set up for it, you can reply to this email and have

[GitHub] spark pull request: [SPARK-15031][EXAMPLE] Use SparkSession in exa...

2016-05-18 Thread zhengruifeng
GitHub user zhengruifeng opened a pull request: https://github.com/apache/spark/pull/13164 [SPARK-15031][EXAMPLE] Use SparkSession in examples ## What changes were proposed in this pull request? Use `SparkSession` according to [SPARK-15031](https://issues.apache.org/jira/browse

[GitHub] spark pull request: [SPARK-15031][EXAMPLES][FOLLOW-UP] Make Python...

2016-05-17 Thread zhengruifeng
Github user zhengruifeng commented on a diff in the pull request: https://github.com/apache/spark/pull/13135#discussion_r63638842 --- Diff: examples/src/main/python/ml/simple_params_example.py --- @@ -36,18 +35,20 @@ if len(sys.argv) > 1: --- End d

[GitHub] spark pull request: [SPARK-15031][EXAMPLES][FOLLOW-UP] Make Python...

2016-05-17 Thread zhengruifeng
Github user zhengruifeng commented on a diff in the pull request: https://github.com/apache/spark/pull/13135#discussion_r63638222 --- Diff: examples/src/main/python/ml/simple_params_example.py --- @@ -36,18 +35,20 @@ if len(sys.argv) > 1: --- End d

[GitHub] spark pull request: [SPARK-15305][ML][DOC]:spark.ml document Bisec...

2016-05-13 Thread zhengruifeng
Github user zhengruifeng commented on the pull request: https://github.com/apache/spark/pull/13083#issuecomment-218962997 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request: [MINOR] Fix Typos

2016-05-12 Thread zhengruifeng
Github user zhengruifeng commented on a diff in the pull request: https://github.com/apache/spark/pull/13078#discussion_r63128336 --- Diff: mllib/src/main/scala/org/apache/spark/ml/ann/Layer.scala --- @@ -64,8 +64,9 @@ private[ann] trait Layer extends Serializable

[GitHub] spark pull request: [MINOR] Fix Typos

2016-05-12 Thread zhengruifeng
Github user zhengruifeng commented on a diff in the pull request: https://github.com/apache/spark/pull/13078#discussion_r63128009 --- Diff: mllib/src/main/scala/org/apache/spark/ml/ann/BreezeUtil.scala --- @@ -55,7 +55,7 @@ private[ann] object BreezeUtil { * @param y y

[GitHub] spark pull request: Branch 2.0

2016-05-12 Thread zhengruifeng
Github user zhengruifeng commented on the pull request: https://github.com/apache/spark/pull/13089#issuecomment-218940318 @ahnqirage please close it --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-14434][ML]:User guide doc and examples ...

2016-05-12 Thread zhengruifeng
Github user zhengruifeng commented on a diff in the pull request: https://github.com/apache/spark/pull/12788#discussion_r63126914 --- Diff: examples/src/main/scala/org/apache/spark/examples/ml/GaussianMixtureExample.scala --- @@ -0,0 +1,64 @@ +/* + * Licensed

[GitHub] spark pull request: [SPARK-14434][ML]:User guide doc and examples ...

2016-05-12 Thread zhengruifeng
Github user zhengruifeng commented on a diff in the pull request: https://github.com/apache/spark/pull/12788#discussion_r63126939 --- Diff: examples/src/main/scala/org/apache/spark/examples/ml/GaussianMixtureExample.scala --- @@ -0,0 +1,64 @@ +/* + * Licensed

[GitHub] spark pull request: [SPARK-14434][ML]:User guide doc and examples ...

2016-05-12 Thread zhengruifeng
Github user zhengruifeng commented on a diff in the pull request: https://github.com/apache/spark/pull/12788#discussion_r63126885 --- Diff: examples/src/main/python/ml/gaussian_mixture_example.py --- @@ -0,0 +1,55 @@ +# +# Licensed to the Apache Software Foundation (ASF

[GitHub] spark pull request: [SPARK-14434][ML]:User guide doc and examples ...

2016-05-12 Thread zhengruifeng
Github user zhengruifeng commented on a diff in the pull request: https://github.com/apache/spark/pull/12788#discussion_r63125817 --- Diff: examples/src/main/scala/org/apache/spark/examples/ml/GaussianMixtureExample.scala --- @@ -0,0 +1,64 @@ +/* + * Licensed

[GitHub] spark pull request: [SPARK-14434][ML]:User guide doc and examples ...

2016-05-12 Thread zhengruifeng
Github user zhengruifeng commented on a diff in the pull request: https://github.com/apache/spark/pull/12788#discussion_r63126132 --- Diff: examples/src/main/java/org/apache/spark/examples/ml/JavaGaussianMixtureExample.java --- @@ -0,0 +1,64 @@ +/* + * Licensed

[GitHub] spark pull request: [SPARK-14434][ML]:User guide doc and examples ...

2016-05-12 Thread zhengruifeng
Github user zhengruifeng commented on a diff in the pull request: https://github.com/apache/spark/pull/12788#discussion_r63125992 --- Diff: examples/src/main/python/ml/gaussian_mixture_example.py --- @@ -0,0 +1,55 @@ +# +# Licensed to the Apache Software Foundation (ASF

[GitHub] spark pull request: [SPARK-14434][ML]:User guide doc and examples ...

2016-05-12 Thread zhengruifeng
Github user zhengruifeng commented on a diff in the pull request: https://github.com/apache/spark/pull/12788#discussion_r63125902 --- Diff: examples/src/main/scala/org/apache/spark/examples/ml/GaussianMixtureExample.scala --- @@ -0,0 +1,64 @@ +/* + * Licensed

[GitHub] spark pull request: [SPARK-14434][ML]:User guide doc and examples ...

2016-05-12 Thread zhengruifeng
Github user zhengruifeng commented on a diff in the pull request: https://github.com/apache/spark/pull/12788#discussion_r63125785 --- Diff: examples/src/main/scala/org/apache/spark/examples/ml/GaussianMixtureExample.scala --- @@ -0,0 +1,64 @@ +/* + * Licensed

[GitHub] spark pull request: [SPARK-14434][ML]:User guide doc and examples ...

2016-05-12 Thread zhengruifeng
Github user zhengruifeng commented on a diff in the pull request: https://github.com/apache/spark/pull/12788#discussion_r63125752 --- Diff: examples/src/main/python/ml/gaussian_mixture_example.py --- @@ -0,0 +1,55 @@ +# +# Licensed to the Apache Software Foundation (ASF

[GitHub] spark pull request: [SPARK-14434][ML]:User guide doc and examples ...

2016-05-12 Thread zhengruifeng
Github user zhengruifeng commented on a diff in the pull request: https://github.com/apache/spark/pull/12788#discussion_r63125728 --- Diff: examples/src/main/python/ml/gaussian_mixture_example.py --- @@ -0,0 +1,55 @@ +# +# Licensed to the Apache Software Foundation (ASF

[GitHub] spark pull request: [SPARK-14434][ML]:User guide doc and examples ...

2016-05-12 Thread zhengruifeng
Github user zhengruifeng commented on a diff in the pull request: https://github.com/apache/spark/pull/12788#discussion_r63125610 --- Diff: examples/src/main/python/ml/gaussian_mixture_example.py --- @@ -0,0 +1,55 @@ +# +# Licensed to the Apache Software Foundation (ASF

[GitHub] spark pull request: [SPARK-14434][ML]:User guide doc and examples ...

2016-05-12 Thread zhengruifeng
Github user zhengruifeng commented on a diff in the pull request: https://github.com/apache/spark/pull/12788#discussion_r63125532 --- Diff: examples/src/main/java/org/apache/spark/examples/ml/JavaGaussianMixtureExample.java --- @@ -0,0 +1,64 @@ +/* + * Licensed

[GitHub] spark pull request: [SPARK-14434][ML]:User guide doc and examples ...

2016-05-12 Thread zhengruifeng
Github user zhengruifeng commented on a diff in the pull request: https://github.com/apache/spark/pull/12788#discussion_r63125508 --- Diff: examples/src/main/java/org/apache/spark/examples/ml/JavaGaussianMixtureExample.java --- @@ -0,0 +1,64 @@ +/* + * Licensed

[GitHub] spark pull request: [SPARK-14434][ML]:User guide doc and examples ...

2016-05-12 Thread zhengruifeng
Github user zhengruifeng commented on a diff in the pull request: https://github.com/apache/spark/pull/12788#discussion_r63125365 --- Diff: examples/src/main/java/org/apache/spark/examples/ml/JavaGaussianMixtureExample.java --- @@ -0,0 +1,64 @@ +/* + * Licensed

[GitHub] spark pull request: [SPARK-14434][ML]:User guide doc and examples ...

2016-05-12 Thread zhengruifeng
Github user zhengruifeng commented on a diff in the pull request: https://github.com/apache/spark/pull/12788#discussion_r63125262 --- Diff: docs/ml-clustering.md --- @@ -148,5 +148,89 @@ Refer to the [Python API docs](api/python/pyspark.ml.html#pyspark.ml.clustering

[GitHub] spark pull request: [MINOR] Fix Typos

2016-05-12 Thread zhengruifeng
GitHub user zhengruifeng opened a pull request: https://github.com/apache/spark/pull/13078 [MINOR] Fix Typos ## What changes were proposed in this pull request? Fix several typos in ML and SQL ## How was this patch tested? manual tests You can merge

[GitHub] spark pull request: [TEST] Remove redundant codes in SVD++

2016-05-12 Thread zhengruifeng
GitHub user zhengruifeng opened a pull request: https://github.com/apache/spark/pull/13075 [TEST] Remove redundant codes in SVD++ ## What changes were proposed in this pull request? ``` val newVertices = g.vertices.mapValues(v => (v._1.toArray, v._2.toArray, v._3, v

[GitHub] spark pull request: [SPARK-15031][SPARK-15134][EXAMPLE][DOC] Use S...

2016-05-11 Thread zhengruifeng
Github user zhengruifeng commented on a diff in the pull request: https://github.com/apache/spark/pull/13050#discussion_r62955620 --- Diff: examples/src/main/python/ml/simple_params_example.py --- @@ -18,36 +18,30 @@ from __future__ import print_function import

[GitHub] spark pull request: [SPARK-15031][SPARK-15134][EXAMPLE][DOC] Use S...

2016-05-11 Thread zhengruifeng
Github user zhengruifeng commented on the pull request: https://github.com/apache/spark/pull/13050#issuecomment-218460211 cc @andrewor14 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-15031][SPARK-15134][EXAMPLE][DOC] Use S...

2016-05-11 Thread zhengruifeng
GitHub user zhengruifeng opened a pull request: https://github.com/apache/spark/pull/13050 [SPARK-15031][SPARK-15134][EXAMPLE][DOC] Use SparkSession and update indent in examples ## What changes were proposed in this pull request? 1, use `SparkSession` according to [SPARK

[GitHub] spark pull request: [SPARK-15141][EXAMPLE][DOC] Update OneVsRest E...

2016-05-11 Thread zhengruifeng
Github user zhengruifeng commented on a diff in the pull request: https://github.com/apache/spark/pull/12920#discussion_r62801050 --- Diff: examples/src/main/java/org/apache/spark/examples/ml/JavaOneVsRestExample.java --- @@ -17,222 +17,68 @@ package

[GitHub] spark pull request: [SPARK-14340][EXAMPLE][DOC] Update Examples an...

2016-05-11 Thread zhengruifeng
Github user zhengruifeng commented on a diff in the pull request: https://github.com/apache/spark/pull/11844#discussion_r62798957 --- Diff: examples/src/main/python/ml/bisecting_k_means_example.py --- @@ -36,21 +35,20 @@ .getOrCreate() # $example

[GitHub] spark pull request: [SPARK-14340][EXAMPLE][DOC] Update Examples an...

2016-05-11 Thread zhengruifeng
Github user zhengruifeng commented on a diff in the pull request: https://github.com/apache/spark/pull/11844#discussion_r62798712 --- Diff: examples/src/main/java/org/apache/spark/examples/ml/JavaBisectingKMeansExample.java --- @@ -48,26 +43,19 @@ public static void main(String

[GitHub] spark pull request: [MINOR][PySpark] update _shared_params_code_ge...

2016-05-11 Thread zhengruifeng
Github user zhengruifeng commented on the pull request: https://github.com/apache/spark/pull/12996#issuecomment-218375184 @holdenk Ok, I will update it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request: [SPARK-14340][EXAMPLE][DOC] Update Examples an...

2016-05-10 Thread zhengruifeng
Github user zhengruifeng commented on the pull request: https://github.com/apache/spark/pull/11844#issuecomment-218345784 @MLnick Thanks. Updated --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-14340][EXAMPLE][DOC] Update Examples an...

2016-05-10 Thread zhengruifeng
Github user zhengruifeng commented on a diff in the pull request: https://github.com/apache/spark/pull/11844#discussion_r62783082 --- Diff: docs/ml-clustering.md --- @@ -104,4 +104,48 @@ Refer to the [Java API docs](api/java/org/apache/spark/ml/clustering/LDA.html) f

[GitHub] spark pull request: [SPARK-15149][EXAMPLE][DOC] update kmeans exam...

2016-05-10 Thread zhengruifeng
Github user zhengruifeng commented on the pull request: https://github.com/apache/spark/pull/12925#issuecomment-218343025 @MLnick Thanks. Updated --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-15149][EXAMPLE][DOC] update kmeans exam...

2016-05-10 Thread zhengruifeng
Github user zhengruifeng commented on a diff in the pull request: https://github.com/apache/spark/pull/12925#discussion_r62781433 --- Diff: examples/src/main/python/ml/kmeans_example.py --- @@ -17,55 +17,43 @@ from __future__ import print_function -import sys

[GitHub] spark pull request: [SPARK-15141][EXAMPLE][DOC] Update OneVsRest E...

2016-05-10 Thread zhengruifeng
Github user zhengruifeng commented on the pull request: https://github.com/apache/spark/pull/12920#issuecomment-218340225 @MLnick Thanks. Updated --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [MINOR][PySpark] update _shared_params_code_ge...

2016-05-09 Thread zhengruifeng
Github user zhengruifeng commented on the pull request: https://github.com/apache/spark/pull/12996#issuecomment-218015762 @yanboliang I have checked all the SharedParams. I will make a review of other params in pyspark. --- If your project is set up for it, you can reply

[GitHub] spark pull request: [SPARK-14340][EXAMPLE][DOC] Update Examples an...

2016-05-09 Thread zhengruifeng
Github user zhengruifeng commented on the pull request: https://github.com/apache/spark/pull/11844#issuecomment-218015365 @wangmiao1981 Once this PR is merged, you can directly load the datafile in your PR. --- If your project is set up for it, you can reply to this email and have

[GitHub] spark pull request: [SPARK-14340][EXAMPLE][DOC] Update Examples an...

2016-05-09 Thread zhengruifeng
Github user zhengruifeng commented on a diff in the pull request: https://github.com/apache/spark/pull/11844#discussion_r62587969 --- Diff: docs/ml-clustering.md --- @@ -104,4 +104,48 @@ Refer to the [Java API docs](api/java/org/apache/spark/ml/clustering/LDA.html) f

[GitHub] spark pull request: [SPARK-14340][EXAMPLE][DOC] Update Examples an...

2016-05-09 Thread zhengruifeng
Github user zhengruifeng commented on the pull request: https://github.com/apache/spark/pull/11844#issuecomment-217860041 @MLnick sorry to involve other peoples' commits into this. I had to recreate this pr. I have updated it according to your comments. Thanks --- If your

[GitHub] spark pull request: [SPARK-14340][EXAMPLE][DOC] Update Examples an...

2016-05-09 Thread zhengruifeng
Github user zhengruifeng commented on the pull request: https://github.com/apache/spark/pull/11844#issuecomment-217797280 @MLnick The results are OK now --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request: [MINOR][PySpark] update _shared_params_code_ge...

2016-05-08 Thread zhengruifeng
GitHub user zhengruifeng opened a pull request: https://github.com/apache/spark/pull/12996 [MINOR][PySpark] update _shared_params_code_gen.py ## What changes were proposed in this pull request? 1, add arg-checkings for `tol` and `stepSize` to keep in line

[GitHub] spark pull request: [SPARK-15141][EXAMPLE][DOC] Update OneVsRest E...

2016-05-08 Thread zhengruifeng
Github user zhengruifeng commented on a diff in the pull request: https://github.com/apache/spark/pull/12920#discussion_r62429495 --- Diff: examples/src/main/java/org/apache/spark/examples/ml/JavaOneVsRestExample.java --- @@ -17,222 +17,69 @@ package

[GitHub] spark pull request: [DO-NOT-MERGE][TEST] Unify 'range' usage

2016-05-08 Thread zhengruifeng
GitHub user zhengruifeng opened a pull request: https://github.com/apache/spark/pull/12983 [DO-NOT-MERGE][TEST] Unify 'range' usage ## What changes were proposed in this pull request? Most python file directly use `range` ignoring the different implement between python 2 and 3

[GitHub] spark pull request: [SPARK-15210][SQL] Add missing @DeveloperApi a...

2016-05-07 Thread zhengruifeng
GitHub user zhengruifeng opened a pull request: https://github.com/apache/spark/pull/12982 [SPARK-15210][SQL] Add missing @DeveloperApi annotation in sql.types ## What changes were proposed in this pull request? add @DeveloperApi annotation for `AbstractDataType` `MapType

[GitHub] spark pull request: [SPARK-14340][DOC] Add Scala Example and User ...

2016-05-07 Thread zhengruifeng
Github user zhengruifeng commented on the pull request: https://github.com/apache/spark/pull/11844#issuecomment-217688171 @MLnick I updated the PRs of KMeans and BisectingKMeans to directly load data file `data/mllib/sample_kmeans_data.txt`. However, due to issue

[GitHub] spark pull request: [SPARK-15149][EXAMPLE][DOC] update kmeans exam...

2016-05-07 Thread zhengruifeng
Github user zhengruifeng commented on the pull request: https://github.com/apache/spark/pull/12925#issuecomment-217688076 `data/mllib/sample_kmeans_data.txt` was created in [BisectingKMeans examples](https://github.com/apache/spark/pull/11844) --- If your project is set up

[GitHub] spark pull request: [SPARK-15141][EXAMPLE][DOC] Add python example...

2016-05-07 Thread zhengruifeng
Github user zhengruifeng commented on the pull request: https://github.com/apache/spark/pull/12920#issuecomment-217683598 @MLnick `MulticlassClassificationEvaluator` do not support `confusionMatrix` not. So I will just remove the computaion of `confusionMatrix`. --- If your project

[GitHub] spark pull request: [SPARK-14340][DOC] Add Scala Example and User ...

2016-05-06 Thread zhengruifeng
Github user zhengruifeng commented on the pull request: https://github.com/apache/spark/pull/11844#issuecomment-217595504 The features type after `spark.read.format("libsvm").load(..)` is `mllib.SparseVector`. `DataSet` can not handle `mllib.SparseVector`?

[GitHub] spark pull request: [SPARK-14340][DOC] Add Scala Example and User ...

2016-05-06 Thread zhengruifeng
Github user zhengruifeng commented on the pull request: https://github.com/apache/spark/pull/11844#issuecomment-217595289 @MLnick @sethah @yanboliang There is something wrong: ``` val dataset = spark.read.format("libsvm").load("data/mllib/sample_

[GitHub] spark pull request: [SPARK-14340][DOC] Add Scala Example and User ...

2016-05-06 Thread zhengruifeng
Github user zhengruifeng commented on a diff in the pull request: https://github.com/apache/spark/pull/11844#discussion_r62326629 --- Diff: examples/src/main/scala/org/apache/spark/examples/ml/BisectingKMeansExample.scala --- @@ -0,0 +1,69 @@ +/* + * Licensed

[GitHub] spark pull request: [SPARK-15150][EXAMPLE][DOC] Add python example...

2016-05-06 Thread zhengruifeng
Github user zhengruifeng commented on a diff in the pull request: https://github.com/apache/spark/pull/12927#discussion_r62324992 --- Diff: examples/src/main/python/ml/lda_example.py --- @@ -0,0 +1,64 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one

[GitHub] spark pull request: [SPARK-15141][EXAMPLE][DOC] Add python example...

2016-05-05 Thread zhengruifeng
Github user zhengruifeng commented on the pull request: https://github.com/apache/spark/pull/12920#issuecomment-217347142 @MLnick Args-Parsing was removed in those examples --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark pull request: [SPARK-15149][EXAMPLE] include python example ...

2016-05-05 Thread zhengruifeng
Github user zhengruifeng commented on a diff in the pull request: https://github.com/apache/spark/pull/12925#discussion_r62282664 --- Diff: examples/src/main/scala/org/apache/spark/examples/ml/KMeansExample.scala --- @@ -40,20 +42,17 @@ object KMeansExample

[GitHub] spark pull request: [SPARK-15149][EXAMPLE] include python example ...

2016-05-05 Thread zhengruifeng
Github user zhengruifeng commented on a diff in the pull request: https://github.com/apache/spark/pull/12925#discussion_r62282570 --- Diff: examples/src/main/python/ml/kmeans_example.py --- @@ -17,52 +17,39 @@ from __future__ import print_function -import sys

<    2   3   4   5   6   7   8   9   >