[GitHub] spark pull request: [SPARK-15608][ml][doc] add_isotonic_regression_doc

2016-05-31 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/13381#discussion_r65206320 --- Diff: examples/src/main/scala/org/apache/spark/examples/ml/IsotonicRegressionExample.scala --- @@ -0,0 +1,73 @@ +/* + * Licensed

[GitHub] spark pull request: [SPARK-15608][ml][doc] add_isotonic_regression_doc

2016-05-31 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/13381#discussion_r65207453 --- Diff: docs/ml-classification-regression.md --- @@ -685,6 +685,88 @@ The implementation matches the result from R's survival function

[GitHub] spark pull request: [SPARK-15608][ml][doc] add_isotonic_regression_doc

2016-05-31 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/13381#discussion_r65207910 --- Diff: examples/src/main/java/org/apache/spark/examples/ml/JavaIsotonicRegressionExample.java --- @@ -0,0 +1,77 @@ +/* + * Licensed

[GitHub] spark pull request: [SPARK-15608][ml][doc] add_isotonic_regression_doc

2016-05-31 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/13381#discussion_r65207877 --- Diff: examples/src/main/java/org/apache/spark/examples/ml/JavaIsotonicRegressionExample.java --- @@ -0,0 +1,77 @@ +/* + * Licensed

[GitHub] spark pull request: [SPARK-15605] [ML] [Examples] Remove JavaDeveloperApiExa...

2016-05-31 Thread yanboliang
Github user yanboliang commented on the pull request: https://github.com/apache/spark/pull/13353 cc @mengxr @jkbradley --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request: [SPARK-13590] [ML] [Doc] Document spark.ml LiR, LoR and ...

2016-05-31 Thread yanboliang
Github user yanboliang commented on the pull request: https://github.com/apache/spark/pull/12731 ping @mengxr --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes

[GitHub] spark pull request: [SPARK-15177] [SparkR] [ML] SparkR 2.0 QA: New R APIs an...

2016-05-31 Thread yanboliang
Github user yanboliang commented on the pull request: https://github.com/apache/spark/pull/13023 @vectorijk There is a separate PR focus on updating machine learning section of SparkR users guide. FYI #13285. Thanks. --- If your project is set up for it, you can reply to this email

[GitHub] spark pull request: [SPARK-15587] [ML] ML 2.0 QA: Scala APIs audit for ml.fe...

2016-05-31 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/13410#discussion_r65192762 --- Diff: python/pyspark/ml/feature.py --- @@ -1481,6 +1474,10 @@ class StandardScaler(JavaEstimator, HasInputCol, HasOutputCol, JavaMLReadable, J

[GitHub] spark pull request #13675: [SPARK-15957] [ML] RFormula supports forcing to i...

2016-06-14 Thread yanboliang
GitHub user yanboliang opened a pull request: https://github.com/apache/spark/pull/13675 [SPARK-15957] [ML] RFormula supports forcing to index label ## What changes were proposed in this pull request? Add param to make users can force to index label whether it is numeric

[GitHub] spark issue #13662: [SPARK-15945] [MLLIB] Conversion between old/new vector ...

2016-06-14 Thread yanboliang
Github user yanboliang commented on the issue: https://github.com/apache/spark/pull/13662 It looks like the merge script is not happy, I will retry later. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark issue #13662: [SPARK-15945] [MLLIB] Conversion between old/new vector ...

2016-06-14 Thread yanboliang
Github user yanboliang commented on the issue: https://github.com/apache/spark/pull/13662 LGTM, merged into master and branch-2.0. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark issue #13675: [SPARK-15957] [ML] RFormula supports forcing to index la...

2016-06-16 Thread yanboliang
Github user yanboliang commented on the issue: https://github.com/apache/spark/pull/13675 cc @mengxr @jkbradley --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #13381: [SPARK-15608][ml][examples][doc] add examples and docume...

2016-06-16 Thread yanboliang
Github user yanboliang commented on the issue: https://github.com/apache/spark/pull/13381 Merged into master and branch-2.0. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark issue #13381: [SPARK-15608][ml][examples][doc] add examples and docume...

2016-06-16 Thread yanboliang
Github user yanboliang commented on the issue: https://github.com/apache/spark/pull/13381 Merged into master and branch-2.0. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark issue #13731: [SPARK-15946] [MLLIB] Conversion between old/new vector ...

2016-06-17 Thread yanboliang
Github user yanboliang commented on the issue: https://github.com/apache/spark/pull/13731 LGTM, merged into master and branch-2.0. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark issue #13641: [SPARK-10258][DOC][ML] Add @Since annotations to ml.feat...

2016-06-15 Thread yanboliang
Github user yanboliang commented on the issue: https://github.com/apache/spark/pull/13641 @MLnick I found you did not add ```@Since``` for all params definition, is this as expected?I think we should add them. --- If your project is set up for it, you can reply to this email

[GitHub] spark pull request #13641: [SPARK-10258][DOC][ML] Add @Since annotations to ...

2016-06-15 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/13641#discussion_r67245508 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/MaxAbsScaler.scala --- @@ -88,7 +91,7 @@ class MaxAbsScaler @Since("2.0.0") (overri

[GitHub] spark pull request #13641: [SPARK-10258][DOC][ML] Add @Since annotations to ...

2016-06-15 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/13641#discussion_r67245105 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/MaxAbsScaler.scala --- @@ -88,7 +91,7 @@ class MaxAbsScaler @Since("2.0.0") (overri

[GitHub] spark pull request: [SPARK-8519][SPARK-11560][SPARK-11559] [ML] [M...

2016-01-15 Thread yanboliang
Github user yanboliang commented on the pull request: https://github.com/apache/spark/pull/10306#issuecomment-171942083 @mengxr Thanks for the prompt. I will check my environment and re-run the test. --- If your project is set up for it, you can reply to this email and have your

[GitHub] spark pull request: [SPARK-8519][SPARK-11560][SPARK-11559] [ML] [M...

2016-01-18 Thread yanboliang
GitHub user yanboliang opened a pull request: https://github.com/apache/spark/pull/10806 [SPARK-8519][SPARK-11560][SPARK-11559] [ML] [MLlib] Optimize KMeans implementation * Use BLAS Level 3 matrix-matrix multiplications to compute pairwise distance in k-means. * Remove runs

[GitHub] spark pull request: [SPARK-8519][SPARK-11560][SPARK-11559] [ML] [M...

2016-01-18 Thread yanboliang
Github user yanboliang commented on the pull request: https://github.com/apache/spark/pull/10306#issuecomment-172558579 @mengxr I found the misconfiguration of my test environment and updated it, thanks! Now ```gemm``` is about 20-30 times faster than ```axpy/dot

[GitHub] spark pull request: [SPARK-8519][SPARK-11560][SPARK-11559] [ML] [M...

2016-01-18 Thread yanboliang
Github user yanboliang closed the pull request at: https://github.com/apache/spark/pull/10306 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request: [SPARK-8519][SPARK-11560][SPARK-11559] [ML] [M...

2016-01-18 Thread yanboliang
Github user yanboliang commented on the pull request: https://github.com/apache/spark/pull/10306#issuecomment-172562607 @mengxr I have a new and advanced implementation for this issue at #10806 , let's move the discussion there. I will close this PR now. --- If your project is set

[GitHub] spark pull request: [SPARK-12645] [SparkR] SparkR support hash fun...

2016-01-15 Thread yanboliang
Github user yanboliang commented on the pull request: https://github.com/apache/spark/pull/10597#issuecomment-171910873 @shivaram Just like @felixcheung commented, the ```hash``` function was added only in 2.0.0. So revert it from branch 1.6 will fix the broken test. --- If your

[GitHub] spark pull request: [SPARK-12905] [ML] [PySpark] PCAModel return e...

2016-01-19 Thread yanboliang
GitHub user yanboliang opened a pull request: https://github.com/apache/spark/pull/10830 [SPARK-12905] [ML] [PySpark] PCAModel return eigenvalues for PySpark ```PCAModel``` return eigenvalues for PySpark You can merge this pull request into a Git repository by running: $ git

[GitHub] spark pull request: [SPARK-12903] [SparkR] Add covar_samp and cova...

2016-01-19 Thread yanboliang
Github user yanboliang commented on the pull request: https://github.com/apache/spark/pull/10829#issuecomment-172784554 Jenkins, test this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-12903] [SparkR] Add covar_samp and cova...

2016-01-19 Thread yanboliang
GitHub user yanboliang opened a pull request: https://github.com/apache/spark/pull/10829 [SPARK-12903] [SparkR] Add covar_samp and covar_pop for SparkR Add ```covar_samp``` and ```covar_pop``` for SparkR. You can merge this pull request into a Git repository by running: $ git

[GitHub] spark issue #13378: [SPARK-15643] [Doc] [ML] Update spark.ml and spark.mllib...

2016-06-27 Thread yanboliang
Github user yanboliang commented on the issue: https://github.com/apache/spark/pull/13378 @MLnick I have updated the new deprecations in the [JIRA] (https://issues.apache.org/jira/browse/SPARK-15643?focusedCommentId=15343059=com.atlassian.jira.plugin.system.issuetabpanels:comment

[GitHub] spark pull request #13935: [SPARK-16242] [MLlib] [PySpark] Conversion betwee...

2016-06-27 Thread yanboliang
GitHub user yanboliang opened a pull request: https://github.com/apache/spark/pull/13935 [SPARK-16242] [MLlib] [PySpark] Conversion between old/new matrix columns in a DataFrame (Python) ## What changes were proposed in this pull request? This PR implements python wrappers

[GitHub] spark issue #13935: [SPARK-16242] [MLlib] [PySpark] Conversion between old/n...

2016-06-27 Thread yanboliang
Github user yanboliang commented on the issue: https://github.com/apache/spark/pull/13935 cc @hhbyyh @mengxr --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes

[GitHub] spark issue #13937: [SPARK-16245] [ML] model loading backward compatibility ...

2016-06-27 Thread yanboliang
Github user yanboliang commented on the issue: https://github.com/apache/spark/pull/13937 cc @hhbyyh @mengxr --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes

[GitHub] spark pull request #13937: [SPARK-16245] [ML] model loading backward compati...

2016-06-27 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/13937#discussion_r68697383 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/PCA.scala --- @@ -206,24 +206,21 @@ object PCAModel extends MLReadable[PCAModel

[GitHub] spark pull request #13937: [SPARK-16245] [ML] model loading backward compati...

2016-06-27 Thread yanboliang
GitHub user yanboliang opened a pull request: https://github.com/apache/spark/pull/13937 [SPARK-16245] [ML] model loading backward compatibility for ml.feature.PCA ## What changes were proposed in this pull request? model loading backward compatibility for ml.feature.PCA

[GitHub] spark pull request #13023: [SPARK-15177] [SparkR] [ML] SparkR 2.0 QA: New R ...

2016-06-24 Thread yanboliang
Github user yanboliang closed the pull request at: https://github.com/apache/spark/pull/13023 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request #13888: [SPARK-16187] [ML] Implement util method for ML M...

2016-06-25 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/13888#discussion_r68486013 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/util/MLUtils.scala --- @@ -309,8 +309,8 @@ object MLUtils extends Logging

[GitHub] spark pull request: [SPARK-13037][ML][PySpark] PySpark ml.recommen...

2016-02-05 Thread yanboliang
Github user yanboliang commented on the pull request: https://github.com/apache/spark/pull/11044#issuecomment-180395685 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request: [SPARK-12811] [ML] Estimator for Generalized L...

2016-02-09 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/11136#discussion_r52330009 --- Diff: mllib/src/main/scala/org/apache/spark/ml/regression/GeneralizedLinearRegression.scala --- @@ -0,0 +1,472 @@ +/* + * Licensed

[GitHub] spark pull request: [SPARK-12811] [ML] Estimator for Generalized L...

2016-02-09 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/11136#discussion_r52331693 --- Diff: mllib/src/main/scala/org/apache/spark/ml/regression/GeneralizedLinearRegression.scala --- @@ -0,0 +1,472 @@ +/* + * Licensed

[GitHub] spark pull request: [SPARK-12811] [ML] Estimator for Generalized L...

2016-02-09 Thread yanboliang
GitHub user yanboliang opened a pull request: https://github.com/apache/spark/pull/11136 [SPARK-12811] [ML] Estimator for Generalized Linear Models(GLMs) Estimator for Generalized Linear Models(GLMs) You can merge this pull request into a Git repository by running: $ git pull

[GitHub] spark pull request: [SPARK-12962] [SQL] [PySpark] PySpark support ...

2016-02-04 Thread yanboliang
Github user yanboliang commented on the pull request: https://github.com/apache/spark/pull/10876#issuecomment-180175154 ping @davies --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-12974] [ML] [PySpark] Add Python API fo...

2016-02-12 Thread yanboliang
Github user yanboliang commented on the pull request: https://github.com/apache/spark/pull/10889#issuecomment-183240342 @mengxr Actually, I have already updated this PR after #10216 get merged. --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark pull request: [SPARK-12811] [ML] Estimator for Generalized L...

2016-02-11 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/11136#discussion_r52610253 --- Diff: mllib/src/main/scala/org/apache/spark/ml/regression/GeneralizedLinearRegression.scala --- @@ -0,0 +1,472 @@ +/* + * Licensed

[GitHub] spark pull request: [SPARK-12811] [ML] Estimator for Generalized L...

2016-02-11 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/11136#discussion_r52611868 --- Diff: mllib/src/main/scala/org/apache/spark/ml/regression/GeneralizedLinearRegression.scala --- @@ -0,0 +1,472 @@ +/* + * Licensed

[GitHub] spark pull request: [SPARK-12811] [ML] Estimator for Generalized L...

2016-02-11 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/11136#discussion_r52611593 --- Diff: mllib/src/main/scala/org/apache/spark/ml/regression/GeneralizedLinearRegression.scala --- @@ -0,0 +1,472 @@ +/* + * Licensed

[GitHub] spark pull request: [SPARK-12811] [ML] Estimator for Generalized L...

2016-02-11 Thread yanboliang
Github user yanboliang commented on the pull request: https://github.com/apache/spark/pull/11136#issuecomment-182905997 Jenkins, test this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-11939] [ML] [PySpark] PySpark support m...

2016-01-27 Thread yanboliang
Github user yanboliang commented on the pull request: https://github.com/apache/spark/pull/10469#issuecomment-175472122 @jkbradley Thanks for your comments! I have made ```MLReadable``` and ```MLWritable``` more general and not specific to Java wrappers, addressed all comments except

[GitHub] spark pull request: [SPARK-11939] [ML] [PySpark] PySpark support m...

2016-01-26 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/10469#discussion_r50950853 --- Diff: python/pyspark/ml/wrapper.py --- @@ -82,13 +71,16 @@ def _transfer_params_to_java(self): pair = self._make_java_param_pair

[GitHub] spark pull request: [SPARK-11939] [ML] [PySpark] PySpark support m...

2016-01-26 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/10469#discussion_r50948067 --- Diff: python/pyspark/ml/util.py --- @@ -52,3 +71,141 @@ def _randomUID(cls): concatenates the class name, "_", and 12 random

[GitHub] spark pull request: [SPARK-13047][PYSPARK][ML] Pyspark Params.hasP...

2016-01-27 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/10962#discussion_r51087340 --- Diff: python/pyspark/ml/param/__init__.py --- @@ -152,13 +152,17 @@ def isDefined(self, param): return self.isSet(param

[GitHub] spark pull request: [SPARK-13032] [ML] [PySpark] PySpark support m...

2016-01-28 Thread yanboliang
Github user yanboliang commented on the pull request: https://github.com/apache/spark/pull/10469#issuecomment-176070210 @jkbradley You PR looks good and get merged, thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark pull request: [Minor] [ML] [PySpark] Cleanup test cases of c...

2016-01-31 Thread yanboliang
Github user yanboliang commented on the pull request: https://github.com/apache/spark/pull/10975#issuecomment-177743797 @mengxr I did not found other class has similar test except ```KMeans```, is this deliberately designed or lacks of some test cases? --- If your project is set up

[GitHub] spark pull request: [SPARK-13035] [ML] [PySpark] PySpark ml.cluste...

2016-01-31 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/10999#discussion_r51375854 --- Diff: python/pyspark/ml/clustering.py --- @@ -69,6 +70,25 @@ class KMeans(JavaEstimator, HasFeaturesCol, HasPredictionCol, HasMaxIter, HasTol

[GitHub] spark pull request: [SPARK-13033][ML][PySpark] Add import/export f...

2016-02-01 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/11000#discussion_r51400507 --- Diff: python/pyspark/ml/regression.py --- @@ -447,7 +447,7 @@ def _create_model(self, java_model): @inherit_doc -class

[GitHub] spark pull request: [SPARK-13037][ML][PySpark] PySpark ml.recommen...

2016-02-03 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/11044#discussion_r51690764 --- Diff: python/pyspark/ml/recommendation.py --- @@ -81,6 +82,23 @@ class ALS(JavaEstimator, HasCheckpointInterval, HasMaxIter, HasPredictionCol, Ha

[GitHub] spark pull request: [SPARK-13037][ML][PySpark] PySpark ml.recommen...

2016-02-03 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/11044#discussion_r51690527 --- Diff: python/pyspark/ml/recommendation.py --- @@ -81,6 +82,23 @@ class ALS(JavaEstimator, HasCheckpointInterval, HasMaxIter, HasPredictionCol, Ha

[GitHub] spark pull request: [SPARK-13035] [ML] [PySpark] PySpark ml.cluste...

2016-01-31 Thread yanboliang
GitHub user yanboliang opened a pull request: https://github.com/apache/spark/pull/10999 [SPARK-13035] [ML] [PySpark] PySpark ml.clustering support export/import PySpark ml.clustering support export/import. You can merge this pull request into a Git repository by running

[GitHub] spark pull request: [SPARK-13033][ML][PySpark] Add import/export f...

2016-02-02 Thread yanboliang
Github user yanboliang commented on the pull request: https://github.com/apache/spark/pull/11000#issuecomment-178575141 @Wenpei It looks like ```_transfer_params_from_java``` did not consider the params which do not have default value and we should handle them. Would you mind

[GitHub] spark pull request: [SPARK-13153][PySpark] ML persistence failed w...

2016-02-02 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/11043#discussion_r51674846 --- Diff: python/pyspark/ml/wrapper.py --- @@ -79,8 +79,9 @@ def _transfer_params_from_java(self): for param in self.params

[GitHub] spark pull request: [SPARK-13153][PySpark] ML persistence failed w...

2016-02-02 Thread yanboliang
Github user yanboliang commented on the pull request: https://github.com/apache/spark/pull/11043#issuecomment-178983977 ping @mengxr @jkbradley Could you add @Wenpei to white list ? This is an obvious bug and we should fix it. --- If your project is set up for it, you can reply

[GitHub] spark pull request: [SPARK-13033][ML][PySpark] Add import/export f...

2016-02-02 Thread yanboliang
Github user yanboliang commented on the pull request: https://github.com/apache/spark/pull/11000#issuecomment-178966434 It should not make all parameters have default value because of some params are not setting default value on purpose. I think we should modify

[GitHub] spark pull request: [SPARK-13047][PYSPARK][ML] Pyspark Params.hasP...

2016-01-27 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/10962#discussion_r51075753 --- Diff: python/pyspark/ml/param/__init__.py --- @@ -152,13 +152,17 @@ def isDefined(self, param): return self.isSet(param

[GitHub] spark pull request: [SPARK-12962] [SQL] [PySpark] PySpark support ...

2016-01-28 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/10876#discussion_r51226739 --- Diff: python/pyspark/sql/functions.py --- @@ -263,6 +263,38 @@ def corr(col1, col2): return Column(sc._jvm.functions.corr(_to_java_column

[GitHub] spark pull request: [Minor] [ML] [PySpark] Cleanup test cases of c...

2016-01-28 Thread yanboliang
GitHub user yanboliang opened a pull request: https://github.com/apache/spark/pull/10975 [Minor] [ML] [PySpark] Cleanup test cases of clustering.py You can merge this pull request into a Git repository by running: $ git pull https://github.com/yanboliang/spark clustering

[GitHub] spark pull request: [SPARK-13033][ML][PySpark] Add import/export f...

2016-02-24 Thread yanboliang
Github user yanboliang commented on the pull request: https://github.com/apache/spark/pull/11000#issuecomment-188217429 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request: [Minor] [ML] [Doc] Cleanup dots at the end of ...

2016-02-24 Thread yanboliang
Github user yanboliang commented on the pull request: https://github.com/apache/spark/pull/11344#issuecomment-188213659 @srowen Some ScalaDoc will end with two dots if we don't fix, you can refer [here](https://github.com/apache/spark/pull/11344/files#diff

[GitHub] spark pull request: [SPARK-13033][ML][PySpark] Add import/export f...

2016-02-24 Thread yanboliang
Github user yanboliang commented on the pull request: https://github.com/apache/spark/pull/11000#issuecomment-188217276 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request: [SPARK-13033][ML][PySpark] Add import/export f...

2016-02-24 Thread yanboliang
Github user yanboliang commented on the pull request: https://github.com/apache/spark/pull/11000#issuecomment-188216762 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request: [SPARK-13033][ML][PySpark] Add import/export f...

2016-02-24 Thread yanboliang
Github user yanboliang commented on the pull request: https://github.com/apache/spark/pull/11000#issuecomment-188217094 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request: [Minor] [ML] [Doc] Cleanup dots at the end of ...

2016-02-24 Thread yanboliang
Github user yanboliang commented on the pull request: https://github.com/apache/spark/pull/11344#issuecomment-188281271 Jenkins, test this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-12811] [ML] Estimator for Generalized L...

2016-02-24 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/11136#discussion_r53911268 --- Diff: mllib/src/test/scala/org/apache/spark/ml/regression/GeneralizedLinearRegressionSuite.scala --- @@ -0,0 +1,499 @@ +/* + * Licensed

[GitHub] spark pull request: [SPARK-12811] [ML] Estimator for Generalized L...

2016-02-24 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/11136#discussion_r53911596 --- Diff: mllib/src/main/scala/org/apache/spark/ml/regression/GeneralizedLinearRegression.scala --- @@ -0,0 +1,565 @@ +/* + * Licensed

[GitHub] spark pull request: [] [] [] Clean up sharedParams

2016-02-24 Thread yanboliang
GitHub user yanboliang opened a pull request: https://github.com/apache/spark/pull/11344 [] [] [] Clean up sharedParams ## What changes were proposed in this pull request? Remove duplicated dot at the end of some sharedParams in ScalaDoc. cc @mengxr @srowen ## How

[GitHub] spark pull request: [SPARK-12811] [ML] Estimator for Generalized L...

2016-02-24 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/11136#discussion_r53909986 --- Diff: mllib/src/main/scala/org/apache/spark/ml/regression/GeneralizedLinearRegression.scala --- @@ -0,0 +1,565 @@ +/* + * Licensed

[GitHub] spark pull request: [SPARK-7106][MLlib][PySpark] Support model sav...

2016-02-24 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/11321#discussion_r53908665 --- Diff: python/pyspark/mllib/fpm.py --- @@ -40,6 +41,11 @@ class FPGrowthModel(JavaModelWrapper): >>> model = FPGrowth.train(rd

[GitHub] spark pull request: [SPARK-12811] [ML] Estimator for Generalized L...

2016-02-24 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/11136#discussion_r53909458 --- Diff: mllib/src/main/scala/org/apache/spark/ml/regression/GeneralizedLinearRegression.scala --- @@ -0,0 +1,565 @@ +/* + * Licensed

[GitHub] spark pull request: [SPARK-13504] [SparkR] Add approxQuantile for ...

2016-02-25 Thread yanboliang
GitHub user yanboliang opened a pull request: https://github.com/apache/spark/pull/11383 [SPARK-13504] [SparkR] Add approxQuantile for SparkR ## What changes were proposed in this pull request? Add ```approxQuantile``` for SparkR. ## How was this patch tested? unit tests

[GitHub] spark pull request: [SPARK-13036][SPARK-13318][SPARK-13319] Add sa...

2016-02-26 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/11203#discussion_r54218770 --- Diff: python/pyspark/ml/feature.py --- @@ -1330,6 +1448,21 @@ class StringIndexer(JavaEstimator, HasInputCol, HasOutputCol, HasHandleInvalid

[GitHub] spark pull request: [SPARK-13036][SPARK-13318][SPARK-13319] Add sa...

2016-02-26 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/11203#discussion_r54217956 --- Diff: python/pyspark/ml/feature.py --- @@ -443,6 +477,12 @@ class HashingTF(JavaTransformer, HasInputCol, HasOutputCol, HasNumFeatures

[GitHub] spark pull request: [SPARK-12811] [ML] Estimator for Generalized L...

2016-02-25 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/11136#discussion_r54206474 --- Diff: mllib/src/main/scala/org/apache/spark/ml/optim/WeightedLeastSquares.scala --- @@ -157,6 +157,12 @@ private[ml] class WeightedLeastSquares

[GitHub] spark pull request: [SPARK-13545] [MLlib] [PySpark] Make MLlib LR'...

2016-02-28 Thread yanboliang
GitHub user yanboliang opened a pull request: https://github.com/apache/spark/pull/11424 [SPARK-13545] [MLlib] [PySpark] Make MLlib LR's default parameters consistent in Scala and Python ## What changes were proposed in this pull request? Make MLlib LR's default parameters

[GitHub] spark pull request: [SPARK-13322] [ML] AFTSurvivalRegression suppo...

2016-02-25 Thread yanboliang
GitHub user yanboliang opened a pull request: https://github.com/apache/spark/pull/11365 [SPARK-13322] [ML] AFTSurvivalRegression supports feature standardization ## What changes were proposed in this pull request? AFTSurvivalRegression should support feature

[GitHub] spark pull request: [SPARK-13372] [ML] Fix LogisticRegression when...

2016-02-25 Thread yanboliang
Github user yanboliang closed the pull request at: https://github.com/apache/spark/pull/11247 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request: [SPARK-13372] [ML] Fix LogisticRegression when...

2016-02-25 Thread yanboliang
Github user yanboliang commented on the pull request: https://github.com/apache/spark/pull/11247#issuecomment-188668664 @dbtsai I think you convinced me, and I have also checked the R glmnet implementation. The current behavior may be more make sense, so I will close this PR. Thanks

[GitHub] spark pull request: [Minor] [ML] [Doc] Cleanup dots at the end of ...

2016-02-25 Thread yanboliang
Github user yanboliang commented on the pull request: https://github.com/apache/spark/pull/11344#issuecomment-188778682 Jenkins, test this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-13490] [ML] ML LinearRegression should ...

2016-02-25 Thread yanboliang
GitHub user yanboliang opened a pull request: https://github.com/apache/spark/pull/11367 [SPARK-13490] [ML] ML LinearRegression should cache standardization param value ## What changes were proposed in this pull request? Like [SPARK-13132](https://issues.apache.org/jira/browse

[GitHub] spark pull request: [SPARK-11940][PYSPARK] Python API for ml.clust...

2016-02-29 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/10242#discussion_r54395410 --- Diff: python/pyspark/ml/clustering.py --- @@ -291,6 +292,317 @@ def _create_model(self, java_model): return BisectingKMeansModel

[GitHub] spark pull request: [SPARK-11940][PYSPARK] Python API for ml.clust...

2016-02-29 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/10242#discussion_r54394942 --- Diff: python/pyspark/ml/clustering.py --- @@ -291,6 +292,317 @@ def _create_model(self, java_model): return BisectingKMeansModel

[GitHub] spark pull request: [SPARK-11940][PYSPARK] Python API for ml.clust...

2016-02-29 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/10242#discussion_r54395971 --- Diff: python/pyspark/ml/clustering.py --- @@ -291,6 +292,317 @@ def _create_model(self, java_model): return BisectingKMeansModel

[GitHub] spark pull request: [SPARK-11940][PYSPARK] Python API for ml.clust...

2016-02-29 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/10242#discussion_r54394622 --- Diff: python/pyspark/ml/clustering.py --- @@ -167,6 +167,200 @@ def getInitSteps(self): return self.getOrDefault(self.initSteps

[GitHub] spark pull request: [SPARK-11940][PYSPARK] Python API for ml.clust...

2016-02-29 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/10242#discussion_r54394752 --- Diff: python/pyspark/ml/clustering.py --- @@ -291,6 +292,317 @@ def _create_model(self, java_model): return BisectingKMeansModel

[GitHub] spark pull request: [SPARK-11940][PYSPARK] Python API for ml.clust...

2016-02-29 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/10242#discussion_r54395661 --- Diff: python/pyspark/ml/clustering.py --- @@ -291,6 +292,317 @@ def _create_model(self, java_model): return BisectingKMeansModel

[GitHub] spark pull request: [SPARK-13506] [MLlib] Fix the wrong parameter ...

2016-02-26 Thread yanboliang
Github user yanboliang commented on the pull request: https://github.com/apache/spark/pull/11387#issuecomment-189200629 LGTM cc @mengxr --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-13036][SPARK-13318][SPARK-13319] Add sa...

2016-02-26 Thread yanboliang
Github user yanboliang commented on the pull request: https://github.com/apache/spark/pull/11203#issuecomment-189188086 @yinxusen Looks good overall, I left some inline comments. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark pull request: [SPARK-13033][ML][PySpark] Add import/export f...

2016-02-22 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/11000#discussion_r53635317 --- Diff: python/pyspark/ml/regression.py --- @@ -172,6 +172,16 @@ class IsotonicRegression(JavaEstimator, HasFeaturesCol, HasLabelCol, HasPredicti

[GitHub] spark pull request: [SPARK-13033][ML][PySpark] Add import/export f...

2016-02-22 Thread yanboliang
Github user yanboliang commented on the pull request: https://github.com/apache/spark/pull/11000#issuecomment-187221567 Looks good except minor issues. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-13033][ML][PySpark] Add import/export f...

2016-02-22 Thread yanboliang
Github user yanboliang commented on the pull request: https://github.com/apache/spark/pull/11000#issuecomment-187222806 Jenkins, test this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-13033][ML][PySpark] Add import/export f...

2016-02-22 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/11000#discussion_r53635702 --- Diff: python/pyspark/ml/regression.py --- @@ -690,6 +700,18 @@ class AFTSurvivalRegression(JavaEstimator, HasFeaturesCol, HasLabelCol, HasPredi

[GitHub] spark pull request: [SPARK-12811] [ML] Estimator for Generalized L...

2016-02-24 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/11136#discussion_r53913710 --- Diff: mllib/src/main/scala/org/apache/spark/ml/regression/GeneralizedLinearRegression.scala --- @@ -0,0 +1,565 @@ +/* + * Licensed

[GitHub] spark pull request: [SPARK-12811] [ML] Estimator for Generalized L...

2016-02-24 Thread yanboliang
Github user yanboliang commented on the pull request: https://github.com/apache/spark/pull/11136#issuecomment-188165176 @mengxr This PR is ready for another pass. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark pull request: [SPARK-9835] [ML] IterativelyReweightedLeastSq...

2016-01-21 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/10639#discussion_r50390322 --- Diff: mllib/src/main/scala/org/apache/spark/ml/glm/Families.scala --- @@ -0,0 +1,138 @@ +/* + * Licensed to the Apache Software Foundation

<    2   3   4   5   6   7   8   9   10   11   >