[GitHub] spark pull request #14112: [SPARK-16240][ML] Model loading backward compatib...

2016-09-07 Thread GayathriMurali
Github user GayathriMurali closed the pull request at: https://github.com/apache/spark/pull/14112 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark issue #14112: [SPARK-16240][ML] Model loading backward compatibility f...

2016-09-07 Thread GayathriMurali
Github user GayathriMurali commented on the issue: https://github.com/apache/spark/pull/14112 @jkbradley Sure! Thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #14112: [SPARK-16240][ML] Model loading backward compatibility f...

2016-09-06 Thread GayathriMurali
Github user GayathriMurali commented on the issue: https://github.com/apache/spark/pull/14112 @jkbradley I am so sorry I couldn't respond to this on time! I am in a transition process and might not be able to drive this JIRA to completion at this point in time. Thanks! --- If your

[GitHub] spark issue #14112: [SPARK-16240][ML] Model loading backward compatibility f...

2016-08-10 Thread GayathriMurali
Github user GayathriMurali commented on the issue: https://github.com/apache/spark/pull/14112 @jkbradley Can you please help review this? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark issue #14112: [SPARK-16240][ML] Model loading backward compatibility f...

2016-07-26 Thread GayathriMurali
Github user GayathriMurali commented on the issue: https://github.com/apache/spark/pull/14112 @jkbradley Please let me know if I can do anything to help get this merged --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark issue #14112: [SPARK-16240][ML] Model loading backward compatibility f...

2016-07-19 Thread GayathriMurali
Github user GayathriMurali commented on the issue: https://github.com/apache/spark/pull/14112 @jkbradley Can you please help review this? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request #14112: [SPARK-16240][ML] Model loading backward compatib...

2016-07-18 Thread GayathriMurali
Github user GayathriMurali commented on a diff in the pull request: https://github.com/apache/spark/pull/14112#discussion_r71172720 --- Diff: mllib/src/main/scala/org/apache/spark/ml/clustering/LDA.scala --- @@ -728,16 +755,40 @@ object DistributedLDAModel extends MLReadable

[GitHub] spark issue #14112: [SPARK-16240][ML] Model loading backward compatibility f...

2016-07-14 Thread GayathriMurali
Github user GayathriMurali commented on the issue: https://github.com/apache/spark/pull/14112 @jkbradley I implemented model loading logic for DistributedLDA as well. I am using a versionRegex for robustness in version checking. Using `as[Data].head()` is producing a scala match

[GitHub] spark pull request #14112: [SPARK-16240][ML] Model loading backward compatib...

2016-07-13 Thread GayathriMurali
Github user GayathriMurali commented on a diff in the pull request: https://github.com/apache/spark/pull/14112#discussion_r70749752 --- Diff: mllib/src/main/scala/org/apache/spark/ml/clustering/LDA.scala --- @@ -566,26 +565,52 @@ object LocalLDAModel extends MLReadable

[GitHub] spark issue #14112: [SPARK-16240][ML] Model loading backward compatibility f...

2016-07-13 Thread GayathriMurali
Github user GayathriMurali commented on the issue: https://github.com/apache/spark/pull/14112 @jkbradley I am sorry, I have been held up with something else. I am looking on ways to add this to DistribtedLDA model. I will have something by EOD today. --- If your project is set up

[GitHub] spark issue #14112: [SPARK-16240][ML] Model loading backward compatibility f...

2016-07-11 Thread GayathriMurali
Github user GayathriMurali commented on the issue: https://github.com/apache/spark/pull/14112 +1 for separate loading logic. The recent commit includes separate code paths depending on sparkVersion --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark issue #14112: [SPARK-16240][ML] Model loading backward compatibility f...

2016-07-11 Thread GayathriMurali
Github user GayathriMurali commented on the issue: https://github.com/apache/spark/pull/14112 @hhbyyh Thanks for helping out. Updated commit includes logic to include topicDistributionCol @yanboliang --- If your project is set up for it, you can reply to this email and have your

[GitHub] spark issue #14112: [SPARK-16240][ML] Model loading backward compatibility f...

2016-07-09 Thread GayathriMurali
Github user GayathriMurali commented on the issue: https://github.com/apache/spark/pull/14112 retest this --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark issue #14112: [SPARK-16240][ML] Model loading backward compatibility f...

2016-07-09 Thread GayathriMurali
Github user GayathriMurali commented on the issue: https://github.com/apache/spark/pull/14112 @hhbyyh Can you please help review? I am not sure if this is the right way to do it, as topicDistributionCol is not included in the MLWriter or load. --- If your project is set up

[GitHub] spark pull request #14112: [SPARK-16240][ML] Model loading backward compatib...

2016-07-09 Thread GayathriMurali
GitHub user GayathriMurali opened a pull request: https://github.com/apache/spark/pull/14112 [SPARK-16240][ML] Model loading backward compatibility for LDA ## What changes were proposed in this pull request? LDA model loading backward compatibility ## How was this patch

[GitHub] spark pull request #13745: [Spark-15997][DOC][ML] Update user guide for Hash...

2016-06-23 Thread GayathriMurali
Github user GayathriMurali commented on a diff in the pull request: https://github.com/apache/spark/pull/13745#discussion_r68245367 --- Diff: examples/src/main/python/ml/quantile_discretizer_example.py --- @@ -29,11 +29,12 @@ # $example on$ data = [(0, 18.0,), (1

[GitHub] spark issue #13745: [Spark-15997][DOC][ML] Update user guide for HashingTF, ...

2016-06-22 Thread GayathriMurali
Github user GayathriMurali commented on the issue: https://github.com/apache/spark/pull/13745 Oops! That works. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #13745: [Spark-15997][DOC][ML] Update user guide for HashingTF, ...

2016-06-22 Thread GayathriMurali
Github user GayathriMurali commented on the issue: https://github.com/apache/spark/pull/13745 @jkbradley Yes, that works --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #13176: [SPARK-15100][DOC] Modified user guide and examples for ...

2016-06-22 Thread GayathriMurali
Github user GayathriMurali commented on the issue: https://github.com/apache/spark/pull/13176 @jkbradley @MLnick My bad. Sorry about that! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark issue #13745: [Spark-15997][DOC][ML] Update user guide for HashingTF, ...

2016-06-22 Thread GayathriMurali
Github user GayathriMurali commented on the issue: https://github.com/apache/spark/pull/13745 @jkbradley @MLnick `repartition` needs to be added along with the creation of the dataframe like this. `val df = spark.createDataFrame(data).toDF("id","hour").repar

[GitHub] spark pull request #12675: [SPARK-14894][PySpark] Add result summary api to ...

2016-06-21 Thread GayathriMurali
Github user GayathriMurali commented on a diff in the pull request: https://github.com/apache/spark/pull/12675#discussion_r67954415 --- Diff: python/pyspark/ml/tests.py --- @@ -1070,6 +1070,21 @@ def test_logistic_regression_summary(self): sameSummary = model.evaluate

[GitHub] spark issue #12675: [SPARK-14894][PySpark] Add result summary api to Gaussia...

2016-06-21 Thread GayathriMurali
Github user GayathriMurali commented on the issue: https://github.com/apache/spark/pull/12675 @MLnick It would be great if you can help review this. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark issue #13745: [Spark-15997][DOC][ML] Update user guide for HashingTF, ...

2016-06-21 Thread GayathriMurali
Github user GayathriMurali commented on the issue: https://github.com/apache/spark/pull/13745 @jkbradley @MLnick I agree with repartition idea. Although I think that it may not be a bad idea to call out that approxquantile calcultion for smaller datasets may be different on different

[GitHub] spark issue #13176: [SPARK-15100][DOC] Modified user guide and examples for ...

2016-06-18 Thread GayathriMurali
Github user GayathriMurali commented on the issue: https://github.com/apache/spark/pull/13176 @MLnick I opened PR #13745 to track this as @jkbradley suggested. This JIRA is only doing partial list of Audit ml.feature. Please help review SPARK-15597. --- If your project is set up

[GitHub] spark pull request #13745: [Spark 15997][DOC][ML] Update user guide for Hash...

2016-06-17 Thread GayathriMurali
GitHub user GayathriMurali opened a pull request: https://github.com/apache/spark/pull/13745 [Spark 15997][DOC][ML] Update user guide for HashingTF, QuantileVectorizer and CountVectorizer ## What changes were proposed in this pull request? Made changes to HashingTF

[GitHub] spark pull request #13176: [SPARK-15997][DOC] Modified user guide and exampl...

2016-06-17 Thread GayathriMurali
Github user GayathriMurali closed the pull request at: https://github.com/apache/spark/pull/13176 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark issue #13176: [SPARK-15100][DOC] Modified user guide and examples for ...

2016-06-16 Thread GayathriMurali
Github user GayathriMurali commented on the issue: https://github.com/apache/spark/pull/13176 @jkbradley @MLnick I have created SPARK-15997 to track the changes addressed in this PR. --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark issue #13176: [SPARK-15100][DOC] Modified user guide and examples for ...

2016-06-16 Thread GayathriMurali
Github user GayathriMurali commented on the issue: https://github.com/apache/spark/pull/13176 @jkbradley I just tried this. https://cloud.githubusercontent.com/assets/7002441/16128207/94f835ea-33b4-11e6-9866-369672b7bdae.png;> and getting this output which is the s

[GitHub] spark issue #13285: [Spark-15129][R][DOC]R API changes in ML

2016-06-16 Thread GayathriMurali
Github user GayathriMurali commented on the issue: https://github.com/apache/spark/pull/13285 @jkbradley I fixed for the review comment. Please let me know if there is anything else. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark issue #13176: [SPARK-15100][DOC] Modified user guide and examples for ...

2016-06-15 Thread GayathriMurali
Github user GayathriMurali commented on the issue: https://github.com/apache/spark/pull/13176 @jkbradley the different results was due to the difference in underlying core count(thread count). @MLnick and I were able to get the same results for `local[4]`. We could explicitly

[GitHub] spark issue #12675: [SPARK-14894][PySpark] Add result summary api to Gaussia...

2016-06-14 Thread GayathriMurali
Github user GayathriMurali commented on the issue: https://github.com/apache/spark/pull/12675 @jkbradley This PR has been open >30days. Can you please help review? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If y

[GitHub] spark issue #13285: [Spark-15129][R][DOC]R API changes in ML

2016-06-14 Thread GayathriMurali
Github user GayathriMurali commented on the issue: https://github.com/apache/spark/pull/13285 @yanboliang Please let me know if there is anything else I can do to help get this merged.Thanks! --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark issue #13176: [SPARK-15100][DOC] Modified user guide and examples for ...

2016-06-14 Thread GayathriMurali
Github user GayathriMurali commented on the issue: https://github.com/apache/spark/pull/13176 @MLnick Please let me know if there is anything else I can do to help get this merged.Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark issue #12675: [SPARK-14894][PySpark] Add result summary api to Gaussia...

2016-06-07 Thread GayathriMurali
Github user GayathriMurali commented on the issue: https://github.com/apache/spark/pull/12675 @jkbradley @holdenk Can you please help review? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark issue #13285: [Spark-15129][R][DOC]R API changes in ML

2016-06-06 Thread GayathriMurali
Github user GayathriMurali commented on the issue: https://github.com/apache/spark/pull/13285 @yanboliang Please let me know if there is anything else I can do to get this merged. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark pull request #13176: [SPARK-15100][DOC] Modified user guide and exampl...

2016-06-06 Thread GayathriMurali
Github user GayathriMurali commented on a diff in the pull request: https://github.com/apache/spark/pull/13176#discussion_r66011880 --- Diff: docs/ml-features.md --- @@ -1092,14 +1095,11 @@ for more details on the API. ## QuantileDiscretizer `QuantileDiscretizer

[GitHub] spark pull request #13176: [SPARK-15100][DOC] Modified user guide and exampl...

2016-06-03 Thread GayathriMurali
Github user GayathriMurali commented on a diff in the pull request: https://github.com/apache/spark/pull/13176#discussion_r65790125 --- Diff: docs/ml-features.md --- @@ -1092,14 +1095,11 @@ for more details on the API. ## QuantileDiscretizer `QuantileDiscretizer

[GitHub] spark issue #13176: [SPARK-15100][DOC] Modified user guide and examples for ...

2016-06-02 Thread GayathriMurali
Github user GayathriMurali commented on the issue: https://github.com/apache/spark/pull/13176 @MLnick I agree. Should I make those changes in this same PR? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark issue #13176: [SPARK-15100][DOC] Modified user guide and examples for ...

2016-06-02 Thread GayathriMurali
Github user GayathriMurali commented on the issue: https://github.com/apache/spark/pull/13176 @MLnick Please let me know if there is anything else that I can help with this PR --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark issue #13285: [Spark-15129][R][DOC]R API changes in ML

2016-06-01 Thread GayathriMurali
Github user GayathriMurali commented on the issue: https://github.com/apache/spark/pull/13285 Also, #10219 uses include_example with different files , which is not the case here. @mengxr We need support for tags with include_example, or we need to reformat ml.R( or split every

[GitHub] spark issue #13285: [Spark-15129][R][DOC]R API changes in ML

2016-06-01 Thread GayathriMurali
Github user GayathriMurali commented on the issue: https://github.com/apache/spark/pull/13285 @yanboliang `$example on$` and `$example off$` needs to be included in ml.R. All the code encompassed within example on and off would be joined and a single code block will be produced

[GitHub] spark pull request: [SPARK-15100][DOC] Modified user guide and examples for ...

2016-05-31 Thread GayathriMurali
Github user GayathriMurali commented on the pull request: https://github.com/apache/spark/pull/13176 @MLnick +1 for making the change in the example as well. Calling out difference in result due to parallelism might be little confusing in this document. --- If your project is set

[GitHub] spark pull request: [SPARK-15100][DOC] Modified user guide and examples for ...

2016-05-31 Thread GayathriMurali
Github user GayathriMurali commented on the pull request: https://github.com/apache/spark/pull/13176 I just tried with `--master local[8]` and I get the same results as you do. Should I call this out in the example? --- If your project is set up for it, you can reply to this email

[GitHub] spark pull request: [SPARK-15100][DOC] Modified user guide and examples for ...

2016-05-31 Thread GayathriMurali
Github user GayathriMurali commented on the pull request: https://github.com/apache/spark/pull/13176 I just did. It is local[4] --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request: [SPARK-15100][DOC] Modified user guide and examples for ...

2016-05-31 Thread GayathriMurali
Github user GayathriMurali commented on the pull request: https://github.com/apache/spark/pull/13176 @MLnick I am using local. I havent explicitly setup thread count. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark pull request: [SPARK-15100][DOC] Modified user guide and examples for ...

2016-05-31 Thread GayathriMurali
Github user GayathriMurali commented on the pull request: https://github.com/apache/spark/pull/13176 On Mac. Scala version 2.11.8 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_73). I checked again and I consistently get the same output on master. @MLnick Please let me know how you

[GitHub] spark pull request: [SPARK-15100][DOC] Modified user guide and examples for ...

2016-05-31 Thread GayathriMurali
Github user GayathriMurali commented on the pull request: https://github.com/apache/spark/pull/13176 @BryanCutler @oliverpierson Looks like something is wrong on my side. I just checked again on a fresh build and got the same results. Will dig deeper. --- If your project is set up

[GitHub] spark pull request: [SPARK-15100][DOC] Modified user guide and examples for ...

2016-05-31 Thread GayathriMurali
Github user GayathriMurali commented on the pull request: https://github.com/apache/spark/pull/13176 I get this : Array[Double] = Array(5.0, 8.0) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-15100][DOC] Modified user guide and examples for ...

2016-05-31 Thread GayathriMurali
Github user GayathriMurali commented on the pull request: https://github.com/apache/spark/pull/13176 @MLnick @oliverpierson I checked again with a clean build off master. Here is the hash : 2bfc4f15214a870b3e067f06f37eb506b0070a1f. Here is what I see https

[GitHub] spark pull request: [SPARK-15100][DOC] Modified user guide and examples for ...

2016-05-31 Thread GayathriMurali
Github user GayathriMurali commented on a diff in the pull request: https://github.com/apache/spark/pull/13176#discussion_r65223909 --- Diff: docs/ml-features.md --- @@ -145,9 +148,11 @@ for more details on the API. passed to other algorithms like LDA. During

[GitHub] spark pull request: [Spark-15129][R][DOC]R API changes in ML

2016-05-31 Thread GayathriMurali
Github user GayathriMurali commented on the pull request: https://github.com/apache/spark/pull/13285 @yanboliang I have included ml.r using include-example, wouldn't that cover all the examples? --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark pull request: [Spark-15129][R][DOC]R API changes in ML

2016-05-31 Thread GayathriMurali
Github user GayathriMurali commented on a diff in the pull request: https://github.com/apache/spark/pull/13285#discussion_r65221434 --- Diff: docs/sparkr.md --- @@ -285,71 +285,57 @@ head(teenagers) # Machine Learning -SparkR allows the fitting of generalized

[GitHub] spark pull request: [SPARK-15100][DOC] Modified user guide and exa...

2016-05-29 Thread GayathriMurali
Github user GayathriMurali commented on the pull request: https://github.com/apache/spark/pull/13176#issuecomment-222409058 @MLnick Please let me know if there is anything else that I can help with this PR --- If your project is set up for it, you can reply to this email and have

[GitHub] spark pull request: [SPARK-15100][DOC] Modified user guide and exa...

2016-05-26 Thread GayathriMurali
Github user GayathriMurali commented on a diff in the pull request: https://github.com/apache/spark/pull/13176#discussion_r64799207 --- Diff: docs/ml-features.md --- @@ -145,9 +148,11 @@ for more details on the API. passed to other algorithms like LDA. During

[GitHub] spark pull request: [Spark-15129][R][DOC]R API changes in ML

2016-05-26 Thread GayathriMurali
Github user GayathriMurali commented on the pull request: https://github.com/apache/spark/pull/13285#issuecomment-221954150 @yanboliang Can you please help review? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [Spark-15129][R][DOC][WIP]R API changes in ML

2016-05-25 Thread GayathriMurali
Github user GayathriMurali commented on the pull request: https://github.com/apache/spark/pull/13285#issuecomment-221764817 @yanboliang Thanks, thats a good idea. However, that would just include example code and not how the output of summary() looks like. It might be useful

[GitHub] spark pull request: [SPARK-15100][DOC] Modified user guide and exa...

2016-05-25 Thread GayathriMurali
Github user GayathriMurali commented on a diff in the pull request: https://github.com/apache/spark/pull/13176#discussion_r64683245 --- Diff: docs/ml-features.md --- @@ -53,7 +53,10 @@ collisions, where different raw features may become the same term after hashing. chance

[GitHub] spark pull request: [Spark 15129][R][DOC][WIP]R API changes in ML

2016-05-24 Thread GayathriMurali
Github user GayathriMurali commented on the pull request: https://github.com/apache/spark/pull/13285#issuecomment-221428716 @jkbradley @MLnick I have marked this WIP, as I want to get your thoughts on if you think the format looks ok. I can add examples to KMeans and SurvReg

[GitHub] spark pull request: [Spark 15129][R][DOC][WIP]R API changes in ML

2016-05-24 Thread GayathriMurali
GitHub user GayathriMurali opened a pull request: https://github.com/apache/spark/pull/13285 [Spark 15129][R][DOC][WIP]R API changes in ML ## What changes were proposed in this pull request? Make user guide changes to SparkR documentation for all changes that happened

[GitHub] spark pull request: [SPARK-15100][DOC] Modified user guide and exa...

2016-05-24 Thread GayathriMurali
Github user GayathriMurali commented on a diff in the pull request: https://github.com/apache/spark/pull/13176#discussion_r64476690 --- Diff: docs/ml-features.md --- @@ -1098,9 +1098,9 @@ for more details on the API. `QuantileDiscretizer` takes a column with continuous

[GitHub] spark pull request: [SPARK-15100][DOC] Modified user guide and exa...

2016-05-23 Thread GayathriMurali
Github user GayathriMurali commented on the pull request: https://github.com/apache/spark/pull/13176#issuecomment-221020100 @MLnick I fixed all review comments. Can you please let me know if there is anything else to be done to help get this merged? --- If your project is set up

[GitHub] spark pull request: [SPARK-15100][DOC] Modified user guide and exa...

2016-05-20 Thread GayathriMurali
Github user GayathriMurali commented on the pull request: https://github.com/apache/spark/pull/13176#issuecomment-220723548 @MLnick The latest commit includes just the ml-feature.md changes. I removed all the other example files and feature.py. --- If your project is set up

[GitHub] spark pull request: [SPARK-15100][DOC] Modified user guide and exa...

2016-05-20 Thread GayathriMurali
Github user GayathriMurali commented on a diff in the pull request: https://github.com/apache/spark/pull/13176#discussion_r64101972 --- Diff: docs/ml-features.md --- @@ -1093,13 +,10 @@ for more details on the API. `QuantileDiscretizer` takes a column with continuous

[GitHub] spark pull request: [SPARK-15100][DOC] Modified user guide and exa...

2016-05-20 Thread GayathriMurali
Github user GayathriMurali commented on the pull request: https://github.com/apache/spark/pull/13176#issuecomment-220698824 Something messed up the `git push`. I will send another commit --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark pull request: [SPARK-15100][DOC] Modified user guide and exa...

2016-05-20 Thread GayathriMurali
Github user GayathriMurali commented on a diff in the pull request: https://github.com/apache/spark/pull/13176#discussion_r64087912 --- Diff: docs/ml-features.md --- @@ -26,7 +26,9 @@ This section covers algorithms for working with features, roughly divided into t

[GitHub] spark pull request: [SPARK-15100][DOC] Modified user guide and exa...

2016-05-20 Thread GayathriMurali
Github user GayathriMurali commented on a diff in the pull request: https://github.com/apache/spark/pull/13176#discussion_r64079147 --- Diff: docs/ml-features.md --- @@ -114,7 +116,10 @@ for more details on the API. During the fitting process, `CountVectorizer` will select

[GitHub] spark pull request: [SPARK-15100][DOC] Modified user guide and exa...

2016-05-20 Thread GayathriMurali
Github user GayathriMurali commented on a diff in the pull request: https://github.com/apache/spark/pull/13176#discussion_r64078981 --- Diff: docs/ml-features.md --- @@ -26,7 +26,9 @@ This section covers algorithms for working with features, roughly divided into t

[GitHub] spark pull request: [SPARK-15100][DOC] Modified user guide and exa...

2016-05-20 Thread GayathriMurali
Github user GayathriMurali commented on a diff in the pull request: https://github.com/apache/spark/pull/13176#discussion_r64075252 --- Diff: docs/ml-features.md --- @@ -1064,7 +1069,8 @@ categorical features. The bin ranges are chosen by taking a sample of the data

[GitHub] spark pull request: [SPARK-15100][DOC] Modified user guide and exa...

2016-05-20 Thread GayathriMurali
Github user GayathriMurali commented on a diff in the pull request: https://github.com/apache/spark/pull/13176#discussion_r64073253 --- Diff: examples/src/main/java/org/apache/spark/examples/ml/JavaCountVectorizerExample.java --- @@ -54,6 +54,7 @@ public static void main(String

[GitHub] spark pull request: [SPARK-15100][DOC] Modified user guide and exa...

2016-05-19 Thread GayathriMurali
Github user GayathriMurali commented on the pull request: https://github.com/apache/spark/pull/13176#issuecomment-220513197 @hhbyyh Can you please help review this? I will resolve the branch conflict along with review comments --- If your project is set up for it, you can reply

[GitHub] spark pull request: [SPARK-15100][DOC] Modified user guide and exa...

2016-05-18 Thread GayathriMurali
GitHub user GayathriMurali opened a pull request: https://github.com/apache/spark/pull/13176 [SPARK-15100][DOC] Modified user guide and examples for CountVectoriz… ## What changes were proposed in this pull request? This is partial document changes to ml.feature. Made

[GitHub] spark pull request: [SPARK-14894][PySpark] Add result summary api ...

2016-05-11 Thread GayathriMurali
Github user GayathriMurali commented on the pull request: https://github.com/apache/spark/pull/12675#issuecomment-218664815 @holdenk I checked the ScalaDoc and removed the evaluate method. Thanks for pointing it out. Can you please help review --- If your project is set up

[GitHub] spark pull request: [SPARK-14894][PySpark] Add result summary api ...

2016-05-05 Thread GayathriMurali
Github user GayathriMurali commented on the pull request: https://github.com/apache/spark/pull/12675#issuecomment-217318560 @holdenk I fixed the pydoc style issue. Can you please help review this? --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark pull request: [SPARK-14894][PySpark] Add result summary api ...

2016-04-28 Thread GayathriMurali
Github user GayathriMurali commented on the pull request: https://github.com/apache/spark/pull/12675#issuecomment-215529653 @jkbradley Can you please ok to test this --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark pull request: [SPARK-14315][SparkR]Add model persistence to ...

2016-04-28 Thread GayathriMurali
Github user GayathriMurali commented on a diff in the pull request: https://github.com/apache/spark/pull/12683#discussion_r61481184 --- Diff: R/pkg/inst/tests/testthat/test_mllib.R --- @@ -71,7 +71,25 @@ test_that("glm and predict", { data = iris, family

[GitHub] spark pull request: [SPARK-14315][SparkR]Add model persistence to ...

2016-04-27 Thread GayathriMurali
Github user GayathriMurali commented on a diff in the pull request: https://github.com/apache/spark/pull/12683#discussion_r61370053 --- Diff: R/pkg/R/mllib.R --- @@ -406,6 +432,8 @@ ml.load <- function(path) { jobj <- callJStatic("org.apache.spark.ml.r.RWrapp

[GitHub] spark pull request: [SPARK-14315][SparkR]Add model persistence to ...

2016-04-25 Thread GayathriMurali
GitHub user GayathriMurali opened a pull request: https://github.com/apache/spark/pull/12683 [SPARK-14315][SparkR]Add model persistence to GLMs ## What changes were proposed in this pull request? Add model persistence to GLMs in SparkR Unit tests added

[GitHub] spark pull request: [Spark-14314][SparkR] Add model persistence to...

2016-04-25 Thread GayathriMurali
GitHub user GayathriMurali opened a pull request: https://github.com/apache/spark/pull/12680 [Spark-14314][SparkR] Add model persistence to KMeans ## What changes were proposed in this pull request? Add model persistence to KMeans SparkR ## How was this patch

[GitHub] spark pull request: [SPARK-14894][PySpark] Add result summary api ...

2016-04-25 Thread GayathriMurali
Github user GayathriMurali commented on the pull request: https://github.com/apache/spark/pull/12675#issuecomment-214569241 @wangmiao1981 @jkbradley Please help review this PR --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark pull request: [SPARK-14894][PySpark] Add result summary api ...

2016-04-25 Thread GayathriMurali
GitHub user GayathriMurali opened a pull request: https://github.com/apache/spark/pull/12675 [SPARK-14894][PySpark] Add result summary api to Gaussian Mixture ## What changes were proposed in this pull request? Add summary API to Gaussian Mixture ## How

[GitHub] spark pull request: [SPARK-14894][Pyspark] Add result summary API ...

2016-04-25 Thread GayathriMurali
Github user GayathriMurali commented on the pull request: https://github.com/apache/spark/pull/12670#issuecomment-214565229 I am closing this PR as a file got added by mistake. Will open a new one. --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark pull request: [SPARK-14894][Pyspark] Add result summary API ...

2016-04-25 Thread GayathriMurali
Github user GayathriMurali closed the pull request at: https://github.com/apache/spark/pull/12670 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark pull request: [SPARK-14894][Pyspark] Add result summary API ...

2016-04-25 Thread GayathriMurali
GitHub user GayathriMurali opened a pull request: https://github.com/apache/spark/pull/12670 [SPARK-14894][Pyspark] Add result summary API to Gaussian Mixture ## What changes were proposed in this pull request? Add summary API to Gaussian Mixture in Pyspark ## How

[GitHub] spark pull request: [SPARK-13783] [ML] Model export/import for spa...

2016-04-08 Thread GayathriMurali
Github user GayathriMurali commented on a diff in the pull request: https://github.com/apache/spark/pull/12230#discussion_r59054623 --- Diff: mllib/src/main/scala/org/apache/spark/ml/classification/GBTClassifier.scala --- @@ -257,12 +240,61 @@ final class GBTClassificationModel

[GitHub] spark pull request: [SPARK-13783] [ML] Model export/import for spa...

2016-04-07 Thread GayathriMurali
Github user GayathriMurali commented on the pull request: https://github.com/apache/spark/pull/12230#issuecomment-207051757 @yanboliang I did a quick first pass. I have some initial comments. Will stay tuned for updates. Thanks! --- If your project is set up for it, you can reply

[GitHub] spark pull request: [SPARK-13783] [ML] Model export/import for spa...

2016-04-07 Thread GayathriMurali
Github user GayathriMurali commented on a diff in the pull request: https://github.com/apache/spark/pull/12230#discussion_r58926273 --- Diff: mllib/src/main/scala/org/apache/spark/ml/classification/GBTClassifier.scala --- @@ -257,12 +240,61 @@ final class GBTClassificationModel

[GitHub] spark pull request: [SPARK-13783] [ML] Model export/import for spa...

2016-04-07 Thread GayathriMurali
Github user GayathriMurali commented on a diff in the pull request: https://github.com/apache/spark/pull/12230#discussion_r58926192 --- Diff: mllib/src/main/scala/org/apache/spark/ml/classification/GBTClassifier.scala --- @@ -257,12 +240,61 @@ final class GBTClassificationModel

[GitHub] spark pull request: [SPARK-13783] [ML] Model export/import for spa...

2016-04-07 Thread GayathriMurali
Github user GayathriMurali commented on a diff in the pull request: https://github.com/apache/spark/pull/12230#discussion_r58925984 --- Diff: mllib/src/main/scala/org/apache/spark/ml/classification/GBTClassifier.scala --- @@ -257,12 +240,61 @@ final class GBTClassificationModel

[GitHub] spark pull request: [SPARK-13783] [ML] Model export/import for spa...

2016-04-07 Thread GayathriMurali
Github user GayathriMurali commented on a diff in the pull request: https://github.com/apache/spark/pull/12230#discussion_r58925589 --- Diff: mllib/src/main/scala/org/apache/spark/ml/classification/GBTClassifier.scala --- @@ -257,12 +240,61 @@ final class GBTClassificationModel

[GitHub] spark pull request: [SPARK-13784][ML] Persistence for RandomForest...

2016-04-01 Thread GayathriMurali
Github user GayathriMurali commented on the pull request: https://github.com/apache/spark/pull/12118#issuecomment-204635416 @jkbradley Thanks for this. This looks great and clarifies a lot of things I was trying to do. I had one minor comment, except that it looks fine to me

[GitHub] spark pull request: [SPARK-13784][ML] Persistence for RandomForest...

2016-04-01 Thread GayathriMurali
Github user GayathriMurali commented on a diff in the pull request: https://github.com/apache/spark/pull/12118#discussion_r58287249 --- Diff: mllib/src/main/scala/org/apache/spark/ml/tree/treeModels.scala --- @@ -358,3 +376,100 @@ private[ml] object DecisionTreeModelReadWrite

[GitHub] spark pull request: [Spark-13784][ML][WIP] Model export/import for...

2016-04-01 Thread GayathriMurali
Github user GayathriMurali closed the pull request at: https://github.com/apache/spark/pull/12023 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark pull request: [Spark-13784][ML][WIP] Model export/import for...

2016-04-01 Thread GayathriMurali
Github user GayathriMurali commented on the pull request: https://github.com/apache/spark/pull/12023#issuecomment-204576825 @jkbradley I was just about to ping you regarding this. I would definitely love to help out. I was out at Strata all week and couldn't get to this. Please let

[GitHub] spark pull request: [Spark-13784][ML][WIP] Model export/import for...

2016-03-30 Thread GayathriMurali
Github user GayathriMurali commented on the pull request: https://github.com/apache/spark/pull/12023#issuecomment-203763228 @jkbradley I am sorry, I am afraid I will not be able to complete tonight. Can you please help me with reusing Splitdata/build code from DecisionTrees

[GitHub] spark pull request: [Spark-13784][ML][WIP] Model export/import for...

2016-03-30 Thread GayathriMurali
Github user GayathriMurali commented on a diff in the pull request: https://github.com/apache/spark/pull/12023#discussion_r57993953 --- Diff: mllib/src/main/scala/org/apache/spark/ml/regression/RandomForestRegressor.scala --- @@ -199,21 +210,71 @@ final class

[GitHub] spark pull request: [Spark-13784][ML][WIP] Model export/import for...

2016-03-30 Thread GayathriMurali
Github user GayathriMurali commented on a diff in the pull request: https://github.com/apache/spark/pull/12023#discussion_r57968732 --- Diff: mllib/src/main/scala/org/apache/spark/ml/classification/RandomForestClassifier.scala --- @@ -240,12 +250,66 @@ final class

[GitHub] spark pull request: [Spark-13784][ML][WIP] Model export/import for...

2016-03-30 Thread GayathriMurali
Github user GayathriMurali commented on the pull request: https://github.com/apache/spark/pull/12023#issuecomment-203645597 @jkbradley I should be able to update this by tonight. Would that work? --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark pull request: [Spark-13784][ML][WIP] Model export/import for...

2016-03-29 Thread GayathriMurali
Github user GayathriMurali commented on a diff in the pull request: https://github.com/apache/spark/pull/12023#discussion_r57787829 --- Diff: mllib/src/main/scala/org/apache/spark/ml/classification/RandomForestClassifier.scala --- @@ -240,12 +250,66 @@ final class

[GitHub] spark pull request: [Spark 13784][ML][WIP] Model export/import for...

2016-03-28 Thread GayathriMurali
Github user GayathriMurali commented on the pull request: https://github.com/apache/spark/pull/12023#issuecomment-202671126 @yanboliang @jkbradley Please help review the code. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark pull request: [Spark 13784][ML][WIP] Model export/import for...

2016-03-28 Thread GayathriMurali
GitHub user GayathriMurali opened a pull request: https://github.com/apache/spark/pull/12023 [Spark 13784][ML][WIP] Model export/import for spark.ml: RandomForests Please help review the code. I have the WIP included to make sure the changes look correct. ## What changes

  1   2   >