[GitHub] spark pull request: [SPARK-12566] [ML] [WIP] GLM model family, lin...

2016-04-11 Thread hhbyyh
Github user hhbyyh closed the pull request at: https://github.com/apache/spark/pull/11549 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is ena

[GitHub] spark pull request: [SPARK-12566] [ML] [WIP] GLM model family, lin...

2016-04-11 Thread yanboliang
Github user yanboliang commented on the pull request: https://github.com/apache/spark/pull/11549#issuecomment-208216908 @hhbyyh I sent #12294 , please feel free to comment. Thanks! Would you mind to close this PR? --- If your project is set up for it, you can reply to this email and

[GitHub] spark pull request: [SPARK-12566] [ML] [WIP] GLM model family, lin...

2016-04-06 Thread hhbyyh
Github user hhbyyh commented on the pull request: https://github.com/apache/spark/pull/11549#issuecomment-206641960 @jkbradley Thanks for the suggestion. @yanboliang Please start work on this if you are interested, since it's your idea that matches the best solution. --- If your

[GitHub] spark pull request: [SPARK-12566] [ML] [WIP] GLM model family, lin...

2016-04-06 Thread jkbradley
Github user jkbradley commented on the pull request: https://github.com/apache/spark/pull/11549#issuecomment-206625152 @hhbyyh @yanboliang I just wrote out my thoughts in the JIRA, and I think they match what @yanboliang suggested above (for option 3). --- If your project is set up f

[GitHub] spark pull request: [SPARK-12566] [ML] [WIP] GLM model family, lin...

2016-04-05 Thread yanboliang
Github user yanboliang commented on the pull request: https://github.com/apache/spark/pull/11549#issuecomment-205844978 @jkbradley @hhbyyh I can work on this PR. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your pro

[GitHub] spark pull request: [SPARK-12566] [ML] [WIP] GLM model family, lin...

2016-04-04 Thread hhbyyh
Github user hhbyyh commented on the pull request: https://github.com/apache/spark/pull/11549#issuecomment-205629238 Thanks @jkbradley. We cannot decide which options to go. I think @yanboliang and @thunterdb both would like to go with option 3, yet there're more details to be decided

[GitHub] spark pull request: [SPARK-12566] [ML] [WIP] GLM model family, lin...

2016-04-04 Thread jkbradley
Github user jkbradley commented on the pull request: https://github.com/apache/spark/pull/11549#issuecomment-205472279 Ping! Let me know if I can help get this in for 2.0 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. I

[GitHub] spark pull request: [SPARK-12566] [ML] [WIP] GLM model family, lin...

2016-03-21 Thread felixcheung
Github user felixcheung commented on the pull request: https://github.com/apache/spark/pull/11549#issuecomment-199541034 Looks like there are breaking signature changes - should we document that? --- If your project is set up for it, you can reply to this email and have your reply app

[GitHub] spark pull request: [SPARK-12566] [ML] [WIP] GLM model family, lin...

2016-03-21 Thread thunterdb
Github user thunterdb commented on the pull request: https://github.com/apache/spark/pull/11549#issuecomment-199516534 @hhbyyh thanks for the split. I have two small comments. Also, can you include some tests in `test_mllib.R`? It should be very close to the manual testing you did and

[GitHub] spark pull request: [SPARK-12566] [ML] [WIP] GLM model family, lin...

2016-03-21 Thread thunterdb
Github user thunterdb commented on a diff in the pull request: https://github.com/apache/spark/pull/11549#discussion_r56908061 --- Diff: mllib/src/main/scala/org/apache/spark/ml/r/SparkRWrappers.scala --- @@ -17,15 +17,34 @@ package org.apache.spark.ml.api.r +i

[GitHub] spark pull request: [SPARK-12566] [ML] [WIP] GLM model family, lin...

2016-03-21 Thread thunterdb
Github user thunterdb commented on a diff in the pull request: https://github.com/apache/spark/pull/11549#discussion_r56907102 --- Diff: R/pkg/R/mllib.R --- @@ -29,15 +29,10 @@ setClass("PipelineModel", representation(model = "jobj")) #' @param formula A symbolic description o

[GitHub] spark pull request: [SPARK-12566] [ML] [WIP] GLM model family, lin...

2016-03-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11549#issuecomment-197791215 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your projec

[GitHub] spark pull request: [SPARK-12566] [ML] [WIP] GLM model family, lin...

2016-03-19 Thread hhbyyh
Github user hhbyyh commented on the pull request: https://github.com/apache/spark/pull/11549#issuecomment-197773071 Thanks @mengxr @thunterdb @yanboliang for the review. Sent an update: 1. resolve the conflict with GLMSummary. 2. revert the summary statistics related part. 3

[GitHub] spark pull request: [SPARK-12566] [ML] [WIP] GLM model family, lin...

2016-03-19 Thread yanboliang
Github user yanboliang commented on the pull request: https://github.com/apache/spark/pull/11549#issuecomment-197905850 @hhbyyh I vote option 3 in JIRA. We already have ```GeneralizedLinearRegression``` in Scala, so it's better to call this implementation from SparkR directly. Due to

[GitHub] spark pull request: [SPARK-12566] [ML] [WIP] GLM model family, lin...

2016-03-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11549#issuecomment-197791218 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/

[GitHub] spark pull request: [SPARK-12566] [ML] [WIP] GLM model family, lin...

2016-03-19 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11549#issuecomment-197774045 **[Test build #53410 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/53410/consoleFull)** for PR 11549 at commit [`8b3dd3e`](https://gi

[GitHub] spark pull request: [SPARK-12566] [ML] [WIP] GLM model family, lin...

2016-03-19 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11549#issuecomment-197791173 **[Test build #53410 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/53410/consoleFull)** for PR 11549 at commit [`8b3dd3e`](https://g

[GitHub] spark pull request: [SPARK-12566] [ML] [WIP] GLM model family, lin...

2016-03-16 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/11549#discussion_r56290850 --- Diff: mllib/src/main/scala/org/apache/spark/ml/r/SparkRWrappers.scala --- @@ -17,15 +17,41 @@ package org.apache.spark.ml.api.r +impo

[GitHub] spark pull request: [SPARK-12566] [ML] [WIP] GLM model family, lin...

2016-03-16 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/11549#discussion_r56290664 --- Diff: R/pkg/R/mllib.R --- @@ -51,13 +45,12 @@ setClass("PipelineModel", representation(model = "jobj")) #' summary(model) #'} setMethod("gl

[GitHub] spark pull request: [SPARK-12566] [ML] [WIP] GLM model family, lin...

2016-03-16 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/11549#discussion_r56290662 --- Diff: R/pkg/R/mllib.R --- @@ -29,15 +29,9 @@ setClass("PipelineModel", representation(model = "jobj")) #' @param formula A symbolic description of the

[GitHub] spark pull request: [SPARK-12566] [ML] [WIP] GLM model family, lin...

2016-03-16 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/11549#discussion_r56290601 --- Diff: R/pkg/R/mllib.R --- @@ -51,13 +45,12 @@ setClass("PipelineModel", representation(model = "jobj")) #' summary(model) #'} setMethod("gl

[GitHub] spark pull request: [SPARK-12566] [ML] [WIP] GLM model family, lin...

2016-03-16 Thread mengxr
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/11549#issuecomment-197192755 @yanboliang @hhbyyh Let us do the summary statistics under another JIRA: https://issues.apache.org/jira/browse/SPARK-13925 --- If your project is set up for it, you can

[GitHub] spark pull request: [SPARK-12566] [ML] [WIP] GLM model family, lin...

2016-03-16 Thread yanboliang
Github user yanboliang commented on the pull request: https://github.com/apache/spark/pull/11549#issuecomment-197190040 @hhbyyh #11694 has been merged, so we can provide all R-like summary statistic for ```glm```. Thanks! --- If your project is set up for it, you can reply to this em

[GitHub] spark pull request: [SPARK-12566] [ML] [WIP] GLM model family, lin...

2016-03-16 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/11549#discussion_r56289063 --- Diff: mllib/src/main/scala/org/apache/spark/ml/r/SparkRWrappers.scala --- @@ -17,15 +17,41 @@ package org.apache.spark.ml.api.r +

[GitHub] spark pull request: [SPARK-12566] [ML] [WIP] GLM model family, lin...

2016-03-09 Thread thunterdb
Github user thunterdb commented on the pull request: https://github.com/apache/spark/pull/11549#issuecomment-194528019 @hhbyyh thanks! I just have some small comments; my main comment being in the jira ticket regarding the choice of options 1/2/3. --- If your project is set up for it

[GitHub] spark pull request: [SPARK-12566] [ML] [WIP] GLM model family, lin...

2016-03-09 Thread thunterdb
Github user thunterdb commented on a diff in the pull request: https://github.com/apache/spark/pull/11549#discussion_r55597240 --- Diff: mllib/src/main/scala/org/apache/spark/ml/regression/GeneralizedLinearRegression.scala --- @@ -569,9 +572,46 @@ class GeneralizedLinearRegression

[GitHub] spark pull request: [SPARK-12566] [ML] [WIP] GLM model family, lin...

2016-03-09 Thread thunterdb
Github user thunterdb commented on a diff in the pull request: https://github.com/apache/spark/pull/11549#discussion_r55595546 --- Diff: mllib/src/main/scala/org/apache/spark/ml/regression/GeneralizedLinearRegression.scala --- @@ -569,9 +572,46 @@ class GeneralizedLinearRegression

[GitHub] spark pull request: [SPARK-12566] [ML] [WIP] GLM model family, lin...

2016-03-09 Thread thunterdb
Github user thunterdb commented on a diff in the pull request: https://github.com/apache/spark/pull/11549#discussion_r55592623 --- Diff: mllib/src/main/scala/org/apache/spark/ml/r/SparkRWrappers.scala --- @@ -17,15 +17,41 @@ package org.apache.spark.ml.api.r +i

[GitHub] spark pull request: [SPARK-12566] [ML] [WIP] GLM model family, lin...

2016-03-09 Thread thunterdb
Github user thunterdb commented on a diff in the pull request: https://github.com/apache/spark/pull/11549#discussion_r55591400 --- Diff: R/pkg/R/mllib.R --- @@ -51,13 +45,12 @@ setClass("PipelineModel", representation(model = "jobj")) #' summary(model) #'} setMethod(

[GitHub] spark pull request: [SPARK-12566] [ML] [WIP] GLM model family, lin...

2016-03-09 Thread thunterdb
Github user thunterdb commented on a diff in the pull request: https://github.com/apache/spark/pull/11549#discussion_r55563547 --- Diff: mllib/src/main/scala/org/apache/spark/ml/r/SparkRWrappers.scala --- @@ -17,15 +17,41 @@ package org.apache.spark.ml.api.r +i

[GitHub] spark pull request: [SPARK-12566] [ML] [WIP] GLM model family, lin...

2016-03-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11549#issuecomment-193020982 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your projec

[GitHub] spark pull request: [SPARK-12566] [ML] [WIP] GLM model family, lin...

2016-03-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11549#issuecomment-193020984 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/

[GitHub] spark pull request: [SPARK-12566] [ML] [WIP] GLM model family, lin...

2016-03-06 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11549#issuecomment-193020961 **[Test build #52534 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52534/consoleFull)** for PR 11549 at commit [`1cca19e`](https://g

[GitHub] spark pull request: [SPARK-12566] [ML] [WIP] GLM model family, lin...

2016-03-06 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11549#issuecomment-193015610 **[Test build #52534 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52534/consoleFull)** for PR 11549 at commit [`1cca19e`](https://gi

[GitHub] spark pull request: [SPARK-12566] [ML] [WIP] GLM model family, lin...

2016-03-06 Thread hhbyyh
Github user hhbyyh commented on the pull request: https://github.com/apache/spark/pull/11549#issuecomment-193015522 Since we already have a `glm` in SparkR which is based on `LogisticRegressionModel` and `LinearRegressionModel`. There're three ways to extend it as I understand:

[GitHub] spark pull request: [SPARK-12566] [ML] [WIP] GLM model family, lin...

2016-03-06 Thread hhbyyh
GitHub user hhbyyh opened a pull request: https://github.com/apache/spark/pull/11549 [SPARK-12566] [ML] [WIP] GLM model family, link function support in SparkR:::glm ## What changes were proposed in this pull request? This JIRA is for extending the support of MLlib's Genera