[GitHub] spark pull request #16150: [SPARK-18349][SparkR]:Update R API documentation ...

2016-12-08 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/16150


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16150: [SPARK-18349][SparkR]:Update R API documentation ...

2016-12-07 Thread wangmiao1981
Github user wangmiao1981 commented on a diff in the pull request:

https://github.com/apache/spark/pull/16150#discussion_r91341471
  
--- Diff: R/pkg/R/mllib.R ---
@@ -661,7 +665,10 @@ setMethod("fitted", signature(object = "KMeansModel"),
 #  Get the summary of a k-means model
 
 #' @param object a fitted k-means model.
-#' @return \code{summary} returns the model's features, coefficients, k, 
size and cluster.
+#' @return \code{summary} returns summary information of the fitted model, 
which is a list.
+#' The list includes the model's \code{coefficients} (model 
cluster centers),
--- End diff --

Do you think we should return these values? If so, I can file a followup PR.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16150: [SPARK-18349][SparkR]:Update R API documentation ...

2016-12-07 Thread felixcheung
Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/16150#discussion_r91247257
  
--- Diff: R/pkg/R/mllib.R ---
@@ -1852,9 +1867,9 @@ summary.treeEnsemble <- function(model) {
 
 #  Get the summary of a Random Forest Regression Model
 
-#' @return \code{summary} returns a summary object of the fitted model, a 
list of components
-#' including formula, number of features, list of features, 
feature importances, number of
-#' trees, and tree weights
+#' @return \code{summary} returns summary information of the fitted model, 
which is a list.
+#' The list of components includes \code{ans} (formula, number of 
features, list of features,
+#' feature importances, number of trees, and tree weights).
--- End diff --

that's a function call, see 
https://github.com/apache/spark/pull/16150/files/6039c1f742f77b4ceff624b3979afbaa9e80d77a#diff-7ede1519b4a56647801b51af33c2dd18R1851


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16150: [SPARK-18349][SparkR]:Update R API documentation ...

2016-12-07 Thread felixcheung
Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/16150#discussion_r91247076
  
--- Diff: R/pkg/R/mllib.R ---
@@ -661,7 +665,10 @@ setMethod("fitted", signature(object = "KMeansModel"),
 #  Get the summary of a k-means model
 
 #' @param object a fitted k-means model.
-#' @return \code{summary} returns the model's features, coefficients, k, 
size and cluster.
+#' @return \code{summary} returns summary information of the fitted model, 
which is a list.
+#' The list includes the model's \code{coefficients} (model 
cluster centers),
--- End diff --

ah, sorry you are right, it's not in the returned list


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16150: [SPARK-18349][SparkR]:Update R API documentation ...

2016-12-06 Thread wangmiao1981
Github user wangmiao1981 commented on a diff in the pull request:

https://github.com/apache/spark/pull/16150#discussion_r91232450
  
--- Diff: R/pkg/R/mllib.R ---
@@ -1389,7 +1399,9 @@ setMethod("spark.gaussianMixture", signature(data = 
"SparkDataFrame", formula =
 #  Get the summary of a multivariate gaussian mixture model
 
 #' @param object a fitted gaussian mixture model.
-#' @return \code{summary} returns the model's lambda, mu, sigma, k, dim 
and posterior.
+#' @return \code{summary} returns summary of the fitted model, which is a 
list.
+#' The list includes the model's \code{lambda} (lambda), \code{mu} 
(mu),
+#' \code{sigma} (sigma), and \code{posterior} (posterior).
--- End diff --

Same reason as the above one.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16150: [SPARK-18349][SparkR]:Update R API documentation ...

2016-12-06 Thread wangmiao1981
Github user wangmiao1981 commented on a diff in the pull request:

https://github.com/apache/spark/pull/16150#discussion_r91232063
  
--- Diff: R/pkg/R/mllib.R ---
@@ -1852,9 +1867,9 @@ summary.treeEnsemble <- function(model) {
 
 #  Get the summary of a Random Forest Regression Model
 
-#' @return \code{summary} returns a summary object of the fitted model, a 
list of components
-#' including formula, number of features, list of features, 
feature importances, number of
-#' trees, and tree weights
+#' @return \code{summary} returns summary information of the fitted model, 
which is a list.
+#' The list of components includes \code{ans} (formula, number of 
features, list of features,
+#' feature importances, number of trees, and tree weights).
--- End diff --

the two places returns `summary.treeEnsemble(object)`. What shall I put in 
the `\code{}`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16150: [SPARK-18349][SparkR]:Update R API documentation ...

2016-12-06 Thread wangmiao1981
Github user wangmiao1981 commented on a diff in the pull request:

https://github.com/apache/spark/pull/16150#discussion_r91231575
  
--- Diff: R/pkg/R/mllib.R ---
@@ -661,7 +665,10 @@ setMethod("fitted", signature(object = "KMeansModel"),
 #  Get the summary of a k-means model
 
 #' @param object a fitted k-means model.
-#' @return \code{summary} returns the model's features, coefficients, k, 
size and cluster.
+#' @return \code{summary} returns summary information of the fitted model, 
which is a list.
+#' The list includes the model's \code{coefficients} (model 
cluster centers),
--- End diff --

For the return list, I didn't see features and k.  Does R function not only 
return the last line? 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16150: [SPARK-18349][SparkR]:Update R API documentation ...

2016-12-06 Thread felixcheung
Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/16150#discussion_r91231272
  
--- Diff: R/pkg/R/mllib.R ---
@@ -1389,7 +1399,9 @@ setMethod("spark.gaussianMixture", signature(data = 
"SparkDataFrame", formula =
 #  Get the summary of a multivariate gaussian mixture model
 
 #' @param object a fitted gaussian mixture model.
-#' @return \code{summary} returns the model's lambda, mu, sigma, k, dim 
and posterior.
+#' @return \code{summary} returns summary of the fitted model, which is a 
list.
+#' The list includes the model's \code{lambda} (lambda), \code{mu} 
(mu),
+#' \code{sigma} (sigma), and \code{posterior} (posterior).
--- End diff --

missing k, dim?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16150: [SPARK-18349][SparkR]:Update R API documentation ...

2016-12-06 Thread felixcheung
Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/16150#discussion_r91231151
  
--- Diff: R/pkg/R/mllib.R ---
@@ -661,7 +665,10 @@ setMethod("fitted", signature(object = "KMeansModel"),
 #  Get the summary of a k-means model
 
 #' @param object a fitted k-means model.
-#' @return \code{summary} returns the model's features, coefficients, k, 
size and cluster.
+#' @return \code{summary} returns summary information of the fitted model, 
which is a list.
+#' The list includes the model's \code{coefficients} (model 
cluster centers),
--- End diff --

are we missing features and k?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16150: [SPARK-18349][SparkR]:Update R API documentation ...

2016-12-05 Thread wangmiao1981
GitHub user wangmiao1981 opened a pull request:

https://github.com/apache/spark/pull/16150

[SPARK-18349][SparkR]:Update R API documentation on ml model summary

## What changes were proposed in this pull request?
In this PR, the document of `summary` method is improved in the format: 

returns summary information of the fitted model, which is a list. The list 
includes ...

Since `summary` in R is mainly about the model, which is not the same as 
`summary` object on scala side, if there is one, the scala API doc is not 
pointed here.

In current document, some `@return` have `.` and some don't have. `.` is 
added to missed ones.

Since spark.logit `summary` has a big refactoring, this PR doesn't include 
this one. It will be changed when the `spark.logit` PR is merged.

## How was this patch tested?

Manual build.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/wangmiao1981/spark audit2

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/16150.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #16150


commit d02d0b39146d8b1e1ed49a9b71b2de6928a2752e
Author: wm...@hotmail.com 
Date:   2016-12-05T18:53:31Z

improve `summary` doc and other minor document improvements.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org