spark git commit: [SPARKR][DOCS] fix broken url in doc
Repository: spark Updated Branches: refs/heads/branch-2.0 57d65e511 -> d9bd066b9 [SPARKR][DOCS] fix broken url in doc ## What changes were proposed in this pull request? Fix broken url, also, sparkR.session.stop doc page should have it in the header, instead of saying "sparkR.stop" ![image](https://cloud.githubusercontent.com/assets/8969467/17080129/26d41308-50d9-11e6-8967-79d6c920313f.png) Data type section is in the middle of a list of gapply/gapplyCollect subsections: ![image](https://cloud.githubusercontent.com/assets/8969467/17080122/f992d00a-50d8-11e6-8f2c-fd5786213920.png) ## How was this patch tested? manual test Author: Felix CheungCloses #14329 from felixcheung/rdoclinkfix. (cherry picked from commit b73defdd790cb823a4f9958ca89cec06fd198051) Signed-off-by: Shivaram Venkataraman Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/d9bd066b Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/d9bd066b Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/d9bd066b Branch: refs/heads/branch-2.0 Commit: d9bd066b9f37cfd18037b9a600371d0342703c0f Parents: 57d65e5 Author: Felix Cheung Authored: Mon Jul 25 11:25:41 2016 -0700 Committer: Shivaram Venkataraman Committed: Mon Jul 25 11:25:51 2016 -0700 -- R/pkg/R/DataFrame.R | 2 +- R/pkg/R/sparkR.R| 16 +++ docs/sparkr.md | 107 +++ 3 files changed, 62 insertions(+), 63 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/d9bd066b/R/pkg/R/DataFrame.R -- diff --git a/R/pkg/R/DataFrame.R b/R/pkg/R/DataFrame.R index 92c10f1..aa211b3 100644 --- a/R/pkg/R/DataFrame.R +++ b/R/pkg/R/DataFrame.R @@ -35,7 +35,7 @@ setOldClass("structType") #' @slot env An R environment that stores bookkeeping states of the SparkDataFrame #' @slot sdf A Java object reference to the backing Scala DataFrame #' @seealso \link{createDataFrame}, \link{read.json}, \link{table} -#' @seealso \url{https://spark.apache.org/docs/latest/sparkr.html#sparkdataframe} +#' @seealso \url{https://spark.apache.org/docs/latest/sparkr.html#sparkr-dataframes} #' @export #' @examples #'\dontrun{ http://git-wip-us.apache.org/repos/asf/spark/blob/d9bd066b/R/pkg/R/sparkR.R -- diff --git a/R/pkg/R/sparkR.R b/R/pkg/R/sparkR.R index ff5297f..524f7c4 100644 --- a/R/pkg/R/sparkR.R +++ b/R/pkg/R/sparkR.R @@ -28,14 +28,6 @@ connExists <- function(env) { }) } -#' @rdname sparkR.session.stop -#' @name sparkR.stop -#' @export -#' @note sparkR.stop since 1.4.0 -sparkR.stop <- function() { - sparkR.session.stop() -} - #' Stop the Spark Session and Spark Context #' #' Stop the Spark Session and Spark Context. @@ -90,6 +82,14 @@ sparkR.session.stop <- function() { clearJobjs() } +#' @rdname sparkR.session.stop +#' @name sparkR.stop +#' @export +#' @note sparkR.stop since 1.4.0 +sparkR.stop <- function() { + sparkR.session.stop() +} + #' (Deprecated) Initialize a new Spark Context #' #' This function initializes a new SparkContext. http://git-wip-us.apache.org/repos/asf/spark/blob/d9bd066b/docs/sparkr.md -- diff --git a/docs/sparkr.md b/docs/sparkr.md index dfa5278..4bbc362 100644 --- a/docs/sparkr.md +++ b/docs/sparkr.md @@ -322,8 +322,59 @@ head(ldf, 3) Apply a function to each group of a `SparkDataFrame`. The function is to be applied to each group of the `SparkDataFrame` and should have only two parameters: grouping key and R `data.frame` corresponding to that key. The groups are chosen from `SparkDataFrame`s column(s). The output of function should be a `data.frame`. Schema specifies the row format of the resulting -`SparkDataFrame`. It must represent R function's output schema on the basis of Spark data types. The column names of the returned `data.frame` are set by user. Below is the data type mapping between R -and Spark. +`SparkDataFrame`. It must represent R function's output schema on the basis of Spark [data types](#data-type-mapping-between-r-and-spark). The column names of the returned `data.frame` are set by user. + + +{% highlight r %} + +# Determine six waiting times with the largest eruption time in minutes. +schema <- structType(structField("waiting", "double"), structField("max_eruption", "double")) +result <- gapply( +df, +"waiting", +function(key, x) { +y <- data.frame(key, max(x$eruptions)) +}, +schema) +head(collect(arrange(result, "max_eruption", decreasing = TRUE))) + +##waiting max_eruption +##1 64 5.100
spark git commit: [SPARKR][DOCS] fix broken url in doc
Repository: spark Updated Branches: refs/heads/master 7ea6d282b -> b73defdd7 [SPARKR][DOCS] fix broken url in doc ## What changes were proposed in this pull request? Fix broken url, also, sparkR.session.stop doc page should have it in the header, instead of saying "sparkR.stop" ![image](https://cloud.githubusercontent.com/assets/8969467/17080129/26d41308-50d9-11e6-8967-79d6c920313f.png) Data type section is in the middle of a list of gapply/gapplyCollect subsections: ![image](https://cloud.githubusercontent.com/assets/8969467/17080122/f992d00a-50d8-11e6-8f2c-fd5786213920.png) ## How was this patch tested? manual test Author: Felix CheungCloses #14329 from felixcheung/rdoclinkfix. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/b73defdd Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/b73defdd Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/b73defdd Branch: refs/heads/master Commit: b73defdd790cb823a4f9958ca89cec06fd198051 Parents: 7ea6d28 Author: Felix Cheung Authored: Mon Jul 25 11:25:41 2016 -0700 Committer: Shivaram Venkataraman Committed: Mon Jul 25 11:25:41 2016 -0700 -- R/pkg/R/DataFrame.R | 2 +- R/pkg/R/sparkR.R| 16 +++ docs/sparkr.md | 107 +++ 3 files changed, 62 insertions(+), 63 deletions(-) -- http://git-wip-us.apache.org/repos/asf/spark/blob/b73defdd/R/pkg/R/DataFrame.R -- diff --git a/R/pkg/R/DataFrame.R b/R/pkg/R/DataFrame.R index 2e99aa0..a473331 100644 --- a/R/pkg/R/DataFrame.R +++ b/R/pkg/R/DataFrame.R @@ -35,7 +35,7 @@ setOldClass("structType") #' @slot env An R environment that stores bookkeeping states of the SparkDataFrame #' @slot sdf A Java object reference to the backing Scala DataFrame #' @seealso \link{createDataFrame}, \link{read.json}, \link{table} -#' @seealso \url{https://spark.apache.org/docs/latest/sparkr.html#sparkdataframe} +#' @seealso \url{https://spark.apache.org/docs/latest/sparkr.html#sparkr-dataframes} #' @export #' @examples #'\dontrun{ http://git-wip-us.apache.org/repos/asf/spark/blob/b73defdd/R/pkg/R/sparkR.R -- diff --git a/R/pkg/R/sparkR.R b/R/pkg/R/sparkR.R index ff5297f..524f7c4 100644 --- a/R/pkg/R/sparkR.R +++ b/R/pkg/R/sparkR.R @@ -28,14 +28,6 @@ connExists <- function(env) { }) } -#' @rdname sparkR.session.stop -#' @name sparkR.stop -#' @export -#' @note sparkR.stop since 1.4.0 -sparkR.stop <- function() { - sparkR.session.stop() -} - #' Stop the Spark Session and Spark Context #' #' Stop the Spark Session and Spark Context. @@ -90,6 +82,14 @@ sparkR.session.stop <- function() { clearJobjs() } +#' @rdname sparkR.session.stop +#' @name sparkR.stop +#' @export +#' @note sparkR.stop since 1.4.0 +sparkR.stop <- function() { + sparkR.session.stop() +} + #' (Deprecated) Initialize a new Spark Context #' #' This function initializes a new SparkContext. http://git-wip-us.apache.org/repos/asf/spark/blob/b73defdd/docs/sparkr.md -- diff --git a/docs/sparkr.md b/docs/sparkr.md index dfa5278..4bbc362 100644 --- a/docs/sparkr.md +++ b/docs/sparkr.md @@ -322,8 +322,59 @@ head(ldf, 3) Apply a function to each group of a `SparkDataFrame`. The function is to be applied to each group of the `SparkDataFrame` and should have only two parameters: grouping key and R `data.frame` corresponding to that key. The groups are chosen from `SparkDataFrame`s column(s). The output of function should be a `data.frame`. Schema specifies the row format of the resulting -`SparkDataFrame`. It must represent R function's output schema on the basis of Spark data types. The column names of the returned `data.frame` are set by user. Below is the data type mapping between R -and Spark. +`SparkDataFrame`. It must represent R function's output schema on the basis of Spark [data types](#data-type-mapping-between-r-and-spark). The column names of the returned `data.frame` are set by user. + + +{% highlight r %} + +# Determine six waiting times with the largest eruption time in minutes. +schema <- structType(structField("waiting", "double"), structField("max_eruption", "double")) +result <- gapply( +df, +"waiting", +function(key, x) { +y <- data.frame(key, max(x$eruptions)) +}, +schema) +head(collect(arrange(result, "max_eruption", decreasing = TRUE))) + +##waiting max_eruption +##1 64 5.100 +##2 69 5.067 +##3 71 5.033 +##4 87 5.000 +##5 63 4.933 +##6 89 4.900 +{% endhighlight %}