spark git commit: [SPARKR][DOCS] fix broken url in doc

2016-07-25 Thread shivaram
Repository: spark
Updated Branches:
  refs/heads/branch-2.0 57d65e511 -> d9bd066b9


[SPARKR][DOCS] fix broken url in doc

## What changes were proposed in this pull request?

Fix broken url, also,

sparkR.session.stop doc page should have it in the header, instead of saying 
"sparkR.stop"
![image](https://cloud.githubusercontent.com/assets/8969467/17080129/26d41308-50d9-11e6-8967-79d6c920313f.png)

Data type section is in the middle of a list of gapply/gapplyCollect 
subsections:
![image](https://cloud.githubusercontent.com/assets/8969467/17080122/f992d00a-50d8-11e6-8f2c-fd5786213920.png)

## How was this patch tested?

manual test

Author: Felix Cheung 

Closes #14329 from felixcheung/rdoclinkfix.

(cherry picked from commit b73defdd790cb823a4f9958ca89cec06fd198051)
Signed-off-by: Shivaram Venkataraman 


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/d9bd066b
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/d9bd066b
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/d9bd066b

Branch: refs/heads/branch-2.0
Commit: d9bd066b9f37cfd18037b9a600371d0342703c0f
Parents: 57d65e5
Author: Felix Cheung 
Authored: Mon Jul 25 11:25:41 2016 -0700
Committer: Shivaram Venkataraman 
Committed: Mon Jul 25 11:25:51 2016 -0700

--
 R/pkg/R/DataFrame.R |   2 +-
 R/pkg/R/sparkR.R|  16 +++
 docs/sparkr.md  | 107 +++
 3 files changed, 62 insertions(+), 63 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/d9bd066b/R/pkg/R/DataFrame.R
--
diff --git a/R/pkg/R/DataFrame.R b/R/pkg/R/DataFrame.R
index 92c10f1..aa211b3 100644
--- a/R/pkg/R/DataFrame.R
+++ b/R/pkg/R/DataFrame.R
@@ -35,7 +35,7 @@ setOldClass("structType")
 #' @slot env An R environment that stores bookkeeping states of the 
SparkDataFrame
 #' @slot sdf A Java object reference to the backing Scala DataFrame
 #' @seealso \link{createDataFrame}, \link{read.json}, \link{table}
-#' @seealso 
\url{https://spark.apache.org/docs/latest/sparkr.html#sparkdataframe}
+#' @seealso 
\url{https://spark.apache.org/docs/latest/sparkr.html#sparkr-dataframes}
 #' @export
 #' @examples
 #'\dontrun{

http://git-wip-us.apache.org/repos/asf/spark/blob/d9bd066b/R/pkg/R/sparkR.R
--
diff --git a/R/pkg/R/sparkR.R b/R/pkg/R/sparkR.R
index ff5297f..524f7c4 100644
--- a/R/pkg/R/sparkR.R
+++ b/R/pkg/R/sparkR.R
@@ -28,14 +28,6 @@ connExists <- function(env) {
   })
 }
 
-#' @rdname sparkR.session.stop
-#' @name sparkR.stop
-#' @export
-#' @note sparkR.stop since 1.4.0
-sparkR.stop <- function() {
-  sparkR.session.stop()
-}
-
 #' Stop the Spark Session and Spark Context
 #'
 #' Stop the Spark Session and Spark Context.
@@ -90,6 +82,14 @@ sparkR.session.stop <- function() {
   clearJobjs()
 }
 
+#' @rdname sparkR.session.stop
+#' @name sparkR.stop
+#' @export
+#' @note sparkR.stop since 1.4.0
+sparkR.stop <- function() {
+  sparkR.session.stop()
+}
+
 #' (Deprecated) Initialize a new Spark Context
 #'
 #' This function initializes a new SparkContext.

http://git-wip-us.apache.org/repos/asf/spark/blob/d9bd066b/docs/sparkr.md
--
diff --git a/docs/sparkr.md b/docs/sparkr.md
index dfa5278..4bbc362 100644
--- a/docs/sparkr.md
+++ b/docs/sparkr.md
@@ -322,8 +322,59 @@ head(ldf, 3)
 Apply a function to each group of a `SparkDataFrame`. The function is to be 
applied to each group of the `SparkDataFrame` and should have only two 
parameters: grouping key and R `data.frame` corresponding to
 that key. The groups are chosen from `SparkDataFrame`s column(s).
 The output of function should be a `data.frame`. Schema specifies the row 
format of the resulting
-`SparkDataFrame`. It must represent R function's output schema on the basis of 
Spark data types. The column names of the returned `data.frame` are set by 
user. Below is the data type mapping between R
-and Spark.
+`SparkDataFrame`. It must represent R function's output schema on the basis of 
Spark [data types](#data-type-mapping-between-r-and-spark). The column names of 
the returned `data.frame` are set by user.
+
+
+{% highlight r %}
+
+# Determine six waiting times with the largest eruption time in minutes.
+schema <- structType(structField("waiting", "double"), 
structField("max_eruption", "double"))
+result <- gapply(
+df,
+"waiting",
+function(key, x) {
+y <- data.frame(key, max(x$eruptions))
+},
+schema)
+head(collect(arrange(result, "max_eruption", decreasing = TRUE)))
+
+##waiting   max_eruption
+##1  64   5.100

spark git commit: [SPARKR][DOCS] fix broken url in doc

2016-07-25 Thread shivaram
Repository: spark
Updated Branches:
  refs/heads/master 7ea6d282b -> b73defdd7


[SPARKR][DOCS] fix broken url in doc

## What changes were proposed in this pull request?

Fix broken url, also,

sparkR.session.stop doc page should have it in the header, instead of saying 
"sparkR.stop"
![image](https://cloud.githubusercontent.com/assets/8969467/17080129/26d41308-50d9-11e6-8967-79d6c920313f.png)

Data type section is in the middle of a list of gapply/gapplyCollect 
subsections:
![image](https://cloud.githubusercontent.com/assets/8969467/17080122/f992d00a-50d8-11e6-8f2c-fd5786213920.png)

## How was this patch tested?

manual test

Author: Felix Cheung 

Closes #14329 from felixcheung/rdoclinkfix.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/b73defdd
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/b73defdd
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/b73defdd

Branch: refs/heads/master
Commit: b73defdd790cb823a4f9958ca89cec06fd198051
Parents: 7ea6d28
Author: Felix Cheung 
Authored: Mon Jul 25 11:25:41 2016 -0700
Committer: Shivaram Venkataraman 
Committed: Mon Jul 25 11:25:41 2016 -0700

--
 R/pkg/R/DataFrame.R |   2 +-
 R/pkg/R/sparkR.R|  16 +++
 docs/sparkr.md  | 107 +++
 3 files changed, 62 insertions(+), 63 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/b73defdd/R/pkg/R/DataFrame.R
--
diff --git a/R/pkg/R/DataFrame.R b/R/pkg/R/DataFrame.R
index 2e99aa0..a473331 100644
--- a/R/pkg/R/DataFrame.R
+++ b/R/pkg/R/DataFrame.R
@@ -35,7 +35,7 @@ setOldClass("structType")
 #' @slot env An R environment that stores bookkeeping states of the 
SparkDataFrame
 #' @slot sdf A Java object reference to the backing Scala DataFrame
 #' @seealso \link{createDataFrame}, \link{read.json}, \link{table}
-#' @seealso 
\url{https://spark.apache.org/docs/latest/sparkr.html#sparkdataframe}
+#' @seealso 
\url{https://spark.apache.org/docs/latest/sparkr.html#sparkr-dataframes}
 #' @export
 #' @examples
 #'\dontrun{

http://git-wip-us.apache.org/repos/asf/spark/blob/b73defdd/R/pkg/R/sparkR.R
--
diff --git a/R/pkg/R/sparkR.R b/R/pkg/R/sparkR.R
index ff5297f..524f7c4 100644
--- a/R/pkg/R/sparkR.R
+++ b/R/pkg/R/sparkR.R
@@ -28,14 +28,6 @@ connExists <- function(env) {
   })
 }
 
-#' @rdname sparkR.session.stop
-#' @name sparkR.stop
-#' @export
-#' @note sparkR.stop since 1.4.0
-sparkR.stop <- function() {
-  sparkR.session.stop()
-}
-
 #' Stop the Spark Session and Spark Context
 #'
 #' Stop the Spark Session and Spark Context.
@@ -90,6 +82,14 @@ sparkR.session.stop <- function() {
   clearJobjs()
 }
 
+#' @rdname sparkR.session.stop
+#' @name sparkR.stop
+#' @export
+#' @note sparkR.stop since 1.4.0
+sparkR.stop <- function() {
+  sparkR.session.stop()
+}
+
 #' (Deprecated) Initialize a new Spark Context
 #'
 #' This function initializes a new SparkContext.

http://git-wip-us.apache.org/repos/asf/spark/blob/b73defdd/docs/sparkr.md
--
diff --git a/docs/sparkr.md b/docs/sparkr.md
index dfa5278..4bbc362 100644
--- a/docs/sparkr.md
+++ b/docs/sparkr.md
@@ -322,8 +322,59 @@ head(ldf, 3)
 Apply a function to each group of a `SparkDataFrame`. The function is to be 
applied to each group of the `SparkDataFrame` and should have only two 
parameters: grouping key and R `data.frame` corresponding to
 that key. The groups are chosen from `SparkDataFrame`s column(s).
 The output of function should be a `data.frame`. Schema specifies the row 
format of the resulting
-`SparkDataFrame`. It must represent R function's output schema on the basis of 
Spark data types. The column names of the returned `data.frame` are set by 
user. Below is the data type mapping between R
-and Spark.
+`SparkDataFrame`. It must represent R function's output schema on the basis of 
Spark [data types](#data-type-mapping-between-r-and-spark). The column names of 
the returned `data.frame` are set by user.
+
+
+{% highlight r %}
+
+# Determine six waiting times with the largest eruption time in minutes.
+schema <- structType(structField("waiting", "double"), 
structField("max_eruption", "double"))
+result <- gapply(
+df,
+"waiting",
+function(key, x) {
+y <- data.frame(key, max(x$eruptions))
+},
+schema)
+head(collect(arrange(result, "max_eruption", decreasing = TRUE)))
+
+##waiting   max_eruption
+##1  64   5.100
+##2  69   5.067
+##3  71   5.033
+##4  87   5.000
+##5  63   4.933
+##6  89   4.900
+{% endhighlight %}