[GitHub] spark pull request #21255: [SPARK-24186][SparR][SQL]change reverse and conca...

2018-05-08 Thread huaxingao
Github user huaxingao commented on a diff in the pull request:

https://github.com/apache/spark/pull/21255#discussion_r186635312
  
--- Diff: R/pkg/R/functions.R ---
@@ -1253,19 +1256,6 @@ setMethod("quarter",
 column(jc)
   })
 
-#' @details
-#' \code{reverse}: Reverses the string column and returns it as a new 
string column.
-#'
-#' @rdname column_string_functions
-#' @aliases reverse reverse,Column-method
-#' @note reverse since 1.5.0
-setMethod("reverse",
-  signature(x = "Column"),
-  function(x) {
-jc <- callJStatic("org.apache.spark.sql.functions", "reverse", 
x@jc)
-column(jc)
-  })
-
--- End diff --

I will put them back to the original places and add back the examples. 
Thanks!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21255: [SPARK-24186][SparR][SQL]change reverse and conca...

2018-05-08 Thread felixcheung
Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/21255#discussion_r186634471
  
--- Diff: R/pkg/R/functions.R ---
@@ -2043,34 +2033,6 @@ setMethod("countDistinct",
 column(jc)
   })
 
-#' @details
-#' \code{concat}: Concatenates multiple input columns together into a 
single column.
-#' If all inputs are binary, concat returns an output as binary. 
Otherwise, it returns as string.
-#'
-#' @rdname column_string_functions
-#' @aliases concat concat,Column-method
-#' @examples
-#'
-#' \dontrun{
-#' # concatenate strings
-#' tmp <- mutate(df, s1 = concat(df$Class, df$Sex),
-#'   s2 = concat(df$Class, df$Sex, df$Age),
-#'   s3 = concat(df$Class, df$Sex, df$Age, df$Class),
-#'   s4 = concat_ws("_", df$Class, df$Sex),
-#'   s5 = concat_ws("+", df$Class, df$Sex, df$Age, 
df$Survived))
--- End diff --

yes, let's not lose example.
we should add concat to the column_collection_functions
and concat_ws to column_string_functions


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21255: [SPARK-24186][SparR][SQL]change reverse and conca...

2018-05-08 Thread felixcheung
Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/21255#discussion_r186633965
  
--- Diff: R/pkg/R/functions.R ---
@@ -1253,19 +1256,6 @@ setMethod("quarter",
 column(jc)
   })
 
-#' @details
-#' \code{reverse}: Reverses the string column and returns it as a new 
string column.
-#'
-#' @rdname column_string_functions
-#' @aliases reverse reverse,Column-method
-#' @note reverse since 1.5.0
-setMethod("reverse",
-  signature(x = "Column"),
-  function(x) {
-jc <- callJStatic("org.apache.spark.sql.functions", "reverse", 
x@jc)
-column(jc)
-  })
-
--- End diff --

right, but that's not strictly where collection functions are.
I'm not saying we need to re-sort/group all functions - that will be 
thousands of lines we need to change


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21255: [SPARK-24186][SparR][SQL]change reverse and conca...

2018-05-08 Thread huaxingao
Github user huaxingao commented on a diff in the pull request:

https://github.com/apache/spark/pull/21255#discussion_r186631550
  
--- Diff: R/pkg/R/functions.R ---
@@ -2043,34 +2033,6 @@ setMethod("countDistinct",
 column(jc)
   })
 
-#' @details
-#' \code{concat}: Concatenates multiple input columns together into a 
single column.
-#' If all inputs are binary, concat returns an output as binary. 
Otherwise, it returns as string.
-#'
-#' @rdname column_string_functions
-#' @aliases concat concat,Column-method
-#' @examples
-#'
-#' \dontrun{
-#' # concatenate strings
-#' tmp <- mutate(df, s1 = concat(df$Class, df$Sex),
-#'   s2 = concat(df$Class, df$Sex, df$Age),
-#'   s3 = concat(df$Class, df$Sex, df$Age, df$Class),
-#'   s4 = concat_ws("_", df$Class, df$Sex),
-#'   s5 = concat_ws("+", df$Class, df$Sex, df$Age, 
df$Survived))
--- End diff --

Since now I put concat in the Collection functions section, I am not sure 
if I should keep the column_string_functions examples there. I will add them 
back if you prefer to keep these examples. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21255: [SPARK-24186][SparR][SQL]change reverse and conca...

2018-05-08 Thread huaxingao
Github user huaxingao commented on a diff in the pull request:

https://github.com/apache/spark/pull/21255#discussion_r186630904
  
--- Diff: R/pkg/R/functions.R ---
@@ -1253,19 +1256,6 @@ setMethod("quarter",
 column(jc)
   })
 
-#' @details
-#' \code{reverse}: Reverses the string column and returns it as a new 
string column.
-#'
-#' @rdname column_string_functions
-#' @aliases reverse reverse,Column-method
-#' @note reverse since 1.5.0
-setMethod("reverse",
-  signature(x = "Column"),
-  function(x) {
-jc <- callJStatic("org.apache.spark.sql.functions", "reverse", 
x@jc)
-column(jc)
-  })
-
--- End diff --

@felixcheung Thanks for your comment. The reason I moved reverse and concat 
around is that I want to put them under line 2943 for Collection functions. 
## Collection functions##


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21255: [SPARK-24186][SparR][SQL]change reverse and conca...

2018-05-08 Thread felixcheung
Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/21255#discussion_r186624915
  
--- Diff: R/pkg/R/functions.R ---
@@ -2043,34 +2033,6 @@ setMethod("countDistinct",
 column(jc)
   })
 
-#' @details
-#' \code{concat}: Concatenates multiple input columns together into a 
single column.
-#' If all inputs are binary, concat returns an output as binary. 
Otherwise, it returns as string.
-#'
-#' @rdname column_string_functions
-#' @aliases concat concat,Column-method
-#' @examples
-#'
-#' \dontrun{
-#' # concatenate strings
-#' tmp <- mutate(df, s1 = concat(df$Class, df$Sex),
-#'   s2 = concat(df$Class, df$Sex, df$Age),
-#'   s3 = concat(df$Class, df$Sex, df$Age, df$Class),
-#'   s4 = concat_ws("_", df$Class, df$Sex),
-#'   s5 = concat_ws("+", df$Class, df$Sex, df$Age, 
df$Survived))
--- End diff --

also what happen to all other examples?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21255: [SPARK-24186][SparR][SQL]change reverse and conca...

2018-05-08 Thread felixcheung
Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/21255#discussion_r186624766
  
--- Diff: R/pkg/R/functions.R ---
@@ -1253,19 +1256,6 @@ setMethod("quarter",
 column(jc)
   })
 
-#' @details
-#' \code{reverse}: Reverses the string column and returns it as a new 
string column.
-#'
-#' @rdname column_string_functions
-#' @aliases reverse reverse,Column-method
-#' @note reverse since 1.5.0
-setMethod("reverse",
-  signature(x = "Column"),
-  function(x) {
-jc <- callJStatic("org.apache.spark.sql.functions", "reverse", 
x@jc)
-column(jc)
-  })
-
--- End diff --

so these func are not actually ordered or grouped - my preference would be 
not to move then around unless we have to?



---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21255: [SPARK-24186][SparR][SQL]change reverse and conca...

2018-05-08 Thread felixcheung
Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/21255#discussion_r186624379
  
--- Diff: R/pkg/R/functions.R ---
@@ -209,6 +209,7 @@ NULL
 #' head(select(tmp, array_max(tmp$v1), array_min(tmp$v1)))
 #' head(select(tmp, array_position(tmp$v1, 21)))
 #' head(select(tmp, flatten(tmp$v1)))
+#' head(select(tmp, reverse(tmp$v1)))
--- End diff --

let's merge into an existing line - we don't need one line per function; 
don't want these too be too long each line but also not too many lines either


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21255: [SPARK-24186][SparR][SQL]change reverse and conca...

2018-05-07 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/21255#discussion_r186612820
  
--- Diff: R/pkg/tests/fulltests/test_sparkSQL.R ---
@@ -1739,6 +1748,13 @@ test_that("string operators", {
 collect(select(df5, repeat_string(df5$a, -1)))[1, 1],
 ""
   )
+
+  l6 <- list(list(a = "abc"))
+  df6 <- createDataFrame(l6)
+  expect_equal(
+collect(select(df6, reverse(df6$a)))[1, 1],
+"cba"
+  )
--- End diff --

Let's make this inlined while we are here. I would also put this test 
around 1505L above.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21255: [SPARK-24186][SparR][SQL]change reverse and conca...

2018-05-07 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/21255#discussion_r186612845
  
--- Diff: R/pkg/tests/fulltests/test_sparkSQL.R ---
@@ -1739,6 +1748,13 @@ test_that("string operators", {
 collect(select(df5, repeat_string(df5$a, -1)))[1, 1],
 ""
   )
+
+  l6 <- list(list(a = "abc"))
+  df6 <- createDataFrame(l6)
--- End diff --

I would combine those two lines too.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21255: [SPARK-24186][SparR][SQL]change reverse and conca...

2018-05-07 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/21255#discussion_r186610290
  
--- Diff: R/pkg/R/functions.R ---
@@ -218,6 +219,8 @@ NULL
 #' head(select(tmp3, map_keys(tmp3$v3)))
 #' head(select(tmp3, map_values(tmp3$v3)))
 #' head(select(tmp3, element_at(tmp3$v3, "Valiant")))}
--- End diff --

The right `}` for this `dontrun` block is wrongly put here.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21255: [SPARK-24186][SparR][SQL]change reverse and conca...

2018-05-07 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/21255#discussion_r186604950
  
--- Diff: R/pkg/tests/fulltests/test_sparkSQL.R ---
@@ -1748,6 +1748,14 @@ test_that("string operators", {
 collect(select(df5, repeat_string(df5$a, -1)))[1, 1],
 ""
   )
+
+  l6 <- list(list(a = "abc"))
+  df6 <- createDataFrame(l6)
+  df7 <- select(df6, reverse(df6$a))
--- End diff --

`df7` is not used below?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21255: [SPARK-24186][SparR][SQL]change reverse and conca...

2018-05-07 Thread huaxingao
Github user huaxingao commented on a diff in the pull request:

https://github.com/apache/spark/pull/21255#discussion_r186604721
  
--- Diff: R/pkg/tests/fulltests/test_sparkSQL.R ---
@@ -1502,12 +1502,21 @@ test_that("column functions", {
   result <- collect(select(df, sort_array(df[[1]])))[[1]]
   expect_equal(result, list(list(1L, 2L, 3L), list(4L, 5L, 6L)))
 
-  # Test flattern
+  result <- collect(select(df, reverse(df[[1]])))[[1]]
--- End diff --

@viirya Thank for your comments. I will make changes. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21255: [SPARK-24186][SparR][SQL]change reverse and conca...

2018-05-07 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/21255#discussion_r186368498
  
--- Diff: R/pkg/tests/fulltests/test_sparkSQL.R ---
@@ -1502,12 +1502,21 @@ test_that("column functions", {
   result <- collect(select(df, sort_array(df[[1]])))[[1]]
   expect_equal(result, list(list(1L, 2L, 3L), list(4L, 5L, 6L)))
 
-  # Test flattern
+  result <- collect(select(df, reverse(df[[1]])))[[1]]
--- End diff --

Seems we don't have test for `reverse` for string. Can you add one for it 
too?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21255: [SPARK-24186][SparR][SQL]change reverse and conca...

2018-05-07 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/21255#discussion_r186369103
  
--- Diff: R/pkg/R/functions.R ---
@@ -209,6 +209,7 @@ NULL
 #' head(select(tmp, array_max(tmp$v1), array_min(tmp$v1)))
 #' head(select(tmp, array_position(tmp$v1, 21)))
 #' head(select(tmp, flatten(tmp$v1)))
+#' head(select(tmp, reverse(tmp$v1)))
--- End diff --

Also add `concat` here?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21255: [SPARK-24186][SparR][SQL]change reverse and conca...

2018-05-07 Thread huaxingao
GitHub user huaxingao opened a pull request:

https://github.com/apache/spark/pull/21255

[SPARK-24186][SparR][SQL]change reverse and concat to collection functions 
in R

## What changes were proposed in this pull request?

reverse and concat are already in functions.R as column string functions. 
Since now these two functions are categorized  as collection functions in scala 
and python, we will do the same in R. 

## How was this patch tested?

Add test in test_sparkSQL.R


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/huaxingao/spark spark-24186

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/21255.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #21255


commit 3985285089673e42a85a5d1ba3cd7419a6948909
Author: Huaxin Gao 
Date:   2018-05-07T06:47:34Z

[SPARK-24186][SparR][SQL]add array reverse and concat functions to R




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org