[GitHub] spark pull request #21255: [SPARK-24186][R][SQL]change reverse and concat to...

2018-05-11 Thread huaxingao
Github user huaxingao closed the pull request at:

https://github.com/apache/spark/pull/21255


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21255: [SPARK-24186][R][SQL]change reverse and concat to...

2018-05-10 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/21255#discussion_r187515108
  
--- Diff: R/pkg/R/functions.R ---
@@ -1256,7 +1259,7 @@ setMethod("quarter",
 #' @details
 #' \code{reverse}: Reverses the string column and returns it as a new 
string column.
--- End diff --

Don't we need to update this doc that adds the fact it supports array now?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21255: [SPARK-24186][R][SQL]change reverse and concat to...

2018-05-10 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/21255#discussion_r187515152
  
--- Diff: R/pkg/R/functions.R ---
@@ -2047,18 +2050,8 @@ setMethod("countDistinct",
 #' \code{concat}: Concatenates multiple input columns together into a 
single column.
 #' If all inputs are binary, concat returns an output as binary. 
Otherwise, it returns as string.
--- End diff --

ditto.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21255: [SPARK-24186][R][SQL]change reverse and concat to...

2018-05-10 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/21255#discussion_r187380148
  
--- Diff: R/pkg/tests/fulltests/test_sparkSQL.R ---
@@ -1502,12 +1502,25 @@ test_that("column functions", {
   result <- collect(select(df, sort_array(df[[1]])))[[1]]
   expect_equal(result, list(list(1L, 2L, 3L), list(4L, 5L, 6L)))
 
-  # Test flattern
+  result <- collect(select(df, reverse(df[[1]])))[[1]]
+  expect_equal(result, list(list(3L, 2L, 1L), list(4L, 5L, 6L)))
+
+  df2 <- createDataFrame(list(list("abc")))
+  result <- collect(select(df2, reverse(df2[[1]])))[[1]]
+  expect_equal(result, "cba")
+
+  # Test flattern()
--- End diff --

`flattern` -> `flatten`. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21255: [SPARK-24186][R][SQL]change reverse and concat to...

2018-05-10 Thread huaxingao
Github user huaxingao commented on a diff in the pull request:

https://github.com/apache/spark/pull/21255#discussion_r187262627
  
--- Diff: R/pkg/R/functions.R ---
@@ -219,7 +219,8 @@ NULL
 #' head(select(tmp3, map_values(tmp3$v3)))
 #' head(select(tmp3, element_at(tmp3$v3, "Valiant")))
 #' tmp4 <- mutate(df, v4 = create_array(df$mpg, df$cyl), v5 = 
create_array(df$hp))
-#' head(select(tmp4, concat(tmp4$v4, tmp4$v5)))}
+#' head(select(tmp4, concat(tmp4$v4, tmp4$v5)))
+#' concat(df$mpg, df$cyl, df$hp)}
--- End diff --

@felixcheung 
Yes, the numeric columns work OK with concat. I tested and it has the 
following
++
|concat(mpg, cyl, hp)|
++
|21.06.0110.0|
|21.06.0110.0|
Is it OK with you to have the example as ```head(select(tmp, concat(df$mpg, 
df$cyl, df$hp)))``` ? Somehow, the test failed with ```head(mutate(df, s1 = 
concat(df$mpg, df$cyl, df$hp))```


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21255: [SPARK-24186][R][SQL]change reverse and concat to...

2018-05-09 Thread felixcheung
Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/21255#discussion_r187238518
  
--- Diff: R/pkg/R/functions.R ---
@@ -219,7 +219,8 @@ NULL
 #' head(select(tmp3, map_values(tmp3$v3)))
 #' head(select(tmp3, element_at(tmp3$v3, "Valiant")))
 #' tmp4 <- mutate(df, v4 = create_array(df$mpg, df$cyl), v5 = 
create_array(df$hp))
-#' head(select(tmp4, concat(tmp4$v4, tmp4$v5)))}
+#' head(select(tmp4, concat(tmp4$v4, tmp4$v5)))
+#' concat(df$mpg, df$cyl, df$hp)}
--- End diff --

I'd perhaps do this as
```
tmp5 <- mutate(df, s1 = concat(df$mpg, df$cyl, df$hp)
head(tmp5)
```

or

```
head(mutate(df, s1 = concat(df$mpg, df$cyl, df$hp))
```

btw, aren't these numeric columns? does that work with concat?



---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21255: [SPARK-24186][R][SQL]change reverse and concat to...

2018-05-09 Thread felixcheung
Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/21255#discussion_r186942863
  
--- Diff: R/pkg/R/functions.R ---
@@ -2047,17 +2049,15 @@ setMethod("countDistinct",
 #' \code{concat}: Concatenates multiple input columns together into a 
single column.
 #' If all inputs are binary, concat returns an output as binary. 
Otherwise, it returns as string.
 #'
-#' @rdname column_string_functions
+#' @rdname column_collection_functions
 #' @aliases concat concat,Column-method
 #' @examples
 #'
 #' \dontrun{
 #' # concatenate strings
 #' tmp <- mutate(df, s1 = concat(df$Class, df$Sex),
 #'   s2 = concat(df$Class, df$Sex, df$Age),
--- End diff --

for background, we want to keep example runnable as much as possible..


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21255: [SPARK-24186][R][SQL]change reverse and concat to...

2018-05-09 Thread felixcheung
Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/21255#discussion_r186942776
  
--- Diff: R/pkg/R/functions.R ---
@@ -2047,17 +2049,15 @@ setMethod("countDistinct",
 #' \code{concat}: Concatenates multiple input columns together into a 
single column.
 #' If all inputs are binary, concat returns an output as binary. 
Otherwise, it returns as string.
 #'
-#' @rdname column_string_functions
+#' @rdname column_collection_functions
 #' @aliases concat concat,Column-method
 #' @examples
 #'
 #' \dontrun{
 #' # concatenate strings
 #' tmp <- mutate(df, s1 = concat(df$Class, df$Sex),
 #'   s2 = concat(df$Class, df$Sex, df$Age),
--- End diff --

here `df` is referencing 
https://github.com/huaxingao/spark/blob/b7b50b6ad11569c8e2c77f300a2c8cdd7064c959/R/pkg/R/functions.R#L141

since rdname is changed, df no longer work here. since you have added an 
example (perhaps add more? or just add a simple one like the ones here) 
https://github.com/apache/spark/pull/21255/files#diff-d97f9adc2dcac0703568c799ff106987R222
 let's remove the whole example block L 2054-2061 here, if we add one like 
`concat(df$Class, df$Sex, df$Age, df$Class)` (change to valid column names) to 
after L 222?



---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org