[GitHub] spark pull request #21255: [SPARK-24186][R][SQL]change reverse and concat to...
Github user huaxingao closed the pull request at: https://github.com/apache/spark/pull/21255 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21255: [SPARK-24186][R][SQL]change reverse and concat to...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/21255#discussion_r187515108 --- Diff: R/pkg/R/functions.R --- @@ -1256,7 +1259,7 @@ setMethod("quarter", #' @details #' \code{reverse}: Reverses the string column and returns it as a new string column. --- End diff -- Don't we need to update this doc that adds the fact it supports array now? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21255: [SPARK-24186][R][SQL]change reverse and concat to...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/21255#discussion_r187515152 --- Diff: R/pkg/R/functions.R --- @@ -2047,18 +2050,8 @@ setMethod("countDistinct", #' \code{concat}: Concatenates multiple input columns together into a single column. #' If all inputs are binary, concat returns an output as binary. Otherwise, it returns as string. --- End diff -- ditto. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21255: [SPARK-24186][R][SQL]change reverse and concat to...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/21255#discussion_r187380148 --- Diff: R/pkg/tests/fulltests/test_sparkSQL.R --- @@ -1502,12 +1502,25 @@ test_that("column functions", { result <- collect(select(df, sort_array(df[[1]])))[[1]] expect_equal(result, list(list(1L, 2L, 3L), list(4L, 5L, 6L))) - # Test flattern + result <- collect(select(df, reverse(df[[1]])))[[1]] + expect_equal(result, list(list(3L, 2L, 1L), list(4L, 5L, 6L))) + + df2 <- createDataFrame(list(list("abc"))) + result <- collect(select(df2, reverse(df2[[1]])))[[1]] + expect_equal(result, "cba") + + # Test flattern() --- End diff -- `flattern` -> `flatten`. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21255: [SPARK-24186][R][SQL]change reverse and concat to...
Github user huaxingao commented on a diff in the pull request: https://github.com/apache/spark/pull/21255#discussion_r187262627 --- Diff: R/pkg/R/functions.R --- @@ -219,7 +219,8 @@ NULL #' head(select(tmp3, map_values(tmp3$v3))) #' head(select(tmp3, element_at(tmp3$v3, "Valiant"))) #' tmp4 <- mutate(df, v4 = create_array(df$mpg, df$cyl), v5 = create_array(df$hp)) -#' head(select(tmp4, concat(tmp4$v4, tmp4$v5)))} +#' head(select(tmp4, concat(tmp4$v4, tmp4$v5))) +#' concat(df$mpg, df$cyl, df$hp)} --- End diff -- @felixcheung Yes, the numeric columns work OK with concat. I tested and it has the following ++ |concat(mpg, cyl, hp)| ++ |21.06.0110.0| |21.06.0110.0| Is it OK with you to have the example as ```head(select(tmp, concat(df$mpg, df$cyl, df$hp)))``` ? Somehow, the test failed with ```head(mutate(df, s1 = concat(df$mpg, df$cyl, df$hp))``` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21255: [SPARK-24186][R][SQL]change reverse and concat to...
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/21255#discussion_r187238518 --- Diff: R/pkg/R/functions.R --- @@ -219,7 +219,8 @@ NULL #' head(select(tmp3, map_values(tmp3$v3))) #' head(select(tmp3, element_at(tmp3$v3, "Valiant"))) #' tmp4 <- mutate(df, v4 = create_array(df$mpg, df$cyl), v5 = create_array(df$hp)) -#' head(select(tmp4, concat(tmp4$v4, tmp4$v5)))} +#' head(select(tmp4, concat(tmp4$v4, tmp4$v5))) +#' concat(df$mpg, df$cyl, df$hp)} --- End diff -- I'd perhaps do this as ``` tmp5 <- mutate(df, s1 = concat(df$mpg, df$cyl, df$hp) head(tmp5) ``` or ``` head(mutate(df, s1 = concat(df$mpg, df$cyl, df$hp)) ``` btw, aren't these numeric columns? does that work with concat? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21255: [SPARK-24186][R][SQL]change reverse and concat to...
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/21255#discussion_r186942863 --- Diff: R/pkg/R/functions.R --- @@ -2047,17 +2049,15 @@ setMethod("countDistinct", #' \code{concat}: Concatenates multiple input columns together into a single column. #' If all inputs are binary, concat returns an output as binary. Otherwise, it returns as string. #' -#' @rdname column_string_functions +#' @rdname column_collection_functions #' @aliases concat concat,Column-method #' @examples #' #' \dontrun{ #' # concatenate strings #' tmp <- mutate(df, s1 = concat(df$Class, df$Sex), #' s2 = concat(df$Class, df$Sex, df$Age), --- End diff -- for background, we want to keep example runnable as much as possible.. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21255: [SPARK-24186][R][SQL]change reverse and concat to...
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/21255#discussion_r186942776 --- Diff: R/pkg/R/functions.R --- @@ -2047,17 +2049,15 @@ setMethod("countDistinct", #' \code{concat}: Concatenates multiple input columns together into a single column. #' If all inputs are binary, concat returns an output as binary. Otherwise, it returns as string. #' -#' @rdname column_string_functions +#' @rdname column_collection_functions #' @aliases concat concat,Column-method #' @examples #' #' \dontrun{ #' # concatenate strings #' tmp <- mutate(df, s1 = concat(df$Class, df$Sex), #' s2 = concat(df$Class, df$Sex, df$Age), --- End diff -- here `df` is referencing https://github.com/huaxingao/spark/blob/b7b50b6ad11569c8e2c77f300a2c8cdd7064c959/R/pkg/R/functions.R#L141 since rdname is changed, df no longer work here. since you have added an example (perhaps add more? or just add a simple one like the ones here) https://github.com/apache/spark/pull/21255/files#diff-d97f9adc2dcac0703568c799ff106987R222 let's remove the whole example block L 2054-2061 here, if we add one like `concat(df$Class, df$Sex, df$Age, df$Class)` (change to valid column names) to after L 222? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org