ianmcook commented on a change in pull request #10190:
URL: https://github.com/apache/arrow/pull/10190#discussion_r626022790
##########
File path: r/tests/testthat/test-dplyr-string-functions.R
##########
@@ -239,6 +239,53 @@ test_that("str_replace and str_replace_all", {
})
+test_that("strsplit and str_split", {
+ df <- tibble(x = c("Foo and bar", "baz and qux and quux"))
+
+ expect_dplyr_equal(
+ input %>%
+ transmute(x = strsplit(x, "and")) %>%
+ collect(),
+ df
+ )
+
+ expect_warning(
+ df %>%
+ Table$create() %>%
+ mutate(x = strsplit(x, "and.*", fixed = FALSE)) %>%
+ collect(),
+ regexp = "not supported in Arrow"
+ )
+
+ expect_dplyr_equal(
+ input %>%
+ mutate(x = strsplit(x, "and.*", fixed = TRUE)) %>%
+ collect(),
+ df
+ )
+
+ expect_dplyr_equal(
+ input %>%
+ transmute(x = str_split(x, "and")) %>%
+ collect(),
+ df
+ )
+
+ expect_dplyr_equal(
+ input %>%
+ transmute(x = str_split(x, "and", n = 2)) %>%
+ collect(),
+ df
+ )
+
+ expect_warning(
+ df %>%
+ Table$create() %>%
+ mutate(x = str_split(x, "and.?")) %>%
+ collect()
+ )
+})
+
Review comment:
Please add some tests here that exercise `str_split()` with the stringr
pattern modifier functions:
- `fixed(pattern)` (should succeed)
- `regex(pattern)` with a non-regex pattern (should succeed)
- `regex(pattern)` with an actual regex pattern (should error)
- `fixed(pattern, ignore_case = TRUE)` with a non-regex pattern (should
error because of the "Case-insensitive string splitting" code I added)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]