[
https://issues.apache.org/jira/browse/ARROW-13188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17369607#comment-17369607
]
Ian Cook commented on ARROW-13188:
----------------------------------
Dup of ARROW-12992?
> [R] [C++] Implement substr/str_sub for dplyr queries
> ----------------------------------------------------
>
> Key: ARROW-13188
> URL: https://issues.apache.org/jira/browse/ARROW-13188
> Project: Apache Arrow
> Issue Type: Bug
> Components: C++, R
> Affects Versions: 4.0.1
> Reporter: Mauricio 'Pachá' Vargas Sepúlveda
> Priority: Minor
>
> I would be highly desirable to be able to use (base) substr and/or (stringr)
> str_sub in dplyr queries, like
> {code:r}
> library(arrow)
> library(dplyr)
> library(stringr)
> # get animal products, year 20919
> open_dataset(
> "../cepii-datasets-arrow/parquet/baci_hs92",
> partitioning = c("year", "reporter_iso")
> ) %>%
> filter(
> year == 2019,
> str_sub(product_code, 1, 2) == "01"
> ) %>%
> collect()
> Error: Filter expression not supported for Arrow Datasets:
> str_sub(product_code, 1, 2) == "01"
> Call collect() first to pull data into R.
> {code}
> Of course, this needs implementation, but similar to ARROW-13107, points to
> an easier integration with dplyr.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)