[
https://issues.apache.org/jira/browse/ARROW-12992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17371744#comment-17371744
]
Mauricio 'PachĂĄ' Vargas SepĂșlveda commented on ARROW-12992:
-----------------------------------------------------------
ok to the moment
{code:r}
d %>%
mutate(
foo = "Hadley Wickham",
bar1 = str_sub(foo, 1, 6), # Hadley - OK
bar2 = str_sub(foo, end = 6), # Hadley - OK
bar3 = str_sub(foo, 8, 14), # Wickham - OK
bar4 = str_sub(foo, 8), # Wickham - BAD
bar5 = str_sub(foo, -1), # m - BAD
bar6 = str_sub(foo, -7), # Wickham - BAD
bar7 = str_sub(foo, end = -7) # Hadley W - BAD
) %>%
filter(
year == 2000
) %>%
select(bar1:bar7) %>%
collect()
{code}
> [R] bindings for substr(), substring(), str_sub()
> -------------------------------------------------
>
> Key: ARROW-12992
> URL: https://issues.apache.org/jira/browse/ARROW-12992
> Project: Apache Arrow
> Issue Type: New Feature
> Components: R
> Reporter: Neal Richardson
> Assignee: Nic Crane
> Priority: Major
> Labels: pull-request-available
> Fix For: 5.0.0
>
> Time Spent: 20m
> Remaining Estimate: 0h
>
> Followup to ARROW-10557, which implemented the C++
> current state:
> {code:r}
> library(arrow)
> library(dplyr)
> library(stringr)
> # get animal products, year 20919
> open_dataset(
> "../cepii-datasets-arrow/parquet/baci_hs92",
> partitioning = c("year", "reporter_iso")
> ) %>%
> filter(
> year == 2019,
> str_sub(product_code, 1, 2) == "01"
> ) %>%
> collect()
> Error: Filter expression not supported for Arrow Datasets:
> str_sub(product_code, 1, 2) == "01"
> Call collect() first to pull data into R.
> {code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)