[ 
https://issues.apache.org/jira/browse/ARROW-18241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lucas Mation updated ARROW-18241:
---------------------------------
    Description: 
I am importing a dataset with arrow, and then converting variable types. But I 
got an error message because the `arrow` implementation of `as.integer` can't 
handle empty strings (which is legal in base R). Is this a bug?
{code:r}
#In R
'' %>% as.integer()

[1] NA

 

#in arrow

q <- data.table(x=c('','1','2'))
q %>% write_dataset('q')
q2 <- 'q' %>% open_dataset %>% mutate(x=as.integer(x)) %>% collect

Error in `collect()`:
! Invalid: Failed to parse string: '' as a scalar of type int32
Run `rlang::last_error()` to see where the error occurred.
{code}
Update: tryed to preprocess x with `ifelse` but it also did not work.
{code:r}
paste0(p2,'/q') %>% open_dataset %>% mutate(x= ifelse(x=='',NA,x)) %>% 
mutate(x=as.integer(x)) %>% collect
Error in `collect()`:
! NotImplemented: Function 'if_else' has no kernel matching input types (bool, 
bool, string)
Run `rlang::last_error()` to see where the error occurred.
{code}

  was:
I am importing a dataset with arrow, and then converting variable types. But I 
got an error message because the `arrow` implementation of `as.integer` can't 
handle empty strings (which is legal in base R). Is this a bug?


{code:r}
#In R
'' %>% as.integer()

[1] NA

 

#in arrow

q <- data.table(x=c('','1','2'))
q %>% write_dataset(paste0(p2,'/q'))
q2 <- paste0(p2,'/q') %>% open_dataset %>% mutate(x=as.integer(x)) %>% collect

Error in `collect()`:
! Invalid: Failed to parse string: '' as a scalar of type int32
Run `rlang::last_error()` to see where the error occurred.
{code}

Update: tryed to preprocess x with `ifelse` but it also did not work.


{code:r}
paste0(p2,'/q') %>% open_dataset %>% mutate(x= ifelse(x=='',NA,x)) %>% 
mutate(x=as.integer(x)) %>% collect
Error in `collect()`:
! NotImplemented: Function 'if_else' has no kernel matching input types (bool, 
bool, string)
Run `rlang::last_error()` to see where the error occurred.
{code}


> [R] as.integer can't handdle empty character cels (ex c(''))
> ------------------------------------------------------------
>
>                 Key: ARROW-18241
>                 URL: https://issues.apache.org/jira/browse/ARROW-18241
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: R
>            Reporter: Lucas Mation
>            Priority: Major
>
> I am importing a dataset with arrow, and then converting variable types. But 
> I got an error message because the `arrow` implementation of `as.integer` 
> can't handle empty strings (which is legal in base R). Is this a bug?
> {code:r}
> #In R
> '' %>% as.integer()
> [1] NA
>  
> #in arrow
> q <- data.table(x=c('','1','2'))
> q %>% write_dataset('q')
> q2 <- 'q' %>% open_dataset %>% mutate(x=as.integer(x)) %>% collect
> Error in `collect()`:
> ! Invalid: Failed to parse string: '' as a scalar of type int32
> Run `rlang::last_error()` to see where the error occurred.
> {code}
> Update: tryed to preprocess x with `ifelse` but it also did not work.
> {code:r}
> paste0(p2,'/q') %>% open_dataset %>% mutate(x= ifelse(x=='',NA,x)) %>% 
> mutate(x=as.integer(x)) %>% collect
> Error in `collect()`:
> ! NotImplemented: Function 'if_else' has no kernel matching input types 
> (bool, bool, string)
> Run `rlang::last_error()` to see where the error occurred.
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to