[jira] [Commented] (ARROW-18242) [R] arrow implementation of lubridate::dmy parses invalid date "00001976" as date

Nicola Crane (Jira) Fri, 04 Nov 2022 06:07:04 -0700


    [ 
https://issues.apache.org/jira/browse/ARROW-18242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17628963#comment-17628963
 ]


Nicola Crane commented on ARROW-18242:
--------------------------------------

I can't replicate this with 10.0.0 either.  FWIW I re-organised the code a bit 
here as the {{mutate()}} call there gave me an error.


{code:r}
library(dplyr)
library(data.table)
library(lubridate)
library(arrow)

'00001976' %>% dmy()

#In arrow
q <- data.table(x=c('00001976','30111976','01011976'))
q %>% write_dataset('q')

q2 <- open_dataset('q') %>% mutate(x2=dmy(x)) %>% collect()
q2
{code}

[~lucasmation] If you run the code in the way I've rewritten it above, do you 
get anything different?  Also, can you confirm which version of Arrow you are 
using? You can use {{arrow::arrow_info()}} to find it if you're not sure.

> [R] arrow implementation of lubridate::dmy parses invalid date "00001976" as 
> date
> ---------------------------------------------------------------------------------
>
>                 Key: ARROW-18242
>                 URL: https://issues.apache.org/jira/browse/ARROW-18242
>             Project: Apache Arrow
>          Issue Type: Bug
>            Reporter: Lucas Mation
>            Priority: Major
>
> Sorry for so many issues, but I think this is another bug.
> Wrong behavior of the arrow implementation of the  `lubridate::dmy`.
> An invalid date such as '00001976' is being parsed as a valid (and completely 
> unrelated) date.
> #in R
> '00001976' %>% dmy
> [1] NA
> Warning message:
>   All formats failed to parse. No formats found. 
> #In arrow
> q <- data.table(x=c('00001976','30111976','01011976'))
> q %>% write_dataset('q')
> q2 <- 'q' %>% open_dataset %>% mutate(x2=dmy) %>% collect
> q2
> x
> 1: 1975-11-30
> 2: 1976-11-30
> 3: 1976-01-01
> #notice '00001976' is an invalid date. First row of x2 should be NA!!!
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (ARROW-18242) [R] arrow implementation of lubridate::dmy parses invalid date "00001976" as date

Reply via email to