[ 
https://issues.apache.org/jira/browse/ARROW-18242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17628973#comment-17628973
 ] 

Lucas Mation commented on ARROW-18242:
--------------------------------------

Weird. I can replicate the error both on arrow-10.0.0  (in the server) and 
arrow-dev-nightly-build (in my PC). Both are windows machines

 

#For the server:

``` 

#For the server:

arrow::arrow_info()
Arrow package version: 10.0.0

Capabilities:
               
dataset    TRUE
substrait FALSE
parquet    TRUE
json       TRUE
s3         TRUE
gcs        TRUE
utf8proc   TRUE
re2        TRUE
snappy     TRUE
gzip       TRUE
brotli     TRUE
zstd       TRUE
lz4        TRUE
lz4_frame  TRUE
lzo       FALSE
bz2        TRUE
jemalloc  FALSE
mimalloc   TRUE

Arrow options():
                       
arrow.use_threads FALSE

Memory:
                  
Allocator mimalloc
Current    1.12 Kb
Max       25.77 Kb

Runtime:
                        
SIMD Level          avx2
Detected SIMD Level avx2

Build:
                                                             
C++ Library Version                                    10.0.0
C++ Compiler                                              GNU
C++ Compiler Version                                   10.3.0
Git ID               aa7118b6e5f49b354fa8a93d9cf363c9ebe9a3f0

``` 

On my PC

``` 

#For the PC:

arrow_info()
Arrow package version: 10.0.0.100000050

Capabilities:
               
dataset    TRUE
substrait FALSE
parquet    TRUE
json       TRUE
s3         TRUE
gcs        TRUE
utf8proc   TRUE
re2        TRUE
snappy     TRUE
gzip       TRUE
brotli     TRUE
zstd       TRUE
lz4        TRUE
lz4_frame  TRUE
lzo       FALSE
bz2        TRUE
jemalloc  FALSE
mimalloc   TRUE

Arrow options():
                       
arrow.use_threads FALSE

Memory:
                   
Allocator  mimalloc
Current   128 bytes
Max        25.52 Kb

Runtime:
                        
SIMD Level          avx2
Detected SIMD Level avx2

Build:
                                                             
C++ Library Version                           11.0.0-SNAPSHOT
C++ Compiler                                              GNU
C++ Compiler Version                                   10.3.0
Git ID               5e53978b56aa13f9c033f2e849cc22f2aed6e2d3

 

``` 

 

 

#For the server:

 

 

> [R] arrow implementation of lubridate::dmy parses invalid date "00001976" as 
> date
> ---------------------------------------------------------------------------------
>
>                 Key: ARROW-18242
>                 URL: https://issues.apache.org/jira/browse/ARROW-18242
>             Project: Apache Arrow
>          Issue Type: Bug
>            Reporter: Lucas Mation
>            Priority: Major
>
> Sorry for so many issues, but I think this is another bug.
> Wrong behavior of the arrow implementation of the  `lubridate::dmy`.
> An invalid date such as '00001976' is being parsed as a valid (and completely 
> unrelated) date.
> #in R
> '00001976' %>% dmy
> [1] NA
> Warning message:
>   All formats failed to parse. No formats found. 
> #In arrow
> q <- data.table(x=c('00001976','30111976','01011976'))
> q %>% write_dataset('q')
> q2 <- 'q' %>% open_dataset %>% mutate(x2=dmy) %>% collect
> q2
> x
> 1: 1975-11-30
> 2: 1976-11-30
> 3: 1976-01-01
> #notice '00001976' is an invalid date. First row of x2 should be NA!!!
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to