Marc Colosimo created ARROW-14509:
-------------------------------------

             Summary: as_vector() downgrades int64 even when 
arrow.int64_downcast = TRUE
                 Key: ARROW-14509
                 URL: https://issues.apache.org/jira/browse/ARROW-14509
             Project: Apache Arrow
          Issue Type: Bug
          Components: R
    Affects Versions: 5.0.0
         Environment: linux
            Reporter: Marc Colosimo


Using as_vector() on a Table or Array when the type is Int64 and 
arrow.int64_downcast = TRUE still downgrades, unless there is a value greater 
than Int32 can store (actually it switches over at some lower value; guessing 
the integer to numeric switch over in R).
{quote}library(arrow)
 options(arrow.int64_downcast = TRUE)
 int64s <- bit64::as.integer64(c(1:10, 101:110, 201:205))
 y <- Array$create(int64s)
 y$type
 yv <- y$as_vector()
 class(yv)
 int64s <- c(int64s, bit64::as.integer64("68719476735")) # 0xF FFFFFFFF
 y <- Array$create(int64s)
 y$type
 yv <- y$as_vector()
 class(yv)
{quote}
Outputs:
{quote}Int64
 int64
 [1] "integer"
 Int64
 int64
 [1] "integer64"
{quote}
This can cause an unexpected overflow
{quote}int64s <- bit64::as.integer64(c(1:10, 101:110, 201:205, 268435454, 
2147483632)) # 0xFFFFFFE, 0x7FFFFFF0
 cumsum(int64s)
 y <- Array$create(int64s)
 y$type
 yv <- y$as_vector()
 class(yv)
 cumsum(yv)
{quote}
as shown in the second cumsum
{quote}integer64
 [1] 1 3 6 10 15 21 
 [7] 28 36 45 55 156 258 
 [13] 361 465 570 676 783 891 
 [19] 1000 1110 1311 1513 1716 1920 
 [25] 2125 268437579 2415921211
 Int64
 int64
 [1] "integer"
 [1] 1 3 6 10 15 21 28
 [8] 36 45 55 156 258 361 465
 [15] 570 676 783 891 1000 1110 1311
 [22] 1513 1716 1920 2125 268437579 NA
 Warning message:
 integer overflow in 'cumsum'; use 'cumsum(as.numeric(.))'
{quote}
The actual version is Version: 5.0.0.9000 running under R 3.6.3



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to