Marc Colosimo created ARROW-14509:
-------------------------------------
Summary: as_vector() downgrades int64 even when
arrow.int64_downcast = TRUE
Key: ARROW-14509
URL: https://issues.apache.org/jira/browse/ARROW-14509
Project: Apache Arrow
Issue Type: Bug
Components: R
Affects Versions: 5.0.0
Environment: linux
Reporter: Marc Colosimo
Using as_vector() on a Table or Array when the type is Int64 and
arrow.int64_downcast = TRUE still downgrades, unless there is a value greater
than Int32 can store (actually it switches over at some lower value; guessing
the integer to numeric switch over in R).
{quote}library(arrow)
options(arrow.int64_downcast = TRUE)
int64s <- bit64::as.integer64(c(1:10, 101:110, 201:205))
y <- Array$create(int64s)
y$type
yv <- y$as_vector()
class(yv)
int64s <- c(int64s, bit64::as.integer64("68719476735")) # 0xF FFFFFFFF
y <- Array$create(int64s)
y$type
yv <- y$as_vector()
class(yv)
{quote}
Outputs:
{quote}Int64
int64
[1] "integer"
Int64
int64
[1] "integer64"
{quote}
This can cause an unexpected overflow
{quote}int64s <- bit64::as.integer64(c(1:10, 101:110, 201:205, 268435454,
2147483632)) # 0xFFFFFFE, 0x7FFFFFF0
cumsum(int64s)
y <- Array$create(int64s)
y$type
yv <- y$as_vector()
class(yv)
cumsum(yv)
{quote}
as shown in the second cumsum
{quote}integer64
[1] 1 3 6 10 15 21
[7] 28 36 45 55 156 258
[13] 361 465 570 676 783 891
[19] 1000 1110 1311 1513 1716 1920
[25] 2125 268437579 2415921211
Int64
int64
[1] "integer"
[1] 1 3 6 10 15 21 28
[8] 36 45 55 156 258 361 465
[15] 570 676 783 891 1000 1110 1311
[22] 1513 1716 1920 2125 268437579 NA
Warning message:
integer overflow in 'cumsum'; use 'cumsum(as.numeric(.))'
{quote}
The actual version is Version: 5.0.0.9000 running under R 3.6.3
--
This message was sent by Atlassian Jira
(v8.3.4#803005)