[
https://issues.apache.org/jira/browse/ARROW-17132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17569030#comment-17569030
]
Neal Richardson commented on ARROW-17132:
-----------------------------------------
Right, transmute drops the input columns, so that would work around this. Note
that this isn't about {{mutate()}} but rather the R <–> Arrow conversion,
and/or how R deals with timezones, or timezone-naive data, or localized data,
or something.
{code}
> expect_identical(as.data.frame(arrow_table(df)), df)
Error: as.data.frame(arrow_table(df)) (`actual`) not identical to `df`
(`expected`).
`attr(actual$time, 'tzone')` is a character vector ('America/Los_Angeles')
`attr(expected$time, 'tzone')` is absent
{code}
I'm sure if we keep pulling on this, we'll end up back on some issue we've
worked before about how R treats timestamps with no timezone as being local
time but arrow reads as UTC, so we have to incorporate time zone information
when converting from R to Arrow.
{code}
> attributes(as.data.frame(arrow_table(df))$time)
$class
[1] "POSIXct" "POSIXt"
$tzone
[1] "America/Los_Angeles"
> attributes(df$time)
$class
[1] "POSIXct" "POSIXt"
{code}
> [R] Mutate in compare_dplyr_binding returns wrong type
> ------------------------------------------------------
>
> Key: ARROW-17132
> URL: https://issues.apache.org/jira/browse/ARROW-17132
> Project: Apache Arrow
> Issue Type: Bug
> Components: R
> Reporter: Rok Mihevc
> Priority: Minor
> Labels: test
>
> The following:
> {code:r}
> df <- tibble::tibble(
> time = as.POSIXct(seq(as.Date("1999-12-31", tz = "UTC"),
> as.Date("2001-01-01", tz = "UTC"), by = "day"))
> )
> compare_dplyr_binding(
> .input %>%
> mutate(x = yday(time)) %>%
> collect(),
> df
> )
> {code}
> Fails with:
> {code:bash}
> Failure (test-dplyr-funcs-datetime.R:574:3): extract wday from timestamp
> `object` (`actual`) not equal to `expected` (`expected`).
> `attr(actual$time, 'tzone')` is a character vector ('UTC')
> `attr(expected$time, 'tzone')` is absent
> Backtrace:
> 1. arrow:::compare_dplyr_binding(...)
> at test-dplyr-funcs-datetime.R:574:2
> 2. arrow:::expect_equal(via_batch, expected, ...)
> at tests/testthat/helper-expectation.R:115:4
> 3. testthat::expect_equal(...)
> at tests/testthat/helper-expectation.R:42:4
> {code}
> This also happens for qday and probably other functions where input is
> temporal and output is numeric.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)