paleolimbot opened a new pull request, #34489:
URL: https://github.com/apache/arrow/pull/34489
### Rationale for this change
When we attempt to re-use an object that Arrow itself created previously by
wrapping a chunked array, we will get a crash if this object has been
materialized (i.e., R values have been accessed and the ChunkedArray reference
deleted). This behaviour changed between 10.0.0 and 11.0.0 because I redid the
ALTREP implementation just after the 10.0.0 release.
The following test crashes R on main and 11.0.0 but passes after this PR:
``` r
library(arrow, warn.conflicts = FALSE)
#> Some features are not enabled in this build of Arrow. Run `arrow_info()`
for more information.
library(testthat, warn.conflicts = FALSE)
withr::local_namespace("arrow")
test_that("Materialized ALTREP arrays don't cause arrow to crash when
attempting to bypass", {
a_int <- Array$create(c(1L, 2L, 3L))
b_int <- a_int$as_vector()
expect_true(is_arrow_altrep(b_int))
expect_false(test_arrow_altrep_is_materialized(b_int))
# Some operations that use altrep bypass
expect_equal(infer_type(b_int), int32())
expect_equal(as_arrow_array(b_int), a_int)
# Still shouldn't have materialized yet
expect_false(test_arrow_altrep_is_materialized(b_int))
# Force it to materialize and check again
test_arrow_altrep_force_materialize(b_int)
expect_true(test_arrow_altrep_is_materialized(b_int))
expect_equal(infer_type(b_int), int32())
expect_equal(as_arrow_array(b_int), a_int)
})
#> Test passed 🎉
```
### What changes are included in this PR?
We used a function called `is_arrow_altrep()` to check if we could safely
access the ChunkedArray reference; however, *materialized* ALTREP arrays still
cause this return `true`. I added a new function
`is_unmaterialized_arrow_altrep()` and replaced usage that depended on the
ChunkedArray actually existing to use it.
### Are these changes tested?
Yes.
### Are there any user-facing changes?
No.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]