jonkeane commented on pull request #12467:
URL: https://github.com/apache/arrow/pull/12467#issuecomment-1074365289
This is very exciting. Like I mentioned earlier, I wanted to try this out
locally to see what this looks like. The example is a little contrived (and
actually AFAIU, not totally accurate depending on the time of year!)
Is it expected that roundtripping without the {vctrs} class wouldn't work?
(Or did I do something wrong here?
``` r
library(arrow, warn.conflicts = FALSE)
# Is this the minimal structure to create a custom class like this?
KoreanAge <- R6::R6Class(
"KoreanAge",
inherit = ExtensionType,
public = list(
.array_as_vector = function(extension_array) {
extension_array$storage()$as_vector() + 1
}
)
)
KoreanAge <- new_extension_type(
int32(),
"KoreanAge",
charToRaw("Korean Age, but stored as the western age value"),
type_class = KoreanAge
)
arr <- new_extension_array(c(0, 1, 2), KoreanAge)
# What we expect (storage + 1)
as.vector(arr)
#> [1] 1 2 3
# But roundtripping doesn't seem to work?
tf <- tempfile()
write_feather(arrow_table(col = arr), tf)
tab <- read_feather(tf, as_data_frame = FALSE)
type(tab$col)
#> Int32
#> int32
as.vector(tab$col)
#> [1] 0 1 2
```
Also, should we export `ExtensionArray`? It doesn't look like it is, but we
do have `Array` etc. exported. The docs additions are really great +
descriptive. But I do wonder if an example (or two) would be nice, even if they
were pretty trivial extension type like this (or some of the vctrs examples
with percentages and the like).
Do we have a follow on for what to do about printing the array? You'll see
here you print the underlying storage type, which might be fine, but that has
confused some folks before.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]