thisisnic commented on code in PR #48634:
URL: https://github.com/apache/arrow/pull/48634#discussion_r2645448843
##########
r/src/altrep.cpp:
##########
@@ -574,11 +575,10 @@ struct AltrepFactor : public
AltrepVectorBase<AltrepFactor> {
// the representation integer vector
SEXP dup = PROTECT(Rf_shallow_duplicate(Materialize(alt)));
- // additional attributes from the altrep
- SEXP atts = PROTECT(Rf_duplicate(ATTRIB(alt)));
- SET_ATTRIB(dup, atts);
+ // copy attributes from the altrep object
+ DUPLICATE_ATTRIB(dup, alt);
- UNPROTECT(2);
+ UNPROTECT(1);
Review Comment:
> 🤖 Performance impact: Negligible. The OBJECT bit and S4 flag are just bit
operations - essentially free compared to the actual attribute list copying
that both approaches do.
I queried these a little more and it looks like these are just metadata -
the `OBJECT` bit determines the result of calling `is.object()` and the `S4`
bit indicates if the object is an S4 object.
I took a look at
https://cran.r-project.org/doc/manuals/r-release/R-ints.html#Attributes-1 and
also queried what this bit of the codebase does (answer: wrapper around
ChunkedArray to store data when we call e.g. `collect()` as an ALTREP vector
until R actually needs it), and whether we definitely need `S4` and `OBJECT`:
apparently we need `OBJECT` but not `S4`:
> 🤖 `OBJECT` bit: Yes, needed. Factors have a class attribute, and the
`OBJECT` bit tells R "this is a classed object, use S3 method dispatch".
Without it, things like `print()`, `[`, etc. might not dispatch to the factor
methods correctly.
I will test this to see if `is.object()` returns different results etc.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]