thisisnic commented on code in PR #48634:
URL: https://github.com/apache/arrow/pull/48634#discussion_r2645448843


##########
r/src/altrep.cpp:
##########
@@ -574,11 +575,10 @@ struct AltrepFactor : public 
AltrepVectorBase<AltrepFactor> {
     // the representation integer vector
     SEXP dup = PROTECT(Rf_shallow_duplicate(Materialize(alt)));
 
-    // additional attributes from the altrep
-    SEXP atts = PROTECT(Rf_duplicate(ATTRIB(alt)));
-    SET_ATTRIB(dup, atts);
+    // copy attributes from the altrep object
+    DUPLICATE_ATTRIB(dup, alt);
 
-    UNPROTECT(2);
+    UNPROTECT(1);

Review Comment:
   > 🤖 Performance impact: Negligible. The OBJECT bit and S4 flag are just bit 
operations - essentially free compared to the actual attribute list copying 
that both approaches do.
   
   I queried these a little more and it looks like these are just metadata - 
the `OBJECT` bit determines the result of calling `is.object()` and the `S4` 
bit indicates if the object is an S4 object.
   
   I took a look at 
https://cran.r-project.org/doc/manuals/r-release/R-ints.html#Attributes-1 and 
also queried what this bit of the codebase does (answer: wrapper around 
ChunkedArray to store data when we call e.g. `collect()` as an ALTREP vector 
until R actually needs it), and whether we definitely need `S4` and `OBJECT`: 
apparently we need `OBJECT` but not `S4`:
   
   > 🤖 `OBJECT` bit: Yes, needed. Factors have a class attribute, and the 
`OBJECT` bit tells R "this is a classed object, use S3 method dispatch". 
Without it, things like `print()`, `[`, etc. might not dispatch to the factor 
methods correctly.
   
   Either way, sounds like no impact.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to