nealrichardson commented on code in PR #43351:
URL: https://github.com/apache/arrow/pull/43351#discussion_r1689716982
##########
r/src/arrow_cpp11.h:
##########
@@ -138,7 +138,12 @@ inline R_xlen_t r_string_size(SEXP s) {
} // namespace unsafe
inline SEXP utf8_strings(SEXP x) {
- return cpp11::unwind_protect([x] {
+ return cpp11::unwind_protect([&] {
+ // ensure that x is not actually altrep first
+ bool was_altrep = ALTREP(x);
+ if (was_altrep) {
+ x = PROTECT(Rf_duplicate(x));
Review Comment:
Add a comment about why we have to duplicate?
##########
r/src/arrow_cpp11.h:
##########
@@ -152,6 +157,9 @@ inline SEXP utf8_strings(SEXP x) {
SET_STRING_ELT(x, i, Rf_mkCharCE(Rf_translateCharUTF8(s), CE_UTF8));
Review Comment:
Did we want to check whether `Rf_translateCharUTF8()` actually modified
anything? Or do we trust that `SET_STRING_ELT` is a no-op in that case? I would
imagine that in most cases, we already have ascii/utf-8 strings, so this whole
function should be basically free. That should be easily verified by
microbenchmarking.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]