For a long-term horizon, would it help R developers to use a naming convention? Perhaps, varName_PROT, or the inverse varName_UNPROT? Eventually, teach some linter about that?
On Fri, Apr 11, 2025 at 10:40 AM Duncan Murdoch <murdoch.dun...@gmail.com> wrote: > On a tangent from the main topic of this thread: sometimes (especially > to non-experts) it's not obvious whether a variable is protected or not. > > I don't think there's any easy way to determine that, but perhaps there > should be. Would it be possible to add a run-time test you could call > in C code (e.g. is_protected(x)) that would do the same search the > garbage collector does in order to determine if a particular pointer is > protected? > > This would be an expensive operation, similar in cost to actually doing > a garbage collection. You wouldn't want to do it routinely, but it > would be really helpful in debugging. > > Duncan Murdoch > > On 2025-04-11 6:05 a.m., Suharto Anggono Suharto Anggono via R-devel wrote: > > On second thought, I wonder if the caching in my changed > 'StringFromLogical' in my previous message is safe. While 'ans' in the C > function 'coerceToString' is protected, its element is also protected. If > the object corresponding to 'ans' is then no longer protected, is it > possible for the cached object 'TrueCh' or 'FalseCh' in 'StringFromLogical' > to be garbage collected? If it is, I think of clearing the cache for each > first filling. For example, by abusing 'warn' argument, the following is > added to my changed 'StringFromLogical'. > > > > if (*warn) TrueCh = FalseCh = NULL; > > > > Correspondingly, in 'coerceToString', > > > > warn = i == 0; > > > > is inserted before > > > > SET_STRING_ELT(ans, i, StringFromLogical(LOGICAL_ELT(v, i), &warn)); > > > > for LGLSXP case. > > > > --------------------- > > On Thursday, 10 April 2025 at 10:54:03 pm GMT+7, Martin Maechler < > maech...@stat.math.ethz.ch> wrote: > > > > > >>>>>> Suharto Anggono Suharto Anggono via R-devel > >>>>>> on Thu, 10 Apr 2025 07:53:04 +0000 (UTC) writes: > > > > > Chain of calls of C functions in coerce.c for > as.character(<logical>) in R: > > > > > do_asatomic > > > ascommon > > > coerceVector > > > coerceToString > > > StringFromLogical (for each element) > > > > > The definition of 'StringFromLogical' in coerce.c : > > > > > Chain of calls of C functions in coerce.c for > as.character(<logical>) in R: > > > > > > do_asatomic > > > ascommon > > > coerceVector > > > coerceToString > > > StringFromLogical (for each element) > > > > > > The definition of 'StringFromLogical' in coerce.c : > > > > > > attribute_hidden SEXP StringFromLogical(int x, int *warn) > > > { > > > int w; > > > formatLogical(&x, 1, &w); > > > if (x == NA_LOGICAL) return NA_STRING; > > > else return mkChar(EncodeLogical(x, w)); > > > } > > > > > > The definition of 'EncodeLogical' in printutils.c : > > > > > > const char *EncodeLogical(int x, int w) > > > { > > > static char buff[NB]; > > > if(x == NA_LOGICAL) snprintf(buff, NB, "%*s", min(w, (NB-1)), > CHAR(R_print.na_string)); > > > else if(x) snprintf(buff, NB, "%*s", min(w, (NB-1)), "TRUE"); > > > else snprintf(buff, NB, "%*s", min(w, (NB-1)), "FALSE"); > > > buff[NB-1] = '\0'; > > > return buff; > > > } > > > > > > > L <- sample(c(TRUE, FALSE), 10^7, replace = TRUE) > > > > system.time(as.character(L)) > > > user system elapsed > > > 2.69 0.02 2.73 > > > > system.time(c("FALSE", "TRUE")[L+1]) > > > user system elapsed > > > 0.15 0.04 0.20 > > > > system.time(c("FALSE", "TRUE")[L+1L]) > > > user system elapsed > > > 0.08 0.05 0.13 > > > > L <- rep(NA, 10^7) > > > > system.time(as.character(L)) > > > user system elapsed > > > 0.11 0.00 0.11 > > > > system.time(c("FALSE", "TRUE")[L+1]) > > > user system elapsed > > > 0.16 0.06 0.22 > > > > system.time(c("FALSE", "TRUE")[L+1L]) > > > user system elapsed > > > 0.09 0.03 0.12 > > > > > > `as.character` of a logical vector that is all NA is fast enough. > > > It appears that the call to 'formatLogical' inside > the C > function > > > 'StringFromLogical' does not introduce much > slowdown. > > > > > > > I found that using string literal inside the C function > 'StringFromLogical', by replacing > > > EncodeLogical(x, w) > > > with > > > x ? "TRUE" : "FALSE" > > > (and the call to 'formatLogical' is not needed anymore), make it > faster. > > > > indeed! ... and we also notice that the 'w' argument is neither > > needed anymore, and that makes sense: At this point when you > > know you have a an R logical value there are only three > > possibilities and no reason ever to warn about the conversion. > > > > > Alternatively, > > or in addition ! > > > > > > > "fast path" could be introduced in 'EncodeLogical', potentially > also benefits format() in R. > > > For example, without replacing existing code, the following > fragment could be inserted. > > > > > > if(x == NA_LOGICAL) {if(w == R_print.na_width) return > CHAR(R_print.na_string);} > > > else if(x) {if(w == 4) return "TRUE";} > > > else {if(w == 5) return "FALSE";} > > > > > > However, with either of them, c("FALSE", "TRUE")[L+1L] is still > faster than as.character(L) . > > > > > > Precomputing or caching possible results of the C function > 'StringFromLogical' allows as.character(L) to be as fast as c("FALSE", > "TRUE")[L+1L] in R. For example, 'StringFromLogical' could be changed to > > > > > > attribute_hidden SEXP StringFromLogical(int x, int *warn) > > > { > > > static SEXP TrueCh, FalseCh; > > > if (x == NA_LOGICAL) return NA_STRING; > > > else if (x) return TrueCh ? TrueCh : (TrueCh = mkChar("TRUE")); > > > else return FalseCh ? FalseCh : (FalseCh = mkChar("FALSE")); > > > > > } > > > > Indeed, and something along this line (storing the other two constant > strings) was also > > my thought when seeing the > > mkChar(x ? "TRUE" : "FALSE) > > you implicitly proposed above. > > > > I'm looking into applying both speedups; > > thank you very much, Suharto! > > > > Martin > > > > > > -- > > Martin Maechler > > ETH Zurich and R Core team > > > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > R-devel@r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-devel > > ______________________________________________ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel > [[alternative HTML version deleted]] ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel