Chain of calls of C functions in coerce.c for as.character(<logical>) in R: do_asatomic ascommon coerceVector coerceToString StringFromLogical (for each element)
The definition of 'StringFromLogical' in coerce.c : attribute_hidden SEXP StringFromLogical(int x, int *warn) { int w; formatLogical(&x, 1, &w); if (x == NA_LOGICAL) return NA_STRING; else return mkChar(EncodeLogical(x, w)); } The definition of 'EncodeLogical' in printutils.c : const char *EncodeLogical(int x, int w) { static char buff[NB]; if(x == NA_LOGICAL) snprintf(buff, NB, "%*s", min(w, (NB-1)), CHAR(R_print.na_string)); else if(x) snprintf(buff, NB, "%*s", min(w, (NB-1)), "TRUE"); else snprintf(buff, NB, "%*s", min(w, (NB-1)), "FALSE"); buff[NB-1] = '\0'; return buff; } > L <- sample(c(TRUE, FALSE), 10^7, replace = TRUE) > system.time(as.character(L)) user system elapsed 2.69 0.02 2.73 > system.time(c("FALSE", "TRUE")[L+1]) user system elapsed 0.15 0.04 0.20 > system.time(c("FALSE", "TRUE")[L+1L]) user system elapsed 0.08 0.05 0.13 > L <- rep(NA, 10^7) > system.time(as.character(L)) user system elapsed 0.11 0.00 0.11 > system.time(c("FALSE", "TRUE")[L+1]) user system elapsed 0.16 0.06 0.22 > system.time(c("FALSE", "TRUE")[L+1L]) user system elapsed 0.09 0.03 0.12 `as.character` of a logical vector that is all NA is fast enough. It appears that the call to 'formatLogical' inside the C function 'StringFromLogical' does not introduce much slowdown. I found that using string literal inside the C function 'StringFromLogical', by replacing EncodeLogical(x, w) with x ? "TRUE" : "FALSE" (and the call to 'formatLogical' is not needed anymore), make it faster. Alternatively, "fast path" could be introduced in 'EncodeLogical', potentially also benefits format() in R. For example, without replacing existing code, the following fragment could be inserted. if(x == NA_LOGICAL) {if(w == R_print.na_width) return CHAR(R_print.na_string);} else if(x) {if(w == 4) return "TRUE";} else {if(w == 5) return "FALSE";} However, with either of them, c("FALSE", "TRUE")[L+1L] is still faster than as.character(L) . Precomputing or caching possible results of the C function 'StringFromLogical' allows as.character(L) to be as fast as c("FALSE", "TRUE")[L+1L] in R. For example, 'StringFromLogical' could be changed to attribute_hidden SEXP StringFromLogical(int x, int *warn) { static SEXP TrueCh, FalseCh; if (x == NA_LOGICAL) return NA_STRING; else if (x) return TrueCh ? TrueCh : (TrueCh = mkChar("TRUE")); else return FalseCh ? FalseCh : (FalseCh = mkChar("FALSE")); } ---------------- On 21 Mar 2025, at 8:26, Karolis Koncevičius wrote: > [You don't often get email from karolis.koncevicius using gmail.com. Learn >why this is important at https://aka.ms/LearnAboutSenderIdentification ] > > I was calling table() on some long logical vectors and noticed that it took a >long time. > > Out of curiosity I checked the performance of table() on different types, and >had some unexpected results: > > C <- sample(c("yes", "no"), 10^7, replace = TRUE) > F <- factor(sample(c("yes", "no"), 10^7, replace = TRUE)) > N <- sample(c(1,0), 10^7, replace = TRUE) > I <- sample(c(1L,0L), 10^7, replace = TRUE) > L <- sample(c(TRUE, FALSE), 10^7, replace = TRUE) > > # ordered by execution time > # user system elapsed > system.time(table(F)) # 0.088 0.006 0.093 > system.time(table(C)) # 0.208 0.017 0.224 > system.time(table(I)) # 0.242 0.019 0.261 > system.time(table(L)) # 0.665 0.015 0.680 > system.time(table(N)) # 1.771 0.019 1.791 > > > The performance for Integers and specially booleans is quite surprising. > After investigating the source of table, I ended up on the reason being >“as.character()”: > > system.time(as.character(L)) > user system elapsed > 0.461 0.002 0.462 > > Even a manual conversion can achieve a speed-up by a factor of ~7: > > system.time(c("FALSE", "TRUE")[L+1]) > user system elapsed > 0.061 0.006 0.067 > > > Tested on 4.4.3 as well as devel trunk. > > Just reporting for comments and attention. > Karolis K. > ______________________________________________ > R-devel using r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel