For reference, fixed in R-devel (80153).
Tomas

On 3/30/21 10:20 AM, Tomas Kalibera wrote:
Thanks for the report, you are probably running into the overhead of the eager creation of the error message. On my system, with your micro-benchmark, it is about 10x. I've tested simply by uncommenting it and re-running the benchmark. I'll fix (this is not a good task for a contributed patch).

Best,
Tomas

On 3/30/21 8:02 AM, Hugh Parsonage wrote:
While profiling some C code, I rolled my own nchar function which
appears to be much faster than base R's (25 times faster for a 10M
length vector).  Obviously base::nchar provides significantly more
features than my barebones function (C snippet below); however, for
argument type = "bytes" it seems that the R_nchar and do_nchar
functions do not actually do anything more than this function.
My suspicion is that I have overlooked some subtlety in the base R
code, or that my benchmarks are not representative. Alternatively,
the action in `do_nchar` of preparing the potential error message
before being passed to `R_nchar` may be quite costly indeed.  Or the
function cannot be unswitched from the more complex width and chars
arguments by the compiler.

If I haven't missed something, would a patch be warranted?

SEXP Cnchar(SEXP x) {
   R_xlen_t N = xlength(x);
   SEXP ans = PROTECT(allocVector(INTSXP, N));
   int * restrict ansp = INTEGER(ans);

   // Ignoring NA to avoid the branch has a very small
   // impact on performance.
   for (R_xlen_t i = 0; i < N; ++i) {
     SEXP sxi = STRING_ELT(x, i);
     if (sxi == NA_STRING) {
       ansp[i] = NA_INTEGER;
       continue;
     }
     ansp[i] = length(sxi);
   }
   UNPROTECT(1);
   return ans;
}

x <- rep_len(c(as.character(c(5L, 1:1e6)), NA_character_, 1e6:15e5), 1e7)
Cnchar(x)
90ms
nchar(x, type = "bytes")
2500 ms

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Reply via email to