Thanks for the background and suggestions.

Valerie


On 07/02/2013 08:41 AM, John Chambers wrote:
It's hard to see how repeated dispatch on the same classes can be that
slow, _if_ the function being called each time is itself doing some
substantial work.

The first call (in a session) with a particular signature searches for
inherited methods and stores the method found in a table.  The following
calls with that signature should do a single lookup in a hash table.
Caching the last signature is unlikely to be dramatically faster, but we
can experiment and see.

What is substantially different is calling a generic function vs calling
a primitive or internal.  If the local paste you constructed is the
default, base::paste, that is a .Internal.

Not going through the R generic function several thousand times would
make a difference.

It's a fundamental point about R that function calls do enough work that
they add significant time to a "trivial" computation, such as a
primitive call.  There are various efforts going on these days to
provide more efficient alternatives.  They're all helpful; my personal
favorite when the game is worth it is to consider doing key computations
in a seriously faster language, like C++ via Rcpp.

John

On 7/1/13 10:04 PM, Valerie Obenchain wrote:
Hi,

S4 method dispatch can be very slow. Would it be reasonable to cache the
most
recent dispatch, anticipating the next invocation will be on the same
type? This
would be very helpful in loops.

   fun0 <- function(x)
       sapply(x, paste, collapse="+")
   fun1 <- function(x) {
       paste <- selectMethod(paste, class(x[[1]]))
       sapply(x, paste, collapse="+")
   }
   lst <- split(rep(LETTERS, 100), rep(1:1300, 2))

   library(microbenchmark)
   microbenchmark(fun0(lst), times=10)
   ## Unit: milliseconds
   ##       expr      min       lq   median      uq      max neval
   ##  fun0(lst) 4.153287 4.180659 4.513539 5.19261 5.280481    10

   setGeneric("paste")
   microbenchmark(fun0(lst), fun1(lst), times=10)
   ## >     microbenchmark(fun0(lst), fun1(lst), times=10)
   ## Unit: milliseconds
   ##       expr       min       lq    median        uq       max neval
   ##  fun0(lst) 21.093180 21.27616 21.453174 21.833686 24.758791    10
   ##  fun1(lst)  4.517808  4.53067  4.582641  4.682235  5.121856    10

Dispatch seems to be especially slow when packages are involved, e.g.,
with the Bioconductor IRanges package
(http://bioconductor.org/packages/release/bioc/html/IRanges.html)

   removeGeneric("paste")
   library(IRanges)
   showMethods(paste)
   ## Function: paste (package BiocGenerics)
   ## ...="ANY"
   ## ...="Rle"
   selectMethod(paste, "ANY")
   ## Method Definition (Class "derivedDefaultMethod"):
   ##
   ## function (..., sep = " ", collapse = NULL)
   ## .Internal(paste(list(...), sep, collapse))
   ## <environment: namespace:base>
   ##
   ## Signatures:
   ##         ...
   ## target  "ANY"
   ## defined "ANY"

   microbenchmark(fun0(lst), fun1(lst), times=10)
   ## Unit: milliseconds
   ##       expr        min         lq     median         uq        max
neval
   ##  fun0(lst) 233.539585 234.592491 236.311209 237.268506 243.181123
    10
   ##  fun1(lst)   4.564914   4.592996   4.642898   4.729009   5.492706
    10

   sessionInfo()
   ## R version 3.0.0 Patched (2013-04-04 r62492)
   ## Platform: x86_64-unknown-linux-gnu (64-bit)
   ##
   ## locale:
   ##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
   ##  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
   ##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
   ##  [7] LC_PAPER=C                 LC_NAME=C
   ##  [9] LC_ADDRESS=C               LC_TELEPHONE=C
   ## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
   ##
   ## attached base packages:
   ## [1] parallel  stats     graphics  grDevices utils     datasets
methods
   ## [8] base
   ##
   ## other attached packages:
   ## [1] IRanges_1.19.15      BiocGenerics_0.7.2   microbenchmark_1.3-0
   ##
   ## loaded via a namespace (and not attached):
   ## [1] stats4_3.0.0


Thanks,
Valerie

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Reply via email to