Re: [Rd] Profiling: attributing costs to place of invocation (instead of place of evaluation)?

2020-02-26 Thread Lionel Henry
There's a patch under review:
https://bugs.r-project.org/bugzilla/show_bug.cgi?id=17595

Best,
Lionel

On 2/26/20, Kirill Müller  wrote:
> Hi
>
>
> Consider the following example:
>
> f <- function(expr) g(expr)
> g <- function(expr) {
>    h(expr)
> }
> h <- function(expr) {
>    expr # evaluation happens here
>    i(expr)
> }
> i <- function(expr) {
>    expr # already evaluated, no costs here
>    invisible()
> }
>
> rprof <- tempfile()
> Rprof(rprof)
> f(replicate(1e2, sample.int(1e4)))
> Rprof(NULL)
> cat(readLines(rprof), sep = "\n")
> #> sample.interval=2
> #> "sample.int" "FUN" "lapply" "sapply" "replicate" "h" "g" "f"
> #> "sample.int" "FUN" "lapply" "sapply" "replicate" "h" "g" "f"
> #> "sample.int" "FUN" "lapply" "sapply" "replicate" "h" "g" "f"
>
> The evaluation of the slow replicate() call is deferred to the execution
> of h(), but there's no replicate() call in h's definition. This makes
> parsing the profile much more difficult than necessary.
>
> I have pasted an experimental patch below (off of 3.6.2) that produces
> the following output:
>
> cat(readLines(rprof), sep = "\n")
> #> sample.interval=2
> #> "sample.int" "FUN" "lapply" "sapply" "replicate" "f"
> #> "sample.int" "FUN" "lapply" "sapply" "replicate" "f"
> #> "sample.int" "FUN" "lapply" "sapply" "replicate" "f"
>
> This attributes the cost to the replicate() call to f(), where the call
> is actually defined. From my experience, this will give a much better
> understanding of the actual costs of each part of the code. The SIGPROF
> handler looks at sysparent and cloenv before deciding if an element of
> the call stack is to be included in the profile.
>
> Is there interest in integrating a variant of this patch, perhaps with
> an optional argument to Rprof()?
>
> Thanks!
>
>
> Best regards
>
> Kirill
>
>
> Index: src/main/eval.c
> ===
> --- src/main/eval.c    (revision 77857)
> +++ src/main/eval.c    (working copy)
> @@ -218,7 +218,10 @@
>   if (R_Line_Profiling)
>   lineprof(buf, R_getCurrentSrcref());
>
> +    SEXP sysparent = NULL;
> +
>   for (cptr = R_GlobalContext; cptr; cptr = cptr->nextcontext) {
> +    if (sysparent != NULL && cptr->cloenv != sysparent &&
> cptr->sysparent != sysparent) continue;
>   if ((cptr->callflag & (CTXT_FUNCTION | CTXT_BUILTIN))
>       && TYPEOF(cptr->call) == LANGSXP) {
>       SEXP fun = CAR(cptr->call);
> @@ -292,6 +295,8 @@
>           else
>           lineprof(buf, cptr->srcref);
>       }
> +
> +        sysparent = cptr->sysparent;
>       }
>   }
>   }
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Profiling: attributing costs to place of invocation (instead of place of evaluation)?

2020-02-26 Thread Kirill Müller

Hi


Consider the following example:

f <- function(expr) g(expr)
g <- function(expr) {
  h(expr)
}
h <- function(expr) {
  expr # evaluation happens here
  i(expr)
}
i <- function(expr) {
  expr # already evaluated, no costs here
  invisible()
}

rprof <- tempfile()
Rprof(rprof)
f(replicate(1e2, sample.int(1e4)))
Rprof(NULL)
cat(readLines(rprof), sep = "\n")
#> sample.interval=2
#> "sample.int" "FUN" "lapply" "sapply" "replicate" "h" "g" "f"
#> "sample.int" "FUN" "lapply" "sapply" "replicate" "h" "g" "f"
#> "sample.int" "FUN" "lapply" "sapply" "replicate" "h" "g" "f"

The evaluation of the slow replicate() call is deferred to the execution 
of h(), but there's no replicate() call in h's definition. This makes 
parsing the profile much more difficult than necessary.


I have pasted an experimental patch below (off of 3.6.2) that produces 
the following output:


cat(readLines(rprof), sep = "\n")
#> sample.interval=2
#> "sample.int" "FUN" "lapply" "sapply" "replicate" "f"
#> "sample.int" "FUN" "lapply" "sapply" "replicate" "f"
#> "sample.int" "FUN" "lapply" "sapply" "replicate" "f"

This attributes the cost to the replicate() call to f(), where the call 
is actually defined. From my experience, this will give a much better 
understanding of the actual costs of each part of the code. The SIGPROF 
handler looks at sysparent and cloenv before deciding if an element of 
the call stack is to be included in the profile.


Is there interest in integrating a variant of this patch, perhaps with 
an optional argument to Rprof()?


Thanks!


Best regards

Kirill


Index: src/main/eval.c
===
--- src/main/eval.c    (revision 77857)
+++ src/main/eval.c    (working copy)
@@ -218,7 +218,10 @@
 if (R_Line_Profiling)
 lineprof(buf, R_getCurrentSrcref());

+    SEXP sysparent = NULL;
+
 for (cptr = R_GlobalContext; cptr; cptr = cptr->nextcontext) {
+    if (sysparent != NULL && cptr->cloenv != sysparent && 
cptr->sysparent != sysparent) continue;

 if ((cptr->callflag & (CTXT_FUNCTION | CTXT_BUILTIN))
     && TYPEOF(cptr->call) == LANGSXP) {
     SEXP fun = CAR(cptr->call);
@@ -292,6 +295,8 @@
         else
         lineprof(buf, cptr->srcref);
     }
+
+        sysparent = cptr->sysparent;
     }
 }
 }

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel