Re: [Rd] RProfmem output format
UPDATE: Tomas Kalibera has fixed this bug (on missing newlines in Rprofmem output) in R-devel r72747 (https://github.com/wch/r-source/commit/ba6665deace4a8dc239aebec977c17d0975fbc27). Using the example of this thread, Rprofmem() of R-devel now outputs with newlines: $ R Under development (unstable) (2017-05-30 r72747) -- "Unsuffered Consequences" Copyright (C) 2017 The R Foundation for Statistical Computing Platform: x86_64-pc-linux-gnu (64-bit) [...] > Rprofmem(); x <- raw(2000); Rprofmem("") > cat(readLines("Rprofmem.out", n=5, warn=FALSE), sep="\n") 232 : 472 : 472 : 1064 : 2040 :"raw" /Henrik On Sat, Jun 4, 2016 at 12:40 PM, Henrik Bengtsson wrote: > I'm picking up this 5-year old thread. > > 1. About the four memory allocations without a stacktrace > > I think the four memory allocations without a stacktrace reported by > Rprofmem(): > >> Rprofmem(); x <- raw(2000); Rprofmem("") >> cat(readLines("Rprofmem.out", n=5, warn=FALSE), sep="\n") > 192 :360 :360 :1064 :2040 :"raw" > > are due to some initialization of R that is independent of Rprofmem(), > because they can be avoided if one allocates some memory before (in a > fresh R session): > >> z <- raw(1000); dummy <- gc() >> Rprofmem(); x <- raw(2000); Rprofmem("") >> cat(readLines("Rprofmem.out", n=5, warn=FALSE), sep="\n") > 2040 :"raw" > > > 2. About missing newlines when stacktrace is empty > > As a refresher, the problem is that memory allocations an empty > stracktrace are reported without newlines, i.e. > > 192 :360 :360 :1064 :2040 :"raw" > > The question is why this is not reported as: > > 192 : > 360 : > 360 : > 1064 : > 2040 :"raw" > > This was/is because C function R_OutputStackTrace() - part of > src/main/memory.c - looks like: > > static void R_OutputStackTrace(FILE *file) > { > int newline = 0; > RCNTXT *cptr; > > for (cptr = R_GlobalContext; cptr; cptr = cptr->nextcontext) { > if ((cptr->callflag & (CTXT_FUNCTION | CTXT_BUILTIN)) >&& TYPEOF(cptr->call) == LANGSXP) { >SEXP fun = CAR(cptr->call); >if (!newline) newline = 1; >fprintf(file, "\"%s\" ", >TYPEOF(fun) == SYMSXP ? CHAR(PRINTNAME(fun)) : >""); > } > } > if (newline) fprintf(file, "\n"); > } > > > Thomas, your last comment was: > >> Yes. It's obviously better to always print a newline, and so clearly >> deliberate not to, that I suspect there may have been a good reason. >> If I can't work it out (after my grant deadline this week) I will just >> assume it's wrong. > > When I search the code and the commit history > (https://github.com/wch/r-source/commit/3d5eb2a09f2d75893efdc8bbf1c72d17603886a0), > it appears that this was there from the very first commit. Also, > searching the code for usages of R_OutputStackTrace(), I only find > R_ReportAllocation() and R_ReportNewPage(), both part of of > src/main/memory.c (see below). > > static void R_ReportAllocation(R_size_t size) > { > if (R_IsMemReporting) { > if(size > R_MemReportingThreshold) { >fprintf(R_MemReportingOutfile, "%lu :", (unsigned long) size); >R_OutputStackTrace(R_MemReportingOutfile); > } > } > return; > } > > static void R_ReportNewPage(void) > { > if (R_IsMemReporting) { > fprintf(R_MemReportingOutfile, "new page:"); > R_OutputStackTrace(R_MemReportingOutfile); > } > return; > } > > > Could it be that when you wrote it you had another usage for > R_OutputStackTrace() in mind as well? If so, it makes sense that > R_OutputStackTrace() shouldn't output a newline if the stack trace was > empty. But if the above is the only usage, to me it looks pretty safe > to always add a newline. > >> sessionInfo() > R version 3.3.0 Patched (2016-05-26 r70682) > Platform: x86_64-w64-mingw32/x64 (64-bit) > Running under: Windows 7 x64 (build 7601) Service Pack 1 > > locale: > [1] LC_COLLATE=English_United States.1252 > [2] LC_CTYPE=English_United States.1252 > [3] LC_MONETARY=English_United States.1252 > [4] LC_NUMERIC=C > [5] LC_TIME=English_United States.1252 > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > /Henrik > > > On Sun, May 15, 2011 at 1:16 PM, Thomas Lumley wrote: >> On Mon, May 16, 2011 at 1:02 AM, Hadley Wickham wrote: >>> So what causes allocations when the call stack is empty? Something >>> internal? Does the garbage collector trigger allocations (i.e. could >>> it be caused by moving data to contiguous memory)? >> >> The garbage collector doesn't move anything, it just swaps pointers in >> a linked list. >> >> The lexer, parser, and evaluator all have to do some work before a >> function context is set up for the top-level function, so I assume >> that's where it is happening. >> >>> Any ideas what the correct thing to do with these memory allocations? >>> Ignore them because they're not really related to the function they're >>> attributed to? Sum them up? >>> I don't see why this is done, and I m
Re: [Rd] RProfmem output format
I'm picking up this 5-year old thread. 1. About the four memory allocations without a stacktrace I think the four memory allocations without a stacktrace reported by Rprofmem(): > Rprofmem(); x <- raw(2000); Rprofmem("") > cat(readLines("Rprofmem.out", n=5, warn=FALSE), sep="\n") 192 :360 :360 :1064 :2040 :"raw" are due to some initialization of R that is independent of Rprofmem(), because they can be avoided if one allocates some memory before (in a fresh R session): > z <- raw(1000); dummy <- gc() > Rprofmem(); x <- raw(2000); Rprofmem("") > cat(readLines("Rprofmem.out", n=5, warn=FALSE), sep="\n") 2040 :"raw" 2. About missing newlines when stacktrace is empty As a refresher, the problem is that memory allocations an empty stracktrace are reported without newlines, i.e. 192 :360 :360 :1064 :2040 :"raw" The question is why this is not reported as: 192 : 360 : 360 : 1064 : 2040 :"raw" This was/is because C function R_OutputStackTrace() - part of src/main/memory.c - looks like: static void R_OutputStackTrace(FILE *file) { int newline = 0; RCNTXT *cptr; for (cptr = R_GlobalContext; cptr; cptr = cptr->nextcontext) { if ((cptr->callflag & (CTXT_FUNCTION | CTXT_BUILTIN)) && TYPEOF(cptr->call) == LANGSXP) { SEXP fun = CAR(cptr->call); if (!newline) newline = 1; fprintf(file, "\"%s\" ", TYPEOF(fun) == SYMSXP ? CHAR(PRINTNAME(fun)) : ""); } } if (newline) fprintf(file, "\n"); } Thomas, your last comment was: > Yes. It's obviously better to always print a newline, and so clearly > deliberate not to, that I suspect there may have been a good reason. > If I can't work it out (after my grant deadline this week) I will just > assume it's wrong. When I search the code and the commit history (https://github.com/wch/r-source/commit/3d5eb2a09f2d75893efdc8bbf1c72d17603886a0), it appears that this was there from the very first commit. Also, searching the code for usages of R_OutputStackTrace(), I only find R_ReportAllocation() and R_ReportNewPage(), both part of of src/main/memory.c (see below). static void R_ReportAllocation(R_size_t size) { if (R_IsMemReporting) { if(size > R_MemReportingThreshold) { fprintf(R_MemReportingOutfile, "%lu :", (unsigned long) size); R_OutputStackTrace(R_MemReportingOutfile); } } return; } static void R_ReportNewPage(void) { if (R_IsMemReporting) { fprintf(R_MemReportingOutfile, "new page:"); R_OutputStackTrace(R_MemReportingOutfile); } return; } Could it be that when you wrote it you had another usage for R_OutputStackTrace() in mind as well? If so, it makes sense that R_OutputStackTrace() shouldn't output a newline if the stack trace was empty. But if the above is the only usage, to me it looks pretty safe to always add a newline. > sessionInfo() R version 3.3.0 Patched (2016-05-26 r70682) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 7 x64 (build 7601) Service Pack 1 locale: [1] LC_COLLATE=English_United States.1252 [2] LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 [4] LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base /Henrik On Sun, May 15, 2011 at 1:16 PM, Thomas Lumley wrote: > On Mon, May 16, 2011 at 1:02 AM, Hadley Wickham wrote: >> So what causes allocations when the call stack is empty? Something >> internal? Does the garbage collector trigger allocations (i.e. could >> it be caused by moving data to contiguous memory)? > > The garbage collector doesn't move anything, it just swaps pointers in > a linked list. > > The lexer, parser, and evaluator all have to do some work before a > function context is set up for the top-level function, so I assume > that's where it is happening. > >> Any ideas what the correct thing to do with these memory allocations? >> Ignore them because they're not really related to the function they're >> attributed to? Sum them up? >> >>> I don't see why this is done, and I may well be the person who did it >>> (I don't have svn on this computer to check), but it is clearly >>> deliberate. >> >> It seems like it would be more consistent to always print a newline, >> and then it would obvious those allocations occurred when the call >> stack was empty. This would make parsing the file a little bit >> easier. > > Yes. It's obviously better to always print a newline, and so clearly > deliberate not to, that I suspect there may have been a good reason. > If I can't work it out (after my grant deadline this week) I will just > assume it's wrong. > > >-thomas > > -- > Thomas Lumley > Professor of Biostatistics > University of Auckland > > __ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailin
Re: [Rd] RProfmem output format
On Mon, May 16, 2011 at 1:02 AM, Hadley Wickham wrote: > So what causes allocations when the call stack is empty? Something > internal? Does the garbage collector trigger allocations (i.e. could > it be caused by moving data to contiguous memory)? The garbage collector doesn't move anything, it just swaps pointers in a linked list. The lexer, parser, and evaluator all have to do some work before a function context is set up for the top-level function, so I assume that's where it is happening. > Any ideas what the correct thing to do with these memory allocations? > Ignore them because they're not really related to the function they're > attributed to? Sum them up? > >> I don't see why this is done, and I may well be the person who did it >> (I don't have svn on this computer to check), but it is clearly >> deliberate. > > It seems like it would be more consistent to always print a newline, > and then it would obvious those allocations occurred when the call > stack was empty. This would make parsing the file a little bit > easier. Yes. It's obviously better to always print a newline, and so clearly deliberate not to, that I suspect there may have been a good reason. If I can't work it out (after my grant deadline this week) I will just assume it's wrong. -thomas -- Thomas Lumley Professor of Biostatistics University of Auckland __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] RProfmem output format
Also, would you mind commenting how RProfmem is misleading? There are three ways to profile memory use over time in R code. ... All can be misleading, for different reasons. --- http://cran.r-project.org/doc/manuals/R-exts.html#Profiling-R-code-for-memory-use The other two ways describe why they are misleading. Hadley On Sun, May 15, 2011 at 8:02 AM, Hadley Wickham wrote: >>> In the subsequence lines I'm assuming the structure is bytes allocated : >>> call. >> >> I think the five numbers come from four memory allocations before >> example() is called. Looking at the code in src/main/memory.c, it >> prints newline only when the call stack is not empty. > > Looking into that example in more detail, here's the distribution of > allocation numbers: > > 1 3 4 > 4621 30 2 > > (with a threshold of 5k) > > So it happens ~30 times in total. > > So what causes allocations when the call stack is empty? Something > internal? Does the garbage collector trigger allocations (i.e. could > it be caused by moving data to contiguous memory)? > > Any ideas what the correct thing to do with these memory allocations? > Ignore them because they're not really related to the function they're > attributed to? Sum them up? > >> I don't see why this is done, and I may well be the person who did it >> (I don't have svn on this computer to check), but it is clearly >> deliberate. > > It seems like it would be more consistent to always print a newline, > and then it would obvious those allocations occurred when the call > stack was empty. This would make parsing the file a little bit > easier. > > Hadley > > -- > Assistant Professor / Dobelman Family Junior Chair > Department of Statistics / Rice University > http://had.co.nz/ > -- Assistant Professor / Dobelman Family Junior Chair Department of Statistics / Rice University http://had.co.nz/ __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] RProfmem output format
>> In the subsequence lines I'm assuming the structure is bytes allocated : >> call. > > I think the five numbers come from four memory allocations before > example() is called. Looking at the code in src/main/memory.c, it > prints newline only when the call stack is not empty. Looking into that example in more detail, here's the distribution of allocation numbers: 134 4621 302 (with a threshold of 5k) So it happens ~30 times in total. So what causes allocations when the call stack is empty? Something internal? Does the garbage collector trigger allocations (i.e. could it be caused by moving data to contiguous memory)? Any ideas what the correct thing to do with these memory allocations? Ignore them because they're not really related to the function they're attributed to? Sum them up? > I don't see why this is done, and I may well be the person who did it > (I don't have svn on this computer to check), but it is clearly > deliberate. It seems like it would be more consistent to always print a newline, and then it would obvious those allocations occurred when the call stack was empty. This would make parsing the file a little bit easier. Hadley -- Assistant Professor / Dobelman Family Junior Chair Department of Statistics / Rice University http://had.co.nz/ __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] RProfmem output format
On Sat, May 14, 2011 at 5:00 AM, Hadley Wickham wrote: > Hi all, > > When I run the example in RProfmem, I get: > > > Rprofmem("Rprofmem.out", threshold=1000) > example(glm) > Rprofmem(NULL) > noquote(readLines("Rprofmem.out", n=5)) > > ... > > [1] 1384 :5416 :5416 :1064 :1064 :"readRDS" "index.search" "example" > [2] 1064 :"readRDS" "index.search" "example" > [3] 4712 :"readRDS" "index.search" "example" > [4] 4712 :"readRDS" "index.search" "example" > [5] 1064 :"readRDS" "index.search" "example" > > What do 5 the numbers in the first line mean? > > In the subsequence lines I'm assuming the structure is bytes allocated : call. I think the five numbers come from four memory allocations before example() is called. Looking at the code in src/main/memory.c, it prints newline only when the call stack is not empty. I don't see why this is done, and I may well be the person who did it (I don't have svn on this computer to check), but it is clearly deliberate. -thomas -- Thomas Lumley Professor of Biostatistics University of Auckland __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] RProfmem output format
Hi all, When I run the example in RProfmem, I get: Rprofmem("Rprofmem.out", threshold=1000) example(glm) Rprofmem(NULL) noquote(readLines("Rprofmem.out", n=5)) ... [1] 1384 :5416 :5416 :1064 :1064 :"readRDS" "index.search" "example" [2] 1064 :"readRDS" "index.search" "example" [3] 4712 :"readRDS" "index.search" "example" [4] 4712 :"readRDS" "index.search" "example" [5] 1064 :"readRDS" "index.search" "example" What do 5 the numbers in the first line mean? In the subsequence lines I'm assuming the structure is bytes allocated : call. Hadley -- Assistant Professor / Dobelman Family Junior Chair Department of Statistics / Rice University http://had.co.nz/ __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel