Re: [Rd] RProfmem output format
In the subsequence lines I'm assuming the structure is bytes allocated : call. I think the five numbers come from four memory allocations before example() is called. Looking at the code in src/main/memory.c, it prints newline only when the call stack is not empty. Looking into that example in more detail, here's the distribution of allocation numbers: 134 4621 302 (with a threshold of 5k) So it happens ~30 times in total. So what causes allocations when the call stack is empty? Something internal? Does the garbage collector trigger allocations (i.e. could it be caused by moving data to contiguous memory)? Any ideas what the correct thing to do with these memory allocations? Ignore them because they're not really related to the function they're attributed to? Sum them up? I don't see why this is done, and I may well be the person who did it (I don't have svn on this computer to check), but it is clearly deliberate. It seems like it would be more consistent to always print a newline, and then it would obvious those allocations occurred when the call stack was empty. This would make parsing the file a little bit easier. Hadley -- Assistant Professor / Dobelman Family Junior Chair Department of Statistics / Rice University http://had.co.nz/ __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Recursively parsing srcrefs
The bug is now fixed in R-devel and R-patched. Thanks! Hadley -- Assistant Professor / Dobelman Family Junior Chair Department of Statistics / Rice University http://had.co.nz/ __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Recursively parsing srcrefs
findLineNum doesn't quite do what I want - it works on the text of the srcref, not on the parse tree. It searches through the parse tree for the smallest source ref that contains a given line. So for example, if(condition) { blah blah blah } is a single statement, and there will be a srcref stored in its container that goes from line N to line N+4. But it also contains the compound statement { blah blah blah } and there will be srcrefs attached to that for each of the statements in it. (I forget right now whether there are 3 or 4 statements there: R treats braces in a funny way, and I'd have to look at an example to check.) Each of the blah's will get a srcref spanning one line, and it will be stored in the container. I'm clearly missing something obvious because I don't see how to access these lower-level srcrefs. Would you mind providing a small example? Thanks! Hadley -- Assistant Professor / Dobelman Family Junior Chair Department of Statistics / Rice University http://had.co.nz/ __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] RProfmem output format
Also, would you mind commenting how RProfmem is misleading? There are three ways to profile memory use over time in R code. ... All can be misleading, for different reasons. --- http://cran.r-project.org/doc/manuals/R-exts.html#Profiling-R-code-for-memory-use The other two ways describe why they are misleading. Hadley On Sun, May 15, 2011 at 8:02 AM, Hadley Wickham had...@rice.edu wrote: In the subsequence lines I'm assuming the structure is bytes allocated : call. I think the five numbers come from four memory allocations before example() is called. Looking at the code in src/main/memory.c, it prints newline only when the call stack is not empty. Looking into that example in more detail, here's the distribution of allocation numbers: 1 3 4 4621 30 2 (with a threshold of 5k) So it happens ~30 times in total. So what causes allocations when the call stack is empty? Something internal? Does the garbage collector trigger allocations (i.e. could it be caused by moving data to contiguous memory)? Any ideas what the correct thing to do with these memory allocations? Ignore them because they're not really related to the function they're attributed to? Sum them up? I don't see why this is done, and I may well be the person who did it (I don't have svn on this computer to check), but it is clearly deliberate. It seems like it would be more consistent to always print a newline, and then it would obvious those allocations occurred when the call stack was empty. This would make parsing the file a little bit easier. Hadley -- Assistant Professor / Dobelman Family Junior Chair Department of Statistics / Rice University http://had.co.nz/ -- Assistant Professor / Dobelman Family Junior Chair Department of Statistics / Rice University http://had.co.nz/ __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] By default, `names-` alters S4 objects
This is basically a case of a user error that is not being caught: On 5/14/11 3:47 PM, Hervé Pagès wrote: Hi, I was stumped by this. The two S4 objects below looked exactly the same: a1 An object of class A Slot aa: integer(0) a2 An object of class A Slot aa: integer(0) str(a1) Formal class 'A' [package .GlobalEnv] with 1 slots ..@ aa: int(0) str(a2) Formal class 'A' [package .GlobalEnv] with 1 slots ..@ aa: int(0) But they were not identical: identical(a1,a2) [1] FALSE Then I found that one had a names attribute but not the other: names(attributes(a1)) [1] aa class names names(attributes(a2)) [1] aa class names(a1) NULL names(a2) NULL Which explained why they were not reported as identical. After tracking the history of 'a1', I found that it was created with something like: setClass(A, representation(aa=integer)) [1] A a1 - new(A) names(a1) - K names(a1) NULL So it seems that, by default (i.e. in the absence of a specialized method), the `names-` primitive is adding a names attribute to the object. Could this behaviour be modified so it doesn't alter the object? Eh? But you did alter the object. Not only that, you altered it in what is technically an invalid way: Adding a names attribute to a class that has no names slot. The modification that would make sense would be to give you an error in the above code. Not a bad idea, but it's likely to generate more complaints in other contexts, particularly where people don't distinguish the list class from lists with names (the namedList class). A plausible strategy: 1. If the class has a vector data slot and no names slot, assign the names but with a warning. 2. Otherwise, throw an error. (I.e., I would prefer an error throughout, but discretion ) Comments? John Thanks, H. __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] RProfmem output format
On Mon, May 16, 2011 at 1:02 AM, Hadley Wickham had...@rice.edu wrote: So what causes allocations when the call stack is empty? Something internal? Does the garbage collector trigger allocations (i.e. could it be caused by moving data to contiguous memory)? The garbage collector doesn't move anything, it just swaps pointers in a linked list. The lexer, parser, and evaluator all have to do some work before a function context is set up for the top-level function, so I assume that's where it is happening. Any ideas what the correct thing to do with these memory allocations? Ignore them because they're not really related to the function they're attributed to? Sum them up? I don't see why this is done, and I may well be the person who did it (I don't have svn on this computer to check), but it is clearly deliberate. It seems like it would be more consistent to always print a newline, and then it would obvious those allocations occurred when the call stack was empty. This would make parsing the file a little bit easier. Yes. It's obviously better to always print a newline, and so clearly deliberate not to, that I suspect there may have been a good reason. If I can't work it out (after my grant deadline this week) I will just assume it's wrong. -thomas -- Thomas Lumley Professor of Biostatistics University of Auckland __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] By default, `names-` alters S4 objects
On 11-05-15 11:33 AM, John Chambers wrote: This is basically a case of a user error that is not being caught: Sure! https://stat.ethz.ch/pipermail/r-devel/2009-March/052386.html On 5/14/11 3:47 PM, Hervé Pagès wrote: Hi, I was stumped by this. The two S4 objects below looked exactly the same: a1 An object of class A Slot aa: integer(0) a2 An object of class A Slot aa: integer(0) str(a1) Formal class 'A' [package .GlobalEnv] with 1 slots ..@ aa: int(0) str(a2) Formal class 'A' [package .GlobalEnv] with 1 slots ..@ aa: int(0) But they were not identical: identical(a1,a2) [1] FALSE Then I found that one had a names attribute but not the other: names(attributes(a1)) [1] aa class names names(attributes(a2)) [1] aa class names(a1) NULL names(a2) NULL Which explained why they were not reported as identical. After tracking the history of 'a1', I found that it was created with something like: setClass(A, representation(aa=integer)) [1] A a1 - new(A) names(a1) - K names(a1) NULL So it seems that, by default (i.e. in the absence of a specialized method), the `names-` primitive is adding a names attribute to the object. Could this behaviour be modified so it doesn't alter the object? Eh? But you did alter the object. Not only that, you altered it in what is technically an invalid way: Adding a names attribute to a class that has no names slot. Ah, that's interesting. I didn't know I could put a names slot in my class. Last time I tried was at least 3 years ago and that was causing problems (don't remember the exact details) so I ended up using NAMES instead. Trying again with R-2.14: setClass(A, representation(names=character)) a - new(A) attributes(a) $names character(0) $class [1] A attr(,package) [1] .GlobalEnv names(a) NULL names(a) - K attributes(a) $names [1] K $class [1] A attr(,package) [1] .GlobalEnv names(a) NULL Surprise! But that's another story... The modification that would make sense would be to give you an error in the above code. Not a bad idea, but it's likely to generate more complaints in other contexts, particularly where people don't distinguish the list class from lists with names (the namedList class). A plausible strategy: 1. If the class has a vector data slot and no names slot, assign the names but with a warning. 2. Otherwise, throw an error. (I.e., I would prefer an error throughout, but discretion ) Or, at a minimum (if no consensus can be reached about the above strategy), not add a names attribute set to NULL. My original post was more about keeping the internal representation of objects normalized, in general, so identical() is more likely to be meaningful. Thanks, H. Comments? John Thanks, H. __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel -- Hervé Pagès Program in Computational Biology Division of Public Health Sciences Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N, M1-B514 P.O. Box 19024 Seattle, WA 98109-1024 E-mail: hpa...@fhcrc.org Phone: (206) 667-5791 Fax:(206) 667-1319 __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel