Re: [R] memory management

2012-02-29 Thread Sam Steingold
* William Dunlap jqha...@gvopb.pbz [2012-02-28 23:06:54 +]: You need to walk through the objects, checking for environments on each component or attribute of an object. so why doesn't object.size do that? f - function(n) { + d - data.frame(y = rnorm(n), x = rnorm(n)) + lm(y

Re: [R] memory management

2012-02-29 Thread William Dunlap
I do a lot of strsplit, unlist, subsetting, so I could imagine why the RSS is triple the total size of my data if all the intermediate results are not released. I can only give some generalities about that. Using lots of small chunks of memory (like short strings) may cause fragmentation

Re: [R] memory management

2012-02-29 Thread Milan Bouchet-Valat
Le mercredi 29 février 2012 à 11:42 -0500, Sam Steingold a écrit : * William Dunlap jqha...@gvopb.pbz [2012-02-28 23:06:54 +]: You need to walk through the objects, checking for environments on each component or attribute of an object. so why doesn't object.size do that? f -

Re: [R] memory management

2012-02-29 Thread Sam Steingold
* Milan Bouchet-Valat anyvzv...@pyho.se [2012-02-29 18:18:50 +0100]: I think you're simply hitting a (terrible) OS limitation. Linux is very often not able to reclaim the memory R has used because it's fragmented. The OS can only get the pages back if nothing is above them, and most of the

Re: [R] memory management

2012-02-29 Thread luke-tierney
On Wed, 29 Feb 2012, Sam Steingold wrote: * Milan Bouchet-Valat anyvzv...@pyho.se [2012-02-29 18:18:50 +0100]: I think you're simply hitting a (terrible) OS limitation. Linux is very often not able to reclaim the memory R has used because it's fragmented. The OS can only get the pages back if

Re: [R] memory management

2012-02-29 Thread Sam Steingold
* yhxr-gvre...@hvbjn.rqh [2012-02-29 13:55:25 -0600]: On Wed, 29 Feb 2012, Sam Steingold wrote: compacting garbage collector is our best friend! Which R does not use because of the problems it would create for external C/Fortran code on which R heavily relies. Well, you know better, of

Re: [R] memory management

2012-02-28 Thread Sam Steingold
My basic worry is that the GC does not work properly, i.e., the unreachable data is never collected. * Bert Gunter thagre.ore...@trar.pbz [2012-02-27 14:35:14 -0800]: This appears to be the sort of query that (with apologies to other R gurus) only Brian Ripley or Luke Tierney could figure

Re: [R] memory management

2012-02-28 Thread Bert Gunter
On Tue, Feb 28, 2012 at 11:57 AM, Sam Steingold s...@gnu.org wrote: My basic worry is that the GC does not work properly, i.e., the unreachable data is never collected. Highly unlikely. Such basic inner R code has been well tested over 20 years. I believe that you merely don't understand the

Re: [R] memory management

2012-02-28 Thread William Dunlap
-project.org] On Behalf Of Sam Steingold Sent: Tuesday, February 28, 2012 11:58 AM To: r-help@r-project.org; Bert Gunter Subject: Re: [R] memory management My basic worry is that the GC does not work properly, i.e., the unreachable data is never collected. * Bert Gunter thagre.ore...@trar.pbz

Re: [R] memory management

2012-02-28 Thread Sam Steingold
* William Dunlap jqha...@gvopb.pbz [2012-02-28 20:19:06 +]: Look into environments that may be stored with your data. thanks, but I see nothing like that: for (n in ls(all.names = TRUE)) { o - get(n) print(object.size(o), units=Kb) e - environment(o) if (!identical(e,NULL)

Re: [R] memory management

2012-02-28 Thread William Dunlap
You need to walk through the objects, checking for environments on each component or attribute of an object. You also have to look at the parent.env of each environment found. E.g., f - function(n) { + d - data.frame(y = rnorm(n), x = rnorm(n)) + lm(y ~ poly(x, 4), data=d) + } z

Re: [R] memory management

2012-02-27 Thread Sam Steingold
It appears that the intermediate data in functions is never GCed even after the return from the function call. R's RSS is 4 Gb (after a gc()) and sum(unlist(lapply(lapply(ls(),get),object.size))) [1] 1009496520 (less than 1 GB) how do I figure out where the 3GB of uncollected garbage is hiding?

Re: [R] memory management

2012-02-27 Thread Bert Gunter
This appears to be the sort of query that (with apologies to other R gurus) only Brian Ripley or Luke Tierney could figure out. R generally passes by value into function calls (but not *always*), so often multiple copies of objects are made during the course of calls. I would speculate that this

[R] memory management

2012-02-09 Thread Sam Steingold
zz - data.frame(a=c(1,2,3),b=c(4,5,6)) zz a b 1 1 4 2 2 5 3 3 6 a - zz$a a [1] 1 2 3 a[2] - 100 a [1] 1 100 3 zz a b 1 1 4 2 2 5 3 3 6 clearly a is a _copy_ of its namesake column in zz. when was the copy made? when a was modified? at assignment? is there a way to find out how

Re: [R] memory management

2012-02-09 Thread Florent D.
This should help: invisible(gc()) m0 - memory.size() mem.usage - function(){invisible(gc()); memory.size() - m0} Mb.size - function(x)print(object.size(x), units=Mb) zz - data.frame(a=runif(100), b=runif(100)) mem.usage() [1] 15.26 Mb.size(zz) 15.3 Mb a - zz$a mem.usage() [1]

Re: [R] memory management

2012-02-09 Thread Sam Steingold
* Florent D. syb...@tznvy.pbz [2012-02-09 19:26:59 -0500]: m0 - memory.size() Mb.size - function(x)print(object.size(x), units=Mb) indeed, these are very useful, thanks. ls reports these objects larger than 100k: behavior : 390.1 Mb mydf : 115.3 Mb nb : 0.2 Mb pl : 1.2 Mb however, top

[R] Memory management

2011-06-01 Thread Michael Conklin
I am trying to run a very large Bradley-Terry model using the BradleyTerry2 package. (There are 288 players in the BT model). My problem is that I ran the model below successfully. WLMat is a win-loss matrix that is 288 by 288 WLdf-countsToBinomial(WLMat)

Re: [R] Memory Management under Linux

2010-11-05 Thread jim holtman
It would be very useful if you would post some information about what exactly you are doing. There si something with the size of the data object you are processing ('str' would help us understand it) and then a portion of the script (both before and after the error message) so we can understand

Re: [R] Memory Management under Linux

2010-11-05 Thread ricardo souza
jholt...@gmail.com Assunto: Re: [R] Memory Management under Linux Para: ricardo souza ricsouz...@yahoo.com.br Cc: r-help@r-project.org Data: Sexta-feira, 5 de Novembro de 2010, 10:21 It would be very useful if you would post some information about what exactly you are doing.  There si something

Re: [R] Memory Management under Linux

2010-11-05 Thread jim holtman
, Ricardo De: jim holtman jholt...@gmail.com Assunto: Re: [R] Memory Management under Linux Para: ricardo souza ricsouz...@yahoo.com.br Cc: r-help@r-project.org Data: Sexta-feira, 5 de Novembro de 2010, 10:21 It would be very useful if you would post some information about what exactly

[R] Memory Management under Linux

2010-11-04 Thread ricardo souza
Dear all, I am using ubuntu linux 32 with 4 Gb.  I am running a very small script and I always got the same error message:  CAN NOT ALLOCATE A VECTOR OF SIZE 231.8 Mb. I have reading carefully the instruction in ?Memory.  Using the function gc() I got very low numbers of memory (please sea

Re: [R] Memory management in R

2010-10-10 Thread Lorenzo Isella
I already offered the Biostrings package. It provides more robust methods for string matching than does grepl. Is there a reason that you choose not to? Indeed that is the way I should go for and I have installed the package after some struggling. Since biostring is a fairly complex package

Re: [R] Memory management in R

2010-10-10 Thread Mike Marchywka
Date: Sun, 10 Oct 2010 15:27:11 +0200 From: lorenzo.ise...@gmail.com To: dwinsem...@comcast.net CC: r-help@r-project.org Subject: Re: [R] Memory management in R I already offered the Biostrings package. It provides more robust methods

Re: [R] Memory management in R

2010-10-09 Thread Lorenzo Isella
Hi David, I am replying to you and to the other people who provided some insight into my problems with grepl. Well, at least we now know that the bug is reproducible. Indeed it is a strange sequence the one I am postprocessing, probably pathological to some extent, nevertheless the problem is

Re: [R] Memory management in R

2010-10-09 Thread David Winsemius
On Oct 9, 2010, at 9:45 AM, Lorenzo Isella wrote: Hi David, I am replying to you and to the other people who provided some insight into my problems with grepl. Well, at least we now know that the bug is reproducible. Indeed it is a strange sequence the one I am postprocessing, probably

Re: [R] Memory management in R

2010-10-09 Thread Lorenzo Isella
My suggestion is to explore other alternatives. (I will admit that I don't yet fully understand the test that you are applying.) Hi, I am trying to partially implement the Lempel Ziv compression algorithm. The point is that compressibility and entropy of a time series are related, hence my

Re: [R] Memory management in R

2010-10-09 Thread David Winsemius
On Oct 9, 2010, at 4:23 PM, Lorenzo Isella wrote: My suggestion is to explore other alternatives. (I will admit that I don't yet fully understand the test that you are applying.) Hi, I am trying to partially implement the Lempel Ziv compression algorithm. The point is that

[R] Memory management in R

2010-10-08 Thread Lorenzo Isella
is some limitation of grepl or R memory management. Any idea about how I could tackle this problem or how I can profile my code to fix it (though it really seems to me that I have to find a way to allow R to process longer strings). Any suggestion is appreciated. Cheers Lorenzo

Re: [R] Memory management in R

2010-10-08 Thread Lorenzo Isella
-boun...@r-project.org] Sent: Friday, October 08, 2010 1:12 PM To: r-help Subject: [R] Memory management in R Dear All, I am experiencing some problems with a script of mine. It crashes with this message Error in grepl(fut_string, past_string) : invalid regular expression '12653a6#12653a6

Re: [R] Memory management in R

2010-10-08 Thread Doran, Harold
the problem is some limitation of grepl or R memory management. Any idea about how I could tackle this problem or how I can profile my code to fix it (though it really seems to me that I have to find a way to allow R to process longer strings). Any suggestion is appreciated. Cheers Lorenzo

Re: [R] Memory management in R

2010-10-08 Thread jim holtman
) and the problem does not seem to be RAM memory (I have several GB of RAM on my machine and its consumption never shoots up so my machine never resorts to swap memory). So (though I am not an expert) it looks like the problem is some limitation of grepl or R memory management. Any idea about how I could

Re: [R] Memory management in R

2010-10-08 Thread Mike Marchywka
Date: Fri, 8 Oct 2010 13:30:59 -0400 From: jholt...@gmail.com To: lorenzo.ise...@gmail.com CC: r-help@r-project.org Subject: Re: [R] Memory management in R More specificity: how long is the string, what is the pattern you are matching against

Re: [R] Memory management in R

2010-10-08 Thread Lorenzo Isella
like the problem is some limitation of grepl or R memory management. Any idea about how I could tackle this problem or how I can profile my code to fix it (though it really seems to me that I have to find a way to allow R to process longer strings). Any suggestion is appreciated. Cheers Lorenzo

Re: [R] Memory management in R

2010-10-08 Thread David Winsemius
and its consumption never shoots up so my machine never resorts to swap memory). So (though I am not an expert) it looks like the problem is some limitation of grepl or R memory management. Any idea about how I could tackle this problem or how I can profile my code to fix it (though

Re: [R] Memory management in R

2010-10-08 Thread Mike Marchywka
From: dwinsem...@comcast.net To: lorenzo.ise...@gmail.com Date: Fri, 8 Oct 2010 19:30:45 -0400 CC: r-help@r-project.org Subject: Re: [R] Memory management in R On Oct 8, 2010, at 6:42 PM, Lorenzo Isella wrote: Please find below the R

Re: [R] Memory management in R

2010-10-08 Thread David Winsemius
On Oct 8, 2010, at 9:19 PM, Mike Marchywka wrote: From: dwinsem...@comcast.net To: lorenzo.ise...@gmail.com Date: Fri, 8 Oct 2010 19:30:45 -0400 CC: r-help@r-project.org Subject: Re: [R] Memory management in R On Oct 8, 2010, at 6:42 PM, Lorenzo

[R] memory management in R

2010-06-16 Thread john
I have volunteered to give a short talk on memory management in R to my local R user group, mainly to motivate myself to learn about it. The focus will be on what a typical R coder might want to know ( e.g. how objects are created, call by value, basics of garbage collection ) but I want

Re: [R] memory management in R

2010-06-16 Thread Jens Oehlschlägel
, short integers etc.). Jens Oehlschlägel -Ursprüngliche Nachricht- Von: john mull...@fastmail.fm Gesendet: Jun 16, 2010 12:20:17 PM An: r-help@r-project.org Betreff: [R] memory management in R I have volunteered to give a short talk on memory management in R to my local R user group

[R] About R memory management?

2009-12-10 Thread Peng Yu
I'm wondering where I can find the detailed descriptions on R memory management. Understanding this could help me understand the runtime of R program. For example, depending on how memory is allocated (either allocate a chuck of memory that is more than necessary for the current use, or allocate

Re: [R] About R memory management?

2009-12-10 Thread Henrik Bengtsson
Related... Rule of thumb: Pre-allocate your object of the *correct* data type, if you know the final dimensions. /Henrik On Thu, Dec 10, 2009 at 8:26 AM, Peng Yu pengyu...@gmail.com wrote: I'm wondering where I can find the detailed descriptions on R memory management. Understanding

Re: [R] About R memory management?

2009-12-10 Thread hadley wickham
pengyu...@gmail.com wrote: I'm wondering where I can find the detailed descriptions on R memory management. Understanding this could help me understand the runtime of R program. For example, depending on how memory is allocated (either allocate a chuck of memory that is more than necessary

Re: [R] About R memory management?

2009-12-10 Thread Peng Yu
: Related... Rule of thumb: Pre-allocate your object of the *correct* data type, if you know the final dimensions. /Henrik On Thu, Dec 10, 2009 at 8:26 AM, Peng Yu pengyu...@gmail.com wrote: I'm wondering where I can find the detailed descriptions on R memory management. Understanding this could

Re: [R] About R memory management?

2009-12-10 Thread jim holtman
Yu pengyu...@gmail.com wrote: I'm wondering where I can find the detailed descriptions on R memory management. Understanding this could help me understand the runtime of R program. For example, depending on how memory is allocated (either allocate a chuck of memory that is more than

Re: [R] About R memory management?

2009-12-10 Thread Peng Yu
wrote: Related... Rule of thumb: Pre-allocate your object of the *correct* data type, if you know the final dimensions. /Henrik On Thu, Dec 10, 2009 at 8:26 AM, Peng Yu pengyu...@gmail.com wrote: I'm wondering where I can find the detailed descriptions on R memory management

[R] FW: R memory management

2007-12-08 Thread Yuri Volchik
Hi, I'm using R to collect data for a number of exchanges through a socket connection and constantly running into memory problems even though task I believe is not that memory consuming. I guess there is a miscommunication between R and WinXP about freeing up memory. So this is the code:

Re: [R] FW: R memory management

2007-12-08 Thread Patrick Burns
The line: data. - c(data., new.data) will eat both memory and time voraciously. You should change it by creating 'data.' to be the final size it will be and then subscript into it. If you don't know the final size, then you can grow it a lot a few times instead of growing it a little lots of

Re: [R] Memory management

2007-09-15 Thread Takatsugu Kobayashi
Hi, I apologize again for posting something not suitable on this list. Basically, it sounds like I should go put this large dataset into a database... The dataset I have had trouble with is the transportation network of Chicago Consolidated Metropolitan Statistical Area. The number of samples

Re: [R] Memory management

2007-09-15 Thread jim holtman
If you data file has 49M rows and 249 columns, then if each column had 5 characters, then you are looking at a text file with 60GB. If these were all numerics (8 bytes per number), then you are looking at an R object that would be almost 100GB. If this is your data, then this is definitely a