On Sat, 19 Feb 2005, Nawaaz Ahmed wrote:
Thanks Brian. I looked at the code (memory.c) after I sent out the first
email and noticed the malloc() call that you mention in your reply.
Looking into this code suggested a possible scenario where R would fail in
malloc() even if it had enough free heap address space.
I noticed that if there is enough heap address space (memory.c:1796,
VHEAP_FREE() > alloc_size) then the garbage collector is not run. So malloc
could fail (since there is no more address space to use), even though R
itself has enough free space it can reclaim. A simple fix is for R to try
doing garbage collection if malloc() fails.
I hacked memory.c() to look in R_GenHeap[LARGE_NODE_CLASS].New if malloc()
fails (in a very similar fashion to ReleaseLargeFreeVectors())
I did a "best-fit" stealing from this list and returned it to allocVector().
This seemed to fix my particular problem - the large vectors that I had
allocated in the previous round were still sitting in this list. Of course,
the right thing to do is to check if there are any free vectors of the right
size before calling malloc() - but it was simpler to do it my way (because I
did not have to worry about how efficient my best-fit was; memory allocation
was anyway going to fail).
I can look deeper into this and provide more details if needed.
Thanks. It looks like it would be a good idea to modify the malloc at
that point to try a GC if the malloc fails, then retry the malloc and
only bail if the second malloc fails. I want to think this through a
bit more before going ahead, but I think it will be the right thing to
do.
Best,
luke
Nawaaz
Prof Brian Ripley wrote:
BTW, I think this is really an R-devel question, and if you want to pursue
this please use that list. (See the posting guide as to why I think so.)
This looks like fragmentation of the address space: many of us are using
64-bit OSes with 2-4Gb of RAM precisely to avoid such fragmentation.
Notice (memory.c line 1829 in the current sources) that large vectors are
malloc-ed separately, so this is a malloc failure, and there is not a lot R
can do about how malloc fragments the (presumably in your case as you did
not say) 32-bit process address space.
The message
1101.7 Mbytes of heap free (51%)
is a legacy of an earlier gc() and is not really `free': I believe it means
something like `may be allocated before garbage collection is triggered':
see memory.c.
On Sat, 19 Feb 2005, Nawaaz Ahmed wrote:
I have a data set of roughly 700MB which during processing grows up to 2G
( I'm using a 4G linux box). After the work is done I clean up (rm()) and
the state is returned to 700MB. Yet I find I cannot run the same routine
again as it claims to not be able to allocate memory even though gcinfo()
claims there is 1.1G left.
At the start of the second time
===============================
used (Mb) gc trigger (Mb)
Ncells 2261001 60.4 3493455 93.3
Vcells 98828592 754.1 279952797 2135.9
Before Failing
==============
Garbage collection 459 = 312+51+96 (level 0) ...
1222596 cons cells free (34%)
1101.7 Mbytes of heap free (51%)
Error: cannot allocate vector of size 559481 Kb
This looks like a fragmentation problem. Anyone have a handle on this
situation? (ie. any work around?) Anyone working on improving R's
fragmentation problems?
On the other hand, is it possible there is a memory leak? In order to make
my functions work on this dataset I tried to eliminate copies by coding
with references (basic new.env() tricks). I presume that my cleaning up
returned the temporary data (as evidenced by the gc output at the start of
the second round of processing). Is it possible that it was not really
cleaned up and is sitting around somewhere even though gc() thinks it has
been returned?
Thanks - any clues to follow up will be very helpful.
Nawaaz
______________________________________________
R-devel@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
--
Luke Tierney
University of Iowa Phone: 319-335-3386
Department of Statistics and Fax: 319-335-3017
Actuarial Science
241 Schaeffer Hall email: [EMAIL PROTECTED]
Iowa City, IA 52242 WWW: http://www.stat.uiowa.edu
______________________________________________
R-devel@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel