On Wed, 1 Nov 2006, Vladimir Dergachev wrote: > > Hi all, > > I was looking at the following piece of code in src/main/memory.c, function > allocVector : > > if (size <= NodeClassSize[1]) { > node_class = 1; > alloc_size = NodeClassSize[1]; > } > else { > node_class = LARGE_NODE_CLASS; > alloc_size = size; > for (i = 2; i < NUM_SMALL_NODE_CLASSES; i++) { > if (size <= NodeClassSize[i]) { > node_class = i; > alloc_size = NodeClassSize[i]; > break; > } > } > } > > > It appears that for LARGE_NODE_CLASS the variable alloc_size should not be > size, but something far less as we are not using vector heap, but rather > calling malloc directly in the code below (and from discussions I read on > this mailing list I think that these two are different - please let me know > if I am wrong). > > So when allocate a large vector the garbage collector goes nuts trying to find > all that space which is not going to be needed after all.
This is as intended, not a bug. The garbage collector does not "go nuts" -- it is doing a garbage collection that may release memory in advance of making a large allocation. The size of the current allocation request is used as part of the process of deciding when to satisfy an allocation by malloc (of a single large noda or a page) and when to first do a gc. It is essential to do this for large allocations as well to keep the memory footprint down and help reduce fragmentation. The strategy for deciding when to allocate and when to gc is by necessity heuristic. It tries to keep overall memory footprint low but at the same time tries to adapt to usage so that gc happens less oftn once a pattern of using larger amounts of memory emerges. The current strategy seems quite robust across a large range of architactures, memory configurations, and applications. That said, when I wrote the mamager I kept in mind that we might eventually want to try morre sophisticated schemes and/or allow some user control over the schemes used. It may be time to revisit this soon. luke > > I made an experiment and replaced the line alloc_size=size with alloc_size=0. > > R compiled fine (both 2.4.0 and 2.3.1) and passed make check with no issues > (it all printed OK). > > Furthermore, all allocVector calls completed in no time and my test case run > very fast (22 seconds, as opposed to minutes). > > In addition, attach() was instantaneous which was wonderful. > > Could anyone with deeper knowledge of R internals comment on whether this > makes any sense ? > > thank you very much ! > > Vladimir Dergachev > > ______________________________________________ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel > -- Luke Tierney Chair, Statistics and Actuarial Science Ralph E. Wareham Professor of Mathematical Sciences University of Iowa Phone: 319-335-3386 Department of Statistics and Fax: 319-335-3017 Actuarial Science 241 Schaeffer Hall email: [EMAIL PROTECTED] Iowa City, IA 52242 WWW: http://www.stat.uiowa.edu ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel