On May 30, 2007, at 08:04, Matt Mower wrote: > On 30/05/07, Eric Hodel <[EMAIL PROTECTED]> wrote: >> On May 30, 2007, at 03:28, Matt Mower wrote: >>> I did a quick trawl back through the last few months of archives and >>> couldn't find any discussions on changes to the current 'bulk >>> update' >>> approach of downloading a parsing a single, ever growing, YAML file. >>> >>> Has there been discussion of alternative approaches? Anything >>> decided >>> on? Anything definitely to be avoided? Any work planned or in >>> progress? >> >> Read back two threads. > > When I read that thread I took it to mean that people were looking at > ways to avoid transferring the entire YAML file but I made an > assumption that the data would still be held in a single in-memory > data structure (at a glance the same structure loaded from the YAML > file but using Marshal instead). > > Possibly the problem is only related to trying to build such a big > structure using YAML but the problems I (and a couple of others) have > seen appear to be related to the memory usage of the gem source index. > > I've got little experience of the Rubygems codebase at this point. > Have I misunderstood the thrust of that thread, or how Rubygems > operates?
Working with the source cache creates a giant object tree. Loading the YAML file creates two very large Strings and a giant object tree. Loading a handful of updates creates a handful or two of very small object trees and Strings. While its hard to capture the latter, it does take much less memory than the former. The problem is being worked on right now. _______________________________________________ Rubygems-developers mailing list Rubygems-developers@rubyforge.org http://rubyforge.org/mailman/listinfo/rubygems-developers