[ruote:2604] Re: Ruote couch memory bloat, StorageHistory

I. E. Smith-Heisters Thu, 09 Sep 2010 10:20:35 -0700

On Sep 8, 5:08 pm, John Mettraux <[email protected]> wrote:
> On Wed, Sep 08, 2010 at 12:49:15PM -0700, I. E. Smith-Heisters wrote:
>
> > I had a problem where calls to the Ruote::StorageHistory were bloating
> > our Ruby process's memory footprint by ~370MB, bringing our servers to
> > screeching halts. Upon investigation, I found that
> > Ruote::StorageHistory#by_process pulls all the documents from the
> > history database and iterates over them. While it might only return
> > 100 documents, it causes the Ruby interpreter to grab enough memory
> > for all 16.5k records (and the Ruby interpreter doesn't like to let go
> > of memory it's gotten its greedy little hands on).
>
> > My first attempt at a fix was to paginate the fetching of the records.
> > This keeps the memory footprint constant as the database grows, but
> > adds significant I/O time. It's a good fallback because I prefer that
> > things take a long time, rather than crashing.
>
> Hello Ian,
>
> that's excellent feedback.
>
> I've been working during the last weeks on pagination for all the storages, 
> ruote-couch included.
>
>  http://github.com/jmettraux/ruote/blob/ruote2.1/CHANGELOG.txt
>  http://github.com/jmettraux/ruote-couch/blob/ruote2.1/CHANGELOG.txt
>
> I have to find the time to release.
>
> Would this work help you ?


Sounds great! I imagine it would be helpful. We're not actually using
the pagination at all at the moment, but IMO any time one fetches more
than, say, 100 records pagination should be used to iterate over them
transparently so as to avoid memory bloat. Performance of course will
become an issue, and having some kind of non-blocking I/O would help
that.

>
> > My second attempt was to make StorageHistory leverage
> > WfidIndexedDatabase, and it was actually much easier to implement. It
> > reduced the memory footprint while also increasing performance by a
> > factor of a little less than 4.
>
> Very nice.
>
> > Here's the gist with my monkey patches:http://gist.github.com/570681.
> > I thought some portion of them might be suitable for incorporation
> > into mainline. I haven't done extensive testing or field-use of these
> > yet--that will be coming in the next few weeks. I'll keep the gists
> > and the list updated with any significant patches.
>
> I'm very interested to merge your work into the mainline, I'm wondering if my 
> pagination (it's for all storages) and your patches could get along. I will 
> study carefully your gists.

I'd be happy to refactor my code to fit into yours. For instance, my
modification to StorageHistory should really be done by subclassing
StorageHistory (perhaps as Ruote::Couch::StorageHistory) so that
Ruote::StorageHistory remains compatible with the usual storage
interfaces. Unless you intend to add some mechanism for selecting a
special database class (WfidIndexedDatabase) to all storages?

>
> Feel free to indicate your requirements and point at weaknesses like you just 
> did. I hope we can work together to make the whole ruote system better, at 
> least for you.

We also ran into some issues because Ruby's HTTP implementations suck
(eg. http://apocryph.org/2008/10/04/analysis_ruby_18x_http_client_performance/).
We moved to Patron, which helped significantly, but we're working on a
Memcached cache layer for Ruote::Couch that helps even more. It's not
baked enough for release, but we'll let you know when it is. Of
course, getting Ruby's HTTP up to speed would obviate the need for
this.

>
> Many thanks,
>
> --
> John Mettraux -http://jmettraux.wordpress.com

-- 
you received this message because you are subscribed to the "ruote users" group.
to post : send email to [email protected]
to unsubscribe : send email to [email protected]
more options : http://groups.google.com/group/openwferu-users?hl=en

[ruote:2604] Re: Ruote couch memory bloat, StorageHistory

Reply via email to