On Tue, Mar 25, 2008 at 4:40 AM, James Tucker <[EMAIL PROTECTED]> wrote: > Forgive me for not having read the whole thread, however, there is one thing > that seems to be really important, and that is, ruby hardly ever runs the > damned GC. It certainly doesn't do full runs nearly often enough (IMO).
There's only one kind of garbage collection sweep. And yeah, depending on what's happening, GC may not run very often. That's not generally a problem. > Also, implicit OOMEs or GC runs quite often DO NOT affect the extensions > correctly. I don't know what rmagick is doing under the hood in this area, > but having been generating large portions of country maps with it (and > moving away from it very rapidly), I know the GC doesn't do "The Right > Thing". There should be no difference between a GC run that is initiated by the interpreter and one that is initiated by one's code. It ends up calling the same thing in gc.c. Extensions can easily mismanage memory, though, and I have a hunch about what's happening with rmagick. > First call of address is GC_MALLOC_LIMIT and friends. For any small script > that doesn't breach that value, the GC simply doesn't run. More than this, > RMagick, in it's apparent 'wisdom' never frees memory if the GC never runs. > Seriously, check it out. Make a tiny script, and make a huge image with it. > Hell, make 20, get an OOME, and watch for a run of the GC. The OOME will > reach your code before the GC calls on RMagick to free. > > Now, add a call to GC.start, and no OOME. Despite the limitations of it > (ruby performance only IMO), most of the above experience was built up on > windows, and last usage was about 6 months ago, FYI. My hunch is that rmagick is allocating large amounts of RAM ouside of Ruby. It registers its objects with the interpreter, but the RAM usage in rmagick itself doesn't count against GC_MALLOC_LIMIT because Ruby didn't allocate it, so doesn't know about it. So, it uses huge amounts of RAM, but doesn't use huge numbers of objects. Thus you never trigger a GC cycle by exceeding the GC_MALLOC_LIMIT nor by running our of object slots in the heap. I'd have to go look at the code to be sure, but the theory fits the behavior that is described very well. I don't think this is a case for building GC.foo memory management into Mongrel, though. As I think you are suggesting, just call GC.start yourself in your code when necessary. In a typical Rails app doing big things with rmagick, the extra time to do GC.start at the end of the image manipulation, in the request handling, isn't going to be noticable. > But that's not really the overall point. My overall point is how to > properly handle a rails app that uses a great deal of memory during each > request. I'm pretty sure this happens in other rails applications that > don't happen to use 'RMagick'. > > Personally, I'll simply say call the GC more often. Seriously. I mean it. > It's not *that* slow, not at all. In fact, I call GC.start explicitly inside > of by ubygems.rb due to stuff I have observed before: I completely concur with this. If there are issues with huge memory use (most likely caused by extensions making RAM allocations outside of Ruby's accounting, so implicit GC isn't triggered), just call GC.start in one's own code. > Now, by my reckoning (and a few production apps seem to be showing > emperically (purely emperical, sorry)) we should be calling on the GC whilst > loading up the apps. I mean come on, when are a really serious number of > temporary objects being created. Actually, it's when rubygems loads, and > that's the first thing that happens in, hmm, probably over 90% of ruby > processes out there. Just as a tangent, I do this in Swiftiply. I make an explicit call to GC.start after everything is loaded and all configs are parsed, just to make sure execution is going into the main event loop with as much junk cleaned out as possible. > Or whatever. It doesn't really matter that much where you do this, or when, > it just needs to happen every now and then. More importantly, add a GC.start > to the end of environment.rb, and you will have literally half the number of > objects in ObjectSpace. This makes sense to me. I could also see providing a 2nd Rails handler that had some GC management stuff in it, along with some documentation on what it actually does or does not do, so people can make an explicit choice to use it, if they need it. I'm still completely against throwing code into Mongrel itself for this sort of thing. I just prefer not to throw more things into Mongrel than we really _need_ to, when there is no strong argument for them being inside of Mongrel itself. GC.start stuff is simple enough to put into one's own code at appropriate locations, or to put into a customized Mongrel handler if one needs it. Maybe this simply needs to be documented in the body of Mongrel documentation? Kirk Haines _______________________________________________ Mongrel-users mailing list Mongrel-users@rubyforge.org http://rubyforge.org/mailman/listinfo/mongrel-users