On Thu, Jul 2, 2009 at 4:08 AM, Carl Johnstone<[email protected]> wrote: > I think that the mod_perl mailing list would also be interested in this - > there are very few people on that list with practical examples of > multi-thread. As far as I'm aware pre-fork is still pretty much the only > model recommended. >
Hmmm. Well I think that many people are still stuck to the paradigms of mod_perl previous to 2.0 and Apache 2.0.x I didn't try this by mere chance... after some research we found two interesting references that led me in this direction: 1) "Scaling Apache 2.x beyond 20,000 concurrent downloads" by Colm MacCárthaigh <[email protected]> 21st July 2005. Look at 3.1 "Choosing an MPM". The results in 2005 were promising, I figured that by 2008 many of these things were overcome. This paper also helped me in fine-tuning the Linux VM and led me to try FreeBSD. 2) "Practical mod_perl" O'Reilly Edition May 2003. Look at 24.3 "What's new in mod_perl 2.0", specifically 24.3.1 "Thread Support". When I read this section, I decided to give this a try, as it was perfect for our blocking processes problem (i.e. having many light-weight threads blocking for several seconds each). > Alejandro Imass wrote: >> Ok. What would you have done? - not meant as a defensive question but >> really, we would like to hear options for this application! > > I would've probably pushed for a change in the architecture, so that the > browser makes a request then polls for results. Don't under-estimate the > ability of users to hammer the F5 button because the page has taken 2 > seconds longer to come back than they expected! > There was no change of architecture possible. The other service takes 1 to 7 seconds to respond. We are a B2B server: there is no final user here. We are just an intermediate server that needs to hold the load to the slow service. > However I do find your choice of solution interesting, as you've essentially > managed to get a fairly out-of-the-box solution working. There are a bunch > of things that could be done to process this type of workload quicker, but > with the disadvantage that you've got a bigger custom code-base to maintain. > Not at all the case. We chose Catalyst for it's design pattern and multiple deployment options, plus the fact that we do a lot of work in Catalyst and in Perl in general. Our own performance was not an issue. The issue was the blocking calls to the other service. > I'm curious about the memory differences between pre-fork and threaded in > mod_perl from your testing. General mod_perl advice is to pre-load as much > perl code and data as possible and take advantage of the copy-on-write > aspects of VM. Did you push this? How much difference was there between the > models? Oh, of course. We did some profiling to revise and reduce memory problems, including valgrind and alike and we stripped our Catalyst app to the bare essentials: For example, our API has HTML form-based as well as XML capabilities, and FormBuilder was used for form validations, etc. ; this proved particularly costly so FormBuilder so it had to be removed, as well as several other plugins for the same reason. For your reference, the app was about 30MB in the stand-alone Catalyst server. Curiously, not very much different in 64Bits and in pre-fork the average size of each apache process was not very different either I don't have the exact number w/ me but they were around 40MB each RSS. With 64 Threads the base Apache process grew to 115MB (170MB VSZ) and each actual serving process flattened out to 200MB RSS (750 VSZ) after several thousand hits. So if you have enough CPU power, the RAM benefits are great. What I have seen happening many times, including our case is that CPU was either sub-utilized or very busy waiting for swap disk. Just FYI: A similar prototype in POE was about 20MB, so no major savings there! That's when we looked to AnyEvent with EV (from Perl). This last prototype did prove a viable option, but the TCO was worrying and we had already invested and completed the Catalyst app. If I were to re-write from scratch I would probably revisit the AnyEvent/EV path. For an app that has to block thousands of incoming HTTP requests for several seconds there are really very few options and at some point you have to dedicate a process or thread (LWP on Linux and alike) to each one. When we realized this, we started looking for references on this and just decided to try with threaded mod_perl. The results, as you can see, were really amazing. Furthermore, the ability to easily develop the app logic with the stand-alone server (the MVC design pattern, an ORM like CDBI, etc.) and being able to deal with the deployment problems without (or little) change in the code was just phenomenal, proving that Catalyst is a very good choice (if not the best) for developing any high-end application. IMHO a lot of the thread paranoia is true before Perl 5.8, mod_perl 2.0 and Apache 2.0.x. Can't say for sure, as I am no expert by any means, but for us it was just compile, configure and rock-and-roll. Best, Alejandro > > Carl > > > _______________________________________________ > List: [email protected] > Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst > Searchable archive: http://www.mail-archive.com/[email protected]/ > Dev site: http://dev.catalyst.perl.org/ > _______________________________________________ List: [email protected] Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst Searchable archive: http://www.mail-archive.com/[email protected]/ Dev site: http://dev.catalyst.perl.org/
