I think I may be a bit dense on this list so forgive me if I try to clarify 
(at least for myself to make sure I have this right)...

I think what you are proposing is not that much different from the proxy 
front-end model. The mod_proxy is added overhead, but that solves your 
memory problem. You can have 50 apache processes on the front-end dealing 
with images and the like and then have only 2 or 5 or however many 
Apache/Perl processes on the backend.

The only inefficiency with this is that HTTP is the protocol being used for 
the front-end HTTPD daemon to communicate with Perl instead of a direct 
socket using a binary/compressed data protocol.

By the way, if you really prefer this out of process yet still a pool of 
Perl interpreters model, you could always consider purchasing Binary 
Evolution's Velocigen product for Netscape for UNIX. I believe they have a 
mode that allows the Perl engine to run out-of-process with a lightweight 
NSAPI wrapper talking to Perl.

It turns out that this is probably the best way to deal with a buggy 
product like Netscape anyway... NSAPI is such a flakey beast that it's no 
wonder that a company would want to separate the application processes out 
(but now I am getting out of topic).

It's likely that this is a faster solution that the mod_proxy solution 
mod_perl uses because mod_proxy and HTTP are both relatively complex and 
designed to do more than provide back-end application server communications.

Here's the relevant Velocigen URL:

http://www.binaryevolution.com/velocigen/arch.vet

However, I would caution that really mod_perl speeds things up SO much as 
it is, that this architectural improvement over using front-end/back-end 
apache servers is really probably not going to make that big a deal unless 
you are writing something that will be under some really really heavy 
stress. And, of course, you should do your own benchmarking to see if this 
is the case.

While you are at it, you might consider PerlEx from ActiveState. As that 
provide in-process thread-pooled Perl engines that run in the IIS memory 
space.

But again, I would stress that speed isn't the only thing. Think about 
reliability. I think the mod_perl model tends to be more reliable (in the 
front/backend scenario) because the apache servers can be monitored to die 
off independently when they spin out of control.. and they can't pollute 
each other's memory space.  Using some mod_rewrite rules, you can also 
limit which applications are partitioned from each other in which back-end 
services as well very easily.

I don't know how easily you can specify what I would term 
application-affinities in the Velocigen or PerlEx model based on URL alone.

Anyway, good luck with your search for information...

Thanks,
     Gunther

At 10:46 PM 4/15/00 +0000, [EMAIL PROTECTED] wrote:
>Perrin-
>On Sat, Apr 15, 2000 at 11:33:15AM -0700, Perrin Harkins wrote:
> > > Each process of apache has
> > > it's registry which holds the compiled perl scripts in..., a copy of
> > > each for each process.  This has become an issue for one of the
> > > companies that I work for, and I noted from monitoring the list that
> > > some people have apache processes that are upwards of 25Megs, which is
> > > frankly ridiculous.
> >
> > I have processes that large, but more than 50% of that is shared through
> > copy-on-write.
> >
> > > I wrote a very small perl engine
> > > for phhttpd that worked within it's threaded paradigm that sucked up a
> > > neglibible amount of memory which used a very basic version of
> > > Apache's registry.
> >
> > Can you explain how this uses less memory than mod_perl doing the same
> > thing?  Was it just that you were using fewer perl interpreters?  If 
> so, you
> > need to improve your use of apache with a multi-server setup.  The only way
> > I could see phttpd really using less memory to do the same work is if you
> > somehow managed to get perl to share more of its internals in memory.  Did
> > you?
>
>Yep very handily I might add ;-).  Basically phhttpd is not process
>based, it's threaded based.  Which means that everything is running
>inside of the same address space.  Which means 100% sharing except for
>the present local stack of variables... which is very minimal.  In
>terms of the perl thing... when you look at your processes and see all
>that non-shared memory, most of that is stack variables.  Now most
>webservers are running on single processor machines, so they get no
>benefit from having 10s or even 100s of copies of these perl stack
>variables.  Its much more efficient to have a single process handle
>all the perl requests.  On a multiprocessor box that single process
>could have multiple threads in order to take advantage of the
>processors.  See..., mod_perl stores the stack state of every script
>it runs in the apache process... for every script... copies of it,
>many many copies of it.  This is not efficient.  What would be
>efficient is to have as many threads/processes as you have processors
>for the mod_perl engine.  In other words seperate the engine from the
>apache process so that there is never unneccesary stack variables
>being tracked.
>
>Hmm... can I explain this better.  Let me try.  Okay, for every apache
>proccess there is an entire perl engine with all the stack variables
>for every script you run recorded there.  What I'm proposing is a
>system where by there would be a seperate process that would have only
>a perl engine in it... you would make as many of these processes as
>you have processors.  (Or multithread them... it doesn't really
>matter)  Now your apache processes would not have a bunch of junk
>memory in them.  Your apache processes would be the size of a stock
>apache process, like 4-6M or so, and you would have 1 process that
>would be 25MB or so that would have all your registry in it.  For a
>high capacity box this would be an incredible boon to increasing
>capacity.  (I'm trying to explain clearly, but I'd be the first to
>admit this isn't one of my strong points)
>
>As to how the multithreaded phhttpd can handle tons of load, well...
>that's a seperate issue and frankly a question much better handled by
>Zach.  I understand it very well, but I don't feel that I could
>adequately explain it.  Its based on real time sig_queue software
>technology... for a "decent" reference on this you can take a look at
>a book by Oreily called "POSIX.4 Programming for the Real World".  I
>should say that this book doesn't go into enough depth... but it's the
>only book that goes into any depth that I could find.
>
> >
> > > What I'm
> > > thinking is essentially we take the perl engine which has the apache
> > > registry and all the perl symbols etc., and seperate it into it's own
> > > process which would could be multithreaded (via pthreads) for multiple
> > > processor boxes.  (above 2 this would be beneficial probably)  On the
> > > front side the apache module API would just connect into this other
> > > process via shared memory pages (shmget et. al), or Unix pipes or
> > > something like that.
> >
> > This is how FastCGI, and all the Java servlet runners (JServ, Resin, etc.)
> > work.  The thing is, even if you run the perl interpreters in a
> > multi-threaded process, it still needs one interpreter per perl thread 
> and I
> > don't know how much you'd be able to share between them.  It might not be
> > any smaller at all.
>
>But there is no need to have more than one perl thread per processor.
>Right now we have a perl "thread" (er.. engine is a better term) per
>process.  Since most boxes start up 10 processes or so of Apache we'd
>be talking about a memory savings something like this:
>6MB stock apache process
>25MB (we'll say that's average) mod_perl apache process 50% shared,
>leaving 12.5 MB non shared
>The way it works now: 12.5 * 10=125MB + 12.5 (shared bit one
>instance)= 147.5 MB total.
>Suggested way:
>6MB stock with about 3MB shared or so.  3MB * 10=30 +25MB mod_perl
>process = 55MB total.
>
>That would be an overal difference of 147.5-55... almost 100 MB of
>memory.  I have no idea how accurate this is, but I'd put my money on
>not too far from the expected result in a high load enviro with lots
>of apache scripts.
>
> >
> > My suggestion would be to look at the two-server approach for mod_perl, and
> > if that doesn't work for you look at FastCGI, and if that doesn't work for
> > you join the effort to get mod_perl working on Apache 2.0 with a
> > multi-threaded model.  Or just skip the preliminaries and go straight for
> > the hack value...
>
>Well... the second option certainly has a lot of merit.  Maybe I
>should get involved in that... actually that has a lot of appeal to
>me.  Hmm... I guess it's time to pick apache 2.0 stuff and do some
>tinkering! :)  As far as the present problem... I'm not all that
>concerned about it.  It actually falls outside of the area of my
>responsibilities at our site..., I'm thinking for the other people in
>the community mostly.
>
>Thanks!
>Shane
> >
> > - Perrin
> >

__________________________________________________
Gunther Birznieks ([EMAIL PROTECTED])
Extropia - The Web Technology Company
http://www.extropia.com/

Reply via email to