[rfc/revised] Do Not Run Everything on One mod_perl Server

Stas Bekman Sat, 22 Apr 2000 11:26:49 -0700

Here is a complete rewrite which comprises your responds on my initial
post. Joshua, notice that this setup with smaller memory requirements and
lots of shared memory gives even better results as it comes to the goal of
having more servers using the same memory. In my hypotetical example below
I've rised the number of servers from 40 to 90!

I hope I didn't make any math mistakes, otherwise Ken will not accept me
to his class (Hi, Ken :) A few more rewrites and I will use integrals to
calculate mod_perl configuration ;) 

Enjoy!

=head1 Do Not Run Everything on One mod_perl Server

Let's assume that you have two different sets of scripts/code which
have a little or nothing in common at all (different modules, no code
sharing). Typical numbers can be four megabytes of unshared and four
megabytes of shared memory for each code set, plus three megabytes of
shared basic mod_perl stuff.  Which makes each process 17Mb in size
when the two code sets are loaded. (3Mb (server) + 4Mb (shared 1st
code set ) + 4Mb (unshared 1st code set ) + 4Mb (shared 2nd code set )
+ 4Mb (unshared 2nd code set ). Where eleven megabytes are shared and
eight megabytes not.

We assume that four megabytes is the size of each code set unshared
memory. This is pretty typical size of unshared memory, especially
when connecting to databases, as the database connections cannot be
shared, and especially DB's like Oracle take lots of RAM per
connection.

Let's assume that we have 260 megabytes of RAM dedicated for the
webserver.

According to the equation developed in the section: "L<Choosing
MaxClients|performance/Choosing_MaxClients>":

                    Total_RAM - Max_Process_Size
  MaxClients = ---------------------------------------
               Max_Process_Size - Shared_RAM_per_Child


  MaxClients = (260 - 17)/(17-11) = 40

We see that we can run 40 processes, using the given memory and the
two code sets in the same server.

Now consider this practical decision. Since we have recognized that
the code sets are very distinct in nature and there is no significant
memory sharing in place, the wise thing to do is to split the two code
sets between two mod_perl servers (a single mod_perl server actually
is a set of the parent process and a number of the child
processes). So instead of running everything on one server, now we
move the second code set onto another mod_perl server. At this point
we are talking about a single machine.

Let's look at the figures again. After the split we will have 20
servers of eleven megabytes (4Mb unshared + 7mb shared) and another 20
servers of eleven megabytes.

How much memory do we need now? From the above equation we derive:

  Total_RAM = MaxClients * (Max_Process_Size - Shared_RAM_per_Child)
              + Max_Process_Size

And using the numbers:

  Total_RAM = 2 * (20 * (11-7) + 11) = 182

A total of 182 megabytes of memory required. But, hey, we have 260Mb
of memory. We've got 78Mb of memory freed up. If we recalculate again
the C<MaxClients> we will see that we can run almost 60 servers:

  MaxClients = (260 - 11*2)/(11-8) = 60

So we can run about 20 more servers using the same memory size. 30
servers for each code set. We have enlarged the servers pool by a half
without changing machine's hardware.

Moreover this new setup allows us to fine tune the two code sets,
since in reality the smaller in size code base might have a higher
hit rate, so we can benefit even more. 

Let's assume that based on the usage statistics we know that the first
code set is deployed in 70% of requests and the other 30% are used by
the second set. Now we assume that the first code set requires only
5Mbytes of RAM (3Mb shared plus 2Mb unshared) over the basic mod_perl
server size, and the second set needs 11Mbytes (7Mb shared and 4Mb
unshared).

Lets compare this new requirement with our original 50%/50% setup.

So now the first mod_perl server running the first code set will have
all its processes of 8Mb (3Mb (server shared) + 3Mb (code shared) +
2Mb (code unshared), and the second 14Mb (3+7+4).  Given that we have
a 70:30 hits relation and that we have 260Mbytes of available memory,
we have to solve these two equations:

  X/Y = 7/3

  X*(8-6) + 8 + Y*(14-10) + 14 = 260

where X is the total number of the processes the first code set can
use and Y the second. The first equation reflect the 70:30 hits
relation, and the second uses the equation for the total memory
requirements for the given number of servers and the shared and
unshared memory sizes.

When we solve these equations, we get that X equals 63 and Y equals
27. So we have a total of 90 servers -- we have twice and a half more
servers running compared to the original setup using the same memory
size

The hits rate optimized solution and the fact that the code sets can
be different in their memory requirements, allowed us to run 30 more
servers in total and gave us 33 more servers (63 versus 30) for the
most wanted code base, relative to the simple 50:50 split as in the
first example.

Of course if you can identify more than two distinct sets of code and
your hits rate statistics may require more complicated decisions.  You
ought to make even more splits and run three and more mod_perl
servers.

Remember that having too many running processes doesn't necessarily
mean a better performance because of all of them will fight over CPU
time slices. The more processes are running the less CPU time each
gets the slower the overall performance will be. Therefore after
hitting a certain load you might want to start spreading servers over
different machine.

In addition to the obvious memory saving you gain the power to
troubleshoot problems, that occur, much easier, when you have
different components running on different servers. It's quite possible
that a little change in the server configuration coming to fix or
improve something in one code set, might completely break the second
code set. For example if you upgrade the first code set and it
requires an update of some modules that both code bases rely on. But
there is a chance that the second code set won't work with a new
module it was relying on.



______________________________________________________________________
Stas Bekman             | JAm_pH    --    Just Another mod_perl Hacker
http://stason.org/      | mod_perl Guide  http://perl.apache.org/guide 
mailto:[EMAIL PROTECTED]  | http://perl.org    http://stason.org/TULARC/
http://singlesheaven.com| http://perlmonth.com http://sourcegarden.org
----------------------------------------------------------------------
[rfc/revised] Do Not Run Everything on One mod_perl Server

Reply via email to