Re: Fwd: [speedycgi] Speedycgi scales better than mod_perl withscripts that contain un-shared memory

Sam Horrocks Thu, 21 Dec 2000 16:45:14 -0800
I've put your suggestion on the todo list.  It certainly wouldn't hurt to
have that feature, though I think memory sharing becomes a much much smaller
issue once you switch to MRU scheduling.

At the moment I think SpeedyCGI has more pressing needs though - for
example multiple scripts in a single interpreter, and an NT port.


 > I think you could actually make speedycgi even better for shared memory 
 > usage by creating a special directive which would indicate to speedycgi to 
 > preload a series of modules. And then to tell speedy cgi to do forking of 
 > that "master" backend preloaded module process and hand control over to 
 > that forked process whenever you need to launch a new process.
 > 
 > Then speedy would potentially have the best of both worlds.
 > 
 > Sorry I cross posted your thing. But I do think it is a problem of mod_perl 
 > also, and I am happily using speedycgi in production on at least one 
 > commercial site where mod_perl could not be installed so easily because of 
 > infrastructure issues.
 > 
 > I believe your mechanism of round robining among MRU perl interpreters is 
 > actually also accomplished by ActiveState's PerlEx (based on 
 > Apache::Registry but using multithreaded IIS and pool of Interpreters). A 
 > method similar to this will be used in Apache 2.0 when Apache is 
 > multithreaded and therefore can control within program logic which Perl 
 > interpeter gets called from a pool of Perl interpreters.
 > 
 > It just isn't so feasible right now in Apache 1.0 to do this. And sometimes 
 > people forget that mod_perl came about primarily for writing handlers in 
 > Perl not as an application environment although it is very good for the 
 > later as well.
 > 
 > I think SpeedyCGI needs more advocacy from the mod_perl group because put 
 > simply speedycgi is way easier to set up and use than mod_perl and will 
 > likely get more PHP people using Perl again. If more people rely on Perl 
 > for their fast websites, then you will get more people looking for more 
 > power, and by extension more people using mod_perl.
 > 
 > Whoops... here we go with the advocacy thing again.
 > 
 > Later,
 >     Gunther
 > 
 > At 02:50 AM 12/21/2000 -0800, Sam Horrocks wrote:
 > >  > Gunther Birznieks wrote:
 > >  > > Sam just posted this to the speedycgi list just now.
 > >  > [...]
 > >  > > >The underlying problem in mod_perl is that apache likes to spread out
 > >  > > >web requests to as many httpd's, and therefore as many mod_perl 
 > > interpreters,
 > >  > > >as possible using an LRU selection processes for picking httpd's.
 > >  >
 > >  > Hmmm... this doesn't sound right.  I've never looked at the code in
 > >  > Apache that does this selection, but I was under the impression that the
 > >  > choice of which process would handle each request was an OS dependent
 > >  > thing, based on some sort of mutex.
 > >  >
 > >  > Take a look at this: http://httpd.apache.org/docs/misc/perf-tuning.html
 > >  >
 > >  > Doesn't that appear to be saying that whichever process gets into the
 > >  > mutex first will get the new request?
 > >
 > >  I would agree that whichver process gets into the mutex first will get
 > >  the new request.  That's exactly the problem I'm describing.  What you
 > >  are describing here is first-in, first-out behaviour which implies LRU
 > >  behaviour.
 > >
 > >  Processes 1, 2, 3 are running.  1 finishes and requests the mutex, then
 > >  2 finishes and requests the mutex, then 3 finishes and requests the mutex.
 > >  So when the next three requests come in, they are handled in the same order:
 > >  1, then 2, then 3 - this is FIFO or LRU.  This is bad for performance.
 > >
 > >  > In my experience running
 > >  > development servers on Linux it always seemed as if the the requests
 > >  > would continue going to the same process until a request came in when
 > >  > that process was already busy.
 > >
 > >  No, they don't.  They go round-robin (or LRU as I say it).
 > >
 > >  Try this simple test script:
 > >
 > >  use CGI;
 > >  my $cgi = CGI->new;
 > >  print $cgi->header();
 > >  print "mypid=$$\n";
 > >
 > >  WIth mod_perl you constantly get different pids.  WIth mod_speedycgi you
 > >  usually get the same pid.  THis is a really good way to see the LRU/MRU
 > >  difference that I'm talking about.
 > >
 > >  Here's the problem - the mutex in apache is implemented using a lock
 > >  on a file.  It's left up to the kernel to decide which process to give
 > >  that lock to.
 > >
 > >  Now, if you're writing a unix kernel and implementing this file locking 
 > > code,
 > >  what implementation would you use?  Well, this is a general purpose thing -
 > >  you have 100 or so processes all trying to acquire this file lock.  You 
 > > could
 > >  give out the lock randomly or in some ordered fashion.  If I were writing
 > >  the kernel I would give it out in a round-robin fashion (or the
 > >  least-recently-used process as I referred to it before).  Why?  Because
 > >  otherwise one of those processes may starve waiting for this lock - it may
 > >  never get the lock unless you do it in a fair (round-robin) manner.
 > >
 > >  THe kernel doesn't know that all these httpd's are exactly the same.
 > >  The kernel is implementing a general-purpose file-locking scheme and
 > >  it doesn't know whether one process is more important than another.  If
 > >  it's not fair about giving out the lock a very important process might
 > >  starve.
 > >
 > >  Take a look at fs/locks.c (I'm looking at linux 2.3.46).  In there is the
 > >  comment:
 > >
 > >  /* Insert waiter into blocker's block list.
 > >   * We use a circular list so that processes can be easily woken up in
 > >   * the order they blocked. The documentation doesn't require this but
 > >   * it seems like the reasonable thing to do.
 > >   */
 > >  static void locks_insert_block(struct file_lock *blocker, struct 
 > > file_lock *waiter)
 > >
 > >  > As I understand it, the implementation of "wake-one" scheduling in the
 > >  > 2.4 Linux kernel may affect this as well.  It may then be possible to
 > >  > skip the mutex and use unserialized accept for single socket servers,
 > >  > which will definitely hand process selection over to the kernel.
 > >
 > >  If the kernel implemented the queueing for multiple accepts using a LIFO
 > >  instead of a FIFO and apache used this method instead of file locks,
 > >  then that would probably solve it.
 > >
 > >  Just found this on the net on this subject:
 > >     http://www.uwsg.iu.edu/hypermail/linux/kernel/9704.0/0455.html
 > >     http://www.uwsg.iu.edu/hypermail/linux/kernel/9704.0/0453.html
 > >
 > >  > > >The problem is that at a high concurrency level, mod_perl is using lots
 > >  > > >and lots of different perl-interpreters to handle the requests, each
 > >  > > >with its own un-shared memory.  It's doing this due to its LRU design.
 > >  > > >But with SpeedyCGI's MRU design, only a few speedy_backends are 
 > > being used
 > >  > > >because as much as possible it tries to use the same interpreter 
 > > over and
 > >  > > >over and not spread out the requests to lots of different interpreters.
 > >  > > >Mod_perl is using lots of perl-interpreters, while speedycgi is 
 > > only using
 > >  > > >a few.  mod_perl is requiring that lots of interpreters be in memory in
 > >  > > >order to handle the requests, wherase speedy only requires a small 
 > > number
 > >  > > >of interpreters to be in memory.
 > >  >
 > >  > This test - building up unshared memory in each process - is somewhat
 > >  > suspect since in most setups I've seen, there is a very significant
 > >  > amount of memory being shared between mod_perl processes.
 > >
 > >  My message and testing concerns un-shared memory only.  If all of your 
 > > memory
 > >  is shared, then there shouldn't be a problem.
 > >
 > >  But a point I'm making is that with mod_perl you have to go to great
 > >  lengths to write your code so as to avoid unshared memory.  My claim is that
 > >  with mod_speedycgi you don't have to concern yourself as much with this.
 > >  You can concentrate more on the application and less on performance tuning.
 > >
 > >  > Regardless,
 > >  > the explanation here doesn't make sense to me.  If we assume that each
 > >  > approach is equally fast (as Sam seems to say earlier in his message)
 > >  > then it should take an equal number of speedycgi and mod_perl processes
 > >  > to handle the same concurrency.
 > >
 > >  I don't assume that each approach is equally fast under all loads.  They
 > >  were about the same with concurrency level-1, but higher concurrency levels
 > >  they weren't.
 > >
 > >  I am saying that since SpeedyCGI uses MRU to allocate requests to perl
 > >  interpreters, it winds up using a lot fewer interpreters to handle the
 > >  same number of requests.
 > >
 > >  On a single-CPU system of course at some point all the concurrency has
 > >  to be serialized. mod_speedycgi and mod_perl take different approaches
 > >  before getting to get to that point.  mod_speedycgi tries to use as
 > >  small a number of unix processes as possible, while mod_perl tries to
 > >  use a very large number of unix processes.
 > >
 > >  > That leads me to believe that what's really happening here is that
 > >  > Apache is pre-forking a bit over-zealously in response to a sudden surge
 > >  > of traffic from ab, and thus has extra unused processes sitting around
 > >  > waiting, while speedycgi is avoiding this situation by waiting for
 > >  > someone to try and use the processes before forking them (i.e. no
 > >  > pre-forking).  The speedycgi way causes a brief delay while new
 > >  > processes fork, but doesn't waste memory.  Does this sound like a
 > >  > plausible explanation to folks?
 > >
 > >  I don't think it's pre-forking.  When I ran my tests I would always run
 > >  them twice, and take the results from the second run.  The first run
 > >  was just to "prime the pump".
 > >
 > >  I tried reducing MinSpareSErvers, and this did help mod_perl get a higher
 > >  concurrency number, but it would still run into a wall where speedycgi
 > >  would not.
 > >
 > >  > This is probably all a moot point on a server with a properly set
 > >  > MaxClients and Apache::SizeLimit that will not go into swap.
 > >
 > >  Please let me know what you think I should change.  So far my
 > >  benchmarks only show one trend, but if you can tell me specifically
 > >  what I'm doing wrong (and it's something reasonable), I'll try it.
 > >
 > >  I don't think SizeLimit is the answer - my process isn't growing.  It's
 > >  using the same 50k of un-shared memory over and over.
 > >
 > >  I believe that with speedycgi you don't have to lower the MaxClients
 > >  setting, because it's able to handle a larger number of clients, at
 > >  least in this test.  In other words, if with mod_perl you had to turn
 > >  away requests, but with mod_speedycgi you did not, that would just
 > >  prove that speedycgi is more scalable.
 > >
 > >  Now you could tell me "don't use unshared memory", but that's outside
 > >  the bounds of the test.   The whole test concerns unshared memory.
 > >
 > >  > I would
 > >  > expect mod_perl to have the advantage when all processes are
 > >  > fully-utilized because of the shared memory.
 > >
 > >  Maybe.  There must a benchmark somewhere that would show off of
 > >  mod_perl's advantages in shared memory.  Maybe a 100,000 line perl
 > >  program or something like that - it would have to be something where
 > >  mod_perl is using *lots* of shared memory, because keep in mind that
 > >  there are still going to be a whole lot fewer SpeedyCGI processes than
 > >  there are mod_perl processes, so you would really have to go overboard
 > >  in the shared-memory department.
 > >
 > >  > It would be cool if speedycgi could somehow use a parent process
 > >  > model and get the shared memory benefits too.
 > >
 > >  > Speedy seems like it
 > >  > might be more attractive to > ISPs, and it would be nice to increase
 > >  > interoperability between the two > projects.
 > >
 > >  Thanks.  And please, I'm not trying  start a speedy vs mod_perl war.
 > >  My original message was only to the speedycgi list, but now that it's
 > >  on mod_perl I think I have to reply there too.
 > >
 > >  But, there is a need for a little good PR on speedycgi's side, and I
 > >  was looking for that.  I would rather just see mod_perl fixed if that's
 > >  possible.  But the last time I brought up this issue (maybe a year ago)
 > >  I was unable to convince the people on the mod_perl list that this
 > >  problem even existed.
 > >
 > >  Sam
 > 
 > __________________________________________________
 > Gunther Birznieks ([EMAIL PROTECTED])
 > eXtropia - The Web Technology Company
 > http://www.extropia.com/
Re: Fwd: [speedycgi] Speedycgi scales better than mod_perl withscripts that contain un-shared memory

Reply via email to