Re: ApacheCon report

2000-11-01 Thread Leslie Mikesell

According to Michael Blakeley:

> >  > I'm not following.  Everyone agrees that we don't want to have big
> >  > mod_perl processes waiting on slow clients.  The question is whether
> >  > tuning your socket buffer can provide the same benefits as a proxy server
> >  > and the conclusion so far is that it can't because of the lingering close
> >  > problem.  Are you saying something different?
> >
> >  A tcp close is supposed to require an acknowledgement from the
> >  other end or a fairly long timeout.  I don't see how a socket buffer
> >  alone can change this.Likewise for any of the load balancer
> >  front ends that work on the tcp connection level (but I'd like to
> >  be proven wrong about this).
> 
> Solaris lets a user-level application close() a socket immediately 
> and go on to do other work. The sockets layer (the TCP/IP stack) will 
> continue to keep that socket open while it delivers any buffered 
> sends - but the user application doesn't need to know this (and 
> naturally won't be able to read any incoming data if it arrives). 
> When the tcp send buffer is empty, the socket will truly close, with 
> all the usual FIN et. al. dialogue.
> 
> Anyway, since the socket is closed from the mod_perl point of view, 
> the heavyweight mod_perl process is no longer tied up. I don't know 
> if this holds true for Linux as well, but if it doesn't, there's 
> always the source code.

I still like the idea of having mod_rewrite in a lightweight
front end, and if the request turns out to be static at that
point there isn't much point in dealing with proxying.  Has
anyone tried putting software load balancing behind the front
end proxy with something like eddieware, balance or ultra
monkey?  In that scheme the front ends might use an IP takeover
failover and/or DNS load balancing and would proxy to what they
think is a single back end server - then this would hit a
tcp level balancer instead.

  Les Mikesell
[EMAIL PROTECTED]



Re: Connection Pooling / TP Monitor

2000-10-30 Thread Leslie Mikesell

According to Gunther Birznieks:

> I guess part of the question is what is meant by "balanced" with regard to 
> the non-apache back-end servers that was mentioned?

I'd be very happy with either a weighted round-robin or a least-connections
choice.  When the numbers get to the point where it matters, pure
statistics is good enough for me.   But, I love what you can
do with mod_rewrite and would like an easy way to point
the target of a match at an arbitrary set of back end servers.
Mod_jserv has a nice configuration setting for multiple
back ends where you name the set and weight each member.  If
mod_proxy and/or mod_backhand had a similar concept with the
group name being usable as a target for mod_rewrite and 
ProxyPass it would be easy to use.  I think Matt's idea
of creating a Location handler and rewriting to the location
would work as long as the modules are loaded in the right
order, but it would make the configuration somewhat confusing. 

> I am also concerned that the original question brings up the notion of 
> failover. mod_backhand is not a failover solution. Backhand does have some 
> facilities to do some failover (eg ByAge weeding) but it's not failover in 
> the traditional sense. Backhand is for load balance not failover.

Does it do something sensible if one of the targets does not
accept the connection, or does it start sending them all
to that one because it isn't busy?  Mod_jserv claims to
mark that connection dead for a while and moves on to
another backend so you have a small delay, not a failure.
After a configurable timeout it will try the failing one
again.

> While Matt is correct that you could probably write your own load balance 
> function, the main interesting function in mod_backhand is ByLoad which as 
> far as I know is Apache specific and relies on the Apache scoreboard (or a 
> patched version of this)

The problem of writing your own is that it needs to be in
the lightweight server - thus all in C.

> Non apache servers won't have this scoreboard file although perhaps you 
> could program your own server(s) to emulate one if it's not mod_backhand.
> 
> The other requirement that non-apache servers may have for optimal use with 
> mod_backhand is that the load balanced servers may need to report 
> themselves to the main backhand server as one of the important functions is 
> ByAge to weed out downed servers (and servers too heavily loaded to report 
> their latest stats).

If a failed connection would set the status as 'down' and periodic
retries checked again, this would take care of itself.

> Otherwise, if you need to load balance a set of non-apache servers evenly 
> and don't need ByLoad, you could always just use mod_rewrite with the 
> reverse_proxy/load balancing recipe from Ralf's guide. This solution would 
> get you up and running fast. But the main immediate downside (other than no 
> true *load* balancing) is the lack of keep-alive upgrading.

I'll accept randomizing as reasonable balancing as long as I
have fine grained control of the URL's I send to each destination.
The real problem with the rewrite randomizer is the complete
lack of knowlege about dead backend servers.  I want something
that will transparently deal with machines that fail.

> I am also not sure if mod_log_spread has hooks to work with mod_backhand in 
> particular which would make mod_rewrite load balancing (poor man's load 
> balancing) less desirable. I suspect mod_log_spread is not 
> backhand-specific although made by the same group but having not played 
> with this module yet, I couldn't say for sure.

If you can run everything through a single front end apache
you can use that as the 'real' log.  There is some point
where this scheme would not handle the load and you would
need one of the connection oriented balancers instead of
a proxy, but a fairly ordinary pentium should be able to
saturate an ethernet or two if it is just fielding static
files and proxying the rest.   You would also need a
fail-over mechanism for the front end box, but this
could be a simple IP takeover and there are some programs
available for that.

  Les Mikesell
   [EMAIL PROTECTED]



Re: [OT] Will a cookie traverse ports in the same domain?

2000-10-19 Thread Leslie Mikesell

According to martin langhoff:

> 
>   this HTTP protocol (definition and actual implementation) question is
> making me mad. Will (and should) a cookie be valid withing the same
> host/domain/subdirectory when changing PORT numbers?

I think this depends on the browser (and its version number).  However
if you set up your front end proxy correctly it should be
completely invisible and the client browser should never see
a different hostname or port and cookies should continue to
work.

>   All my cookies have stopped working as soon as I've set my mod_perl
> apache on a high port with a proxying apache in port 80  [ see thread
> "AARRRGH! The Apache Proxy is not transparent wrt cookies!" ]

Be sure that you have set a ProxyPassReverse to match anything
that can be proxied even if you are using rewriterules to
do the actual proxy setup.  This will make the proxy server
fix any redirects that mention the backend port or location
(if different).  You also need to make sure you aren't mentioning
the local port in your own perl code or generating links that
show it.  If the port number shows up in your browser location
window, you have something wrong.

  Les Mikesell
[EMAIL PROTECTED]



Re: Wild Proposal :)

2000-10-12 Thread Leslie Mikesell

According to David E. Wheeler:
> Perrin Harkins wrote:
> > 
> > My point was that Apache::DBI already gives you persistent connections,
> > and when people say they want actual pooled connections instead they
> > usually don't have a good reason for it.
> 
> Let's say that I have 20 customers, each of whom has a database schema
> for their data. I have one Apache web server serving all of those
> customers. Say that Apache has forked off 20 children. Each of the
> customers who connects has to use their own authentication to their own
> schema. That means that Apache::DBI is caching 20 different connections
> - one per customer. Not only that, but Apache::DBI is caching 20
> different connections in each of the 20 processes. Suddenly you've got
> 400 connections to your database at once! And only 20 can actually be in
> use at any one time (one for each Apache childe).
> 
> Start adding new customers and new database schemas, and you'll soon
> find yourself with more connections than you can handle.

Wouldn't this be handled just as well by running an Apache
per customer and letting each manage it's own pool of children
which will only connect to it's own database?

> And that's why connection pooling makes sense in some cases.

I think you could make a better case for it in a situation where
the reusability  of the connection isn't known ahead of time,
as would be the case if the end user provided a name/password
for the connection.

  Les Mikesell
 [EMAIL PROTECTED]



Re: AuthDBI - semget failed problem

2000-10-03 Thread Leslie Mikesell

According to Pramod Sokke:

> I'm trying to set up Authentication using Apache::AuthDBI. I'm establishing db 
>connections at startup.
> I've set $Apache::DBI::DEBUG = 2.
>  
> When I start the server, I get the following message for every child process: 
> Apache::AuthDBI PerlChildInitHandler semget failed
> 
> And whenever I access the server, I get these messages in my error log:
> 
> Apache::AuthDBI PerlChildInitHandler semget failed
> Apache::AuthDBI PerlChildExitHandler shmread failed 
> Apache::AuthDBI PerlChildExitHandler shmwrite failed
> 
> What does all this mean?

Are you running freebsd? You may need to rebuild a kernel with
the sysV semaphores and shared memory enabled.

  Les Mikesell
[EMAIL PROTECTED]



Re: Poor man's connection pooling

2000-09-06 Thread Leslie Mikesell

According to Michael Peppler:

> The back-end is Sybase. The actual connect time isn't the issue here
> (for me.) It's the sheer number of connections, and the potential
> issue with the number of sockets in CLOSE_WAIT or TIME_WAIT state on
> the database server. We're looking at a farm of 40 front-end servers,
> each runnning ~150 modperl procs. If each of the modperl procs opens
> one connection that's 6000 connections on the database side.
> 
> Sybase can handle this, but I'd rather use a lower number, hence the
> pooling.

Are you using the lightweight httpd proxy front end setup and still
have 150 modperl httpd's per server?  If not, I'd try that
approach first.  I usually see about a 10:1 ratio of front
to back end servers which really cuts down on the database
connections (and the static images are served by a different set of
machines so most of this effect comes from the proxy releasing
the back end process quickly).   Also, if you have pages that
do not need the database connection you could set up
mod_proxy or mod_rewrite to send those requests to a different
set of back-end servers.

  Les Mikesell
   [EMAIL PROTECTED]



Re: HTML Template Comparison Sheet ETA

2000-09-04 Thread Leslie Mikesell

According to Steve Manes:
> At 11:26 AM 9/4/00 -0300, Nelson Correa de Toledo Ferraz wrote:
> >I agree that one shouldn't put lots of code inside of a template, but
> >variables and loops are better expressed in Perl than in a "little
> >crippled language".
> 
> Your example makes perfect sense to me.  But that's why I'm in "Tech" and 
> not "Creative".  I wrote my own quick 'n nasty templating package a few 
> years ago that allowed Perl code to be embedded inside  
> brackets.  So long as I was coding the pages, it worked great, if not as 
> efficiently as embperl or mason.  But in the real world of NYC new media, 
> Creative typically drives the project.  It's more common for the site to be 
> built by artists and HTML sitebuilders, not programmers.  The first time I 
> see the pages is when they get handed off to Tech to glue it all together. 
> This usually happens sometime past Tech's scheduled hand-off date, i.e. 
> five days to do fifteen budgeted days' work in order to make the launch date.

The real advantage of a 'little crippled language' is that perl
itself makes absolutely no effort to keep you from shooting
both your feed off at once and you really don't want to let
layout people destroy your server with something as simple
as a loop that doesn't exit under certain obscure circumstances.
Nor do you want to become the only person who can safely make
changes.

> My favorite anecdote with embedded Perl templates: after a 100-page 
> creative update to an existing site, nothing worked.  Turned out that some 
> funky HTML editor had HTML-escaped the Perl code.   That was a fun all-nighter.

HTML::Embperl anticipates this problem and would have kept on
working anyway.

  Les Mikesell
   [EMAIL PROTECTED]



Re: Building mod_perl and mod_jserv into same apache

2000-08-21 Thread Leslie Mikesell

According to Stas Bekman:
> On Mon, 21 Aug 2000, Jeff Warner wrote:
> 
> > We need to have mod_perl and mod_jserv in the same httpd file.  I can build
> > apache 1.3.9 for either mod_perl or mod_jserv using the appropriate make
> > commands from the install docs and they work fine.
> > 
> > I've tried to build a mod_perl httpd and then use
> > ./configure \
> > --prefix=/usr/local/apache \
> > --activate-module=src/modules/jserv/libjserv.a
> > 
> > on a apache build but then I get a mod_jserv httpd and mod_perl is gone.
> > If I try to activate-module for both mod_perl and mod_jserv I get a lot of
> > mod_perl errors.  Suggestions would be appreciated.
> 
> http://perl.apache.org/guide/install.html#Installation_Scenarios_for_mod_p
> 
> I think it should be easy to apply these notes to your case.

It does work to build them together, but if you are using the
two-apache scheme it makes more sense to put jserv in the
front end.  Apache uses a proxy-like mechanism to pass the
requests on to the jserv running in a separate process.  It
doesn't make much sense to have the larger mod_perl httpd
waiting for the responses.

  Les Mikesell
[EMAIL PROTECTED]



Re: problem with mod_proxy/mod_rewrite being used for the front-end proxy

2000-08-21 Thread Leslie Mikesell

According to Greg Stark:
> 
> This isn't entirely on-topic but it's a solution often suggested for mod_perl
> users so I suspect there are other users here being bitten by the same
> problems. In fact the manner in which problems manifest are such that it's
> possible that many mod_perl users who are using mod_rewrite/mod_proxy to run
> a reverse proxy in front of their heavyweight perl servers have a security
> problem and don't even know it.
> 
> The problem is that the solution written in the mod_rewrite guide for a
> reverse proxy doesn't work as advertised to block incoming proxy requests. 
> 
> RewriteRule^(http|ftp)://.*  -  [F]
> 
> This is supposed to block incoming proxy requests that aren't specifically
> created by the rewrite rules that follow. 
> 
> The problem is that both mod_rewrite and mod_proxy have changed, and this
> seems to no longer catch the incoming proxy requests. Instead mod_rewrite
> seems to see just the path part of the URI, ie, /foo/bar/baz.pl without the
> http://.../. 

Setting 
ProxyRequests off
should disable any explict proxy requests from clients.  It does
not stop ProxyPass or RewriteRule specified proxying.  My server
logs a 302 error and sends a redirect to
http://www.goto.com/d/home/p/digimedia/context/
(interesting - I didn't know where it was redirecting before...).

I do see quite a few of these in my logfiles, mostly trying to
bump up the ad counters on some other sites, I think. 

 Les Mikesell
   [EMAIL PROTECTED]



Re: RFC: Apache::Reload

2000-08-12 Thread Leslie Mikesell

According to Matt Sergeant:
> 
> package Apache::Reload;

What I've always wanted along these lines is the ability
to load something in the parent process that would take
a list of directories where modules are always checked
and reloaded (for your local frequently changed scripts)
plus one filename that is checked every time and if it
has been touched then do the full StatINC procedure.  This
would keep the number of stat's down to a reasonable number
for production and still let you notify the server when
you have updated the infrequently changed modules.

  Les Mikesell
   [EMAIL PROTECTED]



Re: Templating system

2000-07-29 Thread Leslie Mikesell

According to Gerald Richter:
> >
> > The TT parser uses Perl regexen and an LALR state machine.
> 
> Embperl's parser have used from the startup only C code, maybe that's the
> reason why the time this takes (compared to the rest of the request) never
> was an issue for me...

Is there any way for Embperl to get access to other apache
modules (like the C++ versions of the xerces xml parser
or xalan xslt processor)?   It would be nice to be able
to reuse the same code in or out of embperl.

   Les Mikesell
[EMAIL PROTECTED]



Re: Idea of an apache module

2000-07-12 Thread Leslie Mikesell

According to Ken Williams:

> >Another option is to set up whatever handler you want, on a development
> >or staging server (i.e., not the live one), and grab the pages with
> >lynx -dump or GET or an LWP script, and write them to the proper places
> >in the filesystem where the live server can access them. With a little
> >planning, this can be incorporated into a cron job that runs nightly
> >(or hourly, whatever) for stuff that is updated regularly but is
> >composed of discernable chunks.
> 
> I've used this before and it works well.  One disadvantage is that Luis
> would have to move all his existing scripts to different places, and fix
> all the file-path things that might break as a result.  Seems like a
> front-end cache like squid is a better solution when Luis says he wants
> a cache on the front end.
> 
> Putting squid in front of an Apache server used to be very popular - has
> it fallen out of favor?  Most of the answers given in this thread seem
> to be more of the roll-your-own-cache variety.

It really depends on what you are doing. The real problem with
letting a front-end decide when a cache needs to be refreshed
is that it is usually wrong.   If the back end can generate
predictably correct Expires: or Cache-Control headers, then
squid can mostly get it right.  This will also make remote
caches work correctly.  The trouble is that you generally
don't know when a dynamically generated page is going to
change.  Also, squid will pull a fresh copy from the back
end whenever the user hits the 'reload' button, which tends
to be pretty often on dynamic pages that change frequently.

If you just want to control the frequency of doing some
expensive operation you might be able to do scheduled runs
that generate html snippets that are #included into *.shtml
pages, turning it into a cheap operation.

  Les Mikesell
   [EMAIL PROTECTED]



Re: mod_perl and FastCGI, again

2000-06-26 Thread Leslie Mikesell

According to Kenneth Lee:
> 
> Not performance. Not preference.
> 
> The question is, will mod_fastcgi and mod_perl conflict when both are 
> compiled into Apache? Theoretically not I think. And what would the 
> consequences be? Please comment.

I can't see any reason why they should conflict, but I also
don't see why you would want a large mod_perl'd httpd waiting
for a fastcgi backend to complete.   I'd recommend using
the two-apache model with the non-mod_perl'd front end
handling static pages, fastcgi, and java servlets if
you use them, proxying only mod_perl requests to the
mod_perl'd httpd back end.

  Les Mikesell
   [EMAIL PROTECTED]



Re: moving GIF's constantly reloading

2000-06-16 Thread Leslie Mikesell

According to Paul:
> Moving GIF files on some of our pages seem to *keep* reloading the
> whole time I stay on the page.  My browser is set to only compare a
> document in its cache to the network version once per session.  What
> gives?
> 
> I don't see anything in the configs that looks very closely related
> The one place I *ever* use a no_cache is on the user's registration. 
> Would that last for a whole session? (I commented it out, just in
> case?)
> 
> I'd seen this in the logs, but only had it do it to me today (after
> testing the registrationmaybe related.)

There was an old Netscape bug that caused animated GIF's with
Expires: headers (regardless of the value) to reload on
every cycle.  I'm not sure what version(s) did this, but
I stopped sending Expires: headers on GIFs because there are
still some of them around.

  Les Mikesell
   [EMAIL PROTECTED]



Re: speed up/load balancing of session-based sites

2000-05-11 Thread Leslie Mikesell

According to Mark Imbriaco:
> > > "Perrin" == Perrin Harkins <[EMAIL PROTECTED]> writes:
> > Perrin> I think every RDBMS I've seen, includig MySQL, guarantees
> > Perrin> atomicity at this level.
> > 
> > Look, Mummy, the funny man said MySQL and RDBMS in the same sentence :)
> 
> Please don't start on this.  I'm really sick of hearing Phil Greenspun
> harp on the evils of MySQL, and I don't think this is the place to relive
> that discussion all over again.  Yes, I see the smiley, but this topic is
> so inflammatory that I felt a response in an attempt to prematurely stop
> the insanity was in order. :-)  All databases suck.  Pick the one that
> sucks the least for what you're trying to accomplish and move on.

Right, we don't need flames with no content, but there is always
the problem of knowing what is going to suck the least in any
new situation.  For example I have one mysql database that typically
fields 200 concurrent connections, 10 million queries daily (mostly
concentrated in a 4 hour period) and is probably faster than anything
else that could be done on that hardware.  However I recently inherited
another system that is falling on its face at a much lighter load.  It
appears to be using tmp files to sort some ORDER BY clauses that
I haven't had time to fix yet.   Is there any efficient way to pull
the newest N items from a large/growing table as you do a join
with another table?  As a quick fix I've gone to a static snapshot
of some popular lists so I can control how often they are rebuilt. 

  Les Mikesell
[EMAIL PROTECTED]



Re: speed up/load balancing of session-based sites

2000-05-09 Thread Leslie Mikesell

According to G.W. Haywood:
> Hi there,
> 
> On Tue, 9 May 2000, Leslie Mikesell wrote:
> 
> > I'm more concerned about dealing with large numbers of simultaneous
> > clients (say 20,000 who all hit at 10 AM) and I've run into problems
> > with both dbm and mysql where at a certain point of write activity
> > you basically can't keep up.  These problems may be solvable but
> > timings just below the problem threshold don't give you much warning
> > about what is going to happen when your locks begin to overlap. 
> 
> Can you use a RAMdisk?

Not for everything - the service is spread over several machines.

  Les Mikesell
   [EMAIL PROTECTED]



Re: speed up/load balancing of session-based sites

2000-05-09 Thread Leslie Mikesell

According to Tom Mornini:

> > There must be some size where
> > the data values are as easy to pass as the session key, and some
> > size where it becomes slower and more cumbersome.  Has anyone
> > pinned down the size where a server-side lookup starts to win?
> 
> I can't imagine why anyone would pin a website's future to a session
> system that has a maximum of 1k or 2k of session storage potential!

Using cookies where they work doesn't prevent you from using
another mechanism where you need it.  Conceptually, I think
things like user preferences 'belong' on the user's machine
and should be allowed to be different from one machine/browser
to another for the same user.  Things like a shopping cart
in progress might belong on the server.

> We use a custom written session handler that uses Storable for
> serialization. We're storing complete results for complex select
> statements on pages that require "paging" so that the complex select only
> happens once. We store user objects complete, and many multi-level complex
> data structures at whim.

What kind of traffic can you support with this?

> Limiting yourself to cookie size limitation would be a real drag.

I'm more concerned about dealing with large numbers of simultaneous
clients (say 20,000 who all hit at 10 AM) and I've run into problems
with both dbm and mysql where at a certain point of write activity
you basically can't keep up.  These problems may be solvable but
timings just below the problem threshold don't give you much warning
about what is going to happen when your locks begin to overlap. 

  Les Mikesell
   [EMAIL PROTECTED]



Re: speed up/load balancing of session-based sites

2000-05-08 Thread Leslie Mikesell

According to Jeffrey W. Baker:

> > I keep meaning to write this up as an Apache:: module, but it's pretty trivial
> > to cons up an application-specific version. The only thing this doesn't
> > provide is a way to deal with large data structures. But generally if the
> > application is big enough to need such data structures you have a real
> > database from which you can reconstruct the data on each request, just store
> > the state information in the cookie.
> 
> Your post does a significant amount of hand waving regarding people's
> requirements for their websites.  I try to keep an open mind when giving
> advice and realize that people all have different needs.  That's why I
> prefixed my advice with "On my sites..."

Can anyone quantify this a bit?

> On my sites, I use the session as a general purpose data sink.  I find
> that I can significantly improve user experience by keeping things in the
> session related to the user-site interaction.  These session object
> contain way more information than could be stuffed into a cookie, even if
> I assumed that all of my users had cookies turned on.  Note also that
> sending a large cookie can significantly increase the size of the
> request.  That's bad for modem users.
> 
> Your site may be different.  In fact, it had better be! :)

Have you timed your session object retrieval and the cleanup code
that becomes necessary with server-session data compared to
letting the client send back (via cookies or URL) everything you
need to reconstruct the necessary state without keeping temporary
session variables on the server?  There must be some size where
the data values are as easy to pass as the session key, and some
size where it becomes slower and more cumbersome.  Has anyone
pinned down the size where a server-side lookup starts to win?

  Les Mikesell
   [EMAIL PROTECTED]



Re: mod_proxy and Name based Virtual Hosts

2000-04-27 Thread Leslie Mikesell

According to Matt Sergeant:
> OK, just to get this onto a different subject line... I can't seem to get
> mod_proxy to work on the front end with name based virtual hosts on the
> backend, I can only get it to work if I have name based virtual hosts on
> both ends.

You should be able to use IP based vhosts going to the same IP
and different port on the back end (the backend just needs
a different global Port: setting) or name based vhosts
on the front end going to port based on the backend, each
with a different port number.

  Les Mikesell
   [EMAIL PROTECTED]



Re: [RFC] modproxy:modperl ratios...

2000-04-27 Thread Leslie Mikesell

According to Matt Sergeant:

> Is there any benefit of mod_proxy over a real proxy front end like "Oops"?

I've run squid as an alternative and did not see any serious
differences except that the caching was defeated about 10% of the
time even on images, apparently because the clients were hitting
the 'reload' button.   Apache gives you (a) the already-familiar
config file, (b) mod_rewrite to short-circuit image requests and
direct others to different backends, (c) all the other modules you
might want - ssl, jserv, custom logging, authentication, etc.
The main improvement I'd like to see would be load balancing and
failover on the client side of mod_proxy and some sort of IP takeover
mechanism on the front end side so a pair of machines would act as
hot spares for each other on the same IP address.  I know some work
has been done on this but nothing seems like a complete solution
yet.



Re: [RFC] modproxy:modperl ratios...

2000-04-26 Thread Leslie Mikesell

According to [EMAIL PROTECTED]:
> 
> So, overall..., I think that you should consider how many modperl
> processes you want completely seperately from how many modproxy
> processes you want.

Apache takes care of these details for you.  All you need to
do is configure MaxClients around the absolute top number of
mod_perls you can handle before you start pushing memory
to swap, some small MinSpareServers and a bigger MaxSpareServers
and the rest takes care of itself.  On the front-end side
you really don't want any process limits.  If you can't
run enough, buy more memory or turn keepalives down.  Apache
will keep the right number running for the work you are
doing - and the TCP listen queue will hold a few more
connections if you are slightly short of backends.

> But rather on a ratio of how many CPUs you have
> considering primarily what their "bound" by.

Note that when you get down to fine-tuning, you can use 
mod_rewrite to direct different queries to different
back-ends on the same or different machines.  For example
by sending all the database-related URLs to a certain
instance of mod_perl (on a particular port/IP) and others
to a different instance you can reduce the number of
database connections you need. 

  Les Mikesell
   [EMAIL PROTECTED]



Re: Modperl/Apache deficiencies... Memory usage.

2000-04-26 Thread Leslie Mikesell

According to Perrin Harkins:
> On Tue, 25 Apr 2000 [EMAIL PROTECTED] wrote:
> > With mod_proxy you really only need a few mod_perl processes because
> > no longer is the mod_perl ("heavy") apache process i/o bound.  It's
> > now CPU bound.  (or should be under heavy load)
> 
> I think for most of us this is usually not the case, since most web apps
> involve using some kind of external data source like a database or search
> engine.  They spend most of their time waiting on that resource rather
> than using the CPU.

If you have tried it and it didn't work for you, please post the
details to help us understand your real bottleneck.  Most of my hits
involve both another datasource and a database and I still
see a 10-1 reduction of mod_perl processes with the proxy model.
The real problem is slow client connections over the internet.
If you are only serving a local LAN a proxy won't help but
you won't have slow clients either.

> Isn't is common wisdom that parallel processing is better for servers than
> sequenential anyway, since it means most people don't have to wait as long
> for a response?

Only up to the point where the processes continue to run in parallel.
If you are CPU bound, this will be the number of CPUs.  If you
are doing disk access it will be the number of heads that work
independently.  Going to a database server you will have the
same constraints plus any transaction processing that forces
serialization.

>The sequential model is great if you're the next in line,
> but terrible if there are 50 big requests in front of you and yours is
> very small.  Parallelism evens things out.

Or it just adds more overhead.  If you have enough parallelism to
keep your bottleneck busy, the 50th request can only come out slower
by switching among jobs more often.  Anyway, with the proxy model
the cheap way to increase parallelism is to spread jobs across
different backend machines.

  Les Mikesell
   [EMAIL PROTECTED] 



Re: [RFC] XML/Apache Templating with mod_perl

2000-04-25 Thread Leslie Mikesell

According to Matt Sergeant:

> In case you missed it - I just announce the Apache XML Delivery Toolkit to
> both the modperl list and the Perl-XML list. With it you can develop an
> XSLT Apache module in 13 lines of code (no caching, but it works).

I saw it, but perhaps misinterpreted the 'not' in the xslt package.
Is this intended to be fairly compatible with IIS's 'TransformNode'
handling of xml/xsl (i.e. can I use the same xsl files)?

  Les Mikesell
   [EMAIL PROTECTED]



Re: mod_perl 2.x/perl 5.6.x ?

2000-04-22 Thread Leslie Mikesell

According to Eric Cholet:
> > 
> > Does apache 2.0 let you run a prefork model under NT?
> 
> 
> NT has it's own MPM which is threaded
> 
>   prefork ... Multi  Process Model with Preforking (Apache 1.3)
>   dexter  Multi  Process Model with Threading via Pthreads
>   Constant number of processes, variable number of threads
>   mpmt_pthread .. Multi  Process Model with Threading via Pthreads
>   Variable number of processes, constant number of
>   threads/child (= Apache/pthread)
>   spmt_os2 .. Single Process Model with Threading on OS/2
>   winnt . Single Process Model with Threading on Windows NT
> 
> I believe the first 3 run only under Unix.

So, does that still leave mod_perl serializing access until
everything is rewritten to be thread-safe?

   Les Mikesell
[EMAIL PROTECTED]



Re: mod_perl 2.x/perl 5.6.x ?

2000-04-22 Thread Leslie Mikesell

According to Eric Cholet:
> 
> This is for using Apache 2.0's pthread MPM, of course you can build perl
> 5.6 non threaded and use apache 2.0's prefork model but then it's not
> as exciting :-)

Does apache 2.0 let you run a prefork model under NT?

  Les Mikesell
   [EMAIL PROTECTED] 



Re: [OT] Proxy Nice Failure

2000-04-22 Thread Leslie Mikesell

According to Jim Winstead:
> On Apr 21, Michael hall wrote:
> > I'm on the new-httpd list (as a lurker, not a developer :-). Any ideas,
> > patches, help porting, etc. would be more than welcome on the list.
> > Mod-Proxy is actually kind of in limbo, there are some in favor of
> > dropping it and others who want it. I guess the code is difficult and
> > not easy to maintain and thats why some would just as soon see it go
> > unless someone steps up to maintain (redesign) it. There are some
> > working on it and apparently it will survive in some form or another.
> > Now would be a perfect time for anybody to get involved in it.
> 
> mod_backhand may also be the solution people are after.
> 
> http://www.backhand.org/

Is anyone using this in production?  It has the disadvantage
of requiring itself to be compiled into both the front and
back ends.  I have some backend data being generated by
custom programs running on NT boxes and would like to have
a fail-over mechanism.  We may end up running Windows load
balancing on them, but that means paying for Advanced Server
(about $3k extra) on each of them when a smart proxy would
work just as well. 

I also didn't see how to access it through mod_rewrite which
is how I control most of my proxy access.  This might be
possible by letting backhand handle certain directories and
RewriteRules to map to those directories - I just didn't get
that far yet.

> (Sorry for the off-topic-ness.)
> 
> I'm also coming around to the idea that caching proxies have some
> very interesting applications in a web-publishing framework outside
> of caching whole pages. All sorts of areas to exploit mod_perl in
> that sort of framework.

This can help with the load on a backend, but after watching squid
logs for a while I decided that a lot of extra traffic is passed
through when users hit the 'refresh' button which will send the
'Pragma: no-cache' header with the request.  For things like
images you may be better off using RewriteRules on the front
end to short-circuit the request, and other popular pages that
should update only at certain intervals can be done with
cron jobs and delivered from the front end as well.   So, from
a mod_perl perspective I don't care much about the caching side
but really need the relationship between mod_rewrite and
mod_proxy.  I haven't found equivalent built-in functionality in any
other server.

  Les Mikesell
[EMAIL PROTECTED]



Re: [OT] Proxy Nice Failure

2000-04-21 Thread Leslie Mikesell

According to Joshua Chamas:

> I like the mod_proxy module in reverse httpd accel mode, but 
> am interested in having some nicer failure capabilities.  I have 
> hacked in this kind of stuff before but was wondering if anyone 
> had any official patch for this kind of stuff.  
> 
> The nicety under consideration is having the mod_proxy module do 
> X retries every Y seconds instead of failing immediately.  This 
> would allow a backend Apache httpd do a stop/start without any
> downtime apparent to the client besides the connection breaks
> from the stop.  Depending on how much preloading is done at the
> parent httpd, a start could take 5-10 seconds, and during this
> time it would be cool if the proxy could just queue up requests.
> 
> Anyone does this with some nice ProxyTimeout ProxyRetry config
> options?  Thanks.

No, but I'd like to add to the wishlist that it should do
load balancing and failover across multiple backends too.
Mod_jserv appears to have a pretty good scheme of letting
you describe the balanced sets and an interface to view
and control the backend status.  The only problem is that
it is restricted to the jserv protocol for the backends.

  Les Mikesell
[EMAIL PROTECTED]



Proxy hijackers?

2000-04-19 Thread Leslie Mikesell

(Off topic again, but lots of people here are using reverse
proxy).

For a while I had 'ProxyRequests On' in my httpd.conf mistakenly
thinking that it was necessary to make ProxyPass and mod_rewrite
proxying work.  Then I noticed entries in my logfile where
remote sites were sending full http://requests to other
remote sites.  I've turned off the function, but the requests
keep coming in, mostly appearing to request ads from somewhere
with referring pages in Russia and China. 

Is this a common practice and what are they trying to accomplish
by bouncing them through my server? 

  Les Mikesell
   [EMAIL PROTECTED]



Re: Modperl/Apache deficiencies... Memory usage.

2000-04-16 Thread Leslie Mikesell

According to Gunther Birznieks:

> If you want the ultimate in clean models, you may want to consider coding 
> in Java Servlets. It tends to be longer to write Java than Perl, but it's 
> much cleaner as all memory is shared and thread-pooling libraries do exist 
> to restrict 1-thread (or few threads) per CPU (or the request is blocked) 
> type of situation.

Do you happen to know of anyone doing xml/xsl processing in
servlets?  A programmer here has written some nice looking stuff
but it appears that the JVM is never garbage-collecting and
will just grow and get slower until someone restarts it.  I
don't know enough java to tell if it is his code or the xslt
classes that are causing it.

Yes, I know this is off-topic for mod_perl except to point out
that the clean java model isn't necessarily trouble free either.

  Les Mikesell
   [EMAIL PROTECTED]



Re: Modperl/Apache deficiencies... Memory usage.y

2000-04-16 Thread Leslie Mikesell

According to [EMAIL PROTECTED]:
> > 
> > This is basically what you get with the 'two-apache' mode.
> 
> To be frank... it's not.  Not even close.

It is the same to the extent that you get a vast reduction in the
number of backend mod_perl processes.  As I mentioned before, I
see a fairly constant ratio of 10-1 but it is really going to depend
on how fast your script can deliver its output back to the front
end (some of mine are slow).  It is difficult to benchmark this on
a LAN because the thing that determines the number of front-end
connections is the speed at which the content can be delivered back
to the client.  On internet connections you will see many slow
links, and letting those clients talk directly to mod_perl is the
only real problem.

> Especially in the case that
> the present site I'm working on where they have certain boxes for
> dynamic, others for static.

This is a perfect setup.  Let the box handling static content
also proxy the dynamic requests to the backend.

> This is usefull when you have one box
> running dynamic/static requests..., but it's not a solution, it's a
> work around.  (I should say we're moving to have some boxes static
> some dynamic... at present it's all jumbled up ;-()

Mod_rewrite is your friend when you need to spread things over
an arbitrary mix of boxes.  And it doesn't hurt much to
run an extra front end on your dynamic box either - it will
almost always be a win if clients are hitting it directly.

A fun way to convince yourself that the front/back end setup is
working is to run something called 'lavaps' (at least under Linux,
you can find this at www.freshmeat.net).  This shows your processes
as moving colored blobs floating around with the size related to
memory use and the activity and brightness related to processor
use.  It is pretty dramatic on a box typically running 200 1Meg
frontends, and 20 10Meg backends. You get the idea quickly what
would happen with 200 10Meg processes instead - or trying to
funnel through one perl backend.
  
> Well, now your discussing threaded perl... a whole seperate bag of
> tricks :).  That's not what I'm talking about... I'm talking about
> running a standard perl inside of a threaded enviro.  I've done this,
> and thrown tens of thousands of requests at it with no problems.

You could simulate this by configuring mod_perl backend to only
run one child and let the backlog sit in the listen queue. But
you will end up with the same problem.

> I
> believe threaded perl is an attempt to allow multiple simultaneous
> requests going into a single perl engine that is "multi threaded".
> There are problems with this... and it's difficult to accomplish, and
> alltogether a slower approach than queing because of the context
> switching type overhead.  Not to mention the I/O issue of this...
> yikes! makes my head spin.

What happens in your model - or any single threaded, single processing
model - when something takes longer than you expect?  If you are
just doing internal CPU processing and never have an error in your
programs you will be fine, but much of my mod_perl work involves
database connections and network I/O to yet another server for the
data to be displayed.  Some of these are slow and I can't allow
other requests to block until all prior ones have finished.  The
apache/mod_perl model automatically keeps the right number of
processes running to handle the load and since I mostly run
dual-processor machines I want at least a couple running all the
time.

   Les Mikesell
[EMAIL PROTECTED]



Re: Modperl/Apache deficiencies... Memory usage.y

2000-04-15 Thread Leslie Mikesell

According to [EMAIL PROTECTED]:

> Does anyone know of any program which has been developed like this?
> Basically we'd be turning the "module of apache" portion of mod_perl
> into a front end to the "application server" portion of mod_perl that
> would do the actual processing.

This is basically what you get with the 'two-apache' mode.

> It seems quite logical that something
> like this would have been developed, but possibly not.  The seperation
> of the two components seems like it should be done, but there must be
> a reason why no one has done it yet... I'm afraid this reason would be
> the apache module API doesn't lend itself to this.

The reason it hasn't been done in a threaded model is that perl
isn't stable running threaded yet, and based on the history
of making programs thread-safe, I'd expect this to take at
least a few more years.  But, using a non-mod-perl front
end proxy with ProxyPass and RewriteRule directives to hand
off to a mod_perl backend will likely get you a 10-1 reduction
in backend processes and you already know the configuration
syntax for the second instance.

 Les Mikesell
   [EMAIL PROTECTED]



Re: mod_perl virtual web hosting

2000-04-12 Thread Leslie Mikesell

According to Tom Brown:
> 
> strikes me (as an owner of a web hosting service) that DSO is the wrong
> answer. What does DSO buy you? NOTHING except a complete waste of
> memory... 

It doesn't really hurt anything but you still want a proxy.

> it strikes me that you _want_ a frontend proxy to feed your requests to
> the smallest number of backend daemons which are most likely to already
> have compiled your scripts. This saves memory and CPU, while simplifying
> the configuration, and of course, for a dedicated backend daemon, DSO buys
> nothing... even if that daemon uses access handlers, it still always needs
> mod_perl

If someone is ambitious enough to write some C code, what you really
need is a way for mod_proxy to actually start a new backend if
none are already running for a particular vhost.  Then the
backend processes should extend the concept of killing off
excess children to the point of completely exiting after a
certain length of inactivity.  The same approach could also
work for running scripts under different userids.  I think
sometime in the distant past I have seen programs started
by inetd that would continue to listen in standalone mode
for a while to make subsequent connections faster but I
don't recall how it worked.

  Les Mikesell
[EMAIL PROTECTED]



Re: OT: (sort of) AuthDBMUserFile

2000-04-10 Thread Leslie Mikesell

According to Stas Bekman:
> On Mon, 10 Apr 2000, Bill Jones wrote:
> > AuthDBMUserFile
> > 
> > Is there a difference between DBM and GDBM?
> > I always thought they were the same...
> > 
> > I found sleepcat (DB) and GDBM, but where is DBM?

> sleepycat == berkeley db (a product of sleepycat.com)
> 
> gdbm  == gnu dbm library
> 
> dbm   == a global name for all UNIX dbm implementations. This is the
> name of the database type, like RDBMS or flat file are the names of the
> relational DB implementations driven by SQL and simple text file
> with line/record respectively...
> 
> But this should go to the perl newsgroup the best...

It turns out to be a long story, because both the gdbm and db
libraries offer ndbm emulation, which is the interface
that apache actually uses, so when you build mod_perl you
end up with whichever library was linked first by the
perl compile.  If you build apache without mod_perl you
end up with whatever appears to be be your ndbm library
which will vary between systems, and especially Linux
distributions.  Some use gdbm - RedHat uses db but seems
to have switched to the file-incompatible 2.x version in
their 6.0 release.  You may have to work a bit to get both
apache and perl to use the same library and you are likely
to have trouble copying the dbm file from one system to
another or even using it after an upgrade.

  Les Mikesell
   [EMAIL PROTECTED]



Re: mod_perl shared perl instances?

2000-04-08 Thread Leslie Mikesell

According to Soulhuntre:

> > My favorite is throwing a proxy in front...
> 
> The issue in question was getting the perl core out of the address space of
> the httpd process entirely :)
> 
> Velocigen does this nicely (though there is a performance penalty) and even
> allows degicated machines runnign the perl services that talk to the
> webserver over sockets.

Look at the problem from the opposite direction.  There is not
that much overhead in including http in the perl processes
actually doing the work, and (unsurprisingly...) http turns
out to be a reasonable protocol to forward http requests
instead of inventing something else.  So, consider mod_perl
to be the backend perl process with a non-mod_perl frontend
using ProxyPass and/or RewriteRules to talk to it amd
you end up with the same thing, except more flexible, with
a single config file style, and the ability to test the backend
with only a browser.

  Les Mikesell
   [EMAIL PROTECTED]



Re: mod_perl, Apache and zones?

2000-04-07 Thread Leslie Mikesell

According to John Darrow:

> I need to be able to run the same sets of pages in several different
> environments (basically just different environment variables).  The problem
> is that once a process is initiated in a certain environment it can't be
> changed for the life of the process.  The first stroke to solve this would
> say that I need to run several Apache servers each with a slightly different
> config file.  Then each environment would run on its own port.

What determines the correct values per hit?  You can use SetEnv in
virtualhost context, SetEnvIf on the fly based on several
considerations, or do some real black magic with the E= flag
in a RewriteRule.

> The problem with that solution is that maintaining the servers becomes a
> headache.  You have to bounce many different Apache servers everytime
> something changes.

This turns out to be a mixed blessing when you really only want
to change one of the environments but it is probably too much
trouble except for drastically different servers like with/without
mod_perl or a secure proxy.

> With java servlets there is a feature that allows you to specify different
> zones within a single Apache server.  Each zone has a unique config file and
> so it can deal with the environments that way.  I'm wondering if there's
> anything similar for mod_perl?

The above, plus the ability of mod_rewrite or ProxyPass on a front
end server to proxy different requests to different backends.

> The only other thing I can think of is to just have several copies of the
> same scripts, and then depending on the URI they will be smart enough to
> know to set their own environment variables upon initialization.  Apache
> would then keep processes separate depending on where the scripts are using
> the URI.  But that's sort'f ugly.

It is a good idea to make sure the reason for the different
behaviour based on these values is clear within the script,
especially if there is any chance that different people
will make changes separately to the apache config and the
scripts.  If letting the script parse the URI itself makes
it more obvious then it isn't as ugly as the magic to hide
it.

  Les Mikesell
   [EMAIL PROTECTED] 



Re: [RFC] holding a mod_perl conference

2000-04-05 Thread Leslie Mikesell

According to Nathan Torkington:
> Jason Bodnar writes:
> > I guess my big problem with the ORA conference last year was that all the
> > tutorials I attended last year tried to cover the basics and didn't lead
> > enough time for in-depth informaiton.
> 
> Yup, I agree.  The level of the material offered, though, is in the
> hands of the program chair.  So when I put together the Perl
> conference tutorials, I try to make sure that at any one time there's
> something that *I* would like to see, as well as something that a less
> advanced (more intermediate) programmer might want to attend.  So this
> year there's Damian Conway's "making your mind go boom with OO in Perl"
> talks, as well as MjD's hardcore Perl.

Same here, but I'd like to make the point that it is pretty 
difficult to guess what someone else's concept of beginning,
intermediate, and advanced topics really mean.  This is
especially true when a program's author is speaking or
personal styles of perl coding are involved.  It would be
nice if some outlines/slides of the material could be online
before the signup deadlines and the actual session could
spend more time in discussion and question/answer than
covering the overview.

  Les Mikesell
   [EMAIL PROTECTED]



Re: external access to intranet

2000-04-05 Thread Leslie Mikesell

According to James Hart:

No they won't - the browser will strip the URL seen from its perspective
back to the host and add the path. On the scheme Jona describes, where the
host the browser sees is 'gateway_server', that would then be retranslated
by the proxy into a request for the document 'myfile.html' on the intranet
host 'path' -  the correct intranet host would be lost.

As long as the part of the path that triggered the first
ProxyPass directive remains (and it will for any relative
link in the same directory or lower), the request for it
will also be ProxyPass'd to the same back end server and
the correct relative location.

  Les Mikesell
   [EMAIL PROTECTED] 



Re: mod_perl weaknesses? help me build a case....

2000-04-05 Thread Leslie Mikesell

According to Soulhuntre:
> 
> Well, let me turn that around, has anyone succeeded in getting mod_perl
> running well on Apache on win2k?

Your problem here is going to be that mod_perl is not thread-safe
and will serialize everything when running under the threaded
model that apache uses under windows. If your scripts are fast enough
you might be able to live with this if you use it as a back end
to a lightweight front-end proxy which a busy site needs anyway.

  Les Mikesell
   [EMAIL PROTECTED]



Re: external access to intranet

2000-04-05 Thread Leslie Mikesell

According to [Jonas Nordstr_m]:
> But doesn't that only pass on the request and then return the HTML-files
> unchanged? I also want to change the links inside the HTML-bodies on the
> fly, so that the users can continue to "surf the intranet". For example, if
> the HTML contains "" I want to change that to 
> "https://gateway_server/intranet_host/path/myfile.html>"

Relative links like that will work correctly without any changes
because the browser supplies the current protocol/path from
its perspective.  Absolute links that start with a / or
http: will be broken, though.

   Les Mikesell
[EMAIL PROTECTED]



Re: authDBI connect on init?

2000-04-04 Thread Leslie Mikesell

According to Adam Gotheridge:
> Is there anyway to get AuthDBI to connect on initialization of the web server
> like you can with "DBI->connect_on_init(...)"?

AuthDBI uses DBI, so all you have to do is use Apache::DBI
and make sure the connect string is exactly the same.

  Les Mikesell
   [EMAIL PROTECTED]



Re: Embedded Perl XML Extensions ...

2000-02-14 Thread Leslie Mikesell

According to Joshua Chamas:
> > > > Will you be able to emulate the IIS/ASP 'transformNode'
> > > > method that renders html from xml and xsl components?

> >  http://www.sci.kun.nl/sigma/Persoonlijk/egonw/xslt/
> 
> transformNode & XSLT are not on my short list of TODOs for
> Apache::ASP, though certainly one could be easily made 
> available when the other is too!  

It looks like the one above is usable if not complete.  I don't
think anything exactly follows the standard yet.  The idea would
be to get the calling syntax extablished and simple xsl documents
working. 

> However, transformNode is not an ASP method per se but a method 
> of one of Microsoft's COM XML objects. So far I have stayed away 
> from implementing Microsoft's COM objects like the BrowserCap, the 
> AdRotator, and the like, and if I did provided an the XML object, 
> I would be starting a general perl port of as many Microsoft COM 
> objects as I could do.

The difference here is that XSL transformations *should* follow
vendor-independent standards and I'd like to see it available
on as many platforms as possible.  A developer here was already
complaining about how the MS sdk for xml has MS-specific
extensions in all the examples. 

> What is on my short list is providing a developer interface
> for rendering XML to HTML.  Now I have looked at XSLT, and it is 
> pretty scary, even if supposedly necessary for some managing large 
> document sets.  Its purpose does seem a bit orthogonal to the ASP 
> style of interleaving code & content, with the first goal being 
> simplicity and power, and then maintainability with decomp through 
> includes and modules.

It is more geared toward separating data from presentation, although
in theory you can transform data to other data just as well.  I think
it will become extemely useful in cases where you have machine
generated data that can be extracted in xml format and you want
to present it a number of customized ways.

> XSLT seems to seriously complicate the XML rendering issue, and 
> perhaps unnecessarily?  Has it occurred to anyone that XSLT is just
> another programming language, and one that looks like an HTML doc?

Yes, that is exactly the point, aside from being platform and
vendor-independent.  Someone who wants to do HTML presentation
now has a language designed for that.  Plus using IE5, you
can push the work out to the browser an let it cache or update
the components separately.

> I'll likely wait this out a bit to see how it shapes up, and would 
> finally need an Apache::ASP user to require this functionality before 
> pursuing it further.  Leslie, are you that user ?

Our situation may be a lost cause.  The data in question (commodity
trading stuff) lives on an NT box that has a nice native interactive
interface but the web side has been done with a unix mod_perl wrapper
to get the speed (maybe 30 price pages a second at peak times) and
reliability we need.  However we provide custom views of this for
brokers and you can't just turn HTML coders loose on production
mod_perl scripts even if you only need a slightly different look.
So, we've added XML output to the NT server and started using XSL
for some of the new variations, so far running under apache jserv.
However since nearly everyone else here is an NT developer, the
push is to put it all in one box, which is probably going to mean
running IIS and MS-asp so we can sell something easier for other
people to manage.  However, since I haven't completely sold my
soul to the dark side (and we aren't sure yet how well it will
work...) I'd really like the xsl to be portable, and usable under
mod_perl.  I've had some problems with memory use running java
servlets, although I like the load-balancing feature in the
current apache jserv.

  Les Mikesell
   [EMAIL PROTECTED] 



Re: Embedded Perl XML Extensions ...

2000-02-11 Thread Leslie Mikesell

According to Gerald Richter:
> >
> > Will you be able to emulate the IIS/ASP 'transformNode'
> > method that renders html from xml and xsl components?
> >
> 
> I don't know what transform Node exactly does, but I hope we find a common
> design, which will allow you to easly plug in a module that does whatever
> transformation on the XML (or HTML) that you like.

The idea is to apply the stylesheet transformations specified by
an XSL document to everything below a node (possibly the root)
of another XML document.  For now it looks like the only XSLT
transformer in perl is at:
 http://www.sci.kun.nl/sigma/Persoonlijk/egonw/xslt/

There are some samples of the kinds of things you can do at
http://msdn.microsoft.com/downloads/samples/internet/xml/multiple_views/default.asp
although they are somewhat MicroSoft-centric in that they apply the
transformation inside the browser (and only work with IE).  What we
need is the ability to detect the browser type and render to
HTML on the server if the browser can't do it.  The only things
I've seen so far that can do this besides IIS have been in Java.

  Les Mikesell
[EMAIL PROTECTED] 



Re: Embedded Perl XML Extensions ...

2000-02-11 Thread Leslie Mikesell

According to Joshua Chamas:

> I have been thinking about some XML style extensions for
> Apache::ASP, and know that you are looking at the same thing
> with Embperl, and was hoping that we could sync up on the 
> APi, so there might be a common mindset for a developer when 
> using our extensions, even if the underlying implementation 
> were different.

Will you be able to emulate the IIS/ASP 'transformNode'
method that renders html from xml and xsl components?

  Les Mikesell
   [EMAIL PROTECTED]



Re: What's the benefits of using XML ?

2000-02-11 Thread Leslie Mikesell

According to Perrin Harkins:
> On Fri, 11 Feb 2000, Matt Sergeant wrote:
> > XML and XSLT can provide this. Rather than write pages to a
> > specific style with toolbars in the right place, and format things how I
> > want them, I can write in XML, and down transform to HTML using a
> > stylesheet. When I want to change the look of my site I change the
> > stylesheet and the site changes with it. This isn't magic - a lot of
> > template systems exist today - many of them written right here for
> > mod_perl. But XSLT allows me to leverage those XML skills again. And I
> > think authoring XML is easier than most of those template tools (although
> > XSLT isn't trivial).
> 
> Just a small plug for one of my favorite modules: Template Toolkit is very
> easy to use, and a couple of people have written plug-ins for it that
> handle arbitrary XML.  If you're working on a project where you need to
> turn XML into HTML and want non-programmers to write and maintain the HTML
> templates, they may find it easier than XSLT.  Of course, it's a Perl-only
> solution.

One other thing about XML/XSL is that if the browser is IE5, instead
of doing the transformation to HTML on the server you can send
instructions to the browser to get each separately and render to
HTML itself.  This could be a big win if you have rapidly changing
data in XML format because you end up sending essentially static
pages (or perhaps passing through directly from some other source)
and the unchanging XSL will be cached on the browser side.  
IE5 can also let you view raw XML in a fairly intelligent way
even without XSL.

  Les Mikesell
   [EMAIL PROTECTED]



Re: [SITE] possible structure suggestion

2000-02-10 Thread Leslie Mikesell

According to Matt Sergeant:
> >This would be cool. However, in at least a few cases, the PHP docs leave
> > something to be desired. I remember looking up the Oracle connect calls for
> > PHP online once (for 3.0), and having people hold a debate about how a
> > function really worked, because the docs were wrong, but no one really
> > knew what was right--one guy would say, "I think it really returns THIS," 
> > and another would respond with, "No, I think it returns THAT." Gives you a 
> > nice warm and fuzzy feeling about quality of documentation... :)
> 
> Of course they could have just resolved it by looking at the source :)

But when the documentation and source disagree, chances are that
both are wrong.

  Les Mikesell
   [EMAIL PROTECTED]



Re: Perl and SQL, which is the most scalable SQL server to use?

2000-02-09 Thread Leslie Mikesell

According to Ryan, Aaron:

> We found that we are quicking using up the max connections to the MySQL
> database 
> and when we raise the max connection, the performance gets worse. What was
> MySQL designed
> to handle, should it be able to handle 2000 connections, or is that outside
> the scope
> of the design of the server.
> 
> Does anyone have any suggestions or similar experiences with scaling.

Have you already taken the step of setting up a non-mod_perl proxy
front end and avoiding sending any unnecessary requests (images,
static html, etc.) to the database-connected backend?  If not, you
may be able to reduce the number of connections you need by a
factor of 10 or so.

  Les Mikesell
[EMAIL PROTECTED]



Re: Performance advantages to one large, -or many small mod_perl program?

2000-02-04 Thread Leslie Mikesell

According to Ask Bjoern Hansen:
> 
> > Is there any way to make mod_perl reload modified modules in some
> > directories but not check at all in others?  I'd like to avoid
> > the overhead of stat'ing the stable modules every time but still
> > automatically pick up changes in things under development.
> 
> I made that an option for Apache::StatINC. I've made it and lost it a few
> times, but some day I will get it done, tested and commited. :)
> 
> I was going to make the trigger on the module name though. Hmn. Maybe look
> at the directory too would make sense.

We have a lot of local modules, some used both by mod_perl and normal
scripts.  To make it easier to keep them updated across machines and
avoid having to 'use lib' everywhere, I put a symlink under the
normal site_perl to a directory that is physically with other web
related work.  So in this case it really is a component of the
module name (mapped by perl to the symlinked directory...) that
I would want as the trigger, but it would be equally likely that
someone would 'use lib' to pick up their local development.

I'd put a lot more programming into the modules and out of the web
scripts if modifications were always picked up automatically without
having to stat the modules that rarely change.

  Les Mikesell
   [EMAIL PROTECTED]



Re: Performance advantages to one large, -or many small mod_perl program?

2000-02-04 Thread Leslie Mikesell

According to Stas Bekman:

> A module is a package that lives in a file of the same name.  For
> example, the Hello::There module would live in Hello/There.pm.  For
> details, read L.  You'll also find L helpful.  If
> you're writing a C or mixed-language module with both C and Perl, then
> you should study L.
> [snipped]

Is there any way to make mod_perl reload modified modules in some
directories but not check at all in others?  I'd like to avoid
the overhead of stat'ing the stable modules every time but still
automatically pick up changes in things under development.

  Les Mikesell
   [EMAIL PROTECTED]



Re: Using network appliance Filer with modperl

2000-02-01 Thread Leslie Mikesell

According to Elizabeth Mattijsen:

> We have been using such a setup for over 2 years now.  The only real issue
> we've found is not so much with mod_perl itself, but with MySQL.  If you
> put your databases on the NetApp, either have a seperate central database
> server, or make damn sure you do not use the same database from two
> different front-end servers.  We've seen database corruption that way
> (using Linux front-end servers with NFS 2).

It is probably reasonable for MySQL to assume that only one server
is accessing the files at once since it has its own remote client
access protocol.  Do you happen to know if there is a performance
difference for MySQL between local drives and a NetApp?

  Les Mikesell
   [EMAIL PROTECTED] 



Re: Using network appliance Filer with modperl

2000-01-31 Thread Leslie Mikesell

According to Tim Bunce:
> > > And, just to be balanced, has anyone _not_ found any 'gotchas' and is
> > > enjoying life with a netapp or similar NFS file serving appliance?
> > 
> > I haven't really had any gotchas in terms of performance.  But you do 
> > have to plan things out if you are going to be working in a mixed 
> > NFS+CIFS environment because of permission issues.  Also I had a really 
> > hard time accessing a share with samba.  Supposedly that is fixed now 
> > but I have not had reason to test it.
> 
> We wouldn't be using CIFS or samba. Just plan simple NFS file serving.

I'm using one as a cvs repository and master copy that gets distributed
via rsync to some other hosts but it isn't really serving the production
hosts in real time yet.  One thing I was hoping to do was to let it
serve static content directly from my master image (you can get
an http server also) but it keeps crashing when I try to give
permission to a couple of subdirectories only and deny or issue
a redirect on attempts to access anything else.  This is probably
my configuration error - I just haven't been sitting by a phone
long enough to deal with a call to tech support recently...
The thing doesn't seem especially fast at serving http but it
doesn't slow down much with hundreds of concurrent requests either.

  Les Mikesell
   [EMAIL PROTECTED]



Re: squid performance

2000-01-30 Thread Leslie Mikesell

According to Greg Stark:
> Leslie Mikesell <[EMAIL PROTECTED]> writes:
> 
> > The 'something happens' is the part I don't understand.  On a unix
> > server, nothing one httpd process does should affect another
> > one's ability to serve up a static file quickly, mod_perl or
> > not.  (Well, almost anyway). 
> 
> Welcome to the real world however where "something" can and does happen.
> Developers accidentally put untuned SQL code in a new page that takes too long
> to run. Database backups slow down normal processing. Disks crash slowing down
> the RAID array (if you're lucky). Developers include dependencies on services
> like mail directly in the web server instead of handling mail asynchronously
> and mail servers slow down for no reason at all. etc.

Of course.  I have single httpd processes screw up all the time.  They
don't affect the speed of other httpd processes unless they consume
all of the machine's resources or lock something in common.  I suppose
if you have a small limit on the number of backend programs you
could get to a point where they are all busy doing something wrong. 

> > If you are using squid or a caching proxy, those static requests
> > would not be passed to the backend most of the time anyway. 
> 
> Please reread the analysis more carefully. I explained that. That is
> precisely the scenario I'm describing faults in.

I read it, but just wasn't convinced.  I'd like to understand this
better, though.  What did you do to show that there is a difference
when netscape accesses different hostnames for fast static content
as opposed to the same one where a cache responds quickly but
dynamic content is slow?  I thought Netscape would open 6 or so
separate connections regardless and would only wait if all 6
were used.  That is, it should not make anything wait unless you
have dynamically-generated images (or redirects) tying up the
other connections besides the one supplying the main html.  Do
you have some reason to think it will open fewer connections 
if they are all to the same host? 

  Les Mikesell
   [EMAIL PROTECTED]



Re: squid performance

2000-01-29 Thread Leslie Mikesell

According to Greg Stark:

> > > 1) Netscape/IE won't intermix slow dynamic requests with fast static requests
> > >on the same keep-alive connection
> > 
> > I thought they just opened several connections in parallel without regard
> > for the type of content.
> 
> Right, that's the problem. If the two types of content are coming from the
> same proxy server (as far as NS/IE is concerned) then they will intermix the
> requests and the slow page could hold up several images queued behind it. I
> actually suspect IE5 is cleverer about this, but you still know more than it
> does.

They have a maximum number of connections they will open at once
but I don't think there is any concept of queueing involved. 

> > > 2) static images won't be delayed when the proxy gets bogged down waiting on
> > >the backend dynamic server.
> 
> Picture the following situation: The dynamic server normally generates pages
> in about 500ms or about 2/s; the mod_perl server runs 10 processes so it can
> handle 20 connections per second. The mod_proxy runs 200 processes and it
> handles static requests very quickly, so it can handle some huge number of
> static requests, but it can still only handle 20 proxied requests per second.
> 
> Now something happens to your mod_perl server and it starts taking 2s to
> generate pages.

The 'something happens' is the part I don't understand.  On a unix
server, nothing one httpd process does should affect another
one's ability to serve up a static file quickly, mod_perl or
not.  (Well, almost anyway). 

> The proxy server continues to get up to 20 requests per second
> for proxied pages, for each request it tries to connect to the mod_perl
> server. The mod_perl server can now only handle 5 requests per second though.
> So the proxy server processes quickly end up waiting in the backlog queue. 

If you are using squid or a caching proxy, those static requests
would not be passed to the backend most of the time anyway. 

> Now *all* the mod_proxy processes are in "R" state and handling proxied
> requests. The result is that the static images -- which under normal
> conditions are handled quicly -- become delayed until a proxy process is
> available to handle the request. Eventually the backlog queue will fill up and
> the proxy server will hand out errors.

But only if it doesn't cache or know how to serve static content itself.

> Use a separate hostname for your pictures, it's a pain on the html authors but
> it's worth it in the long run.

That depends on what happens in the long run. If your domain name or
vhost changes, all of those non-relative links will have to be
fixed again.

  Les Mikesell
   [EMAIL PROTECTED]



Re: splitting mod_perl and sql over machines

2000-01-29 Thread Leslie Mikesell

According to Jeffrey W. Baker:

> I will address two points:
> 
> There is a very high degree of parallelism in modern PC architecture. 
> The I/O hardware is helpful here.  The machine can do many things while
> a SCSI subsystem is processing a command, or the network hardware is
> writing a buffer over the wire.

Yes, for performance it is going to boil down to contention for
disk and RAM and (rarely) CPU.  You just have to look at pricing
for your particular scale of machine to see whether it is cheaper
to stuff more in the same box or add another.  However, once you
have multiple web server boxes the backend database becomes a
single point of failure so I consider it a good idea to shield
it from direct internet access.

  Les Mikesell
   [EMAIL PROTECTED]



Re: Running 2 httpd on one mache question.

2000-01-26 Thread Leslie Mikesell

According to Martin A. Langhoff:

> But there's one thing that I can't imagine. When I run top, how do I
> tell memory/cpu consumption from lightweight daemons from the mem/cpu
> consumption from mod_perl daemons?

Sorry for the low mod_perl content, but if you are running Linux
and have X available on the network, it is fun to use 'lavaps'
which you can find with a search at www.freshmeat.net.  It shows
processes as though they were in a lava lamp, with the size
corresponding to memory usage, the color and movement related
to activity.  Besides being fun, it gives you a good feeling
for the relationship of the front/back end servers, database
backends, java servlets, and whatever else you might be running.

  Les Mikesell
   [EMAIL PROTECTED]



Re: Advise is needed...

2000-01-25 Thread Leslie Mikesell

According to BeerBong:
> 
> I need protect directory (/abonents) on server.
> User database lies on Radius Server.
> 
> I have front-end (apache proxy) + back-end apache servers.
> I've heard that authentication process must works on front-end server.

No, if you are using ProxyPass or RewriteRules with the [p] flag
the authentication can happen on the back end.  If the authentication
directives are in .htaccess files, they will not be referenced
before the proxy action.

  Les Mikesell
   [EMAIL PROTECTED]



Re: squid performance

2000-01-20 Thread Leslie Mikesell

According to Greg Stark:

> I tried to use the minspareservers and maxspareservers and the other similar
> parameters to let apache tune this automatically and found it didn't work out
> well with mod_perl. What happened was that starting up perl processes was the
> single most cpu intensive thing apache could do, so as soon as it decided it
> needed a new process it slowed down the existing processes and put itself into
> a feedback loop. I prefer to force apache to start a fixed number of processes
> and just stick with that number.

I've never noticed that effect, but I thought that apache always
grew in increments of 'StartServers' so I've tried to keep that
small, equal to MinSpareSevers, and an even divisor of MaxSpareServers
just on general principles.  Maybe you are starting a large number
as you cross the minspareservers boundaries.

  Les Mikesell
   [EMAIL PROTECTED]



Re: Using mod_backhand for load balancing and failover.

2000-01-20 Thread Leslie Mikesell

According to Jeffrey W. Baker:
> 
> Is anyone using mod_backhand (http://www.backhand.org/) for load
> balancing?  I've been trying to get it to work but it is really flaky. 
> For example, it doesn't seem to distribute requests for static content. 
> Bah.

I just started to look at it (and note that there was a recent update)but
haven't got it configured yet.  I thought it distributed whatever it
is configured to handle - it shouldn't be aware of the content type.
The parts I don't like just from looking at it are that the backend
servers all have to have the module included as well (I was hoping
to balance some non-apache servers too) and it looks like it may
be difficult or impossible to make it mesh with RewriteRules.

The mod_jserv load balancing looks much nicer at least at first
glance, but of course that doesn't help for mod_perl.   
 
Les Mikesell
 [EMAIL PROTECTED]



Re: squid performance

2000-01-20 Thread Leslie Mikesell

According to Greg Stark:

> > I think if you can avoid hitting a mod_perl server for the images,
> > you've won more than half the battle, especially on a graphically
> > intensive site.
> 
> I've learned the hard way that a proxy does not completely replace the need to
> put images and other other static components on a separate server. There are
> two reasons that you really really want to be serving images from another
> server (possibly running on the same machine of course).

I agree that it is correct to serve images from a lightweight server
but I don't quite understand how these points relate.  A proxy should
avoid the need to hit the backend server for static content if the
cache copy is current unless the user hits the reload button and
the browser sends the request with 'pragma: no-cache'.

> 1) Netscape/IE won't intermix slow dynamic requests with fast static requests
>on the same keep-alive connection

I thought they just opened several connections in parallel without regard
for the type of content.

> 2) static images won't be delayed when the proxy gets bogged down waiting on
>the backend dynamic server.

Is this under NT where mod_perl is single threaded?  Serving a new request
should not have any relationship to delays handling other requests on
unix unless you have hit your child process limit.

> Eg, if the dynamic content generation becomes slow enough to cause a 2s
> backlog of connections for dynamic content, then a proxy will not protect the
> static images from that delay. Netscape or IE may queue those requests after
> another dynamic content request, and even if they don't the proxy server will
> eventually have every slot taken up waiting on the dynamic server. 

A proxy that already has the cached image should deliver it with no
delay, and a request back to the same server should be serviced
immediately anyway.

> So *every* image on the page will have another 2s latency, instead of just a
> 2s latency for the entire page. This is worst in Netscape of course course
> where the page can't draw until all the images sizes are known.

Putting the sizes in the IMG SRC tag is a good idea anyway.

> This doesn't mean having a proxy is a bad idea. But it doesn't replace putting
> your images on pics.mydomain.foo even if that resolves to the same address and
> run a separate apache instance for them.

This is a good idea because it is easy to move to a different machine
if the load makes it necessary.  However, a simple approach is to
use a non-mod_perl apache as a non-caching proxy front end for the
dynamic content and let it deliver the static pages directly.  A
short stack of RewriteRules can arrange this if you use the 
[L] or [PT] flags on the matches you want the front end to serve
and the [P] flag on the matches to proxy.

  Les Mikesell
[EMAIL PROTECTED]



Re: splitting mod_perl and sql over machines

2000-01-18 Thread Leslie Mikesell

According to Stas Bekman:

> We all know that mod_perl is quite hungry for memory, but when you have
> lots of SQL requests, the sql engine (mysql in my case) and httpd are
> competing for memory (also I/O and CPU of course). The simplest solution
> is to bump in a stronger server until it gets "outgrown" as the loads
> grow and you need a more sophisticated solution.

In a single box you will have contention for disk i/o, RAM, and CPU.
You can avoid most of the disk contention (the biggest time issue)
by putting the database on it's own drive.  I've been running dual
CPU machines, which seems to help with the perl execution although
I haven't really done timing tests against a matching single
CPU box.  RAM may be the real problem when trying to expand a
Linux pentium box.

> My question is a cost-effectiveness of adding another cheap PC vs
> replacing with new expensive machine. The question is what are the
> immediate implications on performace (speed)? Since the 2 machines has to
> interact between them. e.g. when setting the mysql to run on one machine
> and leaving mod_perl/apache/squid on the other. Anyone did that? 

Yes, and a big advantage is that you can then add more web servers
hitting the same database server.

> Most of my requests are served within 0.05-0.2 secs, but I afraid that
> adding a network (even a very fast one) to deliver mysql results, will
> make the response answer go much higher, so I'll need more httpd processes
> and I'll get back to the original situation where I don't have enough
> resources. Hints?

The network just has to match the load.  If you go to a switched 100M
net you won't add much delay.  You'll want to run persistent DBI
connections, of course, and do all you can with front-end proxies
to keep the number of working mod_perl's as low as possible.

> I know that when you have a really big load you need to build a cluster of
> machines or alike, but when the requirement is in the middle - not too
> big, but not small either it's a hard decision to do... especially when
> you don't have the funds :)

The real killer time-wise is virtual memory paging to disk.  Try to 
estimate how much RAM you are going to need at once for the mod_perl
processes and the database and figure out whether it is cheaper to
put it all in one box or two.  If you are just boarderline on needing
the 2nd box, you might try a different approach.  You can use a
fairly cheap box as a server for images and static pages, and perhaps
even your front-end proxy server as long as it is reliable.

  Les Mikesell
   [EMAIL PROTECTED]



Re: modperl success story

2000-01-14 Thread Leslie Mikesell

According to Barb and Tim:
> It could really enhance your integrity if you also
> presented honest evaluations of the downsides of Perl.

Perl has two downsides. One is the start-up time for
the program and mod_perl solves this for web pages.

> The promotion of Perl on this site is so ubiquitous and
> one sided, and Perl has such a bad reputation in many ways,
> that somebody like me has a hard time swallowing the sunny
> prognostications and finally diving in, unless I see
> full honesty.  The language itself is hard enough to swallow.
> Just a suggestion.

The other down side is that it is fast and easy to write working
programs that are difficult for someone else to understand.
That is, it accepts an individual's style instead of forcing
something universal.   I guess everyone here is willing to
accept that tradeoff.

  Les Mikesell
   [EMAIL PROTECTED]



Has anyone tried mod_backhand?

2000-01-11 Thread Leslie Mikesell

Has anyone tried mod_backhand (in the apache module registry) in
a front end apache proxying to multiple mod_perl'd back end
servers?   It claims to load balance, keeping track of the
status and load of the back end servers, which also appear
to need the module included.   Is there a better free alternative
that will detect dead backend servers and avoid them?  (I'm
actually having more trouble with java servlets right now, but
the idea is the same with mod_perl...).

  Les Mikesell
   [EMAIL PROTECTED]



Re: Managing session state over multiple servers

1999-12-16 Thread Leslie Mikesell

According to James G Smith:
> 
> Sun (and most other server/disk providers - e.g., SGI, NetAPP) provides a 
> volume manager that can handle RAID volumes.  If you have a spare disk in the 
> system and a disk goes bad in the volume, you can take the bad disk out, put 
> the spare in, and then replace the bad disk at a convenient time.  The machine 

Is anyone running MySQL or PostgreSQL with the drives NFS mounted
from a NetAPP?  If so, does it perform as well or better than
with local drives?

  Les Mikesell
  [EMAIL PROTECTED]



Re: pool of DB connections ?

1999-11-29 Thread Leslie Mikesell

According to Oleg Bartunov:
> > > Currently I have 20 apache servers which
> > > handle 20 connections to database. If I want to work with
> > > another database I have to create another 20 connections
> > > with DB, so I will have 40 postgres backends. This is too much.
> 
> I didn't write all details but of course I already have 2 servers setup.

Are you sure you need 20 concurrent backend servers?  If you have
enabled apache's 'server-status' reporting you can watch the
backend during some busy times to see how many are doing anything.
It is probably to have too few servers (the front end will wait
as long as the requests don't overflow the listen queue) than
so many that the machine starts paging virtual memory to disk.

  Les Mikesell
   [EMAIL PROTECTED] 



Re: pool of DB connections ?

1999-11-29 Thread Leslie Mikesell

According to Oleg Bartunov:

> I'm using mod_perl, DBI, ApacheDBI and was quite happy
> with persistent connections httpd<->postgres until I used
> just one database. Currently I have 20 apache servers which
> handle 20 connections to database. If I want to work with
> another database I have to create another 20 connections
> with DB. Postgres is not multithreading
> DB, so I will have 40 postgres backends. This is too much.
> Any experience ?

Try the common trick of using a lightweight non-mod_perl apache
as a front end, proxying the program requests to a mod_perl
backend on another port.  If your programs live under directory
boundaries you can use ProxyPass directives. If they don't you
can use RewriteRules with the [p] flag to selectively proxy
(or [L] to not proxy).  This will probably allow you to cut
the mod_perl httpd's at least in half.  If you still have a
problem you could run two back end httpds on different ports
with the front end proxying the requests that need each database
to separate backends.  Or you can throw hardware at the problem
and move the database to a separate machine with enough memory
to handle the connections.

  Les Mikesell
   [EMAIL PROTECTED]



Re: Dynamic rewriting

1999-11-24 Thread Leslie Mikesell

According to Atipat Rojnuckarin:
> Hi,
> 
> I think mod_rewrite (URI Translation) get called
> before Apache::AuthenDBI/AuthzDBI, so mod_rewrite has
> no way of knowing which group a user belongs to. 
> You'll probably need to write your own customized
> handler(s) to do what you want.

Mod_rewrite actually gets called twice.  Once where you
expect the uri->filename translation and later where it 
can try again after the fact.  If you put rules in the .htaccess
file they only have the 2nd chance and should be run after
authorization.  Since AuthzDBI exports the REMOTE_GROUP,
you might have access to this for some black magic in
the rewrite. 

  Les Mikesell
   [EMAIL PROTECTED]



Re: Performance problems

1999-11-24 Thread Leslie Mikesell

According to Rasmus Lerdorf:
> > I have a site running Apache-1.3.6, PHP-3.0.9, mod_perl-1.20 on Solaris with
> > a Sybase Database. And it has some performance flaws. This site has
> > thousands of hits per day (not sure of how many, though) and it should be
> > faster (few images, no animations whatsoever).
> > 
> > Can anybody tell me what could be done here? EmbPerl instead of PHP?
> > mod_perl/apache tuning? Should database access be done through PHP or
> > mod_perl? When should I use PHP? and mod_perl? do I need both?
> 
> Well, which one are you using for talking to Sybase with?  Choosing one or
> the other would reduce your memory requirements a bit.  Performance-wise
> they really are quite similar.  Choosing one over the other is more of a
> personal preference thing.

If the hits are coming over the internet, you could reduce memory
usage with a lightweight front-end apache proxying to your
heavyweight backends.  The front end can deliver any static
content directly.  If you currently have both mod_perl and php
scripts, you could try serving from separate backend servers
each with only one interpreter included running on different ports.
The front end can use RewriteRules with the [p] flag to transparently
pass requests to the right server.  The mod_perl connections  to
Sybase should be using Apache::DBI if possible.

  Les Mikesell
   [EMAIL PROTECTED]



Re: Problems with mod_perl 1.2.1 and apache 1.3.9 - newbie - Please help!

1999-11-20 Thread Leslie Mikesell

According to Scott Chapman:
> I'm new to compiling my own software and attempting to get mod_perl
> and apache to work together.  I have Redhat 6.0.

Most Redhat versions have problems that go away if you compile
and install your own perl.

>  + doing sanity check on compiler and options
> ** A test compilation with your Makefile configuration
> ** failed. This is most likely because your C compiler
> ** is not ANSI. Apache requires an ANSI C Compiler, such
> ** as gcc. The above error message from your compiler
> ** will also provide a clue.
>  Aborting!

I think it is picking up the perl compiler options from the stock
version on your system, and it doesn't match the compiler that
is currently installed.  There may be an easier fix, but building
perl yourself should take care of it.  If you end up with perl
in /usr/local/bin, be sure to kill the old ones in /usr/bin and
replace them with symlinks to keep everything else happy.

  Les Mikesell
   [EMAIL PROTECTED]



Re: Stonehenge::Throttle, round 2 - CPU used by an IP

1999-11-17 Thread Leslie Mikesell

According to Randal L. Schwartz:
> >>>>> "Leslie" == Leslie Mikesell <[EMAIL PROTECTED]> writes:
> 
> Leslie> How about an option to redirect to a different machine instead?  I've
> Leslie> considered digging out an old, slow 386 to handle greedy clients
> Leslie> without obviously denying service to them.
> 
> Most evil spiders I've see don't really pay attention to a redirect,
> in any way other than "this page wasn't ready".  In fact, that
> redirect probably would put the scope of interest outside the spiders
> suck zone.

That would serve the purpose just as well.  Most of my spiders are
actually programs written to harvest our commodities trading data
and since we don't have any stated policy to prohibit that, I don't
really want to refuse the requests or fail but I do want to limit
the impact on other activity on the server. I suspect they would
adapt to follow but it wouldn't matter to me either way.  So far
most of this activity has been after the exchange closings, so it
hasn't had a serious impact on our mid-day peak usage but it may
reach a point where I have to try something.  Are you sure that
keeping track of the client addresses doesn't turn it into an
overall loss compared to just completing the requests?

  Les Mikesell
   [EMAIL PROTECTED]



Re: Stonehenge::Throttle, round 2 - CPU used by an IP

1999-11-16 Thread Leslie Mikesell

According to Randal L. Schwartz:
> 
> So, I modified my throttler to look at the recent CPU usage over a
> window for a given IP.  If the percentage exceeds a threshold, BOOM
> they get a 503 error and a correct "Retry-After:" to tell them how
> long they're banned.

How about an option to redirect to a different machine instead?  I've
considered digging out an old, slow 386 to handle greedy clients
without obviously denying service to them.

  Les Mikesell
   [EMAIL PROTECTED]



Re: Ye Ol' Template System Thread

1999-11-15 Thread Leslie Mikesell

According to Sam Tregar:
> 
> > I think a lot of unnecessary complexity comes from the fact that
> > most of the template systems (and apache modules in general) want
> > to output the html as a side effect instead of accumulating the
> > page in a buffer or just returning a string containg the html plus
> > a status value to the caller.  
> 
> That's a very strange analysis.  HTML::Template (my small contribution to
> the genre) does no printing to the user - it returns a chunk of HTML ready
> for the consumer or an error if something went wrong.  I don't really see
> that this significantly reduces the complexity of using a templating
> system!
> 
> Rather, I think that most of the simplicity of HTML::Template comes from
> its strictly "one-way" interface.  The template file contains only
> output-oriented structures.  Input can only come from the perl side.  I
> think that much of the "slippery slope" refered to previously comes from
> allowing the template file to perform processing of its own - to set
> variables and call procedures, for example.

Right. You don't see the problem until you add conditionals and
flow control - and perhaps not even until you try to reuse some
existing pages as sub-elements of another.  Apache is moderately
good at handling:
<--!include virtual "just about anything..." -->
within mod_include, even mixing different handlers and proxied
elements, but it is very awkward to wrap any kind of conditional
execution or parameter passing in the mod_include language, and
impossible to do anything where the condition depends on a
sub-element.

> H... Shouldn't someone be suggesting a grand-unified templating system
> right about now?  Or maybe we're finally beyond that?  I hope so!  The
> truth of the matter is that there is no one ultimate way to tackle
> generating HTML from Perl.

What I'm looking for is a 'nestable' way of handling the logic
flow and HTML construction that will allow a page to be used
as a stand-alone item (perhaps displayed in a frameset) or
included in another page, but when it is included I'd like to
have the option of letting its execution return a status
that the including page could see before it sends out any
HTML.

  Les Mikesell
   [EMAIL PROTECTED]



Re: Trying not to re-invent the wheel

1999-11-14 Thread Leslie Mikesell

According to Rasmus Lerdorf:

> > > Those introduce more complex problems.
> > 
> > And they are, of course, inevitable with almost any templating
> > system.
> 
> You know, PHP was once just a templating system. 
[...]
> Then I figured it would be a good idea to add stuff like
> IF/LOOPS/etc so I could manipulate my tags a little bit.
> 
> Now, 5 years later, people are writing template systems that sit on top of
> PHP because they are writing business logic in PHP which means yet another
> template system is needed to separate code from layout.  
> 
> I wonder how many layers of templates we will have 5 years from now.

I think a lot of unnecessary complexity comes from the fact that
most of the template systems (and apache modules in general) want
to output the html as a side effect instead of accumulating the
page in a buffer or just returning a string containg the html plus
a status value to the caller.  This means that you can't easily
make nested sub-pages without knowing ahead of time how they
will be used, and worse, if you get an error in step 3 of generating
a page you can't undo the fact that steps 1 and 2 are probably already
on the user's screen.  If the template language offers some
flow control and logic and the ability for one 'page' to return
a status plus a string containing it's html to another page that
includes it then you wouldn't need a different template system
to separate logic from layout, you would just put them in different
pages, letting the 'code' page include the layout elements it wants.

  Les Mikesell
   [EMAIL PROTECTED]



mod_proxy_add_forward and logging

1999-11-01 Thread Leslie Mikesell

The recent message about proxy_add_forward reminded me of a simple
change I made that might help anyone who wants to track the
logs matching the source/destination of proxied requests. 
I also activated mod_unique_id and in mod_proxy_add_forward, after
if (r->proxyreq) {
  ap_table_set(r->headers_in, "X-Forwarded-For",
   r->connection->remote_ip);
I added: (too lazy to write a whole module for this...) 
  ap_table_set(r->headers_in, "X-Parent-Id",
 ap_table_get(r->subprocess_env, "UNIQUE_ID")) ;

Then I added elements in the LogFormat for
   %{UNIQUE_ID}e %{X-Parent-Id}i 
The result is that the UNIQUE_ID field is different on every hit
and can be used as a database key.  If the first server hit
delivers the content directly, the X-Parent-Id will be logged as
"-".  If it passes the request by proxy to another server it will
be the same as the UNIQUE_ID (I wasn't expecting that, but it is
interesting).  The machine that receives the proxy request will
log the X-Parent-Id containing the same value as the sender's
UNIQUE_ID which can then be used to tie them together.

  Les Mikesell
   [EMAIL PROTECTED]



Re: Generic Server

1999-10-31 Thread Leslie Mikesell

According to Matt Sergeant:

> Well I'll show by example. Take slash (the perl scripts for slashdot.org) -
> it's got a web front end and now available is an NNTP front end. Wouldn't
> it be nice to run both in-process under mod_perl, so you could easily
> communicate between the two, use the same logging code, use the same core
> modules, etc. That's what I'm thinking of.

If the common code is written as perl modules or shared C libraries
wrapped as perl modules, you can easily use the same routines
in different programs.  There is no need to include them all
in places where they aren't needed.

> Besides that, with a mod_perl enabled generic server rather than an inetd
> server there's no loading config files for each request, no starting a
> process, and Apache 2.0 (and I'm assuming mod_perl) will be available as a
> threaded server, so it's only 1 10-20M process, not 100+.

Server start-up time is generally only relevant for protocols that
make a connection-per-request, and HTTP is about the only thing
that does that.  Regardless, it is simple enough to make a dedicated
server listen on each port if you prefer.

Threads may help with the memory problem but I'm not convinced yet.
It has taken about 15 years to get the standard libraries mostly
thread-safe.  I don't think it will happen instantly with perl.
Maybe with java, where they were designed in from the start...

  Les Mikesell
   [EMAIL PROTECTED]



Re: Generic Server

1999-10-29 Thread Leslie Mikesell

According to Matt Sergeant:

> > >Would it be possible to have a generic server, like Apache, but not just
> > >for HTTP - something that could also serve up NNTP connections, FTP
> > >connections, etc. It seems to me at first look this should be possible.
> > >
> > >As I can see it there's a few key components to Apache:
> > >
> > >forking tcp/ip server
> > >file caching/sending
> > >header parsing
> > >logging
> > 
> > Sounds a lot like inetd to me, IMHO.
> 
> Maybe I'm wrong, but inetd is just #1 of those points. And slow too.

Inetd just decides which server to start for which protocol, and
the only slow part is starting up a large program which may need
to read a config file.  However you didn't explain why you would
like to replace these typically small and fast programs with
a 10-20Meg mod_perl process.  I can see where having a common
modular authentication method would be useful, but what else would
they have in common?

  Les Mikesell
   [EMAIL PROTECTED]



Re: modperl in practice

1999-10-29 Thread Leslie Mikesell

According to [EMAIL PROTECTED]:

> I will probably do that next. I am not clear what the difference is
> between running squid and doing a mod_proxy, in the case of all dynamic
> content... a remark in the tuning guide that i saw back in June was vague about
> this, saying it wasnt clear whether squid can cache larger doucments or
> if there is some limit.

I don't think there is much difference in practice, although when
I was running a squid front-end I noticed a substantial number of
'client-refresh' hits pulling images that should have been in the cache
from the back-end server anyway.  

> I did read the guide, although I didnt re-read it enough, as Vivek
> so very gently pointed out to me.  I think that there needs to be an
> intro that says, basically, if you expect more than N requests per 
> minute for dynamic content, then start from this config, and the rest
> of the tuning guide is all about tweaking that. 

It really isn't hits/second that matters - it is how fast each server
process can move on to the next request.  If all of your clients were on a
fast local network a proxy would just be extra overhead but on the
internet you will have a certain number of slow connections that tie
up the servers.

  Les Mikesell
   [EMAIL PROTECTED]



Re: modperl in practice

1999-10-29 Thread Leslie Mikesell

According to [EMAIL PROTECTED]:
> 
> I still have resisted the squid layer (dumb
> stubbornness I think), and instead got myself another IP address on the
> same interface card, bound the smallest most light weight separate
> apache to it that I could make, and prefixed all image requests with 
> http://1.2.3.4/.. voila. that was the single biggest jump in throughput
> that I discovered.

You still have another easy jump, using either squid or the two-apache
approach.  Include mod_proxy and mod_rewrite in your lightweight
front end, and use something like:
RewriteRule ^/images/(.*)$ - [L] 
to make the front-end deliver static files directly, and
at the end:
RewriteRule^(.*)$ http://127.0.0.0:8080$1 [p] 
to pass the rest to your mod_perl httpd, moved to port 8080.
If possible with your content turn off authentication in
the front end server.

>.. people were connecting to the site via this link, and packet loss
> was such that retransmits and tcp stalls were keeping httpd heavies
> around for much longer than normal..

Note that with a proxy, this only keeps a lightweight httpd tied up,
assuming the page is small enough to fit in the buffers.  If you
are a busy internet site you always have some slow clients.  This
is a difficult thing to simulate in benchmark testing, though.
 
> comments or corrections most welcome.. i freely admit to not having
> enough time to read the archives of this group before posting.

I probably won't be the only one to mention this, but you might have
a lot more time if you had, or at least gone through the guide
at http://perl.apache.org/guide/ which covers most of the problems.

  Les Mikesell
   [EMAIL PROTECTED] 



Re: Rewrite to handler

1999-10-16 Thread Leslie Mikesell

According to Eric Cholet:
> > What is the most straightforward way to make a RewriteRule
> > map an arbitrary URL directly to a handler? 

> Do you really need to rewrite, I mean can't you just use
> a  container ?

Yes, that will work, but putting all of the special cases into
RewriteRules makes it easier to see what is going on (for me
at least).  Also, especially on the front-end side it makes
it easier to tune the config file where you might want to
alternate between proxying a program off to another server or
running it locally as a CGI.   I guess on the back end mod_perl
side I never proxy again so it doesn't matter so much.

  Les Mikesell
   [EMAIL PROTECTED]



Rewrite to handler

1999-10-15 Thread Leslie Mikesell

What is the most straightforward way to make a RewriteRule
map an arbitrary URL directly to a handler?  I can do it
by setting a handler for a directory, putting the file there
and rewriting to that location, or by setting a handler for
a mime-type and specifying -T for that type in the rewriterule.
Have I missed a more direct way?  (I want to mix-n-match PerlRun,
Registry, and CGI for some old programs without changing the
visible URL's.) 

   Les Mikesell
   [EMAIL PROTECTED]



Re: 2 servers setup: mod_proxy cacheing

1999-10-08 Thread Leslie Mikesell

According to Oleg Bartunov:

> Hmm, very interesting solution but I can't explain every user
> please configure your browser in special way to browse my site.

I didn't mean that was a solution - just that I set everything
up according to directions and it did cache when it had an
explict proxy request from a client, but I still didn't get
the internal proxy requests to cache.  Not much of my dynamic
content could be meaningfully cached anyway so it is working
out pretty well to have a strict split between static files
and uncached proxy passthrough.  There are a couple of exceptions
where I rebuild static files every few minutes.  If I had a
lot of those I would probably work harder on controlling a
cache.

> The problem becomes more complex if we'll take into account
> not only proxy cacheing feature but also clients browser
> cache. 

It turns out to be hard to get this exactly right.  Most clients
won't cache anything with a '?' at all and intermediate caches
have differing policies about /cgi-bin/ and other well-known
hints about dynamic content.  Also the original Expires: header
uses the clients concept of time which may not match yours
(and there is a bug in an old version of Netscape that causes
it to reload animated gifs for each animation step if an
Expires: header is present).  There are now Cache-Control: headers
that give a finer grained control but not everything uses them.

  Les Mikesell
   [EMAIL PROTECTED]



Re: 2 servers setup: mod_proxy cacheing

1999-10-08 Thread Leslie Mikesell

According to Stas Bekman:
> > I have 2 servers setup and would like to configure
> > frontend to cache output from backend using mod_proxy.
> > I tried to add various headers but never seen any files
> > in proxy directory. I didn't find any recommendation in your
> > guide how to configure servers to get Apache cacheing works.
> 
> A good question... Any of the mod_proxy users care to give Oleg (and the
> guide :) a hand here? I use squid as a front end, that's why I don't have
> any examples from mod_proxy...

I tried it several versions of apache ago and never got it to 
cache anything as a reverse proxy, but with the same setup I
could configure a browser to use the box as a normal proxy and
it would cache those pages.  I used squid for a while, then
switched to an apache that serves the images and static pages
directly and proxies everything else to mod_perl (using mod_rewrite
to decide). 

  Les Mikesell
   [EMAIL PROTECTED]



Re: DB connection pooling

1999-10-04 Thread Leslie Mikesell

According to Stefan Reitshamer:
> 
> Sorry if this is in a FAQ, but I dug through a lot of FAQs and archives and
> couldn't find it:
> 
> Is there any database connection pooling built into mod_perl, or DBI?

Not exactly, but you can use Apache::DBI to make the connections
persistent, and you can greatly reduce the number of httpd's holding
connections by using a non-mod_perl front end httpd that uses
ProxyPass or RewriteRules to direct the requests that need database
service to a mod_perl backend.   

  Les Mikesell
   [EMAIL PROTECTED]



Re: Comparison PHP,clean mod_perl script,Apache::ASP,HTML::Embperl

1999-10-04 Thread Leslie Mikesell

According to BeerBong:
> Huh, another test
> test.epl
According to BeerBong:

> test.epl
> --
> use DBI;
> $dbh  = DBI->connect("DBI:Oracle:SIMain","test","test");
> PHP3 and mod_perl script with DBD::Oracle - 24 Requests per second

Is this with or without Apache::DBI loaded?

  Les Mikesell
  [EMAIL PROTECTED]



Re: mod_perl + mod_jserv: what's your milage?

1999-10-03 Thread Leslie Mikesell

According to Nick Bauman:
> The servlet zones concept also has no parallel in
> mod_perl. That is one of the most compelling reasons
> for load distribution.

Actually there is not much practical difference between
the jserve interface and using mod_proxy (with or
without mod_rewrite) to pass requests to a mod_perl
enabled httpd.  Neither one provides quite everything
you want for load balancing and dead host detection,
but they are better than nothing.  There is a module
that does claim to proxy with good load balancing
(mod_backhand) but I have not had time to see if it
will mesh with mod_rewrite.

> So, the reason you don't allow direct connections to
> your mod_perl system is because of security? You
> didn't explicitly say...

No, just memory use.  The mod_perl httpd is likely to
use 20 megs of memory and may be able to serve hundreds
of requests per second.  However, if you let it talk
directly to a client browser you are at the mercy of
every overloaded router and modem on the internet as to
how long each of those requests actually take to complete.
It is difficult to model a slow client in a benchmark
test, but this is a real problem in production.  The 
jserve interface acts about the same way, but the threaded
java server might not have as much impact anyway.

> I have both loaded as DSOs. I haven't yet encountered
> the ApJServMount and RewriteRules you speak of, as
> this only when you are mucking directly with the
> Apache API, which I haven't needed to yet. In theory
> it should give you added flexibility (at the tradeoff
> of complexity)

My scheme is for the front end httpd to accept and log
everything but deliver only static and unprotected files
itself.  Things that require any processing, including
*.shtml files are proxied through to mod_perl httpds
spread over several machines.  The programs were
written without this scheme in mind (and we encouraged
people to bookmark everything) so the RewriteRules
that force the proxy to the right backends are pretty
ugly and arbitrary.  But, the ApJServMount fits into
this model pretty well and with new development it is
easy enough to map the servlets into a directory.

  Les Mikesell
   [EMAIL PROTECTED]



Re: mod_perl + mod_jserv: what's your milage?

1999-10-03 Thread Leslie Mikesell

According to Nick Bauman:
> Anyone out there using mod_perl and mod_jserv together
> on a production system? What are your results? I'm
> playing around with a combined mod_perl, mod_jserv
> (both as DSO's) apache and it seems to work pretty
> slick, but I'm wondering if I'm building a
> Frankenstein that will bring me grief later...

I have them in a production setup, but not using mod_jserv
heavily yet.  I took the approach of compiling mod_jserv
statically into the front end httpd with a separate
mod_perl version as a back end.  The only real ugliness
is the stack of ApJServMount and RewriteRules that
sort out where everything goes.

The one quirk I noticed is that if you try to include
both mod_perl and mod_java statically, the mod_perl
tests will not run because the test config file does
not set any jserv authentication.  But, I only tried
that for a low-usage test machine - I don't think I
would run a non-proxied mod_perl in production with
internet connections anyway.

> Personal benchtests seem to be awesome, with Perl
> slightly leading Java in performance (using ibm's JDK
> brought the figures much closer together than
> Blackdown's) but real world is different as we all
> know.

Mostly I am investigating java because we have data
that can be obtained in xml and formatted using
various xsl stylesheets and there is no support
for this in perl yet (and I'm too lazy to write my
own).  I am very impressed by the ability to develop
the java servlets on windows or unix and copy the
servlet bytecode to the other and run it unchanged.
Likewise you can transparently run the apache on
one machine and the jserve on another without regard
to the operating system.

   Les Mikesell
[EMAIL PROTECTED]



Re: Performance problems

1999-10-01 Thread Leslie Mikesell

According to ricarDo oliveiRa:

> I have a site running Apache-1.3.6, PHP-3.0.9, mod_perl-1.20 on Solaris with
> a Sybase Database. And it has some performance flaws. This site has
> thousands of hits per day (not sure of how many, though) and it should be
> faster (few images, no animations whatsoever).
> 
> Can anybody tell me what could be done here? EmbPerl instead of PHP?
> mod_perl/apache tuning? Should database access be done through PHP or
> mod_perl? When should I use PHP? and mod_perl? do I need both?

Are you using Apache::DBI to hold persistent connections to the
backend database?  It may be the connect time that is the problem.

  Les Mikesell
   [EMAIL PROTECTED]



Re: external module and mod_perl

1999-10-01 Thread Leslie Mikesell

According to Dustin Tenney:
> > mod_perl ties the Perl STDOUT filehandle to Apache i/o, but this is
> > different from C stdout, which cannot easily hooked to Apache i/o without 
> > a subprocess pipe (as mod_cgi does).  I don't know of a decent workaround.
> > do you have access to the library sources?
> 
> Yea I wrote the library.  I was hoping to get this to work because I
> really need the extra performance that C is giving me.  Basically it
> parses an html file and outputs the results to stdout.  It uses
> open/read/write to do this.  Is there any documentation anywhere on how
> this works?  Thanks for the info.

Have you measured a performance difference here?  I'd be very 
surprised at anything you can do in C being enough faster at
doing something to an html file to make up for the process
startup time compared to letting mod_perl code do it - or
are you doing number crunching?  Anyway if it is your own library
why not make it into a .xs module to get the speed without
having to start another process and hook its output?  Or
if you do want the output, let your perl code read it from a
pipe or execute it in back-ticks.

  Les Mikesell
   [EMAIL PROTECTED]