Re: ApacheCon report
According to Michael Blakeley: > > > I'm not following. Everyone agrees that we don't want to have big > > > mod_perl processes waiting on slow clients. The question is whether > > > tuning your socket buffer can provide the same benefits as a proxy server > > > and the conclusion so far is that it can't because of the lingering close > > > problem. Are you saying something different? > > > > A tcp close is supposed to require an acknowledgement from the > > other end or a fairly long timeout. I don't see how a socket buffer > > alone can change this.Likewise for any of the load balancer > > front ends that work on the tcp connection level (but I'd like to > > be proven wrong about this). > > Solaris lets a user-level application close() a socket immediately > and go on to do other work. The sockets layer (the TCP/IP stack) will > continue to keep that socket open while it delivers any buffered > sends - but the user application doesn't need to know this (and > naturally won't be able to read any incoming data if it arrives). > When the tcp send buffer is empty, the socket will truly close, with > all the usual FIN et. al. dialogue. > > Anyway, since the socket is closed from the mod_perl point of view, > the heavyweight mod_perl process is no longer tied up. I don't know > if this holds true for Linux as well, but if it doesn't, there's > always the source code. I still like the idea of having mod_rewrite in a lightweight front end, and if the request turns out to be static at that point there isn't much point in dealing with proxying. Has anyone tried putting software load balancing behind the front end proxy with something like eddieware, balance or ultra monkey? In that scheme the front ends might use an IP takeover failover and/or DNS load balancing and would proxy to what they think is a single back end server - then this would hit a tcp level balancer instead. Les Mikesell [EMAIL PROTECTED]
Re: Connection Pooling / TP Monitor
According to Gunther Birznieks: > I guess part of the question is what is meant by "balanced" with regard to > the non-apache back-end servers that was mentioned? I'd be very happy with either a weighted round-robin or a least-connections choice. When the numbers get to the point where it matters, pure statistics is good enough for me. But, I love what you can do with mod_rewrite and would like an easy way to point the target of a match at an arbitrary set of back end servers. Mod_jserv has a nice configuration setting for multiple back ends where you name the set and weight each member. If mod_proxy and/or mod_backhand had a similar concept with the group name being usable as a target for mod_rewrite and ProxyPass it would be easy to use. I think Matt's idea of creating a Location handler and rewriting to the location would work as long as the modules are loaded in the right order, but it would make the configuration somewhat confusing. > I am also concerned that the original question brings up the notion of > failover. mod_backhand is not a failover solution. Backhand does have some > facilities to do some failover (eg ByAge weeding) but it's not failover in > the traditional sense. Backhand is for load balance not failover. Does it do something sensible if one of the targets does not accept the connection, or does it start sending them all to that one because it isn't busy? Mod_jserv claims to mark that connection dead for a while and moves on to another backend so you have a small delay, not a failure. After a configurable timeout it will try the failing one again. > While Matt is correct that you could probably write your own load balance > function, the main interesting function in mod_backhand is ByLoad which as > far as I know is Apache specific and relies on the Apache scoreboard (or a > patched version of this) The problem of writing your own is that it needs to be in the lightweight server - thus all in C. > Non apache servers won't have this scoreboard file although perhaps you > could program your own server(s) to emulate one if it's not mod_backhand. > > The other requirement that non-apache servers may have for optimal use with > mod_backhand is that the load balanced servers may need to report > themselves to the main backhand server as one of the important functions is > ByAge to weed out downed servers (and servers too heavily loaded to report > their latest stats). If a failed connection would set the status as 'down' and periodic retries checked again, this would take care of itself. > Otherwise, if you need to load balance a set of non-apache servers evenly > and don't need ByLoad, you could always just use mod_rewrite with the > reverse_proxy/load balancing recipe from Ralf's guide. This solution would > get you up and running fast. But the main immediate downside (other than no > true *load* balancing) is the lack of keep-alive upgrading. I'll accept randomizing as reasonable balancing as long as I have fine grained control of the URL's I send to each destination. The real problem with the rewrite randomizer is the complete lack of knowlege about dead backend servers. I want something that will transparently deal with machines that fail. > I am also not sure if mod_log_spread has hooks to work with mod_backhand in > particular which would make mod_rewrite load balancing (poor man's load > balancing) less desirable. I suspect mod_log_spread is not > backhand-specific although made by the same group but having not played > with this module yet, I couldn't say for sure. If you can run everything through a single front end apache you can use that as the 'real' log. There is some point where this scheme would not handle the load and you would need one of the connection oriented balancers instead of a proxy, but a fairly ordinary pentium should be able to saturate an ethernet or two if it is just fielding static files and proxying the rest. You would also need a fail-over mechanism for the front end box, but this could be a simple IP takeover and there are some programs available for that. Les Mikesell [EMAIL PROTECTED]
Re: [OT] Will a cookie traverse ports in the same domain?
According to martin langhoff: > > this HTTP protocol (definition and actual implementation) question is > making me mad. Will (and should) a cookie be valid withing the same > host/domain/subdirectory when changing PORT numbers? I think this depends on the browser (and its version number). However if you set up your front end proxy correctly it should be completely invisible and the client browser should never see a different hostname or port and cookies should continue to work. > All my cookies have stopped working as soon as I've set my mod_perl > apache on a high port with a proxying apache in port 80 [ see thread > "AARRRGH! The Apache Proxy is not transparent wrt cookies!" ] Be sure that you have set a ProxyPassReverse to match anything that can be proxied even if you are using rewriterules to do the actual proxy setup. This will make the proxy server fix any redirects that mention the backend port or location (if different). You also need to make sure you aren't mentioning the local port in your own perl code or generating links that show it. If the port number shows up in your browser location window, you have something wrong. Les Mikesell [EMAIL PROTECTED]
Re: Wild Proposal :)
According to David E. Wheeler: > Perrin Harkins wrote: > > > > My point was that Apache::DBI already gives you persistent connections, > > and when people say they want actual pooled connections instead they > > usually don't have a good reason for it. > > Let's say that I have 20 customers, each of whom has a database schema > for their data. I have one Apache web server serving all of those > customers. Say that Apache has forked off 20 children. Each of the > customers who connects has to use their own authentication to their own > schema. That means that Apache::DBI is caching 20 different connections > - one per customer. Not only that, but Apache::DBI is caching 20 > different connections in each of the 20 processes. Suddenly you've got > 400 connections to your database at once! And only 20 can actually be in > use at any one time (one for each Apache childe). > > Start adding new customers and new database schemas, and you'll soon > find yourself with more connections than you can handle. Wouldn't this be handled just as well by running an Apache per customer and letting each manage it's own pool of children which will only connect to it's own database? > And that's why connection pooling makes sense in some cases. I think you could make a better case for it in a situation where the reusability of the connection isn't known ahead of time, as would be the case if the end user provided a name/password for the connection. Les Mikesell [EMAIL PROTECTED]
Re: AuthDBI - semget failed problem
According to Pramod Sokke: > I'm trying to set up Authentication using Apache::AuthDBI. I'm establishing db >connections at startup. > I've set $Apache::DBI::DEBUG = 2. > > When I start the server, I get the following message for every child process: > Apache::AuthDBI PerlChildInitHandler semget failed > > And whenever I access the server, I get these messages in my error log: > > Apache::AuthDBI PerlChildInitHandler semget failed > Apache::AuthDBI PerlChildExitHandler shmread failed > Apache::AuthDBI PerlChildExitHandler shmwrite failed > > What does all this mean? Are you running freebsd? You may need to rebuild a kernel with the sysV semaphores and shared memory enabled. Les Mikesell [EMAIL PROTECTED]
Re: Poor man's connection pooling
According to Michael Peppler: > The back-end is Sybase. The actual connect time isn't the issue here > (for me.) It's the sheer number of connections, and the potential > issue with the number of sockets in CLOSE_WAIT or TIME_WAIT state on > the database server. We're looking at a farm of 40 front-end servers, > each runnning ~150 modperl procs. If each of the modperl procs opens > one connection that's 6000 connections on the database side. > > Sybase can handle this, but I'd rather use a lower number, hence the > pooling. Are you using the lightweight httpd proxy front end setup and still have 150 modperl httpd's per server? If not, I'd try that approach first. I usually see about a 10:1 ratio of front to back end servers which really cuts down on the database connections (and the static images are served by a different set of machines so most of this effect comes from the proxy releasing the back end process quickly). Also, if you have pages that do not need the database connection you could set up mod_proxy or mod_rewrite to send those requests to a different set of back-end servers. Les Mikesell [EMAIL PROTECTED]
Re: HTML Template Comparison Sheet ETA
According to Steve Manes: > At 11:26 AM 9/4/00 -0300, Nelson Correa de Toledo Ferraz wrote: > >I agree that one shouldn't put lots of code inside of a template, but > >variables and loops are better expressed in Perl than in a "little > >crippled language". > > Your example makes perfect sense to me. But that's why I'm in "Tech" and > not "Creative". I wrote my own quick 'n nasty templating package a few > years ago that allowed Perl code to be embedded inside > brackets. So long as I was coding the pages, it worked great, if not as > efficiently as embperl or mason. But in the real world of NYC new media, > Creative typically drives the project. It's more common for the site to be > built by artists and HTML sitebuilders, not programmers. The first time I > see the pages is when they get handed off to Tech to glue it all together. > This usually happens sometime past Tech's scheduled hand-off date, i.e. > five days to do fifteen budgeted days' work in order to make the launch date. The real advantage of a 'little crippled language' is that perl itself makes absolutely no effort to keep you from shooting both your feed off at once and you really don't want to let layout people destroy your server with something as simple as a loop that doesn't exit under certain obscure circumstances. Nor do you want to become the only person who can safely make changes. > My favorite anecdote with embedded Perl templates: after a 100-page > creative update to an existing site, nothing worked. Turned out that some > funky HTML editor had HTML-escaped the Perl code. That was a fun all-nighter. HTML::Embperl anticipates this problem and would have kept on working anyway. Les Mikesell [EMAIL PROTECTED]
Re: Building mod_perl and mod_jserv into same apache
According to Stas Bekman: > On Mon, 21 Aug 2000, Jeff Warner wrote: > > > We need to have mod_perl and mod_jserv in the same httpd file. I can build > > apache 1.3.9 for either mod_perl or mod_jserv using the appropriate make > > commands from the install docs and they work fine. > > > > I've tried to build a mod_perl httpd and then use > > ./configure \ > > --prefix=/usr/local/apache \ > > --activate-module=src/modules/jserv/libjserv.a > > > > on a apache build but then I get a mod_jserv httpd and mod_perl is gone. > > If I try to activate-module for both mod_perl and mod_jserv I get a lot of > > mod_perl errors. Suggestions would be appreciated. > > http://perl.apache.org/guide/install.html#Installation_Scenarios_for_mod_p > > I think it should be easy to apply these notes to your case. It does work to build them together, but if you are using the two-apache scheme it makes more sense to put jserv in the front end. Apache uses a proxy-like mechanism to pass the requests on to the jserv running in a separate process. It doesn't make much sense to have the larger mod_perl httpd waiting for the responses. Les Mikesell [EMAIL PROTECTED]
Re: problem with mod_proxy/mod_rewrite being used for the front-end proxy
According to Greg Stark: > > This isn't entirely on-topic but it's a solution often suggested for mod_perl > users so I suspect there are other users here being bitten by the same > problems. In fact the manner in which problems manifest are such that it's > possible that many mod_perl users who are using mod_rewrite/mod_proxy to run > a reverse proxy in front of their heavyweight perl servers have a security > problem and don't even know it. > > The problem is that the solution written in the mod_rewrite guide for a > reverse proxy doesn't work as advertised to block incoming proxy requests. > > RewriteRule^(http|ftp)://.* - [F] > > This is supposed to block incoming proxy requests that aren't specifically > created by the rewrite rules that follow. > > The problem is that both mod_rewrite and mod_proxy have changed, and this > seems to no longer catch the incoming proxy requests. Instead mod_rewrite > seems to see just the path part of the URI, ie, /foo/bar/baz.pl without the > http://.../. Setting ProxyRequests off should disable any explict proxy requests from clients. It does not stop ProxyPass or RewriteRule specified proxying. My server logs a 302 error and sends a redirect to http://www.goto.com/d/home/p/digimedia/context/ (interesting - I didn't know where it was redirecting before...). I do see quite a few of these in my logfiles, mostly trying to bump up the ad counters on some other sites, I think. Les Mikesell [EMAIL PROTECTED]
Re: RFC: Apache::Reload
According to Matt Sergeant: > > package Apache::Reload; What I've always wanted along these lines is the ability to load something in the parent process that would take a list of directories where modules are always checked and reloaded (for your local frequently changed scripts) plus one filename that is checked every time and if it has been touched then do the full StatINC procedure. This would keep the number of stat's down to a reasonable number for production and still let you notify the server when you have updated the infrequently changed modules. Les Mikesell [EMAIL PROTECTED]
Re: Templating system
According to Gerald Richter: > > > > The TT parser uses Perl regexen and an LALR state machine. > > Embperl's parser have used from the startup only C code, maybe that's the > reason why the time this takes (compared to the rest of the request) never > was an issue for me... Is there any way for Embperl to get access to other apache modules (like the C++ versions of the xerces xml parser or xalan xslt processor)? It would be nice to be able to reuse the same code in or out of embperl. Les Mikesell [EMAIL PROTECTED]
Re: Idea of an apache module
According to Ken Williams: > >Another option is to set up whatever handler you want, on a development > >or staging server (i.e., not the live one), and grab the pages with > >lynx -dump or GET or an LWP script, and write them to the proper places > >in the filesystem where the live server can access them. With a little > >planning, this can be incorporated into a cron job that runs nightly > >(or hourly, whatever) for stuff that is updated regularly but is > >composed of discernable chunks. > > I've used this before and it works well. One disadvantage is that Luis > would have to move all his existing scripts to different places, and fix > all the file-path things that might break as a result. Seems like a > front-end cache like squid is a better solution when Luis says he wants > a cache on the front end. > > Putting squid in front of an Apache server used to be very popular - has > it fallen out of favor? Most of the answers given in this thread seem > to be more of the roll-your-own-cache variety. It really depends on what you are doing. The real problem with letting a front-end decide when a cache needs to be refreshed is that it is usually wrong. If the back end can generate predictably correct Expires: or Cache-Control headers, then squid can mostly get it right. This will also make remote caches work correctly. The trouble is that you generally don't know when a dynamically generated page is going to change. Also, squid will pull a fresh copy from the back end whenever the user hits the 'reload' button, which tends to be pretty often on dynamic pages that change frequently. If you just want to control the frequency of doing some expensive operation you might be able to do scheduled runs that generate html snippets that are #included into *.shtml pages, turning it into a cheap operation. Les Mikesell [EMAIL PROTECTED]
Re: mod_perl and FastCGI, again
According to Kenneth Lee: > > Not performance. Not preference. > > The question is, will mod_fastcgi and mod_perl conflict when both are > compiled into Apache? Theoretically not I think. And what would the > consequences be? Please comment. I can't see any reason why they should conflict, but I also don't see why you would want a large mod_perl'd httpd waiting for a fastcgi backend to complete. I'd recommend using the two-apache model with the non-mod_perl'd front end handling static pages, fastcgi, and java servlets if you use them, proxying only mod_perl requests to the mod_perl'd httpd back end. Les Mikesell [EMAIL PROTECTED]
Re: moving GIF's constantly reloading
According to Paul: > Moving GIF files on some of our pages seem to *keep* reloading the > whole time I stay on the page. My browser is set to only compare a > document in its cache to the network version once per session. What > gives? > > I don't see anything in the configs that looks very closely related > The one place I *ever* use a no_cache is on the user's registration. > Would that last for a whole session? (I commented it out, just in > case?) > > I'd seen this in the logs, but only had it do it to me today (after > testing the registrationmaybe related.) There was an old Netscape bug that caused animated GIF's with Expires: headers (regardless of the value) to reload on every cycle. I'm not sure what version(s) did this, but I stopped sending Expires: headers on GIFs because there are still some of them around. Les Mikesell [EMAIL PROTECTED]
Re: speed up/load balancing of session-based sites
According to Mark Imbriaco: > > > "Perrin" == Perrin Harkins <[EMAIL PROTECTED]> writes: > > Perrin> I think every RDBMS I've seen, includig MySQL, guarantees > > Perrin> atomicity at this level. > > > > Look, Mummy, the funny man said MySQL and RDBMS in the same sentence :) > > Please don't start on this. I'm really sick of hearing Phil Greenspun > harp on the evils of MySQL, and I don't think this is the place to relive > that discussion all over again. Yes, I see the smiley, but this topic is > so inflammatory that I felt a response in an attempt to prematurely stop > the insanity was in order. :-) All databases suck. Pick the one that > sucks the least for what you're trying to accomplish and move on. Right, we don't need flames with no content, but there is always the problem of knowing what is going to suck the least in any new situation. For example I have one mysql database that typically fields 200 concurrent connections, 10 million queries daily (mostly concentrated in a 4 hour period) and is probably faster than anything else that could be done on that hardware. However I recently inherited another system that is falling on its face at a much lighter load. It appears to be using tmp files to sort some ORDER BY clauses that I haven't had time to fix yet. Is there any efficient way to pull the newest N items from a large/growing table as you do a join with another table? As a quick fix I've gone to a static snapshot of some popular lists so I can control how often they are rebuilt. Les Mikesell [EMAIL PROTECTED]
Re: speed up/load balancing of session-based sites
According to G.W. Haywood: > Hi there, > > On Tue, 9 May 2000, Leslie Mikesell wrote: > > > I'm more concerned about dealing with large numbers of simultaneous > > clients (say 20,000 who all hit at 10 AM) and I've run into problems > > with both dbm and mysql where at a certain point of write activity > > you basically can't keep up. These problems may be solvable but > > timings just below the problem threshold don't give you much warning > > about what is going to happen when your locks begin to overlap. > > Can you use a RAMdisk? Not for everything - the service is spread over several machines. Les Mikesell [EMAIL PROTECTED]
Re: speed up/load balancing of session-based sites
According to Tom Mornini: > > There must be some size where > > the data values are as easy to pass as the session key, and some > > size where it becomes slower and more cumbersome. Has anyone > > pinned down the size where a server-side lookup starts to win? > > I can't imagine why anyone would pin a website's future to a session > system that has a maximum of 1k or 2k of session storage potential! Using cookies where they work doesn't prevent you from using another mechanism where you need it. Conceptually, I think things like user preferences 'belong' on the user's machine and should be allowed to be different from one machine/browser to another for the same user. Things like a shopping cart in progress might belong on the server. > We use a custom written session handler that uses Storable for > serialization. We're storing complete results for complex select > statements on pages that require "paging" so that the complex select only > happens once. We store user objects complete, and many multi-level complex > data structures at whim. What kind of traffic can you support with this? > Limiting yourself to cookie size limitation would be a real drag. I'm more concerned about dealing with large numbers of simultaneous clients (say 20,000 who all hit at 10 AM) and I've run into problems with both dbm and mysql where at a certain point of write activity you basically can't keep up. These problems may be solvable but timings just below the problem threshold don't give you much warning about what is going to happen when your locks begin to overlap. Les Mikesell [EMAIL PROTECTED]
Re: speed up/load balancing of session-based sites
According to Jeffrey W. Baker: > > I keep meaning to write this up as an Apache:: module, but it's pretty trivial > > to cons up an application-specific version. The only thing this doesn't > > provide is a way to deal with large data structures. But generally if the > > application is big enough to need such data structures you have a real > > database from which you can reconstruct the data on each request, just store > > the state information in the cookie. > > Your post does a significant amount of hand waving regarding people's > requirements for their websites. I try to keep an open mind when giving > advice and realize that people all have different needs. That's why I > prefixed my advice with "On my sites..." Can anyone quantify this a bit? > On my sites, I use the session as a general purpose data sink. I find > that I can significantly improve user experience by keeping things in the > session related to the user-site interaction. These session object > contain way more information than could be stuffed into a cookie, even if > I assumed that all of my users had cookies turned on. Note also that > sending a large cookie can significantly increase the size of the > request. That's bad for modem users. > > Your site may be different. In fact, it had better be! :) Have you timed your session object retrieval and the cleanup code that becomes necessary with server-session data compared to letting the client send back (via cookies or URL) everything you need to reconstruct the necessary state without keeping temporary session variables on the server? There must be some size where the data values are as easy to pass as the session key, and some size where it becomes slower and more cumbersome. Has anyone pinned down the size where a server-side lookup starts to win? Les Mikesell [EMAIL PROTECTED]
Re: mod_proxy and Name based Virtual Hosts
According to Matt Sergeant: > OK, just to get this onto a different subject line... I can't seem to get > mod_proxy to work on the front end with name based virtual hosts on the > backend, I can only get it to work if I have name based virtual hosts on > both ends. You should be able to use IP based vhosts going to the same IP and different port on the back end (the backend just needs a different global Port: setting) or name based vhosts on the front end going to port based on the backend, each with a different port number. Les Mikesell [EMAIL PROTECTED]
Re: [RFC] modproxy:modperl ratios...
According to Matt Sergeant: > Is there any benefit of mod_proxy over a real proxy front end like "Oops"? I've run squid as an alternative and did not see any serious differences except that the caching was defeated about 10% of the time even on images, apparently because the clients were hitting the 'reload' button. Apache gives you (a) the already-familiar config file, (b) mod_rewrite to short-circuit image requests and direct others to different backends, (c) all the other modules you might want - ssl, jserv, custom logging, authentication, etc. The main improvement I'd like to see would be load balancing and failover on the client side of mod_proxy and some sort of IP takeover mechanism on the front end side so a pair of machines would act as hot spares for each other on the same IP address. I know some work has been done on this but nothing seems like a complete solution yet.
Re: [RFC] modproxy:modperl ratios...
According to [EMAIL PROTECTED]: > > So, overall..., I think that you should consider how many modperl > processes you want completely seperately from how many modproxy > processes you want. Apache takes care of these details for you. All you need to do is configure MaxClients around the absolute top number of mod_perls you can handle before you start pushing memory to swap, some small MinSpareServers and a bigger MaxSpareServers and the rest takes care of itself. On the front-end side you really don't want any process limits. If you can't run enough, buy more memory or turn keepalives down. Apache will keep the right number running for the work you are doing - and the TCP listen queue will hold a few more connections if you are slightly short of backends. > But rather on a ratio of how many CPUs you have > considering primarily what their "bound" by. Note that when you get down to fine-tuning, you can use mod_rewrite to direct different queries to different back-ends on the same or different machines. For example by sending all the database-related URLs to a certain instance of mod_perl (on a particular port/IP) and others to a different instance you can reduce the number of database connections you need. Les Mikesell [EMAIL PROTECTED]
Re: Modperl/Apache deficiencies... Memory usage.
According to Perrin Harkins: > On Tue, 25 Apr 2000 [EMAIL PROTECTED] wrote: > > With mod_proxy you really only need a few mod_perl processes because > > no longer is the mod_perl ("heavy") apache process i/o bound. It's > > now CPU bound. (or should be under heavy load) > > I think for most of us this is usually not the case, since most web apps > involve using some kind of external data source like a database or search > engine. They spend most of their time waiting on that resource rather > than using the CPU. If you have tried it and it didn't work for you, please post the details to help us understand your real bottleneck. Most of my hits involve both another datasource and a database and I still see a 10-1 reduction of mod_perl processes with the proxy model. The real problem is slow client connections over the internet. If you are only serving a local LAN a proxy won't help but you won't have slow clients either. > Isn't is common wisdom that parallel processing is better for servers than > sequenential anyway, since it means most people don't have to wait as long > for a response? Only up to the point where the processes continue to run in parallel. If you are CPU bound, this will be the number of CPUs. If you are doing disk access it will be the number of heads that work independently. Going to a database server you will have the same constraints plus any transaction processing that forces serialization. >The sequential model is great if you're the next in line, > but terrible if there are 50 big requests in front of you and yours is > very small. Parallelism evens things out. Or it just adds more overhead. If you have enough parallelism to keep your bottleneck busy, the 50th request can only come out slower by switching among jobs more often. Anyway, with the proxy model the cheap way to increase parallelism is to spread jobs across different backend machines. Les Mikesell [EMAIL PROTECTED]
Re: [RFC] XML/Apache Templating with mod_perl
According to Matt Sergeant: > In case you missed it - I just announce the Apache XML Delivery Toolkit to > both the modperl list and the Perl-XML list. With it you can develop an > XSLT Apache module in 13 lines of code (no caching, but it works). I saw it, but perhaps misinterpreted the 'not' in the xslt package. Is this intended to be fairly compatible with IIS's 'TransformNode' handling of xml/xsl (i.e. can I use the same xsl files)? Les Mikesell [EMAIL PROTECTED]
Re: mod_perl 2.x/perl 5.6.x ?
According to Eric Cholet: > > > > Does apache 2.0 let you run a prefork model under NT? > > > NT has it's own MPM which is threaded > > prefork ... Multi Process Model with Preforking (Apache 1.3) > dexter Multi Process Model with Threading via Pthreads > Constant number of processes, variable number of threads > mpmt_pthread .. Multi Process Model with Threading via Pthreads > Variable number of processes, constant number of > threads/child (= Apache/pthread) > spmt_os2 .. Single Process Model with Threading on OS/2 > winnt . Single Process Model with Threading on Windows NT > > I believe the first 3 run only under Unix. So, does that still leave mod_perl serializing access until everything is rewritten to be thread-safe? Les Mikesell [EMAIL PROTECTED]
Re: mod_perl 2.x/perl 5.6.x ?
According to Eric Cholet: > > This is for using Apache 2.0's pthread MPM, of course you can build perl > 5.6 non threaded and use apache 2.0's prefork model but then it's not > as exciting :-) Does apache 2.0 let you run a prefork model under NT? Les Mikesell [EMAIL PROTECTED]
Re: [OT] Proxy Nice Failure
According to Jim Winstead: > On Apr 21, Michael hall wrote: > > I'm on the new-httpd list (as a lurker, not a developer :-). Any ideas, > > patches, help porting, etc. would be more than welcome on the list. > > Mod-Proxy is actually kind of in limbo, there are some in favor of > > dropping it and others who want it. I guess the code is difficult and > > not easy to maintain and thats why some would just as soon see it go > > unless someone steps up to maintain (redesign) it. There are some > > working on it and apparently it will survive in some form or another. > > Now would be a perfect time for anybody to get involved in it. > > mod_backhand may also be the solution people are after. > > http://www.backhand.org/ Is anyone using this in production? It has the disadvantage of requiring itself to be compiled into both the front and back ends. I have some backend data being generated by custom programs running on NT boxes and would like to have a fail-over mechanism. We may end up running Windows load balancing on them, but that means paying for Advanced Server (about $3k extra) on each of them when a smart proxy would work just as well. I also didn't see how to access it through mod_rewrite which is how I control most of my proxy access. This might be possible by letting backhand handle certain directories and RewriteRules to map to those directories - I just didn't get that far yet. > (Sorry for the off-topic-ness.) > > I'm also coming around to the idea that caching proxies have some > very interesting applications in a web-publishing framework outside > of caching whole pages. All sorts of areas to exploit mod_perl in > that sort of framework. This can help with the load on a backend, but after watching squid logs for a while I decided that a lot of extra traffic is passed through when users hit the 'refresh' button which will send the 'Pragma: no-cache' header with the request. For things like images you may be better off using RewriteRules on the front end to short-circuit the request, and other popular pages that should update only at certain intervals can be done with cron jobs and delivered from the front end as well. So, from a mod_perl perspective I don't care much about the caching side but really need the relationship between mod_rewrite and mod_proxy. I haven't found equivalent built-in functionality in any other server. Les Mikesell [EMAIL PROTECTED]
Re: [OT] Proxy Nice Failure
According to Joshua Chamas: > I like the mod_proxy module in reverse httpd accel mode, but > am interested in having some nicer failure capabilities. I have > hacked in this kind of stuff before but was wondering if anyone > had any official patch for this kind of stuff. > > The nicety under consideration is having the mod_proxy module do > X retries every Y seconds instead of failing immediately. This > would allow a backend Apache httpd do a stop/start without any > downtime apparent to the client besides the connection breaks > from the stop. Depending on how much preloading is done at the > parent httpd, a start could take 5-10 seconds, and during this > time it would be cool if the proxy could just queue up requests. > > Anyone does this with some nice ProxyTimeout ProxyRetry config > options? Thanks. No, but I'd like to add to the wishlist that it should do load balancing and failover across multiple backends too. Mod_jserv appears to have a pretty good scheme of letting you describe the balanced sets and an interface to view and control the backend status. The only problem is that it is restricted to the jserv protocol for the backends. Les Mikesell [EMAIL PROTECTED]
Proxy hijackers?
(Off topic again, but lots of people here are using reverse proxy). For a while I had 'ProxyRequests On' in my httpd.conf mistakenly thinking that it was necessary to make ProxyPass and mod_rewrite proxying work. Then I noticed entries in my logfile where remote sites were sending full http://requests to other remote sites. I've turned off the function, but the requests keep coming in, mostly appearing to request ads from somewhere with referring pages in Russia and China. Is this a common practice and what are they trying to accomplish by bouncing them through my server? Les Mikesell [EMAIL PROTECTED]
Re: Modperl/Apache deficiencies... Memory usage.
According to Gunther Birznieks: > If you want the ultimate in clean models, you may want to consider coding > in Java Servlets. It tends to be longer to write Java than Perl, but it's > much cleaner as all memory is shared and thread-pooling libraries do exist > to restrict 1-thread (or few threads) per CPU (or the request is blocked) > type of situation. Do you happen to know of anyone doing xml/xsl processing in servlets? A programmer here has written some nice looking stuff but it appears that the JVM is never garbage-collecting and will just grow and get slower until someone restarts it. I don't know enough java to tell if it is his code or the xslt classes that are causing it. Yes, I know this is off-topic for mod_perl except to point out that the clean java model isn't necessarily trouble free either. Les Mikesell [EMAIL PROTECTED]
Re: Modperl/Apache deficiencies... Memory usage.y
According to [EMAIL PROTECTED]: > > > > This is basically what you get with the 'two-apache' mode. > > To be frank... it's not. Not even close. It is the same to the extent that you get a vast reduction in the number of backend mod_perl processes. As I mentioned before, I see a fairly constant ratio of 10-1 but it is really going to depend on how fast your script can deliver its output back to the front end (some of mine are slow). It is difficult to benchmark this on a LAN because the thing that determines the number of front-end connections is the speed at which the content can be delivered back to the client. On internet connections you will see many slow links, and letting those clients talk directly to mod_perl is the only real problem. > Especially in the case that > the present site I'm working on where they have certain boxes for > dynamic, others for static. This is a perfect setup. Let the box handling static content also proxy the dynamic requests to the backend. > This is usefull when you have one box > running dynamic/static requests..., but it's not a solution, it's a > work around. (I should say we're moving to have some boxes static > some dynamic... at present it's all jumbled up ;-() Mod_rewrite is your friend when you need to spread things over an arbitrary mix of boxes. And it doesn't hurt much to run an extra front end on your dynamic box either - it will almost always be a win if clients are hitting it directly. A fun way to convince yourself that the front/back end setup is working is to run something called 'lavaps' (at least under Linux, you can find this at www.freshmeat.net). This shows your processes as moving colored blobs floating around with the size related to memory use and the activity and brightness related to processor use. It is pretty dramatic on a box typically running 200 1Meg frontends, and 20 10Meg backends. You get the idea quickly what would happen with 200 10Meg processes instead - or trying to funnel through one perl backend. > Well, now your discussing threaded perl... a whole seperate bag of > tricks :). That's not what I'm talking about... I'm talking about > running a standard perl inside of a threaded enviro. I've done this, > and thrown tens of thousands of requests at it with no problems. You could simulate this by configuring mod_perl backend to only run one child and let the backlog sit in the listen queue. But you will end up with the same problem. > I > believe threaded perl is an attempt to allow multiple simultaneous > requests going into a single perl engine that is "multi threaded". > There are problems with this... and it's difficult to accomplish, and > alltogether a slower approach than queing because of the context > switching type overhead. Not to mention the I/O issue of this... > yikes! makes my head spin. What happens in your model - or any single threaded, single processing model - when something takes longer than you expect? If you are just doing internal CPU processing and never have an error in your programs you will be fine, but much of my mod_perl work involves database connections and network I/O to yet another server for the data to be displayed. Some of these are slow and I can't allow other requests to block until all prior ones have finished. The apache/mod_perl model automatically keeps the right number of processes running to handle the load and since I mostly run dual-processor machines I want at least a couple running all the time. Les Mikesell [EMAIL PROTECTED]
Re: Modperl/Apache deficiencies... Memory usage.y
According to [EMAIL PROTECTED]: > Does anyone know of any program which has been developed like this? > Basically we'd be turning the "module of apache" portion of mod_perl > into a front end to the "application server" portion of mod_perl that > would do the actual processing. This is basically what you get with the 'two-apache' mode. > It seems quite logical that something > like this would have been developed, but possibly not. The seperation > of the two components seems like it should be done, but there must be > a reason why no one has done it yet... I'm afraid this reason would be > the apache module API doesn't lend itself to this. The reason it hasn't been done in a threaded model is that perl isn't stable running threaded yet, and based on the history of making programs thread-safe, I'd expect this to take at least a few more years. But, using a non-mod-perl front end proxy with ProxyPass and RewriteRule directives to hand off to a mod_perl backend will likely get you a 10-1 reduction in backend processes and you already know the configuration syntax for the second instance. Les Mikesell [EMAIL PROTECTED]
Re: mod_perl virtual web hosting
According to Tom Brown: > > strikes me (as an owner of a web hosting service) that DSO is the wrong > answer. What does DSO buy you? NOTHING except a complete waste of > memory... It doesn't really hurt anything but you still want a proxy. > it strikes me that you _want_ a frontend proxy to feed your requests to > the smallest number of backend daemons which are most likely to already > have compiled your scripts. This saves memory and CPU, while simplifying > the configuration, and of course, for a dedicated backend daemon, DSO buys > nothing... even if that daemon uses access handlers, it still always needs > mod_perl If someone is ambitious enough to write some C code, what you really need is a way for mod_proxy to actually start a new backend if none are already running for a particular vhost. Then the backend processes should extend the concept of killing off excess children to the point of completely exiting after a certain length of inactivity. The same approach could also work for running scripts under different userids. I think sometime in the distant past I have seen programs started by inetd that would continue to listen in standalone mode for a while to make subsequent connections faster but I don't recall how it worked. Les Mikesell [EMAIL PROTECTED]
Re: OT: (sort of) AuthDBMUserFile
According to Stas Bekman: > On Mon, 10 Apr 2000, Bill Jones wrote: > > AuthDBMUserFile > > > > Is there a difference between DBM and GDBM? > > I always thought they were the same... > > > > I found sleepcat (DB) and GDBM, but where is DBM? > sleepycat == berkeley db (a product of sleepycat.com) > > gdbm == gnu dbm library > > dbm == a global name for all UNIX dbm implementations. This is the > name of the database type, like RDBMS or flat file are the names of the > relational DB implementations driven by SQL and simple text file > with line/record respectively... > > But this should go to the perl newsgroup the best... It turns out to be a long story, because both the gdbm and db libraries offer ndbm emulation, which is the interface that apache actually uses, so when you build mod_perl you end up with whichever library was linked first by the perl compile. If you build apache without mod_perl you end up with whatever appears to be be your ndbm library which will vary between systems, and especially Linux distributions. Some use gdbm - RedHat uses db but seems to have switched to the file-incompatible 2.x version in their 6.0 release. You may have to work a bit to get both apache and perl to use the same library and you are likely to have trouble copying the dbm file from one system to another or even using it after an upgrade. Les Mikesell [EMAIL PROTECTED]
Re: mod_perl shared perl instances?
According to Soulhuntre: > > My favorite is throwing a proxy in front... > > The issue in question was getting the perl core out of the address space of > the httpd process entirely :) > > Velocigen does this nicely (though there is a performance penalty) and even > allows degicated machines runnign the perl services that talk to the > webserver over sockets. Look at the problem from the opposite direction. There is not that much overhead in including http in the perl processes actually doing the work, and (unsurprisingly...) http turns out to be a reasonable protocol to forward http requests instead of inventing something else. So, consider mod_perl to be the backend perl process with a non-mod_perl frontend using ProxyPass and/or RewriteRules to talk to it amd you end up with the same thing, except more flexible, with a single config file style, and the ability to test the backend with only a browser. Les Mikesell [EMAIL PROTECTED]
Re: mod_perl, Apache and zones?
According to John Darrow: > I need to be able to run the same sets of pages in several different > environments (basically just different environment variables). The problem > is that once a process is initiated in a certain environment it can't be > changed for the life of the process. The first stroke to solve this would > say that I need to run several Apache servers each with a slightly different > config file. Then each environment would run on its own port. What determines the correct values per hit? You can use SetEnv in virtualhost context, SetEnvIf on the fly based on several considerations, or do some real black magic with the E= flag in a RewriteRule. > The problem with that solution is that maintaining the servers becomes a > headache. You have to bounce many different Apache servers everytime > something changes. This turns out to be a mixed blessing when you really only want to change one of the environments but it is probably too much trouble except for drastically different servers like with/without mod_perl or a secure proxy. > With java servlets there is a feature that allows you to specify different > zones within a single Apache server. Each zone has a unique config file and > so it can deal with the environments that way. I'm wondering if there's > anything similar for mod_perl? The above, plus the ability of mod_rewrite or ProxyPass on a front end server to proxy different requests to different backends. > The only other thing I can think of is to just have several copies of the > same scripts, and then depending on the URI they will be smart enough to > know to set their own environment variables upon initialization. Apache > would then keep processes separate depending on where the scripts are using > the URI. But that's sort'f ugly. It is a good idea to make sure the reason for the different behaviour based on these values is clear within the script, especially if there is any chance that different people will make changes separately to the apache config and the scripts. If letting the script parse the URI itself makes it more obvious then it isn't as ugly as the magic to hide it. Les Mikesell [EMAIL PROTECTED]
Re: [RFC] holding a mod_perl conference
According to Nathan Torkington: > Jason Bodnar writes: > > I guess my big problem with the ORA conference last year was that all the > > tutorials I attended last year tried to cover the basics and didn't lead > > enough time for in-depth informaiton. > > Yup, I agree. The level of the material offered, though, is in the > hands of the program chair. So when I put together the Perl > conference tutorials, I try to make sure that at any one time there's > something that *I* would like to see, as well as something that a less > advanced (more intermediate) programmer might want to attend. So this > year there's Damian Conway's "making your mind go boom with OO in Perl" > talks, as well as MjD's hardcore Perl. Same here, but I'd like to make the point that it is pretty difficult to guess what someone else's concept of beginning, intermediate, and advanced topics really mean. This is especially true when a program's author is speaking or personal styles of perl coding are involved. It would be nice if some outlines/slides of the material could be online before the signup deadlines and the actual session could spend more time in discussion and question/answer than covering the overview. Les Mikesell [EMAIL PROTECTED]
Re: external access to intranet
According to James Hart: No they won't - the browser will strip the URL seen from its perspective back to the host and add the path. On the scheme Jona describes, where the host the browser sees is 'gateway_server', that would then be retranslated by the proxy into a request for the document 'myfile.html' on the intranet host 'path' - the correct intranet host would be lost. As long as the part of the path that triggered the first ProxyPass directive remains (and it will for any relative link in the same directory or lower), the request for it will also be ProxyPass'd to the same back end server and the correct relative location. Les Mikesell [EMAIL PROTECTED]
Re: mod_perl weaknesses? help me build a case....
According to Soulhuntre: > > Well, let me turn that around, has anyone succeeded in getting mod_perl > running well on Apache on win2k? Your problem here is going to be that mod_perl is not thread-safe and will serialize everything when running under the threaded model that apache uses under windows. If your scripts are fast enough you might be able to live with this if you use it as a back end to a lightweight front-end proxy which a busy site needs anyway. Les Mikesell [EMAIL PROTECTED]
Re: external access to intranet
According to [Jonas Nordstr_m]: > But doesn't that only pass on the request and then return the HTML-files > unchanged? I also want to change the links inside the HTML-bodies on the > fly, so that the users can continue to "surf the intranet". For example, if > the HTML contains "" I want to change that to > "https://gateway_server/intranet_host/path/myfile.html>" Relative links like that will work correctly without any changes because the browser supplies the current protocol/path from its perspective. Absolute links that start with a / or http: will be broken, though. Les Mikesell [EMAIL PROTECTED]
Re: authDBI connect on init?
According to Adam Gotheridge: > Is there anyway to get AuthDBI to connect on initialization of the web server > like you can with "DBI->connect_on_init(...)"? AuthDBI uses DBI, so all you have to do is use Apache::DBI and make sure the connect string is exactly the same. Les Mikesell [EMAIL PROTECTED]
Re: Embedded Perl XML Extensions ...
According to Joshua Chamas: > > > > Will you be able to emulate the IIS/ASP 'transformNode' > > > > method that renders html from xml and xsl components? > > http://www.sci.kun.nl/sigma/Persoonlijk/egonw/xslt/ > > transformNode & XSLT are not on my short list of TODOs for > Apache::ASP, though certainly one could be easily made > available when the other is too! It looks like the one above is usable if not complete. I don't think anything exactly follows the standard yet. The idea would be to get the calling syntax extablished and simple xsl documents working. > However, transformNode is not an ASP method per se but a method > of one of Microsoft's COM XML objects. So far I have stayed away > from implementing Microsoft's COM objects like the BrowserCap, the > AdRotator, and the like, and if I did provided an the XML object, > I would be starting a general perl port of as many Microsoft COM > objects as I could do. The difference here is that XSL transformations *should* follow vendor-independent standards and I'd like to see it available on as many platforms as possible. A developer here was already complaining about how the MS sdk for xml has MS-specific extensions in all the examples. > What is on my short list is providing a developer interface > for rendering XML to HTML. Now I have looked at XSLT, and it is > pretty scary, even if supposedly necessary for some managing large > document sets. Its purpose does seem a bit orthogonal to the ASP > style of interleaving code & content, with the first goal being > simplicity and power, and then maintainability with decomp through > includes and modules. It is more geared toward separating data from presentation, although in theory you can transform data to other data just as well. I think it will become extemely useful in cases where you have machine generated data that can be extracted in xml format and you want to present it a number of customized ways. > XSLT seems to seriously complicate the XML rendering issue, and > perhaps unnecessarily? Has it occurred to anyone that XSLT is just > another programming language, and one that looks like an HTML doc? Yes, that is exactly the point, aside from being platform and vendor-independent. Someone who wants to do HTML presentation now has a language designed for that. Plus using IE5, you can push the work out to the browser an let it cache or update the components separately. > I'll likely wait this out a bit to see how it shapes up, and would > finally need an Apache::ASP user to require this functionality before > pursuing it further. Leslie, are you that user ? Our situation may be a lost cause. The data in question (commodity trading stuff) lives on an NT box that has a nice native interactive interface but the web side has been done with a unix mod_perl wrapper to get the speed (maybe 30 price pages a second at peak times) and reliability we need. However we provide custom views of this for brokers and you can't just turn HTML coders loose on production mod_perl scripts even if you only need a slightly different look. So, we've added XML output to the NT server and started using XSL for some of the new variations, so far running under apache jserv. However since nearly everyone else here is an NT developer, the push is to put it all in one box, which is probably going to mean running IIS and MS-asp so we can sell something easier for other people to manage. However, since I haven't completely sold my soul to the dark side (and we aren't sure yet how well it will work...) I'd really like the xsl to be portable, and usable under mod_perl. I've had some problems with memory use running java servlets, although I like the load-balancing feature in the current apache jserv. Les Mikesell [EMAIL PROTECTED]
Re: Embedded Perl XML Extensions ...
According to Gerald Richter: > > > > Will you be able to emulate the IIS/ASP 'transformNode' > > method that renders html from xml and xsl components? > > > > I don't know what transform Node exactly does, but I hope we find a common > design, which will allow you to easly plug in a module that does whatever > transformation on the XML (or HTML) that you like. The idea is to apply the stylesheet transformations specified by an XSL document to everything below a node (possibly the root) of another XML document. For now it looks like the only XSLT transformer in perl is at: http://www.sci.kun.nl/sigma/Persoonlijk/egonw/xslt/ There are some samples of the kinds of things you can do at http://msdn.microsoft.com/downloads/samples/internet/xml/multiple_views/default.asp although they are somewhat MicroSoft-centric in that they apply the transformation inside the browser (and only work with IE). What we need is the ability to detect the browser type and render to HTML on the server if the browser can't do it. The only things I've seen so far that can do this besides IIS have been in Java. Les Mikesell [EMAIL PROTECTED]
Re: Embedded Perl XML Extensions ...
According to Joshua Chamas: > I have been thinking about some XML style extensions for > Apache::ASP, and know that you are looking at the same thing > with Embperl, and was hoping that we could sync up on the > APi, so there might be a common mindset for a developer when > using our extensions, even if the underlying implementation > were different. Will you be able to emulate the IIS/ASP 'transformNode' method that renders html from xml and xsl components? Les Mikesell [EMAIL PROTECTED]
Re: What's the benefits of using XML ?
According to Perrin Harkins: > On Fri, 11 Feb 2000, Matt Sergeant wrote: > > XML and XSLT can provide this. Rather than write pages to a > > specific style with toolbars in the right place, and format things how I > > want them, I can write in XML, and down transform to HTML using a > > stylesheet. When I want to change the look of my site I change the > > stylesheet and the site changes with it. This isn't magic - a lot of > > template systems exist today - many of them written right here for > > mod_perl. But XSLT allows me to leverage those XML skills again. And I > > think authoring XML is easier than most of those template tools (although > > XSLT isn't trivial). > > Just a small plug for one of my favorite modules: Template Toolkit is very > easy to use, and a couple of people have written plug-ins for it that > handle arbitrary XML. If you're working on a project where you need to > turn XML into HTML and want non-programmers to write and maintain the HTML > templates, they may find it easier than XSLT. Of course, it's a Perl-only > solution. One other thing about XML/XSL is that if the browser is IE5, instead of doing the transformation to HTML on the server you can send instructions to the browser to get each separately and render to HTML itself. This could be a big win if you have rapidly changing data in XML format because you end up sending essentially static pages (or perhaps passing through directly from some other source) and the unchanging XSL will be cached on the browser side. IE5 can also let you view raw XML in a fairly intelligent way even without XSL. Les Mikesell [EMAIL PROTECTED]
Re: [SITE] possible structure suggestion
According to Matt Sergeant: > >This would be cool. However, in at least a few cases, the PHP docs leave > > something to be desired. I remember looking up the Oracle connect calls for > > PHP online once (for 3.0), and having people hold a debate about how a > > function really worked, because the docs were wrong, but no one really > > knew what was right--one guy would say, "I think it really returns THIS," > > and another would respond with, "No, I think it returns THAT." Gives you a > > nice warm and fuzzy feeling about quality of documentation... :) > > Of course they could have just resolved it by looking at the source :) But when the documentation and source disagree, chances are that both are wrong. Les Mikesell [EMAIL PROTECTED]
Re: Perl and SQL, which is the most scalable SQL server to use?
According to Ryan, Aaron: > We found that we are quicking using up the max connections to the MySQL > database > and when we raise the max connection, the performance gets worse. What was > MySQL designed > to handle, should it be able to handle 2000 connections, or is that outside > the scope > of the design of the server. > > Does anyone have any suggestions or similar experiences with scaling. Have you already taken the step of setting up a non-mod_perl proxy front end and avoiding sending any unnecessary requests (images, static html, etc.) to the database-connected backend? If not, you may be able to reduce the number of connections you need by a factor of 10 or so. Les Mikesell [EMAIL PROTECTED]
Re: Performance advantages to one large, -or many small mod_perl program?
According to Ask Bjoern Hansen: > > > Is there any way to make mod_perl reload modified modules in some > > directories but not check at all in others? I'd like to avoid > > the overhead of stat'ing the stable modules every time but still > > automatically pick up changes in things under development. > > I made that an option for Apache::StatINC. I've made it and lost it a few > times, but some day I will get it done, tested and commited. :) > > I was going to make the trigger on the module name though. Hmn. Maybe look > at the directory too would make sense. We have a lot of local modules, some used both by mod_perl and normal scripts. To make it easier to keep them updated across machines and avoid having to 'use lib' everywhere, I put a symlink under the normal site_perl to a directory that is physically with other web related work. So in this case it really is a component of the module name (mapped by perl to the symlinked directory...) that I would want as the trigger, but it would be equally likely that someone would 'use lib' to pick up their local development. I'd put a lot more programming into the modules and out of the web scripts if modifications were always picked up automatically without having to stat the modules that rarely change. Les Mikesell [EMAIL PROTECTED]
Re: Performance advantages to one large, -or many small mod_perl program?
According to Stas Bekman: > A module is a package that lives in a file of the same name. For > example, the Hello::There module would live in Hello/There.pm. For > details, read L. You'll also find L helpful. If > you're writing a C or mixed-language module with both C and Perl, then > you should study L. > [snipped] Is there any way to make mod_perl reload modified modules in some directories but not check at all in others? I'd like to avoid the overhead of stat'ing the stable modules every time but still automatically pick up changes in things under development. Les Mikesell [EMAIL PROTECTED]
Re: Using network appliance Filer with modperl
According to Elizabeth Mattijsen: > We have been using such a setup for over 2 years now. The only real issue > we've found is not so much with mod_perl itself, but with MySQL. If you > put your databases on the NetApp, either have a seperate central database > server, or make damn sure you do not use the same database from two > different front-end servers. We've seen database corruption that way > (using Linux front-end servers with NFS 2). It is probably reasonable for MySQL to assume that only one server is accessing the files at once since it has its own remote client access protocol. Do you happen to know if there is a performance difference for MySQL between local drives and a NetApp? Les Mikesell [EMAIL PROTECTED]
Re: Using network appliance Filer with modperl
According to Tim Bunce: > > > And, just to be balanced, has anyone _not_ found any 'gotchas' and is > > > enjoying life with a netapp or similar NFS file serving appliance? > > > > I haven't really had any gotchas in terms of performance. But you do > > have to plan things out if you are going to be working in a mixed > > NFS+CIFS environment because of permission issues. Also I had a really > > hard time accessing a share with samba. Supposedly that is fixed now > > but I have not had reason to test it. > > We wouldn't be using CIFS or samba. Just plan simple NFS file serving. I'm using one as a cvs repository and master copy that gets distributed via rsync to some other hosts but it isn't really serving the production hosts in real time yet. One thing I was hoping to do was to let it serve static content directly from my master image (you can get an http server also) but it keeps crashing when I try to give permission to a couple of subdirectories only and deny or issue a redirect on attempts to access anything else. This is probably my configuration error - I just haven't been sitting by a phone long enough to deal with a call to tech support recently... The thing doesn't seem especially fast at serving http but it doesn't slow down much with hundreds of concurrent requests either. Les Mikesell [EMAIL PROTECTED]
Re: squid performance
According to Greg Stark: > Leslie Mikesell <[EMAIL PROTECTED]> writes: > > > The 'something happens' is the part I don't understand. On a unix > > server, nothing one httpd process does should affect another > > one's ability to serve up a static file quickly, mod_perl or > > not. (Well, almost anyway). > > Welcome to the real world however where "something" can and does happen. > Developers accidentally put untuned SQL code in a new page that takes too long > to run. Database backups slow down normal processing. Disks crash slowing down > the RAID array (if you're lucky). Developers include dependencies on services > like mail directly in the web server instead of handling mail asynchronously > and mail servers slow down for no reason at all. etc. Of course. I have single httpd processes screw up all the time. They don't affect the speed of other httpd processes unless they consume all of the machine's resources or lock something in common. I suppose if you have a small limit on the number of backend programs you could get to a point where they are all busy doing something wrong. > > If you are using squid or a caching proxy, those static requests > > would not be passed to the backend most of the time anyway. > > Please reread the analysis more carefully. I explained that. That is > precisely the scenario I'm describing faults in. I read it, but just wasn't convinced. I'd like to understand this better, though. What did you do to show that there is a difference when netscape accesses different hostnames for fast static content as opposed to the same one where a cache responds quickly but dynamic content is slow? I thought Netscape would open 6 or so separate connections regardless and would only wait if all 6 were used. That is, it should not make anything wait unless you have dynamically-generated images (or redirects) tying up the other connections besides the one supplying the main html. Do you have some reason to think it will open fewer connections if they are all to the same host? Les Mikesell [EMAIL PROTECTED]
Re: squid performance
According to Greg Stark: > > > 1) Netscape/IE won't intermix slow dynamic requests with fast static requests > > >on the same keep-alive connection > > > > I thought they just opened several connections in parallel without regard > > for the type of content. > > Right, that's the problem. If the two types of content are coming from the > same proxy server (as far as NS/IE is concerned) then they will intermix the > requests and the slow page could hold up several images queued behind it. I > actually suspect IE5 is cleverer about this, but you still know more than it > does. They have a maximum number of connections they will open at once but I don't think there is any concept of queueing involved. > > > 2) static images won't be delayed when the proxy gets bogged down waiting on > > >the backend dynamic server. > > Picture the following situation: The dynamic server normally generates pages > in about 500ms or about 2/s; the mod_perl server runs 10 processes so it can > handle 20 connections per second. The mod_proxy runs 200 processes and it > handles static requests very quickly, so it can handle some huge number of > static requests, but it can still only handle 20 proxied requests per second. > > Now something happens to your mod_perl server and it starts taking 2s to > generate pages. The 'something happens' is the part I don't understand. On a unix server, nothing one httpd process does should affect another one's ability to serve up a static file quickly, mod_perl or not. (Well, almost anyway). > The proxy server continues to get up to 20 requests per second > for proxied pages, for each request it tries to connect to the mod_perl > server. The mod_perl server can now only handle 5 requests per second though. > So the proxy server processes quickly end up waiting in the backlog queue. If you are using squid or a caching proxy, those static requests would not be passed to the backend most of the time anyway. > Now *all* the mod_proxy processes are in "R" state and handling proxied > requests. The result is that the static images -- which under normal > conditions are handled quicly -- become delayed until a proxy process is > available to handle the request. Eventually the backlog queue will fill up and > the proxy server will hand out errors. But only if it doesn't cache or know how to serve static content itself. > Use a separate hostname for your pictures, it's a pain on the html authors but > it's worth it in the long run. That depends on what happens in the long run. If your domain name or vhost changes, all of those non-relative links will have to be fixed again. Les Mikesell [EMAIL PROTECTED]
Re: splitting mod_perl and sql over machines
According to Jeffrey W. Baker: > I will address two points: > > There is a very high degree of parallelism in modern PC architecture. > The I/O hardware is helpful here. The machine can do many things while > a SCSI subsystem is processing a command, or the network hardware is > writing a buffer over the wire. Yes, for performance it is going to boil down to contention for disk and RAM and (rarely) CPU. You just have to look at pricing for your particular scale of machine to see whether it is cheaper to stuff more in the same box or add another. However, once you have multiple web server boxes the backend database becomes a single point of failure so I consider it a good idea to shield it from direct internet access. Les Mikesell [EMAIL PROTECTED]
Re: Running 2 httpd on one mache question.
According to Martin A. Langhoff: > But there's one thing that I can't imagine. When I run top, how do I > tell memory/cpu consumption from lightweight daemons from the mem/cpu > consumption from mod_perl daemons? Sorry for the low mod_perl content, but if you are running Linux and have X available on the network, it is fun to use 'lavaps' which you can find with a search at www.freshmeat.net. It shows processes as though they were in a lava lamp, with the size corresponding to memory usage, the color and movement related to activity. Besides being fun, it gives you a good feeling for the relationship of the front/back end servers, database backends, java servlets, and whatever else you might be running. Les Mikesell [EMAIL PROTECTED]
Re: Advise is needed...
According to BeerBong: > > I need protect directory (/abonents) on server. > User database lies on Radius Server. > > I have front-end (apache proxy) + back-end apache servers. > I've heard that authentication process must works on front-end server. No, if you are using ProxyPass or RewriteRules with the [p] flag the authentication can happen on the back end. If the authentication directives are in .htaccess files, they will not be referenced before the proxy action. Les Mikesell [EMAIL PROTECTED]
Re: squid performance
According to Greg Stark: > I tried to use the minspareservers and maxspareservers and the other similar > parameters to let apache tune this automatically and found it didn't work out > well with mod_perl. What happened was that starting up perl processes was the > single most cpu intensive thing apache could do, so as soon as it decided it > needed a new process it slowed down the existing processes and put itself into > a feedback loop. I prefer to force apache to start a fixed number of processes > and just stick with that number. I've never noticed that effect, but I thought that apache always grew in increments of 'StartServers' so I've tried to keep that small, equal to MinSpareSevers, and an even divisor of MaxSpareServers just on general principles. Maybe you are starting a large number as you cross the minspareservers boundaries. Les Mikesell [EMAIL PROTECTED]
Re: Using mod_backhand for load balancing and failover.
According to Jeffrey W. Baker: > > Is anyone using mod_backhand (http://www.backhand.org/) for load > balancing? I've been trying to get it to work but it is really flaky. > For example, it doesn't seem to distribute requests for static content. > Bah. I just started to look at it (and note that there was a recent update)but haven't got it configured yet. I thought it distributed whatever it is configured to handle - it shouldn't be aware of the content type. The parts I don't like just from looking at it are that the backend servers all have to have the module included as well (I was hoping to balance some non-apache servers too) and it looks like it may be difficult or impossible to make it mesh with RewriteRules. The mod_jserv load balancing looks much nicer at least at first glance, but of course that doesn't help for mod_perl. Les Mikesell [EMAIL PROTECTED]
Re: squid performance
According to Greg Stark: > > I think if you can avoid hitting a mod_perl server for the images, > > you've won more than half the battle, especially on a graphically > > intensive site. > > I've learned the hard way that a proxy does not completely replace the need to > put images and other other static components on a separate server. There are > two reasons that you really really want to be serving images from another > server (possibly running on the same machine of course). I agree that it is correct to serve images from a lightweight server but I don't quite understand how these points relate. A proxy should avoid the need to hit the backend server for static content if the cache copy is current unless the user hits the reload button and the browser sends the request with 'pragma: no-cache'. > 1) Netscape/IE won't intermix slow dynamic requests with fast static requests >on the same keep-alive connection I thought they just opened several connections in parallel without regard for the type of content. > 2) static images won't be delayed when the proxy gets bogged down waiting on >the backend dynamic server. Is this under NT where mod_perl is single threaded? Serving a new request should not have any relationship to delays handling other requests on unix unless you have hit your child process limit. > Eg, if the dynamic content generation becomes slow enough to cause a 2s > backlog of connections for dynamic content, then a proxy will not protect the > static images from that delay. Netscape or IE may queue those requests after > another dynamic content request, and even if they don't the proxy server will > eventually have every slot taken up waiting on the dynamic server. A proxy that already has the cached image should deliver it with no delay, and a request back to the same server should be serviced immediately anyway. > So *every* image on the page will have another 2s latency, instead of just a > 2s latency for the entire page. This is worst in Netscape of course course > where the page can't draw until all the images sizes are known. Putting the sizes in the IMG SRC tag is a good idea anyway. > This doesn't mean having a proxy is a bad idea. But it doesn't replace putting > your images on pics.mydomain.foo even if that resolves to the same address and > run a separate apache instance for them. This is a good idea because it is easy to move to a different machine if the load makes it necessary. However, a simple approach is to use a non-mod_perl apache as a non-caching proxy front end for the dynamic content and let it deliver the static pages directly. A short stack of RewriteRules can arrange this if you use the [L] or [PT] flags on the matches you want the front end to serve and the [P] flag on the matches to proxy. Les Mikesell [EMAIL PROTECTED]
Re: splitting mod_perl and sql over machines
According to Stas Bekman: > We all know that mod_perl is quite hungry for memory, but when you have > lots of SQL requests, the sql engine (mysql in my case) and httpd are > competing for memory (also I/O and CPU of course). The simplest solution > is to bump in a stronger server until it gets "outgrown" as the loads > grow and you need a more sophisticated solution. In a single box you will have contention for disk i/o, RAM, and CPU. You can avoid most of the disk contention (the biggest time issue) by putting the database on it's own drive. I've been running dual CPU machines, which seems to help with the perl execution although I haven't really done timing tests against a matching single CPU box. RAM may be the real problem when trying to expand a Linux pentium box. > My question is a cost-effectiveness of adding another cheap PC vs > replacing with new expensive machine. The question is what are the > immediate implications on performace (speed)? Since the 2 machines has to > interact between them. e.g. when setting the mysql to run on one machine > and leaving mod_perl/apache/squid on the other. Anyone did that? Yes, and a big advantage is that you can then add more web servers hitting the same database server. > Most of my requests are served within 0.05-0.2 secs, but I afraid that > adding a network (even a very fast one) to deliver mysql results, will > make the response answer go much higher, so I'll need more httpd processes > and I'll get back to the original situation where I don't have enough > resources. Hints? The network just has to match the load. If you go to a switched 100M net you won't add much delay. You'll want to run persistent DBI connections, of course, and do all you can with front-end proxies to keep the number of working mod_perl's as low as possible. > I know that when you have a really big load you need to build a cluster of > machines or alike, but when the requirement is in the middle - not too > big, but not small either it's a hard decision to do... especially when > you don't have the funds :) The real killer time-wise is virtual memory paging to disk. Try to estimate how much RAM you are going to need at once for the mod_perl processes and the database and figure out whether it is cheaper to put it all in one box or two. If you are just boarderline on needing the 2nd box, you might try a different approach. You can use a fairly cheap box as a server for images and static pages, and perhaps even your front-end proxy server as long as it is reliable. Les Mikesell [EMAIL PROTECTED]
Re: modperl success story
According to Barb and Tim: > It could really enhance your integrity if you also > presented honest evaluations of the downsides of Perl. Perl has two downsides. One is the start-up time for the program and mod_perl solves this for web pages. > The promotion of Perl on this site is so ubiquitous and > one sided, and Perl has such a bad reputation in many ways, > that somebody like me has a hard time swallowing the sunny > prognostications and finally diving in, unless I see > full honesty. The language itself is hard enough to swallow. > Just a suggestion. The other down side is that it is fast and easy to write working programs that are difficult for someone else to understand. That is, it accepts an individual's style instead of forcing something universal. I guess everyone here is willing to accept that tradeoff. Les Mikesell [EMAIL PROTECTED]
Has anyone tried mod_backhand?
Has anyone tried mod_backhand (in the apache module registry) in a front end apache proxying to multiple mod_perl'd back end servers? It claims to load balance, keeping track of the status and load of the back end servers, which also appear to need the module included. Is there a better free alternative that will detect dead backend servers and avoid them? (I'm actually having more trouble with java servlets right now, but the idea is the same with mod_perl...). Les Mikesell [EMAIL PROTECTED]
Re: Managing session state over multiple servers
According to James G Smith: > > Sun (and most other server/disk providers - e.g., SGI, NetAPP) provides a > volume manager that can handle RAID volumes. If you have a spare disk in the > system and a disk goes bad in the volume, you can take the bad disk out, put > the spare in, and then replace the bad disk at a convenient time. The machine Is anyone running MySQL or PostgreSQL with the drives NFS mounted from a NetAPP? If so, does it perform as well or better than with local drives? Les Mikesell [EMAIL PROTECTED]
Re: pool of DB connections ?
According to Oleg Bartunov: > > > Currently I have 20 apache servers which > > > handle 20 connections to database. If I want to work with > > > another database I have to create another 20 connections > > > with DB, so I will have 40 postgres backends. This is too much. > > I didn't write all details but of course I already have 2 servers setup. Are you sure you need 20 concurrent backend servers? If you have enabled apache's 'server-status' reporting you can watch the backend during some busy times to see how many are doing anything. It is probably to have too few servers (the front end will wait as long as the requests don't overflow the listen queue) than so many that the machine starts paging virtual memory to disk. Les Mikesell [EMAIL PROTECTED]
Re: pool of DB connections ?
According to Oleg Bartunov: > I'm using mod_perl, DBI, ApacheDBI and was quite happy > with persistent connections httpd<->postgres until I used > just one database. Currently I have 20 apache servers which > handle 20 connections to database. If I want to work with > another database I have to create another 20 connections > with DB. Postgres is not multithreading > DB, so I will have 40 postgres backends. This is too much. > Any experience ? Try the common trick of using a lightweight non-mod_perl apache as a front end, proxying the program requests to a mod_perl backend on another port. If your programs live under directory boundaries you can use ProxyPass directives. If they don't you can use RewriteRules with the [p] flag to selectively proxy (or [L] to not proxy). This will probably allow you to cut the mod_perl httpd's at least in half. If you still have a problem you could run two back end httpds on different ports with the front end proxying the requests that need each database to separate backends. Or you can throw hardware at the problem and move the database to a separate machine with enough memory to handle the connections. Les Mikesell [EMAIL PROTECTED]
Re: Dynamic rewriting
According to Atipat Rojnuckarin: > Hi, > > I think mod_rewrite (URI Translation) get called > before Apache::AuthenDBI/AuthzDBI, so mod_rewrite has > no way of knowing which group a user belongs to. > You'll probably need to write your own customized > handler(s) to do what you want. Mod_rewrite actually gets called twice. Once where you expect the uri->filename translation and later where it can try again after the fact. If you put rules in the .htaccess file they only have the 2nd chance and should be run after authorization. Since AuthzDBI exports the REMOTE_GROUP, you might have access to this for some black magic in the rewrite. Les Mikesell [EMAIL PROTECTED]
Re: Performance problems
According to Rasmus Lerdorf: > > I have a site running Apache-1.3.6, PHP-3.0.9, mod_perl-1.20 on Solaris with > > a Sybase Database. And it has some performance flaws. This site has > > thousands of hits per day (not sure of how many, though) and it should be > > faster (few images, no animations whatsoever). > > > > Can anybody tell me what could be done here? EmbPerl instead of PHP? > > mod_perl/apache tuning? Should database access be done through PHP or > > mod_perl? When should I use PHP? and mod_perl? do I need both? > > Well, which one are you using for talking to Sybase with? Choosing one or > the other would reduce your memory requirements a bit. Performance-wise > they really are quite similar. Choosing one over the other is more of a > personal preference thing. If the hits are coming over the internet, you could reduce memory usage with a lightweight front-end apache proxying to your heavyweight backends. The front end can deliver any static content directly. If you currently have both mod_perl and php scripts, you could try serving from separate backend servers each with only one interpreter included running on different ports. The front end can use RewriteRules with the [p] flag to transparently pass requests to the right server. The mod_perl connections to Sybase should be using Apache::DBI if possible. Les Mikesell [EMAIL PROTECTED]
Re: Problems with mod_perl 1.2.1 and apache 1.3.9 - newbie - Please help!
According to Scott Chapman: > I'm new to compiling my own software and attempting to get mod_perl > and apache to work together. I have Redhat 6.0. Most Redhat versions have problems that go away if you compile and install your own perl. > + doing sanity check on compiler and options > ** A test compilation with your Makefile configuration > ** failed. This is most likely because your C compiler > ** is not ANSI. Apache requires an ANSI C Compiler, such > ** as gcc. The above error message from your compiler > ** will also provide a clue. > Aborting! I think it is picking up the perl compiler options from the stock version on your system, and it doesn't match the compiler that is currently installed. There may be an easier fix, but building perl yourself should take care of it. If you end up with perl in /usr/local/bin, be sure to kill the old ones in /usr/bin and replace them with symlinks to keep everything else happy. Les Mikesell [EMAIL PROTECTED]
Re: Stonehenge::Throttle, round 2 - CPU used by an IP
According to Randal L. Schwartz: > >>>>> "Leslie" == Leslie Mikesell <[EMAIL PROTECTED]> writes: > > Leslie> How about an option to redirect to a different machine instead? I've > Leslie> considered digging out an old, slow 386 to handle greedy clients > Leslie> without obviously denying service to them. > > Most evil spiders I've see don't really pay attention to a redirect, > in any way other than "this page wasn't ready". In fact, that > redirect probably would put the scope of interest outside the spiders > suck zone. That would serve the purpose just as well. Most of my spiders are actually programs written to harvest our commodities trading data and since we don't have any stated policy to prohibit that, I don't really want to refuse the requests or fail but I do want to limit the impact on other activity on the server. I suspect they would adapt to follow but it wouldn't matter to me either way. So far most of this activity has been after the exchange closings, so it hasn't had a serious impact on our mid-day peak usage but it may reach a point where I have to try something. Are you sure that keeping track of the client addresses doesn't turn it into an overall loss compared to just completing the requests? Les Mikesell [EMAIL PROTECTED]
Re: Stonehenge::Throttle, round 2 - CPU used by an IP
According to Randal L. Schwartz: > > So, I modified my throttler to look at the recent CPU usage over a > window for a given IP. If the percentage exceeds a threshold, BOOM > they get a 503 error and a correct "Retry-After:" to tell them how > long they're banned. How about an option to redirect to a different machine instead? I've considered digging out an old, slow 386 to handle greedy clients without obviously denying service to them. Les Mikesell [EMAIL PROTECTED]
Re: Ye Ol' Template System Thread
According to Sam Tregar: > > > I think a lot of unnecessary complexity comes from the fact that > > most of the template systems (and apache modules in general) want > > to output the html as a side effect instead of accumulating the > > page in a buffer or just returning a string containg the html plus > > a status value to the caller. > > That's a very strange analysis. HTML::Template (my small contribution to > the genre) does no printing to the user - it returns a chunk of HTML ready > for the consumer or an error if something went wrong. I don't really see > that this significantly reduces the complexity of using a templating > system! > > Rather, I think that most of the simplicity of HTML::Template comes from > its strictly "one-way" interface. The template file contains only > output-oriented structures. Input can only come from the perl side. I > think that much of the "slippery slope" refered to previously comes from > allowing the template file to perform processing of its own - to set > variables and call procedures, for example. Right. You don't see the problem until you add conditionals and flow control - and perhaps not even until you try to reuse some existing pages as sub-elements of another. Apache is moderately good at handling: <--!include virtual "just about anything..." --> within mod_include, even mixing different handlers and proxied elements, but it is very awkward to wrap any kind of conditional execution or parameter passing in the mod_include language, and impossible to do anything where the condition depends on a sub-element. > H... Shouldn't someone be suggesting a grand-unified templating system > right about now? Or maybe we're finally beyond that? I hope so! The > truth of the matter is that there is no one ultimate way to tackle > generating HTML from Perl. What I'm looking for is a 'nestable' way of handling the logic flow and HTML construction that will allow a page to be used as a stand-alone item (perhaps displayed in a frameset) or included in another page, but when it is included I'd like to have the option of letting its execution return a status that the including page could see before it sends out any HTML. Les Mikesell [EMAIL PROTECTED]
Re: Trying not to re-invent the wheel
According to Rasmus Lerdorf: > > > Those introduce more complex problems. > > > > And they are, of course, inevitable with almost any templating > > system. > > You know, PHP was once just a templating system. [...] > Then I figured it would be a good idea to add stuff like > IF/LOOPS/etc so I could manipulate my tags a little bit. > > Now, 5 years later, people are writing template systems that sit on top of > PHP because they are writing business logic in PHP which means yet another > template system is needed to separate code from layout. > > I wonder how many layers of templates we will have 5 years from now. I think a lot of unnecessary complexity comes from the fact that most of the template systems (and apache modules in general) want to output the html as a side effect instead of accumulating the page in a buffer or just returning a string containg the html plus a status value to the caller. This means that you can't easily make nested sub-pages without knowing ahead of time how they will be used, and worse, if you get an error in step 3 of generating a page you can't undo the fact that steps 1 and 2 are probably already on the user's screen. If the template language offers some flow control and logic and the ability for one 'page' to return a status plus a string containing it's html to another page that includes it then you wouldn't need a different template system to separate logic from layout, you would just put them in different pages, letting the 'code' page include the layout elements it wants. Les Mikesell [EMAIL PROTECTED]
mod_proxy_add_forward and logging
The recent message about proxy_add_forward reminded me of a simple change I made that might help anyone who wants to track the logs matching the source/destination of proxied requests. I also activated mod_unique_id and in mod_proxy_add_forward, after if (r->proxyreq) { ap_table_set(r->headers_in, "X-Forwarded-For", r->connection->remote_ip); I added: (too lazy to write a whole module for this...) ap_table_set(r->headers_in, "X-Parent-Id", ap_table_get(r->subprocess_env, "UNIQUE_ID")) ; Then I added elements in the LogFormat for %{UNIQUE_ID}e %{X-Parent-Id}i The result is that the UNIQUE_ID field is different on every hit and can be used as a database key. If the first server hit delivers the content directly, the X-Parent-Id will be logged as "-". If it passes the request by proxy to another server it will be the same as the UNIQUE_ID (I wasn't expecting that, but it is interesting). The machine that receives the proxy request will log the X-Parent-Id containing the same value as the sender's UNIQUE_ID which can then be used to tie them together. Les Mikesell [EMAIL PROTECTED]
Re: Generic Server
According to Matt Sergeant: > Well I'll show by example. Take slash (the perl scripts for slashdot.org) - > it's got a web front end and now available is an NNTP front end. Wouldn't > it be nice to run both in-process under mod_perl, so you could easily > communicate between the two, use the same logging code, use the same core > modules, etc. That's what I'm thinking of. If the common code is written as perl modules or shared C libraries wrapped as perl modules, you can easily use the same routines in different programs. There is no need to include them all in places where they aren't needed. > Besides that, with a mod_perl enabled generic server rather than an inetd > server there's no loading config files for each request, no starting a > process, and Apache 2.0 (and I'm assuming mod_perl) will be available as a > threaded server, so it's only 1 10-20M process, not 100+. Server start-up time is generally only relevant for protocols that make a connection-per-request, and HTTP is about the only thing that does that. Regardless, it is simple enough to make a dedicated server listen on each port if you prefer. Threads may help with the memory problem but I'm not convinced yet. It has taken about 15 years to get the standard libraries mostly thread-safe. I don't think it will happen instantly with perl. Maybe with java, where they were designed in from the start... Les Mikesell [EMAIL PROTECTED]
Re: Generic Server
According to Matt Sergeant: > > >Would it be possible to have a generic server, like Apache, but not just > > >for HTTP - something that could also serve up NNTP connections, FTP > > >connections, etc. It seems to me at first look this should be possible. > > > > > >As I can see it there's a few key components to Apache: > > > > > >forking tcp/ip server > > >file caching/sending > > >header parsing > > >logging > > > > Sounds a lot like inetd to me, IMHO. > > Maybe I'm wrong, but inetd is just #1 of those points. And slow too. Inetd just decides which server to start for which protocol, and the only slow part is starting up a large program which may need to read a config file. However you didn't explain why you would like to replace these typically small and fast programs with a 10-20Meg mod_perl process. I can see where having a common modular authentication method would be useful, but what else would they have in common? Les Mikesell [EMAIL PROTECTED]
Re: modperl in practice
According to [EMAIL PROTECTED]: > I will probably do that next. I am not clear what the difference is > between running squid and doing a mod_proxy, in the case of all dynamic > content... a remark in the tuning guide that i saw back in June was vague about > this, saying it wasnt clear whether squid can cache larger doucments or > if there is some limit. I don't think there is much difference in practice, although when I was running a squid front-end I noticed a substantial number of 'client-refresh' hits pulling images that should have been in the cache from the back-end server anyway. > I did read the guide, although I didnt re-read it enough, as Vivek > so very gently pointed out to me. I think that there needs to be an > intro that says, basically, if you expect more than N requests per > minute for dynamic content, then start from this config, and the rest > of the tuning guide is all about tweaking that. It really isn't hits/second that matters - it is how fast each server process can move on to the next request. If all of your clients were on a fast local network a proxy would just be extra overhead but on the internet you will have a certain number of slow connections that tie up the servers. Les Mikesell [EMAIL PROTECTED]
Re: modperl in practice
According to [EMAIL PROTECTED]: > > I still have resisted the squid layer (dumb > stubbornness I think), and instead got myself another IP address on the > same interface card, bound the smallest most light weight separate > apache to it that I could make, and prefixed all image requests with > http://1.2.3.4/.. voila. that was the single biggest jump in throughput > that I discovered. You still have another easy jump, using either squid or the two-apache approach. Include mod_proxy and mod_rewrite in your lightweight front end, and use something like: RewriteRule ^/images/(.*)$ - [L] to make the front-end deliver static files directly, and at the end: RewriteRule^(.*)$ http://127.0.0.0:8080$1 [p] to pass the rest to your mod_perl httpd, moved to port 8080. If possible with your content turn off authentication in the front end server. >.. people were connecting to the site via this link, and packet loss > was such that retransmits and tcp stalls were keeping httpd heavies > around for much longer than normal.. Note that with a proxy, this only keeps a lightweight httpd tied up, assuming the page is small enough to fit in the buffers. If you are a busy internet site you always have some slow clients. This is a difficult thing to simulate in benchmark testing, though. > comments or corrections most welcome.. i freely admit to not having > enough time to read the archives of this group before posting. I probably won't be the only one to mention this, but you might have a lot more time if you had, or at least gone through the guide at http://perl.apache.org/guide/ which covers most of the problems. Les Mikesell [EMAIL PROTECTED]
Re: Rewrite to handler
According to Eric Cholet: > > What is the most straightforward way to make a RewriteRule > > map an arbitrary URL directly to a handler? > Do you really need to rewrite, I mean can't you just use > a container ? Yes, that will work, but putting all of the special cases into RewriteRules makes it easier to see what is going on (for me at least). Also, especially on the front-end side it makes it easier to tune the config file where you might want to alternate between proxying a program off to another server or running it locally as a CGI. I guess on the back end mod_perl side I never proxy again so it doesn't matter so much. Les Mikesell [EMAIL PROTECTED]
Rewrite to handler
What is the most straightforward way to make a RewriteRule map an arbitrary URL directly to a handler? I can do it by setting a handler for a directory, putting the file there and rewriting to that location, or by setting a handler for a mime-type and specifying -T for that type in the rewriterule. Have I missed a more direct way? (I want to mix-n-match PerlRun, Registry, and CGI for some old programs without changing the visible URL's.) Les Mikesell [EMAIL PROTECTED]
Re: 2 servers setup: mod_proxy cacheing
According to Oleg Bartunov: > Hmm, very interesting solution but I can't explain every user > please configure your browser in special way to browse my site. I didn't mean that was a solution - just that I set everything up according to directions and it did cache when it had an explict proxy request from a client, but I still didn't get the internal proxy requests to cache. Not much of my dynamic content could be meaningfully cached anyway so it is working out pretty well to have a strict split between static files and uncached proxy passthrough. There are a couple of exceptions where I rebuild static files every few minutes. If I had a lot of those I would probably work harder on controlling a cache. > The problem becomes more complex if we'll take into account > not only proxy cacheing feature but also clients browser > cache. It turns out to be hard to get this exactly right. Most clients won't cache anything with a '?' at all and intermediate caches have differing policies about /cgi-bin/ and other well-known hints about dynamic content. Also the original Expires: header uses the clients concept of time which may not match yours (and there is a bug in an old version of Netscape that causes it to reload animated gifs for each animation step if an Expires: header is present). There are now Cache-Control: headers that give a finer grained control but not everything uses them. Les Mikesell [EMAIL PROTECTED]
Re: 2 servers setup: mod_proxy cacheing
According to Stas Bekman: > > I have 2 servers setup and would like to configure > > frontend to cache output from backend using mod_proxy. > > I tried to add various headers but never seen any files > > in proxy directory. I didn't find any recommendation in your > > guide how to configure servers to get Apache cacheing works. > > A good question... Any of the mod_proxy users care to give Oleg (and the > guide :) a hand here? I use squid as a front end, that's why I don't have > any examples from mod_proxy... I tried it several versions of apache ago and never got it to cache anything as a reverse proxy, but with the same setup I could configure a browser to use the box as a normal proxy and it would cache those pages. I used squid for a while, then switched to an apache that serves the images and static pages directly and proxies everything else to mod_perl (using mod_rewrite to decide). Les Mikesell [EMAIL PROTECTED]
Re: DB connection pooling
According to Stefan Reitshamer: > > Sorry if this is in a FAQ, but I dug through a lot of FAQs and archives and > couldn't find it: > > Is there any database connection pooling built into mod_perl, or DBI? Not exactly, but you can use Apache::DBI to make the connections persistent, and you can greatly reduce the number of httpd's holding connections by using a non-mod_perl front end httpd that uses ProxyPass or RewriteRules to direct the requests that need database service to a mod_perl backend. Les Mikesell [EMAIL PROTECTED]
Re: Comparison PHP,clean mod_perl script,Apache::ASP,HTML::Embperl
According to BeerBong: > Huh, another test > test.epl According to BeerBong: > test.epl > -- > use DBI; > $dbh = DBI->connect("DBI:Oracle:SIMain","test","test"); > PHP3 and mod_perl script with DBD::Oracle - 24 Requests per second Is this with or without Apache::DBI loaded? Les Mikesell [EMAIL PROTECTED]
Re: mod_perl + mod_jserv: what's your milage?
According to Nick Bauman: > The servlet zones concept also has no parallel in > mod_perl. That is one of the most compelling reasons > for load distribution. Actually there is not much practical difference between the jserve interface and using mod_proxy (with or without mod_rewrite) to pass requests to a mod_perl enabled httpd. Neither one provides quite everything you want for load balancing and dead host detection, but they are better than nothing. There is a module that does claim to proxy with good load balancing (mod_backhand) but I have not had time to see if it will mesh with mod_rewrite. > So, the reason you don't allow direct connections to > your mod_perl system is because of security? You > didn't explicitly say... No, just memory use. The mod_perl httpd is likely to use 20 megs of memory and may be able to serve hundreds of requests per second. However, if you let it talk directly to a client browser you are at the mercy of every overloaded router and modem on the internet as to how long each of those requests actually take to complete. It is difficult to model a slow client in a benchmark test, but this is a real problem in production. The jserve interface acts about the same way, but the threaded java server might not have as much impact anyway. > I have both loaded as DSOs. I haven't yet encountered > the ApJServMount and RewriteRules you speak of, as > this only when you are mucking directly with the > Apache API, which I haven't needed to yet. In theory > it should give you added flexibility (at the tradeoff > of complexity) My scheme is for the front end httpd to accept and log everything but deliver only static and unprotected files itself. Things that require any processing, including *.shtml files are proxied through to mod_perl httpds spread over several machines. The programs were written without this scheme in mind (and we encouraged people to bookmark everything) so the RewriteRules that force the proxy to the right backends are pretty ugly and arbitrary. But, the ApJServMount fits into this model pretty well and with new development it is easy enough to map the servlets into a directory. Les Mikesell [EMAIL PROTECTED]
Re: mod_perl + mod_jserv: what's your milage?
According to Nick Bauman: > Anyone out there using mod_perl and mod_jserv together > on a production system? What are your results? I'm > playing around with a combined mod_perl, mod_jserv > (both as DSO's) apache and it seems to work pretty > slick, but I'm wondering if I'm building a > Frankenstein that will bring me grief later... I have them in a production setup, but not using mod_jserv heavily yet. I took the approach of compiling mod_jserv statically into the front end httpd with a separate mod_perl version as a back end. The only real ugliness is the stack of ApJServMount and RewriteRules that sort out where everything goes. The one quirk I noticed is that if you try to include both mod_perl and mod_java statically, the mod_perl tests will not run because the test config file does not set any jserv authentication. But, I only tried that for a low-usage test machine - I don't think I would run a non-proxied mod_perl in production with internet connections anyway. > Personal benchtests seem to be awesome, with Perl > slightly leading Java in performance (using ibm's JDK > brought the figures much closer together than > Blackdown's) but real world is different as we all > know. Mostly I am investigating java because we have data that can be obtained in xml and formatted using various xsl stylesheets and there is no support for this in perl yet (and I'm too lazy to write my own). I am very impressed by the ability to develop the java servlets on windows or unix and copy the servlet bytecode to the other and run it unchanged. Likewise you can transparently run the apache on one machine and the jserve on another without regard to the operating system. Les Mikesell [EMAIL PROTECTED]
Re: Performance problems
According to ricarDo oliveiRa: > I have a site running Apache-1.3.6, PHP-3.0.9, mod_perl-1.20 on Solaris with > a Sybase Database. And it has some performance flaws. This site has > thousands of hits per day (not sure of how many, though) and it should be > faster (few images, no animations whatsoever). > > Can anybody tell me what could be done here? EmbPerl instead of PHP? > mod_perl/apache tuning? Should database access be done through PHP or > mod_perl? When should I use PHP? and mod_perl? do I need both? Are you using Apache::DBI to hold persistent connections to the backend database? It may be the connect time that is the problem. Les Mikesell [EMAIL PROTECTED]
Re: external module and mod_perl
According to Dustin Tenney: > > mod_perl ties the Perl STDOUT filehandle to Apache i/o, but this is > > different from C stdout, which cannot easily hooked to Apache i/o without > > a subprocess pipe (as mod_cgi does). I don't know of a decent workaround. > > do you have access to the library sources? > > Yea I wrote the library. I was hoping to get this to work because I > really need the extra performance that C is giving me. Basically it > parses an html file and outputs the results to stdout. It uses > open/read/write to do this. Is there any documentation anywhere on how > this works? Thanks for the info. Have you measured a performance difference here? I'd be very surprised at anything you can do in C being enough faster at doing something to an html file to make up for the process startup time compared to letting mod_perl code do it - or are you doing number crunching? Anyway if it is your own library why not make it into a .xs module to get the speed without having to start another process and hook its output? Or if you do want the output, let your perl code read it from a pipe or execute it in back-ticks. Les Mikesell [EMAIL PROTECTED]