Here's a posting from Philip to the mod_perl list that can be an inspirational read for many of us Apache::ASP users!
Philip, if you have a polished form of this doc sometime, I could bundle it in the Apache::ASP docs, perhaps under some section like a BEST PRACTICES, or SCENARIOS. Actually I think the below is pretty good as is, so let me know if/when you want me to post it. --Josh -------- Original Message -------- Subject: RFC: Security/Performance Best Practices (long) Date: Sun, 11 Nov 2001 05:48:19 -0500 (EST) From: Philip Mak <[EMAIL PROTECTED]> To: <[EMAIL PROTECTED]> Recently, I've been using Apache::ASP to program a new version of an existing website that gets over 5 million page views per month. This website will have to fit on a RaQ4i (450MHz) server, so I'm pretty conscious about performance. Security is also important due to the popularity of the site. I've read various documentation and combined them together into the following strategy for security and performance on a mod_perl driven website. I haven't seen these combined strategies formally written up anywhere, so I thought I would try to do that and ask you guys for suggestions. This is a bit unorganized right now, but all the general concepts should be there. The goal is to produce a document that explains all the principles, and shows all the configuration directives required to accomplish this. This website runs off a MySQL database. Although all the webpages are generated dynamically, they don't change often (unless the webmaster explicitly updates them). I setup a lightweight frontend httpd (port 80) that proxies to a heavyweight mod_perl backend httpd (port 8001). mod_gzip is installed on the frontend to deliver compressed HTML pages for faster download time. mod_proxy_add_forward is also installed so that the backend logs the true IP address of the request in its logs. In my account, I have these directories: httpd: apachectl, httpd.conf, logs for the mod_perl httpd perl: DocumentRoot for backend httpd web: DocumentRoot for frontend httpd global: contains *.pm, startup.pl, global.asa (for Apache::ASP) The proxying is configured in the frontend httpd.conf as follows: 1: RewriteEngine On 2: RewriteRule ^/(.+)\.asp$ http://127.0.0.1:8001/$1.asp [L,P] 3: RewriteRule ^/(.+)\.pl$ http://127.0.0.1:8001/$1.pl [L,P] 4: RewriteCond /home/aw/perl%{REQUEST_URI}index.asp -f 5: RewriteRule ^(.*)/$ http://127.0.0.1:8001$1/ [L,P] Line 2 passes any URL with a .asp extension to the backend. Line 3 passes any URL with a .pl extension to the backend. Line 4,5 passes any request for a directory to the backend, if there is an index.asp file in that directory. Notice that to the outside world, the hostname/port of the website is exactly the same whether it's being served by the frontend or backend. I prefer this approach since it lets my <img src> tags refer to images in the same directory, for example. It also doesn't require an extra DNS lookup on the client end (which it would if the mod_perl server and non-mod_perl server were on different hostnames). I don't have a ProxyPassReverse directive since I haven't thought about it; I wouldn't need it anyway since I don't do any redirecting (at least not right now), but I'll probably end up adding it just in case. The following users were created on the system: aw: I login as this user. Group = aw, httpd aw_guest: mod_perl httpd runs as this user. Group = aw httpd: lightweight httpd runs as this user. Group = httpd aw owns all of the files except httpd/logs. The "web" directory is world readable. It only contains images that everyone can get from the web server anyway. The "httpd" and "global" directories are group readable, so only aw and aw_guest can read it. "perl" is world readable, but the files inside are only group readable (this allows the httpd user to tell what files exist, but nothing more). This protects my source code (and the database passwords they contain!) from being browsed by others. So that I won't accidentally create world readable files, I have this line in ~/.profile for "aw": umask 027 This creates files as rw-r----- by default. Files I upload by FTP still default to mode rw-r--r--, but I only upload image files that way (I use vi through ssh to edit the code) so that's perfect. There is a level of isolation here; in case I write an insecure script that gets hacked, the hacker will only gain access to the aw_guest account. The aw_guest account can read all my site's files, but it can't write to any of them. Also, the MySQL username/password used by the website has read-only access to the database. Apache::ASP is set so that every page has headers indicating that it can be cached for up to one hour: $Response->AddHeader('Last-Modified', time2str(time)); $Response->{CacheControl} = 'public'; $Response->{Expires} = 3600; I could have set the expiry time higher, but I decided to put it at 3600 so that in case I change content on the website and forget to manually clear the cache, it won't be out of date by more than 1 hour. In terms of performance issues, 1 hour should be long enough such that the backend httpd server doesn't have to do too much work. In my frontend httpd server, I have a basic cache configuration: ProxyRequests on CacheRoot /home/httpd/cache CacheSize 10000 # cache size of 10 MB CacheGcInterval 1 # clean up the cache every hour CacheMaxExpire 24 # nothing lives in the cache for > 24 hours CacheDefaultExpire 1 # default expiry time is 1 hour I can force the frontend httpd server to reload a specific page from the backend by viewing it in my browser and clicking Reload (when reloading, Opera and Netscape will send a cache-control header specifying that fresh data is to be pulled); this is useful when I'm tweaking a page and have to keep reloading it to see how it comes out. I also wrote a quick suid script (owned by httpd, mode -rwsr-xr-- so that only users in group "httpd" can execute it) that does "rm -rf /home/httpd/cache/*" to allow the "aw" user to clear the frontend cache manually (which I might want to do if I change a bunch of stuff). [aw aw]$ cat /usr/local/bin/clear_cache.sh #!/bin/bash IFS=' ' PATH='/bin' /bin/echo Clearing httpd cache... /bin/rm -rf /home/httpd/cache/* /bin/echo Cache cleared. Since suid shell scripts are unsafe due to race conditions (and Linux doesn't even allow them due to that reason), I needed to write a wrapper C program: #define SCRIPT "/usr/local/bin/clear_cache.sh" main(ac, av) char **av; { execv(SCRIPT, av); } That C program is the one that is set suid. According to ab (ApacheBench) benchmarks: Without frontend caching, the server can do about 20 requests per second. With frontend caching, it goes as high as 400 (network bandwidth permitting). In summary, the strategies I have employed here are: - principle of minimal permissions and isolation - using mod_proxy to proxy to mod_perl httpd, with caching - mod_gzip to speed download times The performance aspect of this site is still uncertain, as it hasn't gone live yet (I think there's another 1-2 weeks before it goes live). I think it's kind of a challenge to get it all to work on a RaQ4i with this much traffic, but the bandwidth is cheapest on this webhost and they only offer RaQ4is. ab (ApacheBenchmark) tests suggest that the RaQ will be able to handle the load, though. The guys on the RaQ forums often say that the RaQs can't handle this sort of load, but I don't think they have concrete evidence to go on. >From working on this website, I've also learned some nice Apache::ASP coding techniques that make things easier (namely Script_OnStart and XMLSubs), but that's beyond the scope of this article. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]