I was looking for an Apache module to handle access control via URL/regex that reads a
list of rules from file.
I find it hard to believe no-one has done this yet so appologies in advance if I just
didn't search properly.
I tried searching all the usual sources but came up blank so I adapted the
Apache::BlockAgent handler from the
Eagle book (excellent). If anyone has more info on an existing module/handler I'd be
grateful.
The original requirement was to control a clients proxy access so that only a list of
about 30 URLs were accessible
from their LAN. I needed an Apache config directive and handler that reads its list of
names/IPs/regexes from a text
file, caches the list at startup/restart and stats the text file so that
additions/alterations take immediate effect. The
list has to be an 'allow' list as well as a 'deny' list so that the overhead is
minimised and admin tools have an easier
job of controlling access by editing/validating only one file.
Just in case there really are no such modules out there: Apache::URLControl.pm is
still pretty basic but it does the
following:
Adds an Apache config directive that specifies a ServerRoot relative text file:
PerlSetVar URLControlFile access_filters/url_control
PerlPostReadRequestHandler Apache::URLControl
URLControl.pm currently handles the request as a: PerlPostReadRequestHandler in two
test setups.
Used in this way it is obviously not proxy-specific and blocks/allows requests at the
earliest opportunity.
The control file can contain:
DEFAULT DENY
www.adomain.com ALLOW
anotherdomain.com DENY
http://somewhere.com/.*.asp DENY
https://domain.com/
194.164.46.4/blah/blah
/apath/asubdir/afile.htm
.*microsoft.* DENY
# a comment etc.
If DEFAULT DENY is used then only access to locations matching an ALLOW line are
allowed. Otherwise the list
can contain specific DENY rules and if DENY is omitted the rule defaults to DENY.
If the rule begins with https:// then a CONNECT adomain.com:443 is denied or allowed.
The rule could also be
written as:
adomain.com:443 DENY
The '.' in domain.com and index.htm are escaped in the module, as are %,/,+ This just
simplifies writing the file
somewhat. Otherwise the Perl regex in a rule is handled as-is.
A 403 is returned if the request is blocked but the URL from $r->the_request is
substituted for $r->uri so that proxy
requests are denied with the full URL as the reason and not '/'.
If anyone is interested I will stress-test it and then enter the module to CPAN. If
there is nothing similar I will develop
it to allow for cached IP lookups (to convert the IP->domain name and match on that in
the list) and add other
refinements.
Mark
Mark Tiramani
FREDO Internet Services
[EMAIL PROTECTED]