I have found what looks like a pretty serious bug in the urllist function
of squidGuard. This manifests in 1.1.4 and 1.2.0, and affects the meaning
of URLs with and (much more importantly) without a trailing slash.
my squidGuard.conf:
-----------------------------------------
logdir /usr/local/squidGuard/log
dbhome /usr/local/squidGuard/db
dest whitelist {
urllist whitelist.url
}
acl {
default {
pass whitelist
pass none
redirect http://127.0.0.1/cgi-bin/whitedenied.cgi?url=%u
}
}
-------- < snip > --------
whitelist.url:
-------------
foobar.baz/
foo.bar
winkle.com/test
-------- < snip > --------
This configuration exhibits the following behaviour:
http://www.foobar.baz/ ALLOW
http://www.foo.bar/ ALLOW
http://www.foobar.baz DENY *
http://www.foo.bar.qux/ ALLOW *
http://winkle.com/test ALLOW
http://www.winkle.com/test/123 ALLOW
http://www.winkle.com/test.php ALLOW *
http://winkle.com/testing/foo ALLOW *
The four lines with the * are the ones I'm worried about. It seems
squidGuard isn't doing The Right Thing (tm) with regard to trailing
slashes. This doesn't seem to be a wetware issue, since URLLists in the
documentation and even those distributed from the squidguardRobot all
contain no trailing slash.
Browsers will very rarely, if ever, send a request without the trailling
slash for any of the first four cases (domain only; no
directory/filename); and if they do the webserver will send back a
redirect pointing them at URL+"/". This behaviour is acceptable only when
the URLList in question is a Whitelist; if it's a Blacklist the user is
just going to experience confusion. The cases with a /path in the URL are
more serious: a lot of sites are set up with URIs like /path.php
/path.php/doc2.php /path.php/doc3.php?foo=bar.
Regards,
Doran Moppert