Chuq Von Rospach wrote :
> The two groups I'm specifically trying to lock out are the e-mail
> address harvesters who won't abide by a robots.txt restriction,
We had the same problem with harvester going through the archives,
in spite of the norobots.txt + some filtering based on user_agent
in apache (I recommend a good article on this subject :
http://www.csc.ncsu.edu/~brabec/antispam.html).
Harvester are jumping from a page to another via anchors (following
a <A HREF=""> but don't submit forms as far as we know. We have
inserted a basic FORM at the entry of archives. The user needs to
submit the form to access the archives. We have no direct link to
our archives on our web server. This method has shown its efficiency
over the last 2 years.
Here is an exemple : http:[EMAIL PROTECTED]/
> and the occasional troll that gets kicked off a list and goes looking
> for ways to create havoc, where, by definition, a rule like "don't do
> this" won't work.
We have developped a web interface to Sympa MLM, which has an
interresting way of managing archives :
The authentication scheme is based on e-mail addresses
and passwords. When subscribing to a list a user is
allocated an initial password, he can change latter.
He/she needs this password to access some private list
functions. Once he/she privides his/her password, a
HTTP cookie will do the auth job.
Web archives are managed with MHonArc but they are not
directly accessible through our web server (ie not in
the web hierarchy). This job is performed by a CGI
which has therefore complete control over who has
access to an archive. Depending on "web_archive_access"
list parameter (public|private|owner|listmaster|closed),
the CGI awaits a password or requires specific privileged.
Subscriber information are stored in a Relational Database,
so the CGI and the MLM may work on the same set of data.
Olivier Salaun
Comite Reseaux des Universites
http://www.cru.fr/