>So the problem is not legitimate lists with Spam, but what looks like
>fake lists that somehow made it to mail-archive.

That's exactly correct. This often happens when spam (with totally
bogus headers) is sent directly from the spammer to Mail-Archive's
inbox.

>[...] it looks like, from my random clicking that the majority of the
>3000 odd lists on the lists.html page belong to this category. [...]
>it might turn off others and reflect badly on mail-archive, not
>realizing that it's a great service.

That's a good point, and archived spam can also artificially inflate
the list count statistic on the front page. I checked, and the current
time before a list is considered inactive (and dropped from the list
of lists) is 150 days. There are ~3800 lists that meet that
qualification. If I change the definition of inactive to 21 days, we
drop down to ~2000 lists. Presumably the difference is mostly spam.
Therefore, I am going to require activity with the last 21 days to
make the list of lists. (The front page statistic is computed from a
different measurement, and I will worry about fixing that later.)
Please let me know if you see a qualitative improvement in the list of
lists.

I am not sure how else to purge those spam lists. From a basic "ls
-ld" on all the message directories, I can see which lists have very
few messages in a computationally inexpensive way (because the
directory structure itself is small). Those are likely to be either
spam lists, or new/low traffic legitimate lists. I'm not sure how to
make good use of this information, in a way that isn't already covered
by heightening the "active" requirement.

I'm open to suggestions.

-Jeff

_______________________________________________
Gossip mailing list
[EMAIL PROTECTED]
http://www.mail-archive.com/cgi-bin/mailman/listinfo/gossip

Reply via email to