Hi, As a first step I would add a manual step to the subscription process. So when someone submits an archive, they would need to include *their* email address and they'd need to do the standard thing, click on a url that's included in the email.
This, of course, involves some coding effort both in terms of the front end, and the backend not accepting emails from unblessed lists. And of course, spammers could spend the time and do it manually, but we could put road blocks to slow them down, like only allow one list from a certain email per day, etc. Regards, Dror On Tue, Nov 18, 2003 at 11:06:23PM -0800, Jeff Breidenbach wrote: > > >So the problem is not legitimate lists with Spam, but what looks like > >fake lists that somehow made it to mail-archive. > > That's exactly correct. This often happens when spam (with totally > bogus headers) is sent directly from the spammer to Mail-Archive's > inbox. > > >[...] it looks like, from my random clicking that the majority of the > >3000 odd lists on the lists.html page belong to this category. [...] > >it might turn off others and reflect badly on mail-archive, not > >realizing that it's a great service. > > That's a good point, and archived spam can also artificially inflate > the list count statistic on the front page. I checked, and the current > time before a list is considered inactive (and dropped from the list > of lists) is 150 days. There are ~3800 lists that meet that > qualification. If I change the definition of inactive to 21 days, we > drop down to ~2000 lists. Presumably the difference is mostly spam. > Therefore, I am going to require activity with the last 21 days to > make the list of lists. (The front page statistic is computed from a > different measurement, and I will worry about fixing that later.) > Please let me know if you see a qualitative improvement in the list of > lists. > > I am not sure how else to purge those spam lists. From a basic "ls > -ld" on all the message directories, I can see which lists have very > few messages in a computationally inexpensive way (because the > directory structure itself is small). Those are likely to be either > spam lists, or new/low traffic legitimate lists. I'm not sure how to > make good use of this information, in a way that isn't already covered > by heightening the "active" requirement. > > I'm open to suggestions. > > -Jeff > > _______________________________________________ > Gossip mailing list > [EMAIL PROTECTED] > http://www.mail-archive.com/cgi-bin/mailman/listinfo/gossip -- Dror Matalon Zapatec Inc 1700 MLK Way Berkeley, CA 94709 http://www.fastbuzz.com http://www.zapatec.com _______________________________________________ Gossip mailing list [EMAIL PROTECTED] http://www.mail-archive.com/cgi-bin/mailman/listinfo/gossip
