>-----Original Message----- >From: Jeff Chan [mailto:[EMAIL PROTECTED] >Sent: Thursday, September 02, 2004 9:04 PM >To: [EMAIL PROTECTED] >Cc: SURBL Discuss >Subject: Re: Applying SURBL against blog comment spammers > > >On Thursday, September 2, 2004, 5:43:26 PM, Loren Wilton wrote: >>> Given the lack of commonality, it may not make much sense to >>> add to the mail spam lists, since it would be an extra 2000+ >>> records that would probably not get hits on mail. >>> >>> The MT-Blacklist doesn't seem to update too frequently (the >>> last new record was from 8/29) and has about 2000 records. >>> Matthew's list was pretty sparse so far. So I'm still >>> pondering things. > >> Just from a technical/philosophical point, I think a separate list is >> desirable. Although I agree that making it part of multi >would probably be >> the way to go, and I agree with the basic concept that "spam >is spam". > >> However, I think the reasons for a separate list are: > >> 1. Separate source feed. A new list allows the source >feed to be more >> easily documented. >> 2. (As stated) little overlap with email spammers, at >least so far. >> 3. Probably a different update cycle and removal (from >old age) cycle >> requirement > >> The different means of updating and possibly different aging >method are high >> on my list of reasons for suggesting a separate list. On >the other hand, >> having it part of multi would be nice, since (I assume, possibly >> incorrectly) that one query could check a lot of lists based >on the bitmap. > >Correct. I'm still wavering if a blog spam list should be part >of multi. There are programs that use multi but (unadvisedly) >don't differentiate between the source lists. That kind of >argues for keeping multi focussed on only mail spam and making >a blog spam list separate. On the other hand there's much less >overhead in adding a list internally to multi than setting up >a whole new list. > >> It probably would also be good to devote some thought to how >entries will be >> added to this list and validated. We surely don't want some >annoyed blog >> spammer spamming the list with every valid doamin they can find! > >Yes, data quality is always an issue. Any of these ventures will >struggle if spammers are able to poison the data. Keeping >legitimate domains out of any feed is key and provisions would >need to be made for that. >
As always I agree. I think any new idea should be kept out of mutli until more testing is done. Having said that, if a blog spam matches all the requirments we would use in a SURBL entry now, then why not list? And why not list in the regular WS list? I'm saying I would only add hand checked domains like those I found in JM's example. Preemptive listings. Again, only domains that are obvious, like the examples I listed. If there is any question of a blog spam domain being used for legit, then it follows the very same rules we have now. Don't blacklist. Add to unclassified ;) --Chris