Re: My new method for blocking spam - REVEALED!

John Hardin Wed, 20 Jan 2016 10:36:43 -0800

On Wed, 20 Jan 2016, Marc Perkel wrote:

Maybe I should call it a new plan for spam?

Perhaps FUSSP? (Sorry... You're so rah rah about this I couldn'tresist... :) )

So - how do I get a list of words and phrases never used in spam? Icreate a list of words and phrases that are used in spam and check tosee if it's *not on the list*.

So it still needs to be trained, at least initially, with amanually-vetted corpus. If not, how do you propose to do the initialclassification of messages for training?

Do you envision it being self-training past that point? What if it goesoff the rails? How would you keep it from going off the rails?

If it's not self-training then you have the same issues with thereliability of the people feeding the training corpus.

So I'm not just tokenizing the subject. Also the first 25 words of themessage

OK, good. I was thinking it would be *really* sensitive to "bayespoisoning". Ignoring all but the first part of the body helps.

I assume you're only considering the portion that would render as visibleto the recipient. Of course, that brings in all the logic regarding "whatis visible to the recipient?" and all the HTML obfuscation we're alreadyseeing to get around Bayes and "only scan the first part of the message".


--
 John Hardin KA7OHZ                    http://www.impsec.org/~jhardin/
 jhar...@impsec.org    FALaholic #11174     pgpk -a jhar...@impsec.org
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
  Insofar as the police deter by their presence, they are very, very
  good. Criminals take great pains not to commit a crime in front of
  them.                                             -- Jeffrey Snyder
-----------------------------------------------------------------------
 3 days until John Moses Browning's 161st Birthday

Re: My new method for blocking spam - REVEALED!

Reply via email to