On Wed, 20 Jan 2016, Marc Perkel wrote:
On 01/20/16 10:36, John Hardin wrote:
On Wed, 20 Jan 2016, Marc Perkel wrote: .
So it still needs to be trained, at least initially, with a
manually-vetted corpus. If not, how do you propose to do the initial
classification of messages for training?
Do you envision it being self-training past that point? What if it goes
off the rails? How would you keep it from going off the rails?
If it's not self-training then you have the same issues with the
reliability of the people feeding the training corpus.
On my system I have a long list of good email sources that are 100% white
listed and I also have hackerbot traps that are 100% spam. I use these for
training to keep it on the rails. Good question though.
And for those who don't have access to something like that?
And given those resources, would you consider becoming a masscheck
participant?
--
John Hardin KA7OHZ http://www.impsec.org/~jhardin/
jhar...@impsec.org FALaholic #11174 pgpk -a jhar...@impsec.org
key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
Je ne suis pas Charlie. Je suis armé.
-----------------------------------------------------------------------
3 days until John Moses Browning's 161st Birthday