> 1) Has anyone tried classifing, or keeping track of unknown
> HTML tag names? Say for instance, some spam has the following
> html VIGRA
> What if a token was added named "Unknown:Html" or something
> like that (because of the tags ?
At the moment we basically throw all HTML tags away - the
Dear Sam,
> Also, if I want to test some type of technique, what levels of spam
> filtering/fp/fn are people getting? What percentage points should I
> shoot for?
One reason that improving SpamBayes is hard is that it already does
such a good job.
I have mine_received_headers turned on and I've
if I want to test some type of technique, what levels of spam
filtering/fp/fn are people getting? What percentage points should I shoot for?
TIA!
- Original Message -
From: "Tony Meyer" <[EMAIL PROTECTED]>
To: "'Sam Nage'" <[EMAIL PROTECTED]>, spamba
> I'm trying to understand how you implemented the Chi2
> technique.
Do you mean the whole classifier, or just the inverse-chi-squared function
(chi2.chi2Q())?
> Can someone tell me how chi2 method is implemented in spambayes?
What exactly do you mean by how it's implemented? Do you want an
ex
Hi,
I'm trying to understand how you implemented the Chi2 technique. Unfortunately
it's been over 10yrs since my college stats class. I've done some refresher
reading on chi2, and now sorta remember it. ;-) Unfortunately, I also don't
know python, but am struggling through it. Can someone tell