Really getting discouraged... when does the learning happen?

Harry Putnam Sun, 15 Sep 2013 19:57:34 -0700

I've been trying to `teach' SA to spam from ham in my mail system.

I've made it thru two main learning sessions where I ran around 450
msgs (each time) thru sa-learn spam/ham and yet SA is still incapable
of getting it right more than about 40 % or maybe less.  Not sure how
to figure that out very exactly.


My incoming mail is probably no more than 10-12% ham.. maybe not even
that. So major spam is coming in.

Now after the above mentioned amount of training I've run 1100
messages thru my send box setup... its the last 11 messages that have
come in.

I'm using only 2 rules in procmailrc... spam and ham following the
call to SA.

Look at the (mbox style) files that resulted:
-rw-------[...] 10045521 Sep 15 22:26 ham
-rw-------[...]  6372824 Sep 15 22:26 spam

That is about:
9.6 MB ham
6.1 MB spam

So truly massive amounts of spam are STILL being seen as ham by SA.

That should be something like:
 2.0 MB ham 
13.5 MB spam

Even more aggravating is that many many of the spam msgs are just like
the messages that SA was 'trained' on.

Does this seem unreasonable enough that it must mean I'm doing this
all wrong? 

Can anyone post some figures of what to expect with default SA 3.3.2
and what to expect after some specific amount of training?

Really getting discouraged... when does the learning happen?

Reply via email to