From: Stevan Bajić <ste...@bajic.ch>
>To: "dspam-user@lists.sourceforge.net" <dspam-user@lists.sourceforge.net> 
>Sent: Sunday, April 22, 2012 2:28 PM
>Subject: Re: [Dspam-user] Increase Spam Hit Rate
> 
>
>That is correct, as I had mentioned, I did not have a lot of HAM to train with.
At least you have around 33K Ham messages. This is not that bad.
>
>Well, not really. Remember, on your instructions, I essentially trained the 
>HAM many times, so, it's that many divided by the number of months from the 
>SPAM corpus, around 16 I think. 

I have since redone and added more HAM that I dug up.
I will build it up over time.
Maybe you get even more if you use the mails from the send folder of all the 
users.
>
>I got as much sent as I could, it's only retained for 30 days.
Suprisingly, the first day went well.
>>
>>
You mean the first day using the merged group from above? I told you that this 
approach will work. Tell me more. How long did it took you to train with that 
many messages? How long was the downtime? Was the production downtime as low as 
I told you?
>
>Yes, using the merged group. Since I wanted a master machine with Dspam 
>database and only the merged group training data, so, I could use to copy to 
>other systems, I just used my handy Lion server at home. Loaded up Dspam and 
>ran the trains on it. I don't recall how long, maybe 8 hours? Production 
>downtime was close to 0 since there wasn't much to do and I had a MySQL script 
>already set up to do the commands. So, a few minutes maybe.

I mysqldump'd the dspam_stats table (all 1 user) and the token table (all 1 
user) on Lion server since I wanted to copy those over to one of the real mail 
servers. So, loading it was trivial of course. And, since the training was done 
on a local machine, it was no big deal. 

At the same time, I updated the Macports Portfile to compile the latest dspam. 
I'll have to check this in on Macports when I get a chance so others can use.



>I ask because it would be good if your experience in switching from
    TEFT to TOE and using a merged group and that additional training
    and deleting your whole user data, etc.... could motivate others in
    following your example.
>
>Well, it wasn't that bad really. As you can tell from my other responses 
>above. Obivously, the only downside is not being able to retrain recently 
>received mail that was received before the conversion. Very small price to pay.

Which was to be expected. TEFT is an evil relict from the past.
>
>But it's the default! That makes no logical sense to me that the devs won't 
>change the default! I am sure I get the reason, but, I would tend to disagree 
>with it.
------------------------------------------------------------------------------
For Developers, A Lot Can Happen In A Second.
Boundary is the first to Know...and Tell You.
Monitor Your Applications in Ultra-Fine Resolution. Try it FREE!
http://p.sf.net/sfu/Boundary-d2dvs2
_______________________________________________
Dspam-user mailing list
Dspam-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspam-user

Reply via email to