Re: [Dspam-user] Upgrading from 3.8.0 to 3.10.1 greatly detection quality

Stevan Bajić Fri, 30 Mar 2012 07:21:35 -0700

On 30.03.2012 15:44, René Neumann wrote:

Am 30.03.2012 15:22, schrieb Stevan Bajić:

I upgraded my dspam installation from 3.8.0 to 3.10.1 several weeks ago.
And I had to notice, that the detection quality suddenly dropped a lot:
Each day I have to remove several mails from Junk because they are seen
as spam -- this didn't happen in the old installation and it also does
not change. Was it a mistake to use the old database? Would it be wise
to drop the old data and retrain?

in your case: yes!
I say that because I see you are using CHAIN and you are using TEFT. If
you ask me then I would start from a empty database and would use OSB
and TOE.

But if I remember correctly, TOE should only be used if the database is
quite mature. Doesn't this hold anymore?

This does not hold any more.

And I forgot to mention, for most of the accounts this is actually set
to 'TUM' (that I -- from the description -- prefer).

TUM is okay. Anything other then TEFT.

Btw: For re-training: Is there some nice 'junk database' one could use
(for non-junk I can just use the current messages)? I know that when I
first installed DSPAM it took me quite a while to find such a junk
database -- but I forgot where it came from.

http://spamassassin.apache.org/publiccorpus/
http://plg.uwaterloo.ca/~gvcormac/treccorpus/
http://plg.uwaterloo.ca/~gvcormac/treccorpus06/
http://plg.uwaterloo.ca/~gvcormac/treccorpus07/

Let me know if you need more.

If you want my advice: Don't use any pre training. It is almost useless.Switch to osb tokenizer and let the engine do the rest. You will seethat you will very quickly (waaaay quicker than before) have already ascore above 95%.

Or if you really want to do training then do it in conjunction with amerged global group and train that. But I would not train individual users.

I know, I know. It sounds strange. But I have been there. I have trainedfor weeks (in the old days when the systems where not that fast) and theresult of this insane training is: do not pre-train. It will eat a lotof time and bring almost no benefit (often with something modern likeosb in conjunction with TOE/TUM it will be even a disadvantage topre-train individual users).

Also I am a bit puzzled about the new configuration: Several options now
appear twice in the conffile: One time as a normal option and one time
as a 'Preference' parameter. It is not clear to me what takes precedence
or what happens if one of them is not set. Perhaps this influences the
problem above, as I might have conflicting options set this way. (Why
are these 'Preference' parameters there anyway?)

The entries without Preference are the global valid entries. Preferences
are values that each user can have and can change (if you allow him/her
to change them).

So what is the actual effect of setting:

TrainingMode teft
Preference "trainingMode=TUM"

(and assuming no override is done by the user)

User has more weight than the other.

And if I understand this correctly, I can drop any Preference thingy
that I don't want to be overridden by a user anyway?

Not really. There are just a bunch of values that are available in bothplaces. The one that are NOT preferences are used by the DSPAMagent/daemon while the other with the preference are used in the DSPAMclient. Dropping them is not what you want (I guess).

- René



------------------------------------------------------------------------------
This SF email is sponsosred by:
Try Windows Azure free for 90 days Click Here
http://p.sf.net/sfu/sfd2d-msazure


_______________________________________________
Dspam-user mailing list
Dspam-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspam-user



--
Kind Regards from Switzerland,

Stevan Bajić

------------------------------------------------------------------------------
This SF email is sponsosred by:
Try Windows Azure free for 90 days Click Here 
http://p.sf.net/sfu/sfd2d-msazure

_______________________________________________
Dspam-user mailing list
Dspam-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspam-user

Re: [Dspam-user] Upgrading from 3.8.0 to 3.10.1 greatly detection quality

Reply via email to