El 14/08/2012 05:05 p.m., Stevan Bajić
escribió:
On 14.08.2012 21:05, Cristian wrote:
El 14/08/2012 03:42 p.m., Stevan
Bajić escribió:
On 14.08.2012 19:32, Cristian
wrote:
[....]
[....]
Y run daily:
dspam_merge user1 user2 user3 -o globaluser
The globaluser is the merged group. With this command the
data of the users is put into the globaluser really?
It should. That's at least what the command dspam_merge is
supposed to do.
So when delete the data of the users the dspam
read the data from the globalgroup?
Yes. The data is (should be) in the globalgroup. And it is
read from there. But look at this (as an example).
globalgroup has 1'000'000 tokens and 100'000 processed
messages
user data has 1'000 tokens and 100 processed messages
Now assume a inbound message has 100 tokens and those 100
tokens fully get a hit in globalgroup then you have 100 tokens
out of 1 Million. Now assume you get 100 tokens fully hit the
user data then you have 100 tokens out of 1'000 tokens. I
think you don't need much mathematical knowledge to understand
that the user hit has more weight than the same hit on the
globalgroup.
Understand, but so dspam only works compare the tokens that get
hits with the total of tokens? Is the only way?
What I wrote is a simplified view. If you want to understand how
things in DSPAM work then you should read about Bayes theorem and
probability. For example:
http://en.wikipedia.org/wiki/Bayes%27_theorem
http://en.wikipedia.org/wiki/Bayesian_probability
This is ok?
Yes. It is okay.
This is bad?
No. It is not bad. IMHO it is unusual to do that daily merge
but why not?
How train the global merged group?
With dspam_train maybe?
But dspam_train can train from the learning of another users?
Depends how you present or make available the data to dspam_train.
etc...
Remember that I have the messages storage with mdbox, and I
dspam_train can´t read directly from the localdisk.
dspam_train CAN read directly from the local disc. Unfortunately
dspam_train does know how to handle multi-dbox format. However...
DSPAM is open source and you can easily extend dspam_train to
handle mdbox format.
I don´t know how extend dspam_train. But with dspam_merge, not is
equal? I have 5 users to train the global user, run dspam_merge and
next clean data into mysql of the users, one time per day. What is
the diferente into
this and read every time the emails from folders with dspam_train?
This not is really a user,
This does not matter for DSPAM.
and the accouts run in mdbox format. The idea
is train the globalgroup from the trained users.
Then maybe using a managed group would be better?
The issue is I need general rules that help a new user to have a
good antispam, but if the user has false positives, he can fix
this.
How is retraining done on your setup? DSPAM Web-UI? Other ways?
What would that be?
I will need to manage aprox 10.000 users, so need any solution
that don´t have a big database for every user.
A global merged group is a good way to reduce the overall database
size. Are those 10K users having +/- the same type of mail? Same
language? Same sort of data?
OSB is btw another way to reduce size. Running the cleaning job
daily or using the dspam_maintenance script is another way keeping
the data consumption low.
Cristian.
--
Kind Regards from Switzerland,
Stevan Bajić
------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and
threat landscape has changed and how IT managers can respond. Discussions
will include endpoint security, mobile security and the latest in malware
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Dspam-user mailing list
Dspam-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspam-user
|
------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and
threat landscape has changed and how IT managers can respond. Discussions
will include endpoint security, mobile security and the latest in malware
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Dspam-user mailing list
Dspam-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspam-user