-------- Original-Nachricht --------
> Datum: Fri, 19 Dec 2008 09:56:04 +0000
> Von: Matt Galloway <matt.gallo...@senokian.com>
> An: dspam-users@lists.nuclearelephant.com
> Betreff: Re: [dspam-users] DSpam global user advice

> Many thanks for the amazingly quick reply!
> 
> Yes I used dspam_train first to train one of the users which will be
> merged into globaluser and then I merged that user into globaluser.
> 
> Basically my setup is that we do mail for a large number of people and
> we want the ability to use a globaluser so that everyone has training
> data, but then the global user will be updated every month with the
> training data of 3 users (who get large amounts of spam). Therefore I
> want to make sure that at each merge step, it's not just adding the
> users, but rather starting again for globaluser.
> 
I don't understand that. Allow me to rephrase:
User A: We call him A
User B: We call him B
User C: We call him C
User D: We call him D
User E: We call him E
User F: We call him F
merged group name: globaluser

Now you have merged A + B + C into globaluser
globaluser is the merged group name and it's used for all users

Now during a time range user A, B and C get a lot of SPAM and you want their 
training to be included into globaluser. Right?

Or do you want just to take the data from A + B + C and then recreate the 
globaluser every month (or so)?


The problem with the second way is:

1) Initial
A has 1000 tokens
B has 2000 tokens
C has 3000 tokens
globaluser has 0 tokens

2) Merging
A + B + C ~ 4500 tokens (Just an assumption. Less because some tokens are 
probably found in A and in B and in C)
This results into 4500 tokens for globaluser

3) Setting globaluser as merged group for all users
All the users have now out of the box at least 4500 tokens

4) Running for one month
A get's new 200 tokens
B get's new 100 tokens
C get's new 50 tokens

5) You do your daily purging and cleaning of the data
A had 1000 (from item 1) + 200 (from item 4) tokens -> after purging he has -> 
500 tokens
B had 2000 (from item 1) + 100 (from item 4) tokens -> after purging he has -> 
1000 tokens
C had 3000 (from item 1) + 50 (from item 4) tokens -> after purging he has -> 
1500 tokens
globaluser still has 4500 and you hopefully have excluded globaluser from 
purging

6) You recreate globaluser from A + B + C
Now do the math. What do you think will you get into globaluser? More or less 
tokens then it had before? You will get LESS because A + B + C just has 3000 
tokens including the doubles. And before you had alone 4500 tokens for 
globaluser.


So it is a very bad idea to just erase globaluser in each run. And you have to 
keep in mind that from that moment where you activate the merged group, all 
user tokens are just the delta between there tokens and the tokens from the 
globaluser at the time when DSPAM calculated the tokens. So erasing the tokens 
in each merger run in globaluser is NOT making DSPAM stronger. It is making it 
weaker!

Better would be to train allone the global user. For example by activating 
corpus creation for A, B, C and then do on a regular basis a training of 
globaluser from the data you collect at A, B and C. Or you exclude A, B, C from 
the merged global group and add A, B and C into a merged and managed group. 
Then training from A, B and C will flow into globaluser and the other users 
(all except A, B, C) will automatically get the result from the training of A, 
B and C.

Do you understand what I mean?


> Does that make sense, or is it completely wacky?
> 
It did not make sense to me. But English is not my native language and it could 
be that I did not understand you right.


> Also, does anyone have any ideas about my other message regarding the
> signature not appearing anywhere? It suddenly disappeared and I don't
> know what setting made it go away! So confused!
> 
> Thanks again,
> Matt
> 
Steve


> Steve wrote:
> > If you want to merge into the global user, then DO NOT delete the old
> data in the globaluser. Just merge into it but don't delete.
> >
> > If you want to start from the beginning, then just delete the old data
> and remerge again from whatever userdata you like.
> >
> > Better would be to use raw mail data and use dspam_train to train the
> globaluser.
> >
> >
> > -------- Original-Nachricht --------
> >   
> >> Datum: Fri, 19 Dec 2008 09:27:46 +0000
> >> Von: Matt Galloway <matt.gallo...@senokian.com>
> >> An: dspam-users@lists.nuclearelephant.com
> >> Betreff: [dspam-users] DSpam global user advice
> >>     
> >
> >   
> >> Hello again,
> >>
> >> I have another question... this one is regarding dspam with a global
> >> user. I am using a merged globaluser like so:
> >> globaluser:merged:*
> >>
> >> This seems to be working correctly (in the logs it states that the
> >> user's data is being merged with globaluser's data) so that's good. And
> >> I created the globaluser by dspam_merge on a few users. Now what I want
> >> to do is "update" this globaluser from another merge of a few users,
> but
> >> what's the best way to go about this? Should I delete all data for the
> >> globaluser first by doing:
> >>
> >> DELETE FROM dspam_token_data WHERE uid='52';
> >> DELETE FROM dspam_stats WHERE uid='52';
> >>
> >> Or should I just go ahead and do the merge?
> >>
> >> Any advice much appreciated.
> >>
> >> Regards,
> >> Matt
> >>
> >>
> >>
> >>     
> >
> >   
> 
> -- 
> 
> Matt Galloway
> Systems Engineer
> 
> Try our easy CRM system - you can sign up for free
> at http://www.tactilecrm.com or read our blog at
> http://www.senokian.com/barking.
> 
> Senokian Solutions Ltd
> Business Innovation Centre
> Binley Business Park
> Coventry
> CV3 2TX
> 
> Coventry Office: 024 76 233 400
> London Office: 0207 183 6677
> Fax: 024 76 233 401
> 
> e: matt.gallo...@senokian.com
> w: http://www.senokian.com
> 
> Registered in England & Wales: 04415783
> VAT Registered: GB 793 8163 86
> 
> 
> 
> 
> 

-- 
Psssst! Schon vom neuen GMX MultiMessenger gehört? Der kann`s mit allen: 
http://www.gmx.net/de/go/multimessenger

!DSPAM:1011,494b88ec150921127855158!


Reply via email to