Re: [dspam-users] DSpam global user advice

2008-12-19 Thread Steve
If you want to merge into the global user, then DO NOT delete the old data in 
the globaluser. Just merge into it but don't delete.

If you want to start from the beginning, then just delete the old data and 
remerge again from whatever userdata you like.

Better would be to use raw mail data and use dspam_train to train the 
globaluser.


 Original-Nachricht 
 Datum: Fri, 19 Dec 2008 09:27:46 +
 Von: Matt Galloway matt.gallo...@senokian.com
 An: dspam-users@lists.nuclearelephant.com
 Betreff: [dspam-users] DSpam global user advice

 Hello again,
 
 I have another question... this one is regarding dspam with a global
 user. I am using a merged globaluser like so:
 globaluser:merged:*
 
 This seems to be working correctly (in the logs it states that the
 user's data is being merged with globaluser's data) so that's good. And
 I created the globaluser by dspam_merge on a few users. Now what I want
 to do is update this globaluser from another merge of a few users, but
 what's the best way to go about this? Should I delete all data for the
 globaluser first by doing:
 
 DELETE FROM dspam_token_data WHERE uid='52';
 DELETE FROM dspam_stats WHERE uid='52';
 
 Or should I just go ahead and do the merge?
 
 Any advice much appreciated.
 
 Regards,
 Matt
 
 
 

-- 
Sensationsangebot verlängert: GMX FreeDSL - Telefonanschluss + DSL 
für nur 16,37 Euro/mtl.!* http://dsl.gmx.de/?ac=OM.AD.PD003K1308T4569a

!DSPAM:1011,494b6bc2150929955290273!




Re: [dspam-users] DSpam global user advice

2008-12-19 Thread Matt Galloway
Many thanks for the amazingly quick reply!

Yes I used dspam_train first to train one of the users which will be
merged into globaluser and then I merged that user into globaluser.

Basically my setup is that we do mail for a large number of people and
we want the ability to use a globaluser so that everyone has training
data, but then the global user will be updated every month with the
training data of 3 users (who get large amounts of spam). Therefore I
want to make sure that at each merge step, it's not just adding the
users, but rather starting again for globaluser.

Does that make sense, or is it completely wacky?

Also, does anyone have any ideas about my other message regarding the
signature not appearing anywhere? It suddenly disappeared and I don't
know what setting made it go away! So confused!

Thanks again,
Matt

Steve wrote:
 If you want to merge into the global user, then DO NOT delete the old data in 
 the globaluser. Just merge into it but don't delete.

 If you want to start from the beginning, then just delete the old data and 
 remerge again from whatever userdata you like.

 Better would be to use raw mail data and use dspam_train to train the 
 globaluser.


  Original-Nachricht 
   
 Datum: Fri, 19 Dec 2008 09:27:46 +
 Von: Matt Galloway matt.gallo...@senokian.com
 An: dspam-users@lists.nuclearelephant.com
 Betreff: [dspam-users] DSpam global user advice
 

   
 Hello again,

 I have another question... this one is regarding dspam with a global
 user. I am using a merged globaluser like so:
 globaluser:merged:*

 This seems to be working correctly (in the logs it states that the
 user's data is being merged with globaluser's data) so that's good. And
 I created the globaluser by dspam_merge on a few users. Now what I want
 to do is update this globaluser from another merge of a few users, but
 what's the best way to go about this? Should I delete all data for the
 globaluser first by doing:

 DELETE FROM dspam_token_data WHERE uid='52';
 DELETE FROM dspam_stats WHERE uid='52';

 Or should I just go ahead and do the merge?

 Any advice much appreciated.

 Regards,
 Matt



 

   

-- 

Matt Galloway
Systems Engineer

Try our easy CRM system - you can sign up for free
at http://www.tactilecrm.com or read our blog at
http://www.senokian.com/barking.

Senokian Solutions Ltd
Business Innovation Centre
Binley Business Park
Coventry
CV3 2TX

Coventry Office: 024 76 233 400
London Office: 0207 183 6677
Fax: 024 76 233 401

e: matt.gallo...@senokian.com
w: http://www.senokian.com

Registered in England  Wales: 04415783
VAT Registered: GB 793 8163 86



!DSPAM:1011,494b6fb9150926488221064!




Re: [dspam-users] DSpam global user advice

2008-12-19 Thread Steve
 Original-Nachricht 
 Datum: Fri, 19 Dec 2008 09:56:04 +
 Von: Matt Galloway matt.gallo...@senokian.com
 An: dspam-users@lists.nuclearelephant.com
 Betreff: Re: [dspam-users] DSpam global user advice

 Many thanks for the amazingly quick reply!
 
 Yes I used dspam_train first to train one of the users which will be
 merged into globaluser and then I merged that user into globaluser.
 
 Basically my setup is that we do mail for a large number of people and
 we want the ability to use a globaluser so that everyone has training
 data, but then the global user will be updated every month with the
 training data of 3 users (who get large amounts of spam). Therefore I
 want to make sure that at each merge step, it's not just adding the
 users, but rather starting again for globaluser.
 
I don't understand that. Allow me to rephrase:
User A: We call him A
User B: We call him B
User C: We call him C
User D: We call him D
User E: We call him E
User F: We call him F
merged group name: globaluser

Now you have merged A + B + C into globaluser
globaluser is the merged group name and it's used for all users

Now during a time range user A, B and C get a lot of SPAM and you want their 
training to be included into globaluser. Right?

Or do you want just to take the data from A + B + C and then recreate the 
globaluser every month (or so)?


The problem with the second way is:

1) Initial
A has 1000 tokens
B has 2000 tokens
C has 3000 tokens
globaluser has 0 tokens

2) Merging
A + B + C ~ 4500 tokens (Just an assumption. Less because some tokens are 
probably found in A and in B and in C)
This results into 4500 tokens for globaluser

3) Setting globaluser as merged group for all users
All the users have now out of the box at least 4500 tokens

4) Running for one month
A get's new 200 tokens
B get's new 100 tokens
C get's new 50 tokens

5) You do your daily purging and cleaning of the data
A had 1000 (from item 1) + 200 (from item 4) tokens - after purging he has - 
500 tokens
B had 2000 (from item 1) + 100 (from item 4) tokens - after purging he has - 
1000 tokens
C had 3000 (from item 1) + 50 (from item 4) tokens - after purging he has - 
1500 tokens
globaluser still has 4500 and you hopefully have excluded globaluser from 
purging

6) You recreate globaluser from A + B + C
Now do the math. What do you think will you get into globaluser? More or less 
tokens then it had before? You will get LESS because A + B + C just has 3000 
tokens including the doubles. And before you had alone 4500 tokens for 
globaluser.


So it is a very bad idea to just erase globaluser in each run. And you have to 
keep in mind that from that moment where you activate the merged group, all 
user tokens are just the delta between there tokens and the tokens from the 
globaluser at the time when DSPAM calculated the tokens. So erasing the tokens 
in each merger run in globaluser is NOT making DSPAM stronger. It is making it 
weaker!

Better would be to train allone the global user. For example by activating 
corpus creation for A, B, C and then do on a regular basis a training of 
globaluser from the data you collect at A, B and C. Or you exclude A, B, C from 
the merged global group and add A, B and C into a merged and managed group. 
Then training from A, B and C will flow into globaluser and the other users 
(all except A, B, C) will automatically get the result from the training of A, 
B and C.

Do you understand what I mean?


 Does that make sense, or is it completely wacky?
 
It did not make sense to me. But English is not my native language and it could 
be that I did not understand you right.


 Also, does anyone have any ideas about my other message regarding the
 signature not appearing anywhere? It suddenly disappeared and I don't
 know what setting made it go away! So confused!
 
 Thanks again,
 Matt
 
Steve


 Steve wrote:
  If you want to merge into the global user, then DO NOT delete the old
 data in the globaluser. Just merge into it but don't delete.
 
  If you want to start from the beginning, then just delete the old data
 and remerge again from whatever userdata you like.
 
  Better would be to use raw mail data and use dspam_train to train the
 globaluser.
 
 
   Original-Nachricht 

  Datum: Fri, 19 Dec 2008 09:27:46 +
  Von: Matt Galloway matt.gallo...@senokian.com
  An: dspam-users@lists.nuclearelephant.com
  Betreff: [dspam-users] DSpam global user advice
  
 

  Hello again,
 
  I have another question... this one is regarding dspam with a global
  user. I am using a merged globaluser like so:
  globaluser:merged:*
 
  This seems to be working correctly (in the logs it states that the
  user's data is being merged with globaluser's data) so that's good. And
  I created the globaluser by dspam_merge on a few users. Now what I want
  to do is update this globaluser from another merge of a few users,
 but
  what's the best way to go about this? Should I delete all data