On Thu, 29 Apr 2010 15:33:28 -0400
Ed Szynaka <szyn...@localnet.com> wrote:

> Hello,
>     I'm trying to figure out if a corpus classification is working 
> properly.  I've setup a Global Group as described in the README but it 
> doesn't appear to be working properly.  I have a user named corpususer 
> and have trained it on 120756 spam and 20581 ham. 
> 
> I've added the following to the group file:
> corpususer:classification:*corpususer
> 
> When sending mail to a test user it does not appear to mark any mail as 
> spam.  Looking at the debug output there is also no indication that the 
> corpususer tokens are being consulted.  (I've included the debug outputs 
> below)
> 
> As a test I setup a merged group with the same corpususer by adding this 
> to the group file:
> corpususer:merged:*
> 
> This appears to classify mail better and the debug output (listed below) 
> also shows the corpususer being consulted.  At the moment I am using the 
> merged group but would much prefer to use the classification group 
> becuase I don't like that because I'm using a large corpus which would 
> override anything the user did to train their own email.
> 
> Trying to combine the 2 ideas above I tried this in the group file:
> corpususer:classification:*
> 
> But unfortunately this causes a double free or corruption error in glibc 
> when trying to classify any message.  (I saw a ticket on that 
> http://sourceforge.net/tracker/?func=detail&atid=1126467&aid=2990455&group_id=250683
>  
> <http://sourceforge.net/tracker/?func=detail&atid=1126467&aid=2990455&group_id=250683>
>  
> and will be posting to that ticket right after this email)
> 
This issue is fixed in GIT repository. Check out and try again.


> My questions are:
> How should a Global Group be setup to get the results describe in the 
> README?
> Is there any way to tell if a Global Group is being used?
> 
A global group is many things in DSPAM. You mean a "classification" group. 
Right? So the question should be: How to check if a classification group is 
working.

Can you post the output of:
dspam_stats -H testu...@testdomain.com
dspam_admin ag pref testu...@testdomain.com
dspam_admin ag pref default
sed "/^[\t ]*#\|^[\t ]*$/d" /path/to/your/dspam.conf


> I'm using dspam 3.9.0 with a postgresql backend, compiled from source 
> with the following options:
> ../configure --prefix=/usr/local/dspam --sysconfdir=/usr/local/dspam/etc 
> --with-storage-driver=mysql_drv,pgsql_drv 
> --with-mysql-includes=/usr/include/mysql 
> --with-pgsql-includes=/usr/include/postgresql --enable-daemon 
> --enable-debug -
> -enable-virtual-users --enable-preferences-extension --enable-clamav
> 
> Thanks,
> Ed
> 
> 
> corpususer:classification:*corpususer debug output:
> > 6565: [04/29/2010 14:47:37] No QuarantineAgent option found. Using 
> > standard quarantine.
> > 6565: [04/29/2010 14:47:37] DSPAM Instance Startup
> > 6565: [04/29/2010 14:47:37] input args: /usr/local/dspam/bin/dspam 
> > --stdout --deliver=innocent,spam --user testu...@testdomain.com --debug
> > 6565: [04/29/2010 14:47:37] pass-thru args:
> > 6565: [04/29/2010 14:47:37] processing user testu...@testdomain.com
> > 6565: [04/29/2010 14:47:37] uid = 0, euid = 0, gid = 0, egid = 8
> > 6565: [04/29/2010 14:47:37] loading preferences for user 
> > testu...@testdomain.com
> > 6565: [04/29/2010 14:47:37] _pgsql_drv_getpwnam: successful returning 
> > struct for name: testu...@testdomain.com
> > 6565: [04/29/2010 14:47:37] Loading preferences for uid 3856
> > 6565: [04/29/2010 14:47:37] Loading preferences for uid 0
> > 6565: [04/29/2010 14:47:37] Loading preferences for uid 0
> > 6565: [04/29/2010 14:47:37] default preferences empty. reverting to 
> > dspam.conf preferences.
> > 6565: [04/29/2010 14:47:37] Loading preferences from dspam.conf
> > 6565: [04/29/2010 14:47:37] using 
> > /usr/local/dspam/var/dspam/opt-in/testu...@testdomain.com.dspam as path
> > 6565: [04/29/2010 14:47:37] using 
> > /usr/local/dspam/var/dspam/opt-out/testu...@testdomain.com.nodspam as path
> > 6565: [04/29/2010 14:47:37] sedation level set to: 5
> > 6565: [04/29/2010 14:47:37] _pgsql_drv_getpwnam: successful returning 
> > struct for name: testu...@testdomain.com
> > 6565: [04/29/2010 14:47:37] _pgsql_drv_getpwnam returning cached name 
> > testu...@testdomain.com.
> > 6565: [04/29/2010 14:47:39] Loading 7 BNR patterns
> > 6565: [04/29/2010 14:47:39] _pgsql_drv_getpwnam returning cached name 
> > testu...@testdomain.com.
> > 6565: [04/29/2010 14:47:39] Whitelist threshold: 10
> > <snip tokens>
> > 6565: [04/29/2010 14:47:39] Graham-Bayesian Probability: 0.002278 
> > Samples: 15
> > 6565: [04/29/2010 14:47:39] Burton-Bayesian Probability: 0.000018 
> > Samples: 27
> > 6565: [04/29/2010 14:47:39] no factors specified; using default
> > 6565: [04/29/2010 14:47:39] Result Confidence: 1.00
> > 6565: [04/29/2010 14:47:39] _pgsql_drv_getpwnam returning cached name 
> > testu...@testdomain.com.
> > 6565: [04/29/2010 14:47:39] Control: [10 10] [10 11] Delta: [0 1]
> > 6565: [04/29/2010 14:47:40] total processing time: 3.01203s
> > 6565: [04/29/2010 14:47:40] _pgsql_drv_getpwnam returning cached name 
> > testu...@testdomain.com.
> > 6565: [04/29/2010 14:47:40] _pgsql_drv_getpwnam returning cached name 
> > testu...@testdomain.com.
> > 6565: [04/29/2010 14:47:40] saving signature as 
> > 3856,4bd9d44c65657166613715
> > 6565: [04/29/2010 14:47:40] _pgsql_drv_getpwnam returning cached name 
> > testu...@testdomain.com.
> > 6565: [04/29/2010 14:47:40] libdspam returned probability of 0.002278
> > 6565: [04/29/2010 14:47:40] message result: NOT SPAM
> > 6565: [04/29/2010 14:47:40] _pgsql_drv_getpwnam returning cached name 
> > testu...@testdomain.com.
> > 6565: [04/29/2010 14:47:40] delivering message
> > 6565: [04/29/2010 14:47:40] DSPAM Instance Shutdown.  Exit Code: 
> > 0                   
> 
> corpususer:merged:* output:
> > 21270: [04/29/2010 13:58:49] No QuarantineAgent option found. Using 
> > standard quarantine.
> > 21270: [04/29/2010 13:58:49] DSPAM Instance Startup
> > 21270: [04/29/2010 13:58:49] input args: /usr/local/dspam/bin/dspam 
> > --stdout --deliver=innocent,spam --user testu...@testdomain.com --debug
> > 21270: [04/29/2010 13:58:49] pass-thru args:
> > 21270: [04/29/2010 13:58:49] processing user testu...@testdomain.com
> > 21270: [04/29/2010 13:58:49] uid = 0, euid = 0, gid = 0, egid = 8
> > 21270: [04/29/2010 13:58:49] loading preferences for user 
> > testu...@testdomain.com
> > 21270: [04/29/2010 13:58:49] _pgsql_drv_getpwnam: successful returning 
> > struct for name: testu...@testdomain.com
> > 21270: [04/29/2010 13:58:49] Loading preferences for uid 3856
> > 21270: [04/29/2010 13:58:49] Loading preferences for uid 0
> > 21270: [04/29/2010 13:58:49] Loading preferences for uid 0
> > 21270: [04/29/2010 13:58:49] default preferences empty. reverting to 
> > dspam.conf preferences.
> > 21270: [04/29/2010 13:58:49] Loading preferences from dspam.conf
> > 21270: [04/29/2010 13:58:49] using 
> > /usr/local/dspam/var/dspam/opt-in/testu...@testdomain.com.dspam as path
> > 21270: [04/29/2010 13:58:49] using 
> > /usr/local/dspam/var/dspam/opt-out/testu...@testdomain.com.nodspam as path
> > 21270: [04/29/2010 13:58:49] adding user to merged group corpususer
> > 21270: [04/29/2010 13:58:49] sedation level set to: 5
> > 21270: [04/29/2010 13:58:49] _pgsql_drv_getpwnam: successful returning 
> > struct for name: testu...@testdomain.com
> > 21270: [04/29/2010 13:58:49] _pgsql_drv_getpwnam: successful returning 
> > struct for name: corpususer
> > 21270: [04/29/2010 13:58:49] _pgsql_drv_getpwnam: successful returning 
> > struct for name: testu...@testdomain.com
> > 21270: [04/29/2010 13:58:49] _pgsql_drv_getpwnam: successful returning 
> > struct for name: corpususer
> > 21270: [04/29/2010 13:58:58] Loading 1033 BNR patterns
> > 21270: [04/29/2010 13:58:58] _pgsql_drv_getpwnam: successful returning 
> > struct for name: testu...@testdomain.com
> > 21270: [04/29/2010 13:58:58] _pgsql_drv_getpwnam: successful returning 
> > struct for name: corpususer
> > 21270: [04/29/2010 13:58:58] Whitelist threshold: 10
> > <snip tokens>
> > 21270: [04/29/2010 13:58:58] Graham-Bayesian Probability: 0.000000 
> > Samples: 15
> > 21270: [04/29/2010 13:58:58] Burton-Bayesian Probability: 0.000000 
> > Samples: 27
> > 21270: [04/29/2010 13:58:58] no factors specified; using default
> > 21270: [04/29/2010 13:58:58] Result Confidence: 0.99
> > 21270: [04/29/2010 13:58:58] _pgsql_drv_getpwnam: successful returning 
> > struct for name: testu...@testdomain.com
> > 21270: [04/29/2010 13:58:58] Control: [10 10] [10 11] Delta: [0 1]
> > 21270: [04/29/2010 13:58:58] total processing time: 9.26809s
> > 21270: [04/29/2010 13:58:58] _pgsql_drv_getpwnam returning cached name 
> > testu...@testdomain.com.
> > 21270: [04/29/2010 13:58:58] _pgsql_drv_getpwnam returning cached name 
> > testu...@testdomain.com.
> > 21270: [04/29/2010 13:58:58] saving signature as 
> > 3856,4bd9c8e2212701461562737
> > 21270: [04/29/2010 13:58:58] _pgsql_drv_getpwnam returning cached name 
> > testu...@testdomain.com.
> > 21270: [04/29/2010 13:58:58] libdspam returned probability of 0.000000
> > 21270: [04/29/2010 13:58:58] message result: NOT SPAM
> > 21270: [04/29/2010 13:58:58] _pgsql_drv_getpwnam returning cached name 
> > testu...@testdomain.com.
> > 21270: [04/29/2010 13:58:58] delivering message
> > 21270: [04/29/2010 13:58:58] DSPAM Instance Shutdown.  Exit Code: 0
> 
> -- 
> Ed Szynaka
> Network/Systems Manager
> LocalNet Corp./CoreComm Internet Services
> 

------------------------------------------------------------------------------
_______________________________________________
Dspam-user mailing list
Dspam-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspam-user

Reply via email to