Re: [Dspam-user] training time?

Stevan Bajić Fri, 09 Apr 2010 19:32:16 -0700

On Fri, 9 Apr 2010 19:00:54 -0700
Terry Barnum <te...@dop.com> wrote:


> I've been running DSPAM for approximately 2 weeks and looking at the output 
> of dspam_stats, I'm curious how long training normally takes.
> 
> A script is run nightly to check .Junk mailboxes for false negatives and 
> .NotJunk mailboxes for false positives and retrains on error. (Richard Valk's 
> http://switch.richard5.net/serverinstall/train.dspam)
> 
> Here's sample output from dspam_stats -H
> 
> x...@dop.com:
>               TP True Positives:                     0
>               TN True Negatives:                    19
>               FP False Positives:                    0
>               FN False Negatives:                  348
>               SC Spam Corpusfed:                     0
>               NC Nonspam Corpusfed:                  0
>               TL Training Left:                   2481
>               SHR Spam Hit Rate                  0.00%
>               HSR Ham Strike Rate:               0.00%
>               PPV Positive predictive value:   100.00%
>               OCA Overall Accuracy:              5.18%
> 
> y...@dop.com:
>               TP True Positives:                     0
>               TN True Negatives:                     0
>               FP False Positives:                    0
>               FN False Negatives:                 3035
>               SC Spam Corpusfed:                     0
>               NC Nonspam Corpusfed:                  0
>               TL Training Left:                   2500
>               SHR Spam Hit Rate                  0.00%
>               HSR Ham Strike Rate:             100.00%
>               PPV Positive predictive value:   100.00%
>               OCA Overall Accuracy:              0.00%
> 
> z...@dop.com:
>               TP True Positives:                     0
>               TN True Negatives:                     0
>               FP False Positives:                    0
>               FN False Negatives:                  358
>               SC Spam Corpusfed:                     0
>               NC Nonspam Corpusfed:                  0
>               TL Training Left:                   2500
>               SHR Spam Hit Rate                  0.00%
>               HSR Ham Strike Rate:             100.00%
>               PPV Positive predictive value:   100.00%
>               OCA Overall Accuracy:              0.00%
> 
> te...@dop.com:
>               TP True Positives:                     0
>               TN True Negatives:                     3
>               FP False Positives:                    0
>               FN False Negatives:                 5108
>               SC Spam Corpusfed:                     0
>               NC Nonspam Corpusfed:                  0
>               TL Training Left:                   2497
>               SHR Spam Hit Rate                  0.00%
>               HSR Ham Strike Rate:               0.00%
>               PPV Positive predictive value:   100.00%
>               OCA Overall Accuracy:              0.09%
> 
This all looks to me that you are not using DSPAM at all. Seems to me that only 
the script from http://switch.richard5.net/serverinstall/train.dspam is feeding 
DSPAM with data in your setup.


> Is so much "Training Left" normal? Do I have something misconfigured? Will 
> DSPAM start tagging email as SPAM only after 2500 successfully classified 
> emails?
> 
No. DSPAM is fully functional from day one. The tagging can be turned on/off 
inside dspam.conf or with the preference extension. However... turning on/off 
the tagging has nothing to do with the training left number.


> $ dspam --version
> 
> DSPAM Anti-Spam Suite 3.9.0 (agent/library)
> 
> Copyright (c) 2002-2009 DSPAM Project
> http://dspam.sourceforge.net.
> 
> DSPAM may be copied only under the terms of the GNU General Public License,
> a copy of which can be found with the DSPAM distribution kit.
> 
> $ cat /usr/local/dspam.conf | grep -v ^# | grep -v ^$
> 
> Home /usr/local/var/dspam
> StorageDriver /usr/local/lib/dspam/libmysql_drv.dylib
> TrustedDeliveryAgent "/usr/bin/procmail"
> DeliveryHost          127.0.0.1
> DeliveryPort          10026
> DeliveryIdent         localhost
> DeliveryProto         SMTP
> OnFail error
> Trust root
> Trust dspam
> Trust apache
> Trust mail
> Trust mailnull 
> Trust smmsp
> Trust daemon
> Trust _dspam
> Trust _postfix
> Trust _www
> TrainingMode toe
> TestConditionalTraining on
> Feature whitelist
> Algorithm graham burton
> Tokenizer osb
> PValue bcr
> WebStats on
> Preference "trainingMode=TOE"         # { TOE | TUM | TEFT | NOTRAIN } -> 
> default:teft
> Preference "spamAction=tag"           # { quarantine | tag | deliver } -> 
> default:quarantine
> Preference "spamSubject=[SPAM]"               # { string } -> default:[SPAM]
> Preference "statisticalSedation=5"    # { 0 - 10 } -> default:0
> Preference "enableBNR=on"             # { on | off } -> default:off
> Preference "enableWhitelist=on"               # { on | off } -> default:on
> Preference "signatureLocation=headers"        # { message | headers } -> 
> default:message
> Preference "tagSpam=off"              # { on | off }
> Preference "tagNonspam=off"           # { on | off }
> Preference "showFactors=on"           # { on | off } -> default:off
> Preference "optIn=off"                        # { on | off }
> Preference "optOut=off"                       # { on | off }
> Preference "whitelistThreshold=10"    # { Integer } -> default:10
> Preference "makeCorpus=off"           # { on | off } -> default:off
> Preference "storeFragments=off"               # { on | off } -> default:off
> Preference "localStore="              # { on | off } -> default:username  
> <---- ** okay to be blank? **
>
Yes


> Preference "processorBias=on"         # { on | off } -> default:on
> Preference "fallbackDomain=off"               # { on | off } -> default:off
> Preference "trainPristine=off"                # { on | off } -> default:off
> Preference "optOutClamAV=off"         # { on | off } -> default:off
> Preference "ignoreRBLLookups=off"     # { on | off } -> default:off
> Preference "RBLInoculate=off"         # { on | off } -> default:off
> AllowOverride enableBNR
> AllowOverride enableWhitelist
> AllowOverride fallbackDomain
> AllowOverride ignoreGroups
> AllowOverride ignoreRBLLookups
> AllowOverride localStore
> AllowOverride makeCorpus
> AllowOverride optIn
> AllowOverride optOut
> AllowOverride optOutClamAV
> AllowOverride processorBias
> AllowOverride RBLInoculate
> AllowOverride showFactors
> AllowOverride signatureLocation
> AllowOverride spamAction
> AllowOverride spamSubject
> AllowOverride statisticalSedation
> AllowOverride storeFragments
> AllowOverride tagNonspam
> AllowOverride tagSpam
> AllowOverride trainPristine
> AllowOverride trainingMode
> AllowOverride whitelistThreshold
> AllowOverride dailyQuarantineSummary
> MySQLServer           /var/mysql/mysql.sock
> MySQLUser             *
> MySQLPass             *
> MySQLDb                       *
> MySQLCompress         false
> MySQLVirtualTable             dspam_virtual_uids
> MySQLVirtualUIDField          uid
> MySQLVirtualUsernameField     username
> MySQLUIDInSignature   on
> HashRecMax            98317
> HashAutoExtend                on  
> HashMaxExtents                0
> HashExtentSize                49157
> HashPctIncrease               10
> HashMaxSeek           10
> HashConnectionCache   10
> Notifications off
> PurgeSignatures 14    # Stale signatures
> PurgeNeutral  90      # Tokens with neutralish probabilities
> PurgeUnused   90      # Unused tokens
> PurgeHapaxes  30      # Tokens with less than 5 hits (hapaxes)
> PurgeHits1S   15      # Tokens with only 1 spam hit
> PurgeHits1I   15      # Tokens with only 1 innocent hit
> LocalMX 127.0.0.1
> SystemLog     on
> UserLog               on
> Opt out
> ParseToHeaders on
> ChangeModeOnParse on
> ChangeUserOnParse full
> ServerPID             /var/run/dspam.pid
> ServerParameters      "--deliver=innocent,spam"
> ServerIdent           "localhost.local"
> ProcessorURLContext on
> ProcessorBias on
> StripRcptDomain off
> 
What MTA are you using? Postfix? If so could you post your master.conf and your 
main.conf?


-- 
Kind Regards from Switzerland,

Stevan Bajić

------------------------------------------------------------------------------
Download Intel&#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
_______________________________________________
Dspam-user mailing list
Dspam-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspam-user

Re: [Dspam-user] training time?

Reply via email to