On Fri, 9 Apr 2010 23:23:16 -0700 Terry Barnum <te...@dop.com> wrote:
> > On Apr 9, 2010, at 7:21 PM, Stevan Bajić wrote: > > > On Fri, 9 Apr 2010 19:00:54 -0700 > > Terry Barnum <te...@dop.com> wrote: > > > >> I've been running DSPAM for approximately 2 weeks and looking at the > >> output of dspam_stats, I'm curious how long training normally takes. > >> > >> A script is run nightly to check .Junk mailboxes for false negatives and > >> .NotJunk mailboxes for false positives and retrains on error. (Richard > >> Valk's http://switch.richard5.net/serverinstall/train.dspam) > >> > >> Here's sample output from dspam_stats -H > >> > >> x...@dop.com: > >> TP True Positives: 0 > >> TN True Negatives: 19 > >> FP False Positives: 0 > >> FN False Negatives: 348 > >> SC Spam Corpusfed: 0 > >> NC Nonspam Corpusfed: 0 > >> TL Training Left: 2481 > >> SHR Spam Hit Rate 0.00% > >> HSR Ham Strike Rate: 0.00% > >> PPV Positive predictive value: 100.00% > >> OCA Overall Accuracy: 5.18% > >> > >> y...@dop.com: > >> TP True Positives: 0 > >> TN True Negatives: 0 > >> FP False Positives: 0 > >> FN False Negatives: 3035 > >> SC Spam Corpusfed: 0 > >> NC Nonspam Corpusfed: 0 > >> TL Training Left: 2500 > >> SHR Spam Hit Rate 0.00% > >> HSR Ham Strike Rate: 100.00% > >> PPV Positive predictive value: 100.00% > >> OCA Overall Accuracy: 0.00% > >> > >> z...@dop.com: > >> TP True Positives: 0 > >> TN True Negatives: 0 > >> FP False Positives: 0 > >> FN False Negatives: 358 > >> SC Spam Corpusfed: 0 > >> NC Nonspam Corpusfed: 0 > >> TL Training Left: 2500 > >> SHR Spam Hit Rate 0.00% > >> HSR Ham Strike Rate: 100.00% > >> PPV Positive predictive value: 100.00% > >> OCA Overall Accuracy: 0.00% > >> > >> te...@dop.com: > >> TP True Positives: 0 > >> TN True Negatives: 3 > >> FP False Positives: 0 > >> FN False Negatives: 5108 > >> SC Spam Corpusfed: 0 > >> NC Nonspam Corpusfed: 0 > >> TL Training Left: 2497 > >> SHR Spam Hit Rate 0.00% > >> HSR Ham Strike Rate: 0.00% > >> PPV Positive predictive value: 100.00% > >> OCA Overall Accuracy: 0.09% > >> > > This all looks to me that you are not using DSPAM at all. Seems to me that > > only the script from http://switch.richard5.net/serverinstall/train.dspam > > is feeding DSPAM with data in your setup. > > Thank you for your help Stevan. My understanding of how this is supposed to > eventually work is DSPAM analyzes and adds a header to email as Innocent or > Spam and the MUA, which is configured to trust the Spam header, moves mail > into the Junk mailbox if DSPAM classified it as Spam. The MUA has its own > Junk filtering and moves mail it considers spam into the Junk mailbox too. So > the nightly script may run across mail in the Junk mailbox that it > mis-classified as Innocent but is actually spam and is retrained as a false > negative. Conversely, if DSPAM incorrectly classifies mail as spam, the user > moves that email from the Junk mailbox into the NotJunk mailbox so the > nightly script can retrain as a false positive. > So what it does is basically what the Dovecot anti-spam plugin does. The plugin however does it in real time while the script you have there does it on a scheduled basis. > DSPAM appears to be correctly adding headers but so far I've seen only > Whitelisted and Innocent. > But how is it possible that you almost have everywhere 0 for TN/TP. If DSPAM would work properly then TP/TN would need to increase every time you get a mail. > >> Is so much "Training Left" normal? Do I have something misconfigured? Will > >> DSPAM start tagging email as SPAM only after 2500 successfully classified > >> emails? > >> > > No. DSPAM is fully functional from day one. The tagging can be turned > > on/off inside dspam.conf or with the preference extension. However... > > turning on/off the tagging has nothing to do with the training left number. > > > > > >> $ dspam --version > >> > >> DSPAM Anti-Spam Suite 3.9.0 (agent/library) > >> > >> Copyright (c) 2002-2009 DSPAM Project > >> http://dspam.sourceforge.net. > >> > >> DSPAM may be copied only under the terms of the GNU General Public License, > >> a copy of which can be found with the DSPAM distribution kit. > >> > >> $ cat /usr/local/dspam.conf | grep -v ^# | grep -v ^$ > >> > >> Home /usr/local/var/dspam > >> StorageDriver /usr/local/lib/dspam/libmysql_drv.dylib > >> TrustedDeliveryAgent "/usr/bin/procmail" > >> DeliveryHost 127.0.0.1 > >> DeliveryPort 10026 > >> DeliveryIdent localhost > >> DeliveryProto SMTP > >> OnFail error > >> Trust root > >> Trust dspam > >> Trust apache > >> Trust mail > >> Trust mailnull > >> Trust smmsp > >> Trust daemon > >> Trust _dspam > >> Trust _postfix > >> Trust _www > >> TrainingMode toe > >> TestConditionalTraining on > >> Feature whitelist > >> Algorithm graham burton > >> Tokenizer osb > >> PValue bcr > >> WebStats on > >> Preference "trainingMode=TOE" # { TOE | TUM | TEFT | NOTRAIN > >> } -> default:teft > >> Preference "spamAction=tag" # { quarantine | tag | deliver > >> } -> default:quarantine > >> Preference "spamSubject=[SPAM]" # { string } -> default:[SPAM] > >> Preference "statisticalSedation=5" # { 0 - 10 } -> default:0 > >> Preference "enableBNR=on" # { on | off } -> default:off > >> Preference "enableWhitelist=on" # { on | off } -> default:on > >> Preference "signatureLocation=headers" # { message | headers } -> > >> default:message > >> Preference "tagSpam=off" # { on | off } > >> Preference "tagNonspam=off" # { on | off } > >> Preference "showFactors=on" # { on | off } -> default:off > >> Preference "optIn=off" # { on | off } > >> Preference "optOut=off" # { on | off } > >> Preference "whitelistThreshold=10" # { Integer } -> default:10 > >> Preference "makeCorpus=off" # { on | off } -> default:off > >> Preference "storeFragments=off" # { on | off } -> default:off > >> Preference "localStore=" # { on | off } -> default:username > >> <---- ** okay to be blank? ** > >> > > Yes > > > > > >> Preference "processorBias=on" # { on | off } -> default:on > >> Preference "fallbackDomain=off" # { on | off } -> default:off > >> Preference "trainPristine=off" # { on | off } -> default:off > >> Preference "optOutClamAV=off" # { on | off } -> default:off > >> Preference "ignoreRBLLookups=off" # { on | off } -> default:off > >> Preference "RBLInoculate=off" # { on | off } -> default:off > >> AllowOverride enableBNR > >> AllowOverride enableWhitelist > >> AllowOverride fallbackDomain > >> AllowOverride ignoreGroups > >> AllowOverride ignoreRBLLookups > >> AllowOverride localStore > >> AllowOverride makeCorpus > >> AllowOverride optIn > >> AllowOverride optOut > >> AllowOverride optOutClamAV > >> AllowOverride processorBias > >> AllowOverride RBLInoculate > >> AllowOverride showFactors > >> AllowOverride signatureLocation > >> AllowOverride spamAction > >> AllowOverride spamSubject > >> AllowOverride statisticalSedation > >> AllowOverride storeFragments > >> AllowOverride tagNonspam > >> AllowOverride tagSpam > >> AllowOverride trainPristine > >> AllowOverride trainingMode > >> AllowOverride whitelistThreshold > >> AllowOverride dailyQuarantineSummary > >> MySQLServer /var/mysql/mysql.sock > >> MySQLUser * > >> MySQLPass * > >> MySQLDb * > >> MySQLCompress false > >> MySQLVirtualTable dspam_virtual_uids > >> MySQLVirtualUIDField uid > >> MySQLVirtualUsernameField username > >> MySQLUIDInSignature on > >> HashRecMax 98317 > >> HashAutoExtend on > >> HashMaxExtents 0 > >> HashExtentSize 49157 > >> HashPctIncrease 10 > >> HashMaxSeek 10 > >> HashConnectionCache 10 > >> Notifications off > >> PurgeSignatures 14 # Stale signatures > >> PurgeNeutral 90 # Tokens with neutralish probabilities > >> PurgeUnused 90 # Unused tokens > >> PurgeHapaxes 30 # Tokens with less than 5 hits (hapaxes) > >> PurgeHits1S 15 # Tokens with only 1 spam hit > >> PurgeHits1I 15 # Tokens with only 1 innocent hit > >> LocalMX 127.0.0.1 > >> SystemLog on > >> UserLog on > >> Opt out > >> ParseToHeaders on > >> ChangeModeOnParse on > >> ChangeUserOnParse full > >> ServerPID /var/run/dspam.pid > >> ServerParameters "--deliver=innocent,spam" > >> ServerIdent "localhost.local" > >> ProcessorURLContext on > >> ProcessorBias on > >> StripRcptDomain off > >> > > What MTA are you using? Postfix? If so could you post your master.conf and > > your main.conf? > > Yes, postfix/dovecot/mysql with virtual users, postgrey, dspam and vacation. > > $ postconf -n > > broken_sasl_auth_clients = yes > command_directory = /opt/local/sbin > config_directory = /opt/local/etc/postfix > daemon_directory = /opt/local/libexec/postfix > data_directory = /opt/local/var/lib/postfix > debug_peer_level = 2 > default_privs = nobody > delay_warning_time = 4h > home_mailbox = Maildir/ > html_directory = no > mail_owner = _postfix > mailq_path = /opt/local/bin/mailq > manpage_directory = /opt/local/share/man > mydestination = $myhostname, localhost.$mydomain, localhost > myhostname = mailbox.dop.com > mynetworks = 192.168.0.0/23, 127.0.0.0/8 > myorigin = $mydomain > newaliases_path = /opt/local/bin/newaliases > proxy_interfaces = 70.167.15.114 > queue_directory = /opt/local/var/spool/postfix > readme_directory = /opt/local/share/postfix/readme > sample_directory = /opt/local/share/postfix/sample > sendmail_path = /opt/local/sbin/sendmail > setgid_group = _postdrop > smtpd_banner = $myhostname ESMTP $mail_name > smtpd_helo_required = yes > smtpd_helo_restrictions = permit_mynetworks, reject_non_fqdn_helo_hostname > smtpd_recipient_restrictions = permit_mynetworks, permit_sasl_authenticated, > reject_non_fqdn_sender, reject_non_fqdn_recipient, > reject_unknown_sender_domain, reject_unknown_recipient_domain, > reject_unauth_pipelining, reject_unauth_destination, > reject_unlisted_recipient, check_helo_access > hash:/opt/local/etc/postfix/helo_checks, check_sender_access > hash:/opt/local/etc/postfix/access_sender, reject_rbl_client > zen.spamhaus.org, reject_rbl_client bl.spamcop.net, check_policy_service > inet:127.0.0.1:60000, check_client_access > pcre:/opt/local/etc/postfix/dspam_filter_access > could you post the content of that /opt/local/etc/postfix/dspam_filter_access file? > smtpd_reject_unlisted_sender = yes > smtpd_sasl_auth_enable = yes > smtpd_sasl_local_domain = $myhostname > smtpd_sasl_path = private/auth > smtpd_sasl_security_options = noanonymous > smtpd_sasl_type = dovecot > smtpd_sender_restrictions = permit_mynetworks, reject_unknown_address > smtpd_tls_cert_file = /opt/local/etc/postfix/ssl/certs/postfix.cert > smtpd_tls_key_file = /opt/local/etc/postfix/ssl/private/postfix.key > smtpd_tls_loglevel = 1 > smtpd_tls_security_level = may > tls_random_source = dev:/dev/urandom > transport_maps = hash:/opt/local/etc/postfix/transport > unknown_local_recipient_reject_code = 550 > virtual_alias_maps = > proxy:mysql:/opt/local/etc/postfix/mysql_virtual_alias_maps.cf > virtual_gid_maps = static:102 > virtual_mailbox_base = /xxxx/xxxx/xxxx/ > virtual_mailbox_domains = > mysql:/opt/local/etc/postfix/mysql_virtual_mailbox_domains.cf > virtual_mailbox_maps = > proxy:mysql:/opt/local/etc/postfix/mysql_virtual_mailbox_maps.cf > virtual_minimum_uid = 102 > virtual_transport = dovecot > virtual_uid_maps = static:102 > > $ cat master.cf | grep -v ^# > > smtp inet n - n - - smtpd > dspam unix - n n - 10 pipe > flags=Ru user=_dspam argv=/usr/local/bin/dspam --deliver=innocent --user > ${recipient} -i -f $sender -- $recipient > submission inet n - n - - smtpd > -o smtpd_enforce_tls=yes > -o smtpd_tls_security_level=encrypt > -o smtpd_sasl_auth_enable=yes > -o smtpd_client_restrictions=permit_sasl_authenticated,reject > -o milter_macro_daemon_name=ORIGINATING > pickup fifo n - n 60 1 pickup > cleanup unix n - n - 0 cleanup > qmgr fifo n - n 300 1 qmgr > tlsmgr unix - - n 1000? 1 tlsmgr > rewrite unix - - n - - trivial-rewrite > bounce unix - - n - 0 bounce > defer unix - - n - 0 bounce > trace unix - - n - 0 bounce > verify unix - - n - 1 verify > flush unix n - n 1000? 0 flush > proxymap unix - - n - - proxymap > proxywrite unix - - n - 1 proxymap > smtp unix - - n - - smtp > relay unix - - n - - smtp > -o smtp_fallback_relay= > showq unix n - n - - showq > error unix - - n - - error > retry unix - - n - - error > discard unix - - n - - discard > local unix - n n - - local > virtual unix - n n - - virtual > lmtp unix - - n - - lmtp > anvil unix - - n - 1 anvil > scache unix - - n - 1 scache > dovecot unix - n n - - pipe > flags=DRhu user=_vmail argv=/opt/local/libexec/dovecot/deliver -f ${sender} > -d ${recipient} > localhost:10026 inet n - n - - smtpd > -o content_filter= > -o > receive_override_options=no_unknown_recipient_checks,no_header_body_checks,no_address_mappings > -o smtpd_helo_restrictions= > -o smtpd_client_restrictions= > -o smtpd_sender_restrictions= > -o smtpd_recipient_restrictions=permit_mynetworks,reject > -o mynetworks=127.0.0.0/8 > -o smtpd_authorized_xforward_hosts=127.0.0.0/8 > vacation unix - n n - - pipe > flags=Rq user=_vacation argv=/opt/local/var/spool/vacation/vacation.pl -f > ${sender} -- ${recipient} > Hmm... that looks to me like you are using FILTER to pass messages to DSPAM. Right? > Thanks, > -Terry > > -- Kind Regards from Switzerland, Stevan Bajić ------------------------------------------------------------------------------ Download Intel® Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev _______________________________________________ Dspam-user mailing list Dspam-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspam-user