Le 16/08/2012 21:49, Stevan Bajić a écrit :
Hello Christophe,
Hello Stevan, thanks for answering, I really appreciate.
Spam AND Ham? Really?
Oh yes. And now I did the same for another user: tmp # dspam_train t...@garault.org spam ham dspam # dspam_stats t...@garault.org TP: 0 TN: 939 FP: 0 FN: 441 SC: 0 NC: 0
dspam # dspam_stats christo...@garault.org TP: 0 TN: 5139 FP: 0 FN: 4868 SC: 0 NC: 0So you have here 5'139 messages that got classified as HAM ([T]rue [N]egative) and you got 4'868 messages that got falsely classified as HAM ([F]alse [N]egative). Somehow this is very, very, very, very strange. How can you make DSPAM to have just TN and FN count after almost processing 10K messages and no singe TP, FN?Can I make a guess? You are using sbph as Tokenizer.
Nice try but it's osb. ;)
Something is fishy on your setup. Can you please post your dspam.conf?
Yeah sure, here it is: dspam # egrep -v "^#.*|^$" /etc/dspam/dspam.conf Home /var/spool/dspam StorageDriver /usr/lib/x86_64-linux-gnu/dspam/libpgsql_drv.so TrustedDeliveryAgent "/usr/bin/procmail" # Linux UntrustedDeliveryAgent "/usr/bin/procmail -d %u" DeliveryHost 127.0.0.1 DeliveryPort 10034 DeliveryIdent localhost DeliveryProto SMTP EnablePlusedDetail on OnFail error Trust root Trust dspam Trusr postfix Trust www-data Trust mail Trust daemon Trust amavis TrainingMode teft TestConditionalTraining on Feature noise Feature whitelist Feature tb=5 Algorithm graham burton Tokenizer osb PValue bcr WebStats on ImprobabilityDrive onPreference "trainingMode=TEFT" # { TOE | TUM | TEFT | NOTRAIN } -> default:teft Preference "spamAction=tag" # { quarantine | tag | deliver } -> default:quarantine
Preference "spamSubject=[SPAM]" # { string } -> default:[SPAM] Preference "statisticalSedation=5" # { 0 - 10 } -> default:0 Preference "enableBNR=on" # { on | off } -> default:off Preference "enableWhitelist=on" # { on | off } -> default:onPreference "signatureLocation=header" # { message | headers } -> default:message
Preference "tagSpam=on" # { on | off } Preference "tagNonspam=off" # { on | off } Preference "showFactors=on" # { on | off } -> default:off Preference "optIn=off" # { on | off } Preference "optOut=on" # { on | off } Preference "whitelistThreshold=20" # { Integer } -> default:10 Preference "makeCorpus=off" # { on | off } -> default:off Preference "storeFragments=off" # { on | off } -> default:off Preference "localStore=" # { on | off } -> default:username Preference "processorBias=on" # { on | off } -> default:on Preference "fallbackDomain=off" # { on | off } -> default:off Preference "trainPristine=off" # { on | off } -> default:off Preference "optOutClamAV=off" # { on | off } -> default:off Preference "ignoreRBLLookups=off" # { on | off } -> default:off Preference "RBLInoculate=off" # { on | off } -> default:off Preference "notifications=off" # { on | off } -> default:off AllowOverride enableBNR AllowOverride enableWhitelist AllowOverride fallbackDomain AllowOverride ignoreGroups AllowOverride ignoreRBLLookups AllowOverride localStore AllowOverride makeCorpus AllowOverride optIn AllowOverride optOut AllowOverride optOutClamAV AllowOverride processorBias AllowOverride RBLInoculate AllowOverride showFactors AllowOverride signatureLocation AllowOverride spamAction AllowOverride spamSubject AllowOverride statisticalSedation AllowOverride storeFragments AllowOverride tagNonspam AllowOverride tagSpam AllowOverride trainPristine AllowOverride trainingMode AllowOverride whitelistThreshold AllowOverride dailyQuarantineSummary AllowOverride notifications IgnoreHeader Accept-Language IgnoreHeader Authentication-Results IgnoreHeader Content-Type IgnoreHeader DKIM-Signature IgnoreHeader Date IgnoreHeader DomainKey-Signature IgnoreHeader Importance IgnoreHeader In-Reply-To IgnoreHeader List-Archive IgnoreHeader List-Help IgnoreHeader List-Id IgnoreHeader List-Post IgnoreHeader List-Subscribe IgnoreHeader List-Unsubscribe IgnoreHeader Message-ID IgnoreHeader Message-Id IgnoreHeader Organization IgnoreHeader Received IgnoreHeader Received-SPF IgnoreHeader References IgnoreHeader Reply-To IgnoreHeader Resent-Date IgnoreHeader Resent-From IgnoreHeader Thread-Index IgnoreHeader Thread-Topic IgnoreHeader User-Agent IgnoreHeader X-policyd-weight IgnoreHeader thread-index PurgeSignature off # Specified in purge.sql PurgeNeutral 90 PurgeUnused off # Specified in purge.sql PurgeHapaxes off # Specified in purge.sql PurgeHits1S off # Specified in purge.sql PurgeHits1I off # Specified in purge.sql LocalMX 127.0.0.1 SystemLog on UserLog on Opt in ParseToHeaders on ChangeModeOnParse on ChangeUserOnParse full MaxMessageSize 26214400 ServerHost 127.0.0.1 ServerPort 10033 ServerQueueSize 32 ServerPID /var/run/dspam/dspam.pid ServerMode auto ServerParameters "--deliver=innocent -d %u" ServerIdent "dspam.garault" ProcessorURLContext on ProcessorBias on StripRcptDomain off Include /etc/dspam/dspam.d/
Strange thing is I don't seem to have 10K messages despite the fact they were given to spam_train:I have now more than 4 million lines in dspam_token_data for this user (me).This is a lot. Just for slightly 10K messages?
spam=# select count(*) from dspam_signature_data; count ------- 5860 dspam=# select count(*) from dspam_token_data; count --------- 4594613
What version of DSPAM is that?
dspam # dspam --version DSPAM Anti-Spam Suite 3.10.1 (agent/library) Copyright (C) 2002-2011 DSPAM Project http://dspam.sourceforge.net. DSPAM may be copied only under the terms of the GNU Affero General Public License, a copy of which can be found with the DSPAM distribution kit.Configuration parameters: '--prefix=/usr' '--includedir=${prefix}/include' '--mandir=${prefix}/share/man' '--infodir=${prefix}/share/info' '--sysconfdir=/etc' '--localstatedir=/var' '--libdir=${prefix}/lib/x86_64-linux-gnu' '--libexecdir=${prefix}/lib/x86_64-linux-gnu' '--disable-maintainer-mode' '--build=x86_64-linux-gnu' '--host=x86_64-linux-gnu' '--sysconfdir=/etc/dspam' '--disable-dependency-tracking' '--enable-split-configuration' '--enable-static' '--enable-external-lookup' '--enable-syslog' '--with-logdir=/var/log/dspam/' '--with-dspam-home=/var/spool/dspam' '--enable-domain-scale' '--with-delivery-agent=/usr/bin/procmail' '--enable-daemon' '--with-mysql-includes=/usr/include/mysql' '--with-pgsql-includes=/usr/include/postgresql' '--with-storage-driver=hash_drv,mysql_drv,pgsql_drv,sqlite3_drv' '--enable-debug' '--enable-virtual-users' '--enable-preferences-extension' '--enable-clamav' 'build_alias=x86_64-linux-gnu' 'host_alias=x86_64-linux-gnu' 'CFLAGS=-g -O2 -fstack-protector --param=ssp-buffer-size=4 -Wformat -Werror=format-security' 'LDFLAGS=-Wl,-z,relro -Wl,-z,defs -Wl,--as-needed' 'CPPFLAGS=-D_FORTIFY_SOURCE=2'
And again thanks for your help Stevan. -- "L'ennui avec les citations sur Internet c'est qu'il est difficile de savoir si elles sont authentiques." -- Napoléon Bonaparte.
smime.p7s
Description: S/MIME Cryptographic Signature
!DSPAM:502dfc7e214741112915171!
------------------------------------------------------------------------------ Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________ Dspam-user mailing list Dspam-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspam-user