Le 16/08/2012 21:49, Stevan Bajić a écrit :
Hello Christophe,
Hello Stevan, thanks for answering, I really appreciate.
Spam AND Ham? Really?
Oh yes. And now I did the same for another user: tmp # dspam_train [email protected] spam ham dspam # dspam_stats [email protected] TP: 0 TN: 939 FP: 0 FN: 441 SC: 0 NC: 0
dspam # dspam_stats [email protected] TP: 0 TN: 5139 FP: 0 FN: 4868 SC: 0 NC: 0So you have here 5'139 messages that got classified as HAM ([T]rue [N]egative) and you got 4'868 messages that got falsely classified as HAM ([F]alse [N]egative). Somehow this is very, very, very, very strange. How can you make DSPAM to have just TN and FN count after almost processing 10K messages and no singe TP, FN?Can I make a guess? You are using sbph as Tokenizer.
Nice try but it's osb. ;)
Something is fishy on your setup. Can you please post your dspam.conf?
Yeah sure, here it is: dspam # egrep -v "^#.*|^$" /etc/dspam/dspam.conf Home /var/spool/dspam StorageDriver /usr/lib/x86_64-linux-gnu/dspam/libpgsql_drv.so TrustedDeliveryAgent "/usr/bin/procmail" # Linux UntrustedDeliveryAgent "/usr/bin/procmail -d %u" DeliveryHost 127.0.0.1 DeliveryPort 10034 DeliveryIdent localhost DeliveryProto SMTP EnablePlusedDetail on OnFail error Trust root Trust dspam Trusr postfix Trust www-data Trust mail Trust daemon Trust amavis TrainingMode teft TestConditionalTraining on Feature noise Feature whitelist Feature tb=5 Algorithm graham burton Tokenizer osb PValue bcr WebStats on ImprobabilityDrive onPreference "trainingMode=TEFT" # { TOE | TUM | TEFT | NOTRAIN } -> default:teft Preference "spamAction=tag" # { quarantine | tag | deliver } -> default:quarantine
Preference "spamSubject=[SPAM]" # { string } -> default:[SPAM]
Preference "statisticalSedation=5" # { 0 - 10 } -> default:0
Preference "enableBNR=on" # { on | off } -> default:off
Preference "enableWhitelist=on" # { on | off } -> default:on
Preference "signatureLocation=header" # { message | headers } ->
default:message
Preference "tagSpam=on" # { on | off }
Preference "tagNonspam=off" # { on | off }
Preference "showFactors=on" # { on | off } -> default:off
Preference "optIn=off" # { on | off }
Preference "optOut=on" # { on | off }
Preference "whitelistThreshold=20" # { Integer } -> default:10
Preference "makeCorpus=off" # { on | off } -> default:off
Preference "storeFragments=off" # { on | off } -> default:off
Preference "localStore=" # { on | off } -> default:username
Preference "processorBias=on" # { on | off } -> default:on
Preference "fallbackDomain=off" # { on | off } -> default:off
Preference "trainPristine=off" # { on | off } -> default:off
Preference "optOutClamAV=off" # { on | off } -> default:off
Preference "ignoreRBLLookups=off" # { on | off } -> default:off
Preference "RBLInoculate=off" # { on | off } -> default:off
Preference "notifications=off" # { on | off } -> default:off
AllowOverride enableBNR
AllowOverride enableWhitelist
AllowOverride fallbackDomain
AllowOverride ignoreGroups
AllowOverride ignoreRBLLookups
AllowOverride localStore
AllowOverride makeCorpus
AllowOverride optIn
AllowOverride optOut
AllowOverride optOutClamAV
AllowOverride processorBias
AllowOverride RBLInoculate
AllowOverride showFactors
AllowOverride signatureLocation
AllowOverride spamAction
AllowOverride spamSubject
AllowOverride statisticalSedation
AllowOverride storeFragments
AllowOverride tagNonspam
AllowOverride tagSpam
AllowOverride trainPristine
AllowOverride trainingMode
AllowOverride whitelistThreshold
AllowOverride dailyQuarantineSummary
AllowOverride notifications
IgnoreHeader Accept-Language
IgnoreHeader Authentication-Results
IgnoreHeader Content-Type
IgnoreHeader DKIM-Signature
IgnoreHeader Date
IgnoreHeader DomainKey-Signature
IgnoreHeader Importance
IgnoreHeader In-Reply-To
IgnoreHeader List-Archive
IgnoreHeader List-Help
IgnoreHeader List-Id
IgnoreHeader List-Post
IgnoreHeader List-Subscribe
IgnoreHeader List-Unsubscribe
IgnoreHeader Message-ID
IgnoreHeader Message-Id
IgnoreHeader Organization
IgnoreHeader Received
IgnoreHeader Received-SPF
IgnoreHeader References
IgnoreHeader Reply-To
IgnoreHeader Resent-Date
IgnoreHeader Resent-From
IgnoreHeader Thread-Index
IgnoreHeader Thread-Topic
IgnoreHeader User-Agent
IgnoreHeader X-policyd-weight
IgnoreHeader thread-index
PurgeSignature off # Specified in purge.sql
PurgeNeutral 90
PurgeUnused off # Specified in purge.sql
PurgeHapaxes off # Specified in purge.sql
PurgeHits1S off # Specified in purge.sql
PurgeHits1I off # Specified in purge.sql
LocalMX 127.0.0.1
SystemLog on
UserLog on
Opt in
ParseToHeaders on
ChangeModeOnParse on
ChangeUserOnParse full
MaxMessageSize 26214400
ServerHost 127.0.0.1
ServerPort 10033
ServerQueueSize 32
ServerPID /var/run/dspam/dspam.pid
ServerMode auto
ServerParameters "--deliver=innocent -d %u"
ServerIdent "dspam.garault"
ProcessorURLContext on
ProcessorBias on
StripRcptDomain off
Include /etc/dspam/dspam.d/
Strange thing is I don't seem to have 10K messages despite the fact they were given to spam_train:I have now more than 4 million lines in dspam_token_data for this user (me).This is a lot. Just for slightly 10K messages?
spam=# select count(*) from dspam_signature_data; count ------- 5860 dspam=# select count(*) from dspam_token_data; count --------- 4594613
What version of DSPAM is that?
dspam # dspam --version DSPAM Anti-Spam Suite 3.10.1 (agent/library) Copyright (C) 2002-2011 DSPAM Project http://dspam.sourceforge.net. DSPAM may be copied only under the terms of the GNU Affero General Public License, a copy of which can be found with the DSPAM distribution kit.Configuration parameters: '--prefix=/usr' '--includedir=${prefix}/include' '--mandir=${prefix}/share/man' '--infodir=${prefix}/share/info' '--sysconfdir=/etc' '--localstatedir=/var' '--libdir=${prefix}/lib/x86_64-linux-gnu' '--libexecdir=${prefix}/lib/x86_64-linux-gnu' '--disable-maintainer-mode' '--build=x86_64-linux-gnu' '--host=x86_64-linux-gnu' '--sysconfdir=/etc/dspam' '--disable-dependency-tracking' '--enable-split-configuration' '--enable-static' '--enable-external-lookup' '--enable-syslog' '--with-logdir=/var/log/dspam/' '--with-dspam-home=/var/spool/dspam' '--enable-domain-scale' '--with-delivery-agent=/usr/bin/procmail' '--enable-daemon' '--with-mysql-includes=/usr/include/mysql' '--with-pgsql-includes=/usr/include/postgresql' '--with-storage-driver=hash_drv,mysql_drv,pgsql_drv,sqlite3_drv' '--enable-debug' '--enable-virtual-users' '--enable-preferences-extension' '--enable-clamav' 'build_alias=x86_64-linux-gnu' 'host_alias=x86_64-linux-gnu' 'CFLAGS=-g -O2 -fstack-protector --param=ssp-buffer-size=4 -Wformat -Werror=format-security' 'LDFLAGS=-Wl,-z,relro -Wl,-z,defs -Wl,--as-needed' 'CPPFLAGS=-D_FORTIFY_SOURCE=2'
And again thanks for your help Stevan. -- "L'ennui avec les citations sur Internet c'est qu'il est difficile de savoir si elles sont authentiques." -- Napoléon Bonaparte.
smime.p7s
Description: S/MIME Cryptographic Signature
!DSPAM:502dfc7e214741112915171!
------------------------------------------------------------------------------ Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________ Dspam-user mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/dspam-user
