Hello again,

and thank you all very much for helping me with my problem. I'm really
happy to have support on this one. But sadly my configuration now
don't work as planned. I'm going to provide more information now so
hopefully someone can find the mistake I made here. Sorry in advance
for this long e-mail!

On 2012/9/6, Stevan Bajić wrote:
> uhhh... TEFT is probably the problem. Switching to TOE (or TUM) would be
> better. But just switching will not solve your issue you have. You would
> really need to switch the mode and start from scratch again (cleaning
> all tokens and all statistical data too).

I just cleared all the tokens and statistical data. I started from
scratch as you said with an empty database.

> Instruct the anti-spam plugin from dovecot to not retrain by using the
> source 'error' but by using the source 'inoculation'. Inoculations are
> heavier weighted than normal trainings.

Thank you for the advice. I have now the following configuration
entries (just listing the core ones that I changed here):

Dspam configuration:
----- dspam.conf -----

TrainingMode toe
Feature whitelist noise
Algorithm graham burton
Tokenizer osb
PValue bcr

----------------------

Dovecot configuration:
----- 90-plugin.conf -----

plugin {
antispam_spam = SPAM
antispam_trash_pattern_ignorecase = TRASH;DELETED
antispam_backend = dspam
antispam_dspam_binary = /services/dspam/bin/dspamc
antispam_dspam_args = --source=inoculation;--signature=%%s
antispam_dspam_spam = --class=spam
antispam_dspam_notspam = --class=innocent
antispam_signature = X-DSPAM-Signature
}

--------------------------

But now every mail will get the 'X-DSPAM-Result: Innocent' even though
I'm moving them into the SPAM folder and DSPAM is processing that.
Maybe you can help me out here if I post some logging. The DSPAM log
after I send a message with a little spam text to my account with an
empty DSPAM database:


20507: [09/07/2012 15:03:39] connection id 5 from 123.112.211.12.
20507: [09/07/2012 15:03:39] checking trusted user list for root(0)
20507: [09/07/2012 15:03:39] No QuarantineAgent option found. Using
standard quarantine.
20507: [09/07/2012 15:03:40] using database handle id 5
20507: [09/07/2012 15:03:40] handle locked
20507: [09/07/2012 15:03:40] DSPAM Instance Startup
20507: [09/07/2012 15:03:40] input args: dspam --deliver=innocent,spam
20507: [09/07/2012 15:03:40] pass-thru args:
20507: [09/07/2012 15:03:40] processing user hanno
20507: [09/07/2012 15:03:40] uid = 0, euid = 0, gid = 0, egid = 6
20507: [09/07/2012 15:03:40] loading preferences for user hanno
20507: [09/07/2012 15:03:40] default preferences empty. reverting to
dspam.conf preferences.
20507: [09/07/2012 15:03:40] Loading preferences from dspam.conf
20507: [09/07/2012 15:03:40] using /mail/.dspam/opt-in/hanno.dspam as path
20507: [09/07/2012 15:03:40] using /mail/.dspam/opt-out/hanno.nodspam as path
20507: [09/07/2012 15:03:40] sedation level set to: 0
20507: [09/07/2012 15:03:40] Loading 4 BNR patterns
20507: [09/07/2012 15:03:40] [graham] [0.400000] Staaten+und (1frq, 0s, 0i)
20507: [09/07/2012 15:03:40] [burton] [0.400000] Staaten+und (1frq, 0s, 0i)
20507: [09/07/2012 15:03:40] [graham] [0.400000] zu+#+#+werden (1frq, 0s, 0i)
20507: [09/07/2012 15:03:40] [burton] [0.400000] zu+#+#+werden (1frq, 0s, 0i)
20507: [09/07/2012 15:03:40] [graham] [0.400000] Jahre+#+auf (1frq, 0s, 0i)
20507: [09/07/2012 15:03:40] [burton] [0.400000] Jahre+#+auf (1frq, 0s, 0i)
20507: [09/07/2012 15:03:40] [graham] [0.400000]
Subject*Geld+verdienen (1frq, 0s, 0i)
20507: [09/07/2012 15:03:40] [burton] [0.400000]
Subject*Geld+verdienen (1frq, 0s, 0i)
20507: [09/07/2012 15:03:40] [graham] [0.400000] in+#+#+#+dank (1frq, 0s, 0i)
20507: [09/07/2012 15:03:40] [burton] [0.400000] in+#+#+#+dank (1frq, 0s, 0i)
20507: [09/07/2012 15:03:40] [graham] [0.400000] 6+#+#+pro (1frq, 0s, 0i)
20507: [09/07/2012 15:03:40] [burton] [0.400000] 6+#+#+pro (1frq, 0s, 0i)
20507: [09/07/2012 15:03:40] [graham] [0.400000] Kunden+#+#+#+Welt
(1frq, 0s, 0i)
20507: [09/07/2012 15:03:40] [burton] [0.400000] Kunden+#+#+#+Welt
(1frq, 0s, 0i)
20507: [09/07/2012 15:03:40] [graham] [0.400000] die+#+#+mit (1frq, 0s, 0i)
20507: [09/07/2012 15:03:40] [burton] [0.400000] die+#+#+mit (1frq, 0s, 0i)
20507: [09/07/2012 15:03:40] [graham] [0.400000] From*Hanno+#+hanno
(1frq, 2s, 1i)
20507: [09/07/2012 15:03:40] [burton] [0.400000] From*Hanno+#+hanno
(1frq, 2s, 1i)
20507: [09/07/2012 15:03:40] [graham] [0.400000] Bremen+#+#+#+218 (1frq, 0s, 0i)
20507: [09/07/2012 15:03:40] [burton] [0.400000] Bremen+#+#+#+218 (1frq, 0s, 0i)
20507: [09/07/2012 15:03:40] [graham] [0.400000] Agenten+#+Ihrem (1frq, 0s, 0i)
20507: [09/07/2012 15:03:40] [burton] [0.400000] Agenten+#+Ihrem (1frq, 0s, 0i)
20507: [09/07/2012 15:03:40] [graham] [0.400000] Unsere+russische (1frq, 0s, 0i)
20507: [09/07/2012 15:03:40] [burton] [0.400000] Unsere+russische (1frq, 0s, 0i)
20507: [09/07/2012 15:03:40] [graham] [0.400000] zu+verdienen (1frq, 0s, 0i)
20507: [09/07/2012 15:03:40] [burton] [0.400000] zu+verdienen (1frq, 0s, 0i)
20507: [09/07/2012 15:03:40] [graham] [0.400000] pro+#+Hanno (1frq, 0s, 0i)
20507: [09/07/2012 15:03:40] [burton] [0.400000] pro+#+Hanno (1frq, 0s, 0i)
20507: [09/07/2012 15:03:40] [graham] [0.400000] werden+#+#+ist (1frq, 0s, 0i)
20507: [09/07/2012 15:03:40] [burton] [0.400000] werden+#+#+ist (1frq, 0s, 0i)
20507: [09/07/2012 15:03:40] [burton] [0.400000] Weltboersen+#+als
(1frq, 0s, 0i)
20507: [09/07/2012 15:03:40] [burton] [0.400000] bringt+#+#+zaehlt
(1frq, 0s, 0i)
20507: [09/07/2012 15:03:40] [burton] [0.400000] mehr+#+7000 (1frq, 0s, 0i)
20507: [09/07/2012 15:03:40] [burton] [0.400000] die+#+auf (1frq, 0s, 0i)
20507: [09/07/2012 15:03:40] [burton] [0.400000] Gewinn+#+#+bringt
(1frq, 0s, 0i)
20507: [09/07/2012 15:03:40] [burton] [0.400000] Laender+#+#+nach (1frq, 0s, 0i)
20507: [09/07/2012 15:03:40] [burton] [0.400000] Alle+#+#+das (1frq, 0s, 0i)
20507: [09/07/2012 15:03:40] [burton] [0.400000] verdienen+#+#+sagen
(1frq, 0s, 0i)
20507: [09/07/2012 15:03:40] [burton] [0.400000] Gewinn+#+#+#+Zur (1frq, 0s, 0i)
20507: [09/07/2012 15:03:40] [burton] [0.400000] unseren+#+Das (1frq, 0s, 0i)
20507: [09/07/2012 15:03:40] [burton] [0.400000] bestimmten+Kreisen
(1frq, 0s, 0i)
20507: [09/07/2012 15:03:40] [burton] [0.400000] Hanno+#+Universitaet
(1frq, 0s, 0i)
20507: [09/07/2012 15:03:40] Graham-Bayesian Probability: 0.002278 Samples: 15
20507: [09/07/2012 15:03:40] Burton-Bayesian Probability: 0.000018 Samples: 27
20507: [09/07/2012 15:03:40] no factors specified; using default
20507: [09/07/2012 15:03:40] Result Confidence: 1.00
20507: [09/07/2012 15:03:40] total processing time: 0.03494s
20507: [09/07/2012 15:03:40] saving signature as 5049f0ac205073109318227
20507: [09/07/2012 15:03:40] libdspam returned probability of 0.002278
20507: [09/07/2012 15:03:40] message result: NOT SPAM
20507: [09/07/2012 15:03:40] delivering message
20507: [09/07/2012 15:03:40] Establishing connection to /var/run/dovecot/lmtp:0
20507: [09/07/2012 15:03:40] Connection established
20507: [09/07/2012 15:03:40] DSPAM Instance Shutdown.  Exit Code: 0
20507: [09/07/2012 15:03:40] checking trusted user list for root(0)


Okay it is obviously that DSPAM will get the result 'NOT SPAM' because
my database is empty. Let's have a look at the header information that
DSPAM has added so that we can compare them later:


X-DSPAM-Result: Innocent
X-DSPAM-Processed: Fri Sep  7 15:03:40 2012
X-DSPAM-Confidence: 1.0000
X-DSPAM-Improbability: 1 in 98689409 chance of being spam
X-DSPAM-Probability: 0.0023
X-DSPAM-Signature: 5049f0ac205073109318227


So and now I'm moving that message out of my 'Inbox' right into the
folder 'SPAM' to tell DSPAM that it made a mistake and should learn
from that. Future e-mails like this should be identified as spam.
Let's have a look at the DSPAM log again to see what happened now:


20507: [09/07/2012 15:04:14] connection id 5 from 127.0.0.1.
20507: [09/07/2012 15:04:14] checking trusted user list for root(0)
20507: [09/07/2012 15:04:14] process mode: '--source=inoculation
--signature=5049f0ac205073109318227 --class=spam '
20507: [09/07/2012 15:04:14] No QuarantineAgent option found. Using
standard quarantine.
20507: [09/07/2012 15:04:14] using database handle id 5
20507: [09/07/2012 15:04:14] handle locked
20507: [09/07/2012 15:04:14] DSPAM Instance Startup
20507: [09/07/2012 15:04:14] input args: dspam --source=inoculation
--signature=5049f0ac205073109318227 --class=spam
20507: [09/07/2012 15:04:14] pass-thru args:
20507: [09/07/2012 15:04:14] processing user hanno
20507: [09/07/2012 15:04:14] uid = 0, euid = 0, gid = 0, egid = 6
20507: [09/07/2012 15:04:14] loading preferences for user hanno
20507: [09/07/2012 15:04:14] default preferences empty. reverting to
dspam.conf preferences.
20507: [09/07/2012 15:04:14] Loading preferences from dspam.conf
20507: [09/07/2012 15:04:14] using /mail/.dspam/opt-in/hanno.dspam as path
20507: [09/07/2012 15:04:14] using /mail/.dspam/opt-out/hanno.nodspam as path
20507: [09/07/2012 15:04:14] sedation level set to: 0
20507: [09/07/2012 15:04:14] loading preferences for user hanno
20507: [09/07/2012 15:04:14] default preferences empty. reverting to
dspam.conf preferences.
20507: [09/07/2012 15:04:14] Loading preferences from dspam.conf
20507: [09/07/2012 15:04:14] processing signature.  length: 6144
20507: [09/07/2012 15:04:14] Reversing 512 tokens
20507: [09/07/2012 15:04:14] Message classification/result: SPAM
20507: [09/07/2012 15:04:14] reclassifying iteration 1 result: 0
20507: [09/07/2012 15:04:14] libdspam returned probability of 1.000000
20507: [09/07/2012 15:04:14] message result: SPAM
20507: [09/07/2012 15:04:14] DSPAM Instance Shutdown.  Exit Code: 0
20507: [09/07/2012 15:04:14] checking trusted user list for root(0)


Okay for me it looks like DSPAM recognized what I wanted. The mail has
the same signature and is now classified as spam. Am I right here? Now
I'm sending the exact same message again. DSPAM should then classify
it as spam or at least have a lower 'X-DSPAM-Confidence' than before.
But nope - it's going right into the Inbox. Let's have a look at the
log one more time:


20507: [09/07/2012 15:05:28] connection id 5 from 123.112.211.12.
20507: [09/07/2012 15:05:28] checking trusted user list for root(0)
20507: [09/07/2012 15:05:28] No QuarantineAgent option found. Using
standard quarantine.
20507: [09/07/2012 15:05:28] using database handle id 5
20507: [09/07/2012 15:05:28] handle locked
20507: [09/07/2012 15:05:28] DSPAM Instance Startup
20507: [09/07/2012 15:05:28] input args: dspam --deliver=innocent,spam
20507: [09/07/2012 15:05:28] pass-thru args:
20507: [09/07/2012 15:05:28] processing user hanno
20507: [09/07/2012 15:05:28] uid = 0, euid = 0, gid = 0, egid = 6
20507: [09/07/2012 15:05:28] loading preferences for user hanno
20507: [09/07/2012 15:05:28] default preferences empty. reverting to
dspam.conf preferences.
20507: [09/07/2012 15:05:28] Loading preferences from dspam.conf
20507: [09/07/2012 15:05:28] using /mail/.dspam/opt-in/hanno.dspam as path
20507: [09/07/2012 15:05:28] using /mail/.dspam/opt-out/hanno.nodspam as path
20507: [09/07/2012 15:05:28] sedation level set to: 0
20507: [09/07/2012 15:05:28] Loading 7 BNR patterns
20507: [09/07/2012 15:05:28] [graham] [0.333333] From*Hanno+#+hanno
(1frq, 3s, 2i)
20507: [09/07/2012 15:05:28] [burton] [0.333333] From*Hanno+#+hanno
(1frq, 3s, 2i)
20507: [09/07/2012 15:05:28] [graham] [0.333333]
From*Hanno+#+gmail.com (1frq, 3s, 2i)
20507: [09/07/2012 15:05:28] [burton] [0.333333]
From*Hanno+#+gmail.com (1frq, 3s, 2i)
20507: [09/07/2012 15:05:28] [graham] [0.333333] From*hanno+gmail.com
(1frq, 3s, 2i)
20507: [09/07/2012 15:05:28] [burton] [0.333333] From*hanno+gmail.com
(1frq, 3s, 2i)
20507: [09/07/2012 15:05:28] [graham] [0.333333]
From*Hanno+Hirschberger (1frq, 3s, 2i)
20507: [09/07/2012 15:05:28] [burton] [0.333333]
From*Hanno+Hirschberger (1frq, 3s, 2i)
20507: [09/07/2012 15:05:28] [graham] [0.333333]
From*Hirschberger+hanno (1frq, 3s, 2i)
20507: [09/07/2012 15:05:28] [burton] [0.333333]
From*Hirschberger+hanno (1frq, 3s, 2i)
20507: [09/07/2012 15:05:28] [graham] [0.333333]
From*Hanno+#+#+gmail.com (1frq, 3s, 2i)
20507: [09/07/2012 15:05:28] [burton] [0.333333]
From*Hanno+#+#+gmail.com (1frq, 3s, 2i)
20507: [09/07/2012 15:05:28] [graham] [0.333333]
To*hanno+mail-test.no-ip.de (1frq, 3s, 2i)
20507: [09/07/2012 15:05:28] [burton] [0.333333]
To*hanno+mail-test.no-ip.de (1frq, 3s, 2i)
20507: [09/07/2012 15:05:28] [graham] [0.500000] unseren+Kunden (2frq, 3s, 1i)
20507: [09/07/2012 15:05:28] [burton] [0.500000] unseren+Kunden (2frq, 3s, 1i)
20507: [09/07/2012 15:05:28] [burton] [0.500000] unseren+Kunden (2frq, 3s, 1i)
20507: [09/07/2012 15:05:28] [graham] [0.500000] mehr+als (2frq, 3s, 1i)
20507: [09/07/2012 15:05:28] [burton] [0.500000] mehr+als (2frq, 3s, 1i)
20507: [09/07/2012 15:05:28] [burton] [0.500000] mehr+als (2frq, 3s, 1i)
20507: [09/07/2012 15:05:28] [graham] [0.500000] in+#+#+#+dank (1frq, 3s, 1i)
20507: [09/07/2012 15:05:28] [burton] [0.500000] in+#+#+#+dank (1frq, 3s, 1i)
20507: [09/07/2012 15:05:28] [graham] [0.500000] Kunden+#+#+#+Welt
(1frq, 3s, 1i)
20507: [09/07/2012 15:05:28] [burton] [0.500000] Kunden+#+#+#+Welt
(1frq, 3s, 1i)
20507: [09/07/2012 15:05:28] [graham] [0.500000] Bremen+#+#+#+218 (1frq, 3s, 1i)
20507: [09/07/2012 15:05:28] [burton] [0.500000] Bremen+#+#+#+218 (1frq, 3s, 1i)
20507: [09/07/2012 15:05:28] [graham] [0.500000] Gewinn+#+#+#+Zur (1frq, 3s, 1i)
20507: [09/07/2012 15:05:28] [burton] [0.500000] Gewinn+#+#+#+Zur (1frq, 3s, 1i)
20507: [09/07/2012 15:05:28] [graham] [0.500000] Monat+#+#+#+Bremen
(1frq, 3s, 1i)
20507: [09/07/2012 15:05:28] [burton] [0.500000] Monat+#+#+#+Bremen
(1frq, 3s, 1i)
20507: [09/07/2012 15:05:28] [graham] [0.500000] unsere+#+#+#+7000
(1frq, 3s, 1i)
20507: [09/07/2012 15:05:28] [burton] [0.500000] unsere+#+#+#+7000
(1frq, 3s, 1i)
20507: [09/07/2012 15:05:28] [burton] [0.500000] Firma+#+#+#+Agenten
(1frq, 3s, 1i)
20507: [09/07/2012 15:05:28] [burton] [0.500000] Staaten+#+#+#+in (1frq, 3s, 1i)
20507: [09/07/2012 15:05:28] [burton] [0.500000]
spuerbaren+#+#+#+bringt (1frq, 3s, 1i)
20507: [09/07/2012 15:05:28] [burton] [0.500000] 7000+#+#+#+ganzen
(1frq, 3s, 1i)
20507: [09/07/2012 15:05:28] [burton] [0.500000]
frueher+#+#+#+spuerbaren (1frq, 3s, 1i)
20507: [09/07/2012 15:05:28] [burton] [0.500000] Zeit+#+#+#+mehr (1frq, 3s, 1i)
20507: [09/07/2012 15:05:28] [burton] [0.500000] Unsere+#+#+#+spuert
(1frq, 3s, 1i)
20507: [09/07/2012 15:05:28] [burton] [0.500000] Raum+#+#+#+28359 (1frq, 3s, 1i)
20507: [09/07/2012 15:05:28] [burton] [0.500000]
Kundenschaft+#+#+#+Kunden (1frq, 3s, 1i)
20507: [09/07/2012 15:05:28] [burton] [0.500000] Unsere+#+#+#+Business
(1frq, 3s, 1i)
20507: [09/07/2012 15:05:28] Graham-Bayesian Probability: 0.007752 Samples: 15
20507: [09/07/2012 15:05:28] Burton-Bayesian Probability: 0.007752 Samples: 27
20507: [09/07/2012 15:05:28] no factors specified; using default
20507: [09/07/2012 15:05:28] Result Confidence: 1.00
20507: [09/07/2012 15:05:28] total processing time: 0.00832s
20507: [09/07/2012 15:05:28] saving signature as 5049f1182050724841763
20507: [09/07/2012 15:05:28] libdspam returned probability of 0.007752
20507: [09/07/2012 15:05:28] message result: NOT SPAM
20507: [09/07/2012 15:05:28] delivering message
20507: [09/07/2012 15:05:28] Establishing connection to /var/run/dovecot/lmtp:0
20507: [09/07/2012 15:05:28] Connection established
20507: [09/07/2012 15:05:28] DSPAM Instance Shutdown.  Exit Code: 0
20507: [09/07/2012 15:05:28] checking trusted user list for root(0)


And again I get the 'NOT SPAM' result like nothing happened. Even the
header entry of that message is basically the same as the first
e-mail. I'm going to post it here for the record:


X-DSPAM-Result: Innocent
X-DSPAM-Processed: Fri Sep  7 15:05:28 2012
X-DSPAM-Confidence: 1.0000
X-DSPAM-Improbability: 1 in 98689409 chance of being spam
X-DSPAM-Probability: 0.0078
X-DSPAM-Signature: 5049f1182050724841763


Can someone see what I'm doing wrong here? I running into a dead end
here. Help would be very much appreciated.

Kind regards,

Hanno

------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Dspam-user mailing list
Dspam-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspam-user

Reply via email to