On Dec 28, 2006, at 8:22 AM, Rob Rosenfeld wrote:

Chris Ryland wrote:
On Dec 27, 2006, at 7:37 PM, Rob Rosenfeld wrote:
Stefan M. Huber wrote:
Hi!
I have been searching the archives but this particular problem doesn't seem to be covered anywhere. I have DSpam 2.6.8 (the latest stable release) on a Debian/ stable system. It works good apart for a certain type of spam. These are mutlipart messages with one HTML part and a JPEG attachment. No matter how often I retrain such messages as spam (and I've had lots of them), DSpam always labels them as
  X-DSPAM-Confidence: 0.9362
  X-DSPAM-Probability: 0.0000

Are you using "Feature noise"? I had this disabled and these always came through and always just over the threshold. After enabling it and allowing enough time for training to overcome my corpus, it's started catching it all.
Boy, if this works (I just turned it on), this should be shouted from the rooftops. What training mode are you using? (I'm using teft.) The bnr.nuclearelephant.com site suggests that "Feature noise" (BNR) may not work as well with teft as with other modes.

I'm using TEFT, but I haven't had BNR enabled that long.

In our case, in just a few days, with BNR turned on (in TEFT mode), DSPAM has gone from letting through a dozen or two (out of 1,000 spams per day) of these graphics/dictionary spams to letting through *none*, with no new false positives!

Now this could be related to the broken Asian internet cable (and thus greatly reduced spamming), but I'm pretty sure BNR has solved the problem.

Since DSPAM configuration is such a black art (at least to someone like me who hasn't done the theoretical study of how it all works), this is the kind of critical information we should be sharing on this list: how to fight certain spam trends with DSPAM configuration tuning.

Cheers!
--Chris Ryland / Em Software, Inc. / www.emsoftware.com

Reply via email to