Hi Tom, Which dspam version you are using? How do you train? Which tokenizer do you use during the train and after train? Dspam is very sensitive about training. If you don't train very well or if you train too much you may have troubles. Also there are many headers you should ignore. You can get the list from: http://sourceforge.net/apps/mediawiki/dspam/index.php?title=Working_DSPAM%2BPOSTFIX%2BMYSQL%2BCLAMAV_Setup_by_PaulC
Also if uploaded spam/ham corpus from windows to unix/linux you should ignore them by adding the following line to dspam.conf. I had this problem before, In this case dspam was only checking the headers like for the classification. #Specifying 'lineStripping' causes DSPAM to strip ^M's from messages passed # in. Broken lineStripping If you have same problem you may have to re-train your dspam data. Thanks. On Fri, Apr 22, 2011 at 9:17 AM, Tom Hendrikx <t...@whyscream.net> wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Hi, > > In my current setup I just received my first FP. Dspam is setup to add > the dspam-factors header to classified e-mails, but after reviewing the > data, I don't understand why dspam decided to classify the message as > spam. Also the X-DSPAM-Improbability header has weird contents. > > Does the dspam_factors header contain all of the tokens used to classify > the message, or only a subset of them? Because the headers in the FP > message do not explain why it happens: > > X-DSPAM-Result: Spam > X-DSPAM-Processed: Fri Apr 22 01:01:29 2011 > X-DSPAM-Confidence: 0.9963 > X-DSPAM-Improbability: 1 in 26939 chance of being ham > X-DSPAM-Probability: 1.0000 > X-DSPAM-Signature: 1,4db0b74991741873512032 > X-DSPAM-Factors: 15, > X-AntiAbuse*Original+#+-, 0.99649, > X-AntiAbuse*Caller+#+GID, 0.99649, > X-AntiAbuse*Sender+#+Domain, 0.99649, > X-AntiAbuse*please+#+it, 0.99649, > X-AntiAbuse*with+#+#+report, 0.99649, > X-AntiAbuse*to+#+abuse, 0.99649, > X-AntiAbuse*Primary+#+-, 0.99649, > X-AntiAbuse*Original+Domain, 0.99649, > X-AntiAbuse*GID+-, 0.99649, > X-AntiAbuse*Sender+#+#+-, 0.99649, > X-AntiAbuse*track+abuse, 0.99649, > X-AntiAbuse*header+was, 0.99649, > X-AntiAbuse*header+#+#+#+track, 0.99649, > X-AntiAbuse*was+#+to, 0.99649, > X-AntiAbuse*Originator+Caller, 0.99649 > > According to the scoring of the listed tokens, I think this message > should be marked as ham, not as spam. Relevant values from dspam.conf: > > TrainingMode teft > ImprobabilityDrive on > Algorithm graham burton > Tokenizer osb > PValue bcr > > All of the above with a git tip checkout from 2011-03-01. > > Kind regards, > > Tom > > > FWIW: I added the X-AntiAbuse header to the Ignmoreheaders after > reviewing this message, because I concluded that the header is pretty > useless for classification. > > > - -- > New PGP key: 7D54EFF5 > Fingerprint: C26F 374F 5E13 157B 5B42 7A1B 93DF 319D 7D54 EFF5 > http://www.whyscream.net/key-transition-2011-03-30.txt.asc > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1.4.10 (GNU/Linux) > Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ > > iQIcBAEBAgAGBQJNsTmNAAoJEJPfMZ19VO/1GPkP/RRPmcjm+GodpcVhTQH2HzX2 > nVJlZKpVedc6O+NHd79++wFD6xQ4O/+58r4KmV3w1IuVp+VJ105sAiaslnYZDNzq > i4/6gZgUZtb2UOTyQCFsJekiXWjsPc2mTLvHFDuDtHEPNlKB2XKexfSP1wAiq3Xx > DE/Uxp9OjrmVa3pB9632l+YOOmzno/x6P975hr34ToULBlm2Vsqq0Z7x8OjZfMD3 > 78MlKo5YiY9yNnJoY8OZPj8MXu5EtRRHcotkc3vZ4QfofCLIKFWzC8YXQ9arzhJy > HEdSdcHR7s91z+/tSfiDfXy3cSff7Qwanvi7HBm4+zWT9+EAX2Y3nGvb097ymmhz > 3lLPYlgDWDfxXIkmScGINHyXrTr91tp7YgsnrV8/GbVoW2HLoa83cS/im/GfkDoZ > Kmy0OmFc65Apv8S4kl5FYdA4bWemIHlcLaLZjX2zNVm3JYzg5Eatb8N63j//4nO7 > 9fAZjpY5/j9oLTs60L/uPwhqgqFZWJebCf1rQcPDMSAjzO9kBrXG0v4bT/dbAd5E > KXuoVhxY1VsIh+agc+92dsufdeVO344hZpUtPqwWsfhb6/OvI9gyRuSiqyAznZD3 > 5KPGuO05yVmwvrBAdNiTah3uHsLh5UAf3Dk12TE3LKQfx443Fh5gZg1P9XWj5xfO > kE3slZqPktWcL6EKfZPS > =hra9 > -----END PGP SIGNATURE----- > > ------------------------------------------------------------------------------ > Fulfilling the Lean Software Promise > Lean software platforms are now widely adopted and the benefits have been > demonstrated beyond question. Learn why your peers are replacing JEE > containers with lightweight application servers - and what you can gain > from the move. http://p.sf.net/sfu/vmware-sfemails > _______________________________________________ > Dspam-user mailing list > Dspam-user@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/dspam-user > ------------------------------------------------------------------------------ Fulfilling the Lean Software Promise Lean software platforms are now widely adopted and the benefits have been demonstrated beyond question. Learn why your peers are replacing JEE containers with lightweight application servers - and what you can gain from the move. http://p.sf.net/sfu/vmware-sfemails _______________________________________________ Dspam-user mailing list Dspam-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspam-user