On Fri, March 20, 2009 12:42 pm, Chris Ryland wrote: > But this might also make DSPAM too dependent on what SA things, no? > > I.e., maybe you'd be better off letting SA filter the truly obvious > spam, and then letting DSPAM independently decide for the remaining > emails.
I started to do that but I ended up with a lot of false positives; this is more cumbersome but works better for me. I end up with about a 99% accuracy with maybe 0.01% false positives. On balance I'd rather have more spam get through than get false positives. --Yan > > On Mar 20, 2009, at 2:58 PM, Yan Seiner wrote: > >> >> On Fri, March 20, 2009 10:43 am, Chris Ryland wrote: >>> Interesting--can you elaborate just a bit? >> >> OK, first mail passes through SA. I have it configured to only add >> info >> in the X- headers. >> >> Then the mail passes through dspam. dspam uses the info in SA's X >> headers >> as tokens in its decision. So your email has the following headers: >> >> X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on >> selene.seiner.lan >> X-Spam-Level: >> X-Spam-Status: No, score=-2.8 required=5.0 tests=AWL,BAYES_00, >> DNS_FROM_RFC_BOGUSMX autolearn=no version=3.2.5 >> X-DSPAM-Check: by www.seiner.com on Fri, 20 Mar 2009 11:08:38 -0700 >> X-DSPAM-Result: Innocent >> X-DSPAM-Processed: Fri Mar 20 11:08:39 2009 >> X-DSPAM-Confidence: 0.9995 >> X-DSPAM-Probability: 0.0000 >> X-DSPAM-Signature: 49c3dba742621804284693 >> X-DSPAM-Factors: 27, >> Cc*lists.sourceforge.net, 0.00010, >> wrote+>>, 0.00010, >> On+Fri, 0.00010, >> Subject*user], 0.00010, >> as+>, 0.00011, >>>> +>>, 0.00013, >> wrote+>, 0.00015, >>> +On, 0.00017, >> Cc*user, 0.00021, >> the+>, 0.00022, >> References*mail.gmail.com>, 0.00023, >> References*mail.gmail.com>, 0.00023, >> same+>, 0.00024, >> Cc*user+lists.sourceforge.net, 0.00024, >>> +I, 0.00026, >>> +>, 0.00026, >>> +>, 0.00026, >> X-Mailer*Mail+(2.930.3), 0.00048, >> X-Mailer*(2.930.3), 0.00048, >> Mime-Version*v930.3), 0.00049, >> Mime-Version*framework+v930.3), 0.00049, >> References*www.datavault.us>, 0.00052, >>> +Yan, 0.00053, >>>> +Can, 0.00058, >>> +the, 0.00061, >> 38+PM, 0.00067, >> From*Chris, 0.00092 >> >> Now let's look at a piece of junk: >> >> X-Spam-Flag: YES >> X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on >> selene.seiner.lan >> X-Spam-Level: *********** >> X-Spam-Status: Yes, score=11.2 required=5.0 tests=AWL,BAYES_99, >> >> HTML_IMAGE_RATIO_04 >> ,HTML_MESSAGE,MIME_HTML_ONLY,RCVD_IN_XBL,URIBL_JP_SURBL, >> URIBL_RHS_DOB autolearn=no version=3.2.5 >> X-Spam-Report: >> * 1.5 URIBL_JP_SURBL Contains an URL listed in the JP SURBL >> blocklist >> * [URIs: batiaceo.org] >> * 3.5 BAYES_99 BODY: Bayesian spam probability is 99 to 100% >> * [score: 1.0000] >> * 0.2 HTML_IMAGE_RATIO_04 BODY: HTML has a low ratio of text to >> image >> area >> * 0.0 HTML_MESSAGE BODY: HTML included in message >> * 1.5 MIME_HTML_ONLY BODY: Message only has text/html MIME parts >> * 3.0 RCVD_IN_XBL RBL: Received via a relay in Spamhaus XBL >> * [64.18.137.4 listed in zen.spamhaus.org] >> * 1.1 URIBL_RHS_DOB Contains an URI of a new domain (Day Old >> Bread) >> * [URIs: batiaceo.org] >> * 0.4 AWL AWL: From: address is in the auto white-list >> X-DSPAM-Check: by www.seiner.com on Fri, 20 Mar 2009 11:42:38 -0700 >> X-DSPAM-Result: Spam >> X-DSPAM-Processed: Fri Mar 20 11:42:39 2009 >> X-DSPAM-Confidence: 0.9997 >> X-DSPAM-Probability: 1.0000 >> X-DSPAM-Signature: 49c3e39f74883847820380 >> X-DSPAM-Factors: 15, >> X-Spam-Report*[URIs, 0.99990, >> X-Spam-Report*URL, 0.99990, >> X-Spam-Report*URL+listed, 0.99990, >> X-Spam-Report*1.5+URIBL_JP_SURBL, 0.99990, >> X-Spam-Report*URIBL_JP_SURBL, 0.99990, >> X-Spam-Report*URI+of, 0.99990, >> X-Spam-Report*an+URI, 0.99990, >> jpg"/>, 0.99990, >> X-Spam-Report*the, 0.99990, >> X-Spam-Report*3.5, 0.99990, >> X-Spam-Report*the+JP, 0.99990, >> X-Spam-Report*URIBL_JP_SURBL+Contains, 0.99990, >> X-Spam-Report*3.5+BAYES_99, 0.99990, >> X-Spam-Report*MIME_HTML_ONLY, 0.99990, >> X-Spam-Report*RCVD_IN_XBL+RBL, 0.99990 >> >> you can see that almost all the tokens dspam used came from the X-Spam >> headers. >> >> --Yan >> >>> >>> On Mar 20, 2009, at 1:38 PM, Yan Seiner wrote: >>> >>>> >>>> On Fri, March 20, 2009 9:55 am, Chris Ryland wrote: >>>>> Very interesting, thanks. >>>>> >>>>> Can I ask what SpamAssassin adds to the mix? >>>> >>>> I use SA as input to dspam. It allows dspam to be more accurate as >>>> the >>>> header tokens are nearly always the same. >>>> >>>> -- >>>> Yan Seiner, PE >>>> >>>> Support my bid for the 4J School Board >>>> http://www.seiner.com >>>> >>>> >>> >>> Cheers! >>> --Chris Ryland / Em Software, Inc. / www.emsoftware.com >>> >>> >>> >>> >>> >> >> >> -- >> Yan Seiner, PE >> >> Support my bid for the 4J School Board >> http://www.seiner.com >> >> > > Cheers! > --Chris Ryland / Em Software, Inc. / www.emsoftware.com > > > !DSPAM:49c3f1bb129321557312447! > > -- Yan Seiner, PE Support my bid for the 4J School Board http://www.seiner.com ------------------------------------------------------------------------------ Apps built with the Adobe(R) Flex(R) framework and Flex Builder(TM) are powering Web 2.0 with engaging, cross-platform capabilities. Quickly and easily build your RIAs with Flex Builder, the Eclipse(TM)based development software that enables intelligent coding and step-through debugging. Download the free 60 day trial. http://p.sf.net/sfu/www-adobe-com _______________________________________________ Dspam-user mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/dspam-user
