Re: [Dspam-user] Spam Identification Deteriorates with Time

Yan Seiner Fri, 20 Mar 2009 12:56:54 -0700

On Fri, March 20, 2009 12:18 pm, Jonathan Hall wrote:
> Why do you have SA's Bayes and AWL features turned on?  In this setup,
> it doesn't seem to me that these would really help, as dspam does both
> of these already, right?  Seems like you could reduce processing time in
> SA by disabling these features, without affecting accuracy at all.


Well, I set it up and it worked, and I hadn't looked at it since?

>
> Maybe you have reasons to have enabled these I haven't thought of?

I'm lazy?  ;-)

Seriously, nice catch.  I'll read up on SA docs again over the weekend and
see what I can eliminate.

Thanks!

--Yan

>
> --
> Jonathan
>
>
> Yan Seiner wrote:
>> On Fri, March 20, 2009 10:43 am, Chris Ryland wrote:
>>
>>> Interesting--can you elaborate just a bit?
>>>
>>
>> OK, first mail passes through SA.  I have it configured to only add info
>> in the X- headers.
>>
>> Then the mail passes through dspam.  dspam uses the info in SA's X
>> headers
>> as tokens in its decision.  So your email has the following headers:
>>
>> X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on
>> selene.seiner.lan
>> X-Spam-Level:
>> X-Spam-Status: No, score=-2.8 required=5.0 tests=AWL,BAYES_00,
>>      DNS_FROM_RFC_BOGUSMX autolearn=no version=3.2.5
>> X-DSPAM-Check: by www.seiner.com on Fri, 20 Mar 2009 11:08:38 -0700
>> X-DSPAM-Result: Innocent
>> X-DSPAM-Processed: Fri Mar 20 11:08:39 2009
>> X-DSPAM-Confidence: 0.9995
>> X-DSPAM-Probability: 0.0000
>> X-DSPAM-Signature: 49c3dba742621804284693
>> X-DSPAM-Factors: 27,
>>      Cc*lists.sourceforge.net, 0.00010,
>>      wrote+>>, 0.00010,
>>      On+Fri, 0.00010,
>>      Subject*user], 0.00010,
>>      as+>, 0.00011,
>>      >>+>>, 0.00013,
>>      wrote+>, 0.00015,
>>      >+On, 0.00017,
>>      Cc*user, 0.00021,
>>      the+>, 0.00022,
>>      References*mail.gmail.com>, 0.00023,
>>      References*mail.gmail.com>, 0.00023,
>>      same+>, 0.00024,
>>      Cc*user+lists.sourceforge.net, 0.00024,
>>      >+I, 0.00026,
>>      >+>, 0.00026,
>>      >+>, 0.00026,
>>      X-Mailer*Mail+(2.930.3), 0.00048,
>>      X-Mailer*(2.930.3), 0.00048,
>>      Mime-Version*v930.3), 0.00049,
>>      Mime-Version*framework+v930.3), 0.00049,
>>      References*www.datavault.us>, 0.00052,
>>      >+Yan, 0.00053,
>>      >>+Can, 0.00058,
>>      >+the, 0.00061,
>>      38+PM, 0.00067,
>>      From*Chris, 0.00092
>>
>> Now let's look at a piece of junk:
>>
>> X-Spam-Flag: YES
>> X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on
>> selene.seiner.lan
>> X-Spam-Level: ***********
>> X-Spam-Status: Yes, score=11.2 required=5.0 tests=AWL,BAYES_99,
>>      
>> HTML_IMAGE_RATIO_04,HTML_MESSAGE,MIME_HTML_ONLY,RCVD_IN_XBL,URIBL_JP_SURBL,
>>      URIBL_RHS_DOB autolearn=no version=3.2.5
>> X-Spam-Report:
>>      * 1.5 URIBL_JP_SURBL Contains an URL listed in the JP SURBL
>> blocklist
>>      * [URIs: batiaceo.org]
>>      * 3.5 BAYES_99 BODY: Bayesian spam probability is 99 to 100%
>>      * [score: 1.0000]
>>      * 0.2 HTML_IMAGE_RATIO_04 BODY: HTML has a low ratio of text to
>> image
>> area
>>      * 0.0 HTML_MESSAGE BODY: HTML included in message
>>      * 1.5 MIME_HTML_ONLY BODY: Message only has text/html MIME parts
>>      * 3.0 RCVD_IN_XBL RBL: Received via a relay in Spamhaus XBL
>>      * [64.18.137.4 listed in zen.spamhaus.org]
>>      * 1.1 URIBL_RHS_DOB Contains an URI of a new domain (Day Old Bread)
>>      * [URIs: batiaceo.org]
>>      * 0.4 AWL AWL: From: address is in the auto white-list
>> X-DSPAM-Check: by www.seiner.com on Fri, 20 Mar 2009 11:42:38 -0700
>> X-DSPAM-Result: Spam
>> X-DSPAM-Processed: Fri Mar 20 11:42:39 2009
>> X-DSPAM-Confidence: 0.9997
>> X-DSPAM-Probability: 1.0000
>> X-DSPAM-Signature: 49c3e39f74883847820380
>> X-DSPAM-Factors: 15,
>>      X-Spam-Report*[URIs, 0.99990,
>>      X-Spam-Report*URL, 0.99990,
>>      X-Spam-Report*URL+listed, 0.99990,
>>      X-Spam-Report*1.5+URIBL_JP_SURBL, 0.99990,
>>      X-Spam-Report*URIBL_JP_SURBL, 0.99990,
>>      X-Spam-Report*URI+of, 0.99990,
>>      X-Spam-Report*an+URI, 0.99990,
>>      jpg"/>, 0.99990,
>>      X-Spam-Report*the, 0.99990,
>>      X-Spam-Report*3.5, 0.99990,
>>      X-Spam-Report*the+JP, 0.99990,
>>      X-Spam-Report*URIBL_JP_SURBL+Contains, 0.99990,
>>      X-Spam-Report*3.5+BAYES_99, 0.99990,
>>      X-Spam-Report*MIME_HTML_ONLY, 0.99990,
>>      X-Spam-Report*RCVD_IN_XBL+RBL, 0.99990
>>
>> you can see that almost all the tokens dspam used came from the X-Spam
>> headers.
>>
>> --Yan
>>
>>
>>> On Mar 20, 2009, at 1:38 PM, Yan Seiner wrote:
>>>
>>>
>>>> On Fri, March 20, 2009 9:55 am, Chris Ryland wrote:
>>>>
>>>>> Very interesting, thanks.
>>>>>
>>>>> Can I ask what SpamAssassin adds to the mix?
>>>>>
>>>> I use SA as input to dspam.  It allows dspam to be more accurate as
>>>> the
>>>> header tokens are nearly always the same.
>>>>
>>>> --
>>>> Yan Seiner, PE
>>>>
>>>> Support my bid for the 4J School Board
>>>> http://www.seiner.com
>>>>
>>>>
>>>>
>>> Cheers!
>>> --Chris Ryland / Em Software, Inc. / www.emsoftware.com
>>>
>>>
>>>
>>>
>>>
>>>
>>
>>
>>
>
>
>
> !DSPAM:49c3ed70114601557312447!
>


-- 
Yan Seiner, PE

Support my bid for the 4J School Board
http://www.seiner.com


------------------------------------------------------------------------------
Apps built with the Adobe(R) Flex(R) framework and Flex Builder(TM) are
powering Web 2.0 with engaging, cross-platform capabilities. Quickly and
easily build your RIAs with Flex Builder, the Eclipse(TM)based development
software that enables intelligent coding and step-through debugging.
Download the free 60 day trial. http://p.sf.net/sfu/www-adobe-com
_______________________________________________
Dspam-user mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dspam-user

Re: [Dspam-user] Spam Identification Deteriorates with Time

Reply via email to