http://bugzilla.spamassassin.org/show_bug.cgi?id=3065

[EMAIL PROTECTED] changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|                            |FIXED



------- Additional Comments From [EMAIL PROTECTED]  2004-02-19 01:50 -------
This feature already exists in HEAD:

OVERALL%   SPAM%     HAM%     S/O    RANK   SCORE  NAME
 260784   188819    71965    0.724   0.00    0.00  (all messages)
100.000  72.4044  27.5956    0.724   0.00    0.00  (all messages as %)
  0.933   1.2891   0.0000    1.000   0.94    1.00  HTML_NONELEMENT_90_100
  1.101   1.5205   0.0000    1.000   0.94    1.00  HTML_NONELEMENT_80_90
  1.086   1.4993   0.0000    1.000   0.94    1.00  HTML_NONELEMENT_70_80
  1.185   1.6370   0.0000    1.000   0.95    1.00  HTML_NONELEMENT_60_70
 14.805  20.4471   0.0014    1.000   0.97    1.00  HTML_NONELEMENT_50_60
  0.682   0.9374   0.0125    0.987   0.91    1.00  HTML_NONELEMENT_40_50
  2.822   3.8894   0.0222    0.994   0.93    1.00  HTML_NONELEMENT_30_40
  0.350   0.4714   0.0306    0.939   0.78    1.00  HTML_NONELEMENT_20_30
  0.511   0.6572   0.1265    0.839   0.56    1.00  HTML_NONELEMENT_10_20
  1.411   1.8123   0.3571    0.835   0.55    1.00  HTML_NONELEMENT_00_10
  0.016   0.0222   0.0000    1.000   0.94    1.00  HTML_BADTAG_90_100
  0.112   0.1546   0.0000    1.000   0.94    1.00  HTML_BADTAG_80_90
  0.275   0.3797   0.0000    1.000   0.94    1.00  HTML_BADTAG_70_80
  0.805   1.1116   0.0000    1.000   0.94    1.00  HTML_BADTAG_60_70
  0.354   0.4888   0.0000    1.000   0.94    1.00  HTML_BADTAG_50_60
 15.185  20.9730   0.0000    1.000   0.97    1.00  HTML_BADTAG_40_50
  1.321   1.8240   0.0014    0.999   0.94    1.00  HTML_BADTAG_30_40
  0.851   1.1736   0.0028    0.998   0.94    1.00  HTML_BADTAG_20_30
  3.637   5.0006   0.0584    0.988   0.92    1.00  HTML_BADTAG_10_20
  2.330   3.0325   0.4877    0.861   0.61    1.00  HTML_BADTAG_00_10

40 or 50% and above for both of these new rules matches tons of spam.

If you just look at HTML messages in our nightly corpus, the peak of both
ranges are in the top 6 rules of all rules.

OVERALL%   SPAM%     HAM%     S/O    RANK   SCORE  NAME
 166122   161128     4994    0.970   0.00    0.00  (all messages)
100.000  96.9938   3.0062    0.970   0.00    0.00  (all messages as %)
 46.790  48.2238   0.5206    0.989   1.00    0.75  BIZ_TLD
 25.606  26.4001   0.0000    1.000   1.00    0.01  T_DEEP_DISC_MEDS
 23.839  24.5774   0.0000    1.000   1.00    1.00  HTML_BADTAG_40_50     <---
 23.609  24.3403   0.0000    1.000   1.00    0.01  T_SUBJ_VALIUM
 22.192  22.8799   0.0000    1.000   0.99    0.01  T_MSGID_EVIL_20
 23.241  23.9611   0.0200    0.999   0.99    1.00  HTML_NONELEMENT_50_60 <---

(removed other test variations of T_MSGID_EVIL)

In addition to the above tests for invalid tags, we also look for mid-word
tags.

  0.287   0.2960   0.0000    1.000   0.96    1.00  HTML_OBFUSCATION_90_100
  0.182   0.1874   0.0000    1.000   0.96    1.00  HTML_OBFUSCATION_80_90
  0.680   0.7013   0.0000    1.000   0.96    1.00  HTML_OBFUSCATION_70_80
  0.640   0.6597   0.0000    1.000   0.96    1.00  HTML_OBFUSCATION_60_70
  1.365   1.4076   0.0000    1.000   0.96    1.00  HTML_OBFUSCATION_50_60
  1.798   1.8538   0.0000    1.000   0.96    1.00  HTML_OBFUSCATION_40_50
  1.678   1.7297   0.0200    0.989   0.93    1.00  HTML_OBFUSCATION_30_40
  3.724   3.8342   0.1802    0.955   0.84    1.00  HTML_OBFUSCATION_20_30
  2.376   2.4285   0.6808    0.781   0.46    1.00  HTML_OBFUSCATION_10_20
 21.327  20.0276  63.2559    0.240   0.04    1.00  HTML_OBFUSCATION_00_10

So, thanks for the suggestions.  :-)




------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

Reply via email to