On 12/18/2014 06:27 PM, RW wrote:
On Tue, 16 Dec 2014 13:10:05 +0100
Axb wrote:
https://sourceforge.net/projects/sare/files/
replaces any older version.
leech while it lasts....
adjust scores if needed..
There are some rules that shouldn't be there. (I only tested a few that
looked the most dubious)
The first is a common phrase in mail from UK banks and other financial
services companies. Note the "ise" spelling which is common outside
the US.
body __RULEGEN_PHISH_BLR6YY /uthorised and regulated by the /
The following are common in legal disclaimer signatures:
body __RULEGEN_PHISH_UNQ4VP / may contain information that is /
body __RULEGEN_PHISH_B9HL3A /The information contained in this /
body __RULEGEN_PHISH_C6URDE / do not necessarily represent those of /
body __RULEGEN_PHISH_L3I0Z5 / is intended solely for the ..d/
This hits some of of my ham:
body __RULEGEN_PHISH_SRX3XZ / apologize for any inconvenience/
Unless there's a bug, the fact that those disclaimer phrases got through
suggests that these rules are either intended to be very much more
aggressive than the SOUGHT rules, or the ham corpus isn't good enough.
as the rules were generated with donated corpus data, you're more than
welcome to send me an archive of ham samples to avoid these potential
issues.