http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5736

           Summary: FPs on FROM_DOMAIN_NOVOWEL  & URI_NOVOWEL
           Product: Spamassassin
           Version: 3.2.3
          Platform: Other
        OS/Version: other
            Status: NEW
          Severity: normal
          Priority: P5
         Component: Rules
        AssignedTo: [email protected]
        ReportedBy: [EMAIL PROTECTED]


As per the list email unterm-durchschnitt.de and unterm-durchschnitt.com are
both domains that are suffering from FPs caused by these rules.

The problem of course being "rchschn" is a series of 7 non-vowels, which is just
enough to trigger the rules. 

Clearly the intent of the rule is to try to match obviously invalid domains that
are randomly generated. However  the assumption that 7 consonants in a row is
well beyond what any legitimate domain would have is obviously invalid,
particularly in languages German where long consonant strings are more common.

Quite frankly, neither of these rules has a particularly high hit rate and
FROM_DOMAIN_NOVOWEL is almost zero in hits. FROM_LOCAL_NOVOWEL does much better,
but that's not a problem here. 

The question is, do we try to alter the rules to require 8? 9? or drop them? Or
some of each (ie: drop FROM_DOMAIN_NOVOWEL and modify URI_NOVOWEL)?



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

Reply via email to