> Was wondering if some of the regex guru's on the list might be willing
> to tell me if a rule I have created could be made more efficient
> somehow.
Not a guru, but some comments...
> This rule checks the from address for some common spam-originating
> country codes and tacks on a half-point to the score.
>
> From =~ /[EMAIL
> PROTECTED](nl|ie|de|fr|pl|co\.za|co\.nz|dk|ch|ru|fi|mx|il|tw|ca|cz|lu|lt|ar).?$/i
^^ ^^
^^^
The first marked part ".*" is totally irrelevant. There may be any
number of chars before the @. Simply dropping it and starting the RE
with @ will give the same results.
The second marked part is bogus. It means any string between the leading
"@" and the trailing TLD. This might not have huge impact on the From:
header, but will result in FP on body tests.
The third marked part feels incorrect as well. Do you really want chars
there? This can result in FP with (sub)-domains ending with the given
TLDs, like "foo.de.edu".
>>From =~ /[EMAIL PROTECTED](nl|ie|de|...|ar)>?$/
The above RE will only trigger on TLDs that are at the end of the From:
header (no real name) or ending with the optional ">" (with real name).
The middle part should probably not contain other chars than
[-a-zA-z0-9] instead of the ".".
> I apologize if anyone from the list is from any of these countries,
> please don't take it personally.. ;)
Although this is not SPAM, I get added half a point. Spammy me... ;-)
...guenther
--
char *t="[EMAIL PROTECTED]";
main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i<l;i++){ i%8? c<<=1:
(c=*++x); c&128 && (s+=h); if (!(h>>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}