John Hardin wrote:
 On 4/28/10 3:13 PM, Kris Deugau wrote:
>   0.0 TO_EQ_FM_HTML_ONLY     To == From and HTML only
>   0.0 TO_EQ_FM_DIRECT_MX     To == From and direct-to-MX
>   1.7 TO_EQ_FM_HTML_DIRECT   To == From and HTML only, direct-to-MX

There was a bug in handling bare addresses in the first version of those rules, which has since been fixed. Unfortunately sa-update hasn'tpublished the update yet - so I'm off to the dev list. Sorry!

Ah. These rules weren't my original concern; the TVD_PH_SUBJ_ACCOUNTS_POST and TVD_SUBJ_ACC_NUM rules were, since they thoroughly overbalanced Bayes (even with the more aggressive local BAYES_00 score) and caused the original FP.

I don't see anything obviously wrong with the root From == To meta subrules:

header __TO_EQ_FROM_1 ALL =~ /\nFrom:[^\n<]{0,80}<?([^\n\s>]+)>?\n(?:[^\n]{1,100}\n)*To:[^\n]+\1/ism header __TO_EQ_FROM_2 ALL =~ /\nTo:[^\n<]{0,80}<?([^\n\s>]+)>?\n(?:[^\n]{1,100}\n)*From:[^\n]+\1/ism

They assume a human-readable comment and angle brackets are present on whichever header appears first, which was erroneous.

Hmm. I'll be curious to see the updates; I'm far from a regex expert but I don't see what's actually broken.

Well, there _is_ a size limit on what will be accepted between those two headers, so other headers _can_ affect whether it will hit.

*nod* So I can see in the subrules... but the From and To in the original, and the sanitized example I posted to Pastebin, were right next to each other. And with no pattern I could detect, removing or altering other headers, or even the username and/or domain part of either From or To *sometimes* caused a previously-matching header set to not match, or vice versa. O_o

IIRC even moving a header from above to below the To/From pair altered the behaviour at one point.

-kgd

Reply via email to