John Hardin wrote:
On Mon, 8 Mar 2010, Ned Slider wrote:

John Hardin wrote:
 On Mon, 8 Mar 2010, Ned Slider wrote:
> > So I've refined the rule to specifically exclude hitting on the > sequence ../. which stops the rule triggering on multiple relative > paths.
> >  uri        LOCAL_URI_HIDDEN_DIR    /(?!.{6}\.\.\/\..).{8}\/\../

 How about:

     uri         LOCAL_URI_HIDDEN_DIR    m;.{8}/\..(?!/);

Yes, that works too on my examples and is probably a more elegant solution than mine :-)


Having done a little more testing, I can confirm this variant is more accurate than my revision above.


It took a little more work to generate a clean rule and it's somewhat more complex than the above.

    uri  URI_HIDDEN_2  m;.{8}(?:[/\\]|%(?i:5c|2f))(?!\.\.?[/%\\])\..;

Winders generates (and accepts) URIs with backslashes as directory separators (in violation of the URI RFC? I'll have to look) and URIs can have encoded directory separators (e.g. %2F).


and your latest revision (above) performs as expected on my small collection of hidden_dir ham/spam, hitting on all the spam and missing the potential FPs (containing relative paths etc). I'm currently running both rules for further comparative testing.

A comparison between the older version that hits on /../ and the version that does not shows the somewhat counterintuituve result that hitting on /../ gives marginally better results (at least, as far as ruleqa is concerned). Sure, it's not _really_ a hidden directory, but it has false hits on spam to a greater degree than it does on ham, so the S/O ratio is better and the overall hits are higher...

    http://ruleqa.spamassassin.org/?rule=%2FURI_HIDDEN


Interesting and indeed not what one might expect.


Reply via email to