John Hardin wrote:
On Mon, 8 Mar 2010, Ned Slider wrote:
John Hardin wrote:
On Mon, 8 Mar 2010, Ned Slider wrote:
> > So I've refined the rule to specifically exclude hitting on the
> sequence ../. which stops the rule triggering on multiple relative
> paths.
> > uri LOCAL_URI_HIDDEN_DIR /(?!.{6}\.\.\/\..).{8}\/\../
How about:
uri LOCAL_URI_HIDDEN_DIR m;.{8}/\..(?!/);
Yes, that works too on my examples and is probably a more elegant
solution than mine :-)
Having done a little more testing, I can confirm this variant is more
accurate than my revision above.
It took a little more work to generate a clean rule and it's somewhat
more complex than the above.
uri URI_HIDDEN_2 m;.{8}(?:[/\\]|%(?i:5c|2f))(?!\.\.?[/%\\])\..;
Winders generates (and accepts) URIs with backslashes as directory
separators (in violation of the URI RFC? I'll have to look) and URIs can
have encoded directory separators (e.g. %2F).
and your latest revision (above) performs as expected on my small
collection of hidden_dir ham/spam, hitting on all the spam and missing
the potential FPs (containing relative paths etc). I'm currently running
both rules for further comparative testing.
A comparison between the older version that hits on /../ and the version
that does not shows the somewhat counterintuituve result that hitting on
/../ gives marginally better results (at least, as far as ruleqa is
concerned). Sure, it's not _really_ a hidden directory, but it has false
hits on spam to a greater degree than it does on ham, so the S/O ratio
is better and the overall hits are higher...
http://ruleqa.spamassassin.org/?rule=%2FURI_HIDDEN
Interesting and indeed not what one might expect.