On Thu, Aug 18, 2011 at 11:40 PM, Ryan Barnett <[email protected]> wrote:

>>I think previously I have ended up disabling many the PHP-IDS based
>>rules because of the false positives.
>>
>>Is there some difference between how PHP-IDS executes the rules and
>>how modsecurity treats them?

>>Similarly
>>3 Orchard Street
>>on the PHP_IDS demo is fine and
>>http://testing/index.php?street=3+Orchard+Street
>>gives me warnings (981248 on "3 Or" and 981242 on "3 Orc")
>
> The rule that triggered is not a converted PHPIDS rule but rather one that
> we developed as part of the SQL Injection Challenge analysis.  This
> particular rule is trying to identify common SQL Operators.  Examples here

Sorry, I probably didn't help there by jumping to a different rule ID
for my first example.
The second example (quoted above) did also use PHP IDS related rules
which trigger in modsec but not on the PHP IDS demo site I think, I'll
constrain discussion to the 981242 rule for now, which in the current
release version is:

======
SecRule 
REQUEST_COOKIES|REQUEST_COOKIES_NAMES|REQUEST_FILENAME|ARGS_NAMES|ARGS|XML:/*
"(?i:(?:(\"|'|`|´|’|‘)\s*x?or|div|like|between|and\s*(\"|'|`|´|’|‘)?\d)|(?:\\\\x(?:23|27|3d))|(?:^.?(\"|'|`|´|’|‘)$)|(?:(?:^[(\"|'|`|´|’|‘)\\\\]*(?:[\d(\"|'|`|´|’|‘)]+|[^(\"|'|`|´|’|‘)]+(\"|'|`|´|’|‘)))+\s*(?:n?and|x?x?or|div|like|between|and|not|\|\||\&\&)\s*[\w(\"|'|`|´|’|‘)[+&!@(),.-])|(?:[^\w\s]\w+\s*[|-]\s*(\"|'|`|´|’|‘)\s*\w)|(?:@\w+\s+(and|x?or|div|like|between|and)\s*[(\"|'|`|´|’|‘)\d]+)|(?:@[\w-]+\s(and|x?or|div|like|between|and)\s*[^\w\s])|(?:[^\w\s:]\s*\d\W+[^\w\s]\s*(\"|'|`|´|’|‘).)|(?:\Winformation_schema|table_name\W))"
"phase:2,capture,multiMatch,t:none,t:urlDecodeUni,t:replaceComments,block,msg:'Detects
classic SQL injection probings
1/2',id:'981242',tag:'WEB_ATTACK/SQLI',tag:'WEB_ATTACK/ID',tag:'WEB_ATTACK/LFI',logdata:'%{TX.0}',severity:'2',setvar:'tx.msg=%{rule.id}-%{rule.msg}',setvar:tx.anomaly_score=+6,setvar:'tx.%{tx.msg}-WEB_ATTACK/SQLI-%{matched_var_name}=%{tx.0}',setvar:'tx.%{tx.msg}-WEB_ATTACK/ID-%{matched_var_name}=%{tx.0}',setvar:'tx.%{tx.msg}-WEB_ATTACK/LFI-%{matched_var_name}=%{tx.0}'"
======

If I understand the rule correctly it is really 8 separate,
independent tests in one rule.
Effectively:

(?i:(\"|'|`|´|’|‘)\s*x?or|div|like|between|and\s*(\"|'|`|´|’|‘)?\d)
(?i:\\\\x(?:23|27|3d))
(?i:^.?(\"|'|`|´|’|‘)$)
(?:(?:^[(\"|'|`|´|’|‘)\\\\]*(?:[\d(\"|'|`|´|’|‘)]+|[^(\"|'|`|´|’|‘)]+(\"|'|`|´|’|‘)))+\s*(?:n?and|x?x?or|div|like|between|and|not|\|\||\&\&)\s*[\w(\"|'|`|´|’|‘)[+&!@(),.-])
(?i:[^\w\s]\w+\s*[|-]\s*(\"|'|`|´|’|‘)\s*\w)
(?i:@\w+\s+(and|x?or|div|like|between|and)\s*[(\"|'|`|´|’|‘)\d]+)
(?i:@[\w-]+\s(and|x?or|div|like|between|and)\s*[^\w\s])|(?:[^\w\s:]\s*\d\W+[^\w\s]\s*(\"|'|`|´|’|‘).)
(?i:\Winformation_schema|table_name\W)

are all evaluated independently.

What are the advantages to combining these into one rule?

>From my perspective there are some obvious downsides eg:
1) Makes the rule more opaque when trying to understand it.
2) Makes determining exactly what a particular exception is caused by
more difficult.
3) Makes rule removal more risky, ie if I need remove the rule
(perhaps to allow "3 Orchard" which the 4th regex picks up) then I
forgo a bunch of other checks that may still be useful.
It would seem to me that the simpler the rules can be the more
oversight of the rules there will be from people interested in the
area. A lot of the posts to the list seem to be of the "how do I
disable X" sort, perhaps because they take one look at the rule and
put it in the too hard basket. They might be more inclined to try and
improve them if they are more readily understandable.


Specifically for the 4th regex which is one I have butted up against:

(?:(?:^[(\"|'|`|´|’|‘)\\\\]*(?:[\d(\"|'|`|´|’|‘)]+|[^(\"|'|`|´|’|‘)]+(\"|'|`|´|’|‘)))+\s*(?:n?and|x?x?or|div|like|between|and|not|\|\||\&\&)\s*[\w(\"|'|`|´|’|‘)[+&!@(),.-])


1) Some of the () grouping in the regex seems to obfuscate rather than
assist understanding, ie are the following the same (I am certainly
not a regex expert):
^[(\"|'|`|´|’|‘)\\\\]*
^[\"'`´’‘\\\\]*

2) What are the four \ in the same class, doesn't it just produce two
literal \ in the class the second one being redundant?

3) Is the "x?x?" intentional? If so why and should other areas only
checking for xor also check for xxor?

4) Is it wise to try and handle word operators (or, and, not) in the
same rule code as symbol operators || and &&?
I think this might be the underlying root of my false positive as SQL
parsers will likely deal with them differently.
Ie no SQL parser (I hope) is going to see "3 Orchard" and interpret
the c as the start of a new word, whereas "3 ||chard" likely would and
could lead to exploitation (perhaps "3 Ortrue" and "3 ||true" is a
clearer example).
It seems that tighter regexs with less false positives could be
written if lexically different operators were treated separately.

Paul
_______________________________________________
Owasp-modsecurity-core-rule-set mailing list
[email protected]
https://lists.owasp.org/mailman/listinfo/owasp-modsecurity-core-rule-set

Reply via email to