On Thu, Aug 18, 2011 at 11:40 PM, Ryan Barnett <[email protected]> wrote:
>>I think previously I have ended up disabling many the PHP-IDS based >>rules because of the false positives. >> >>Is there some difference between how PHP-IDS executes the rules and >>how modsecurity treats them? >>Similarly >>3 Orchard Street >>on the PHP_IDS demo is fine and >>http://testing/index.php?street=3+Orchard+Street >>gives me warnings (981248 on "3 Or" and 981242 on "3 Orc") > > The rule that triggered is not a converted PHPIDS rule but rather one that > we developed as part of the SQL Injection Challenge analysis. This > particular rule is trying to identify common SQL Operators. Examples here Sorry, I probably didn't help there by jumping to a different rule ID for my first example. The second example (quoted above) did also use PHP IDS related rules which trigger in modsec but not on the PHP IDS demo site I think, I'll constrain discussion to the 981242 rule for now, which in the current release version is: ====== SecRule REQUEST_COOKIES|REQUEST_COOKIES_NAMES|REQUEST_FILENAME|ARGS_NAMES|ARGS|XML:/* "(?i:(?:(\"|'|`|´|’|‘)\s*x?or|div|like|between|and\s*(\"|'|`|´|’|‘)?\d)|(?:\\\\x(?:23|27|3d))|(?:^.?(\"|'|`|´|’|‘)$)|(?:(?:^[(\"|'|`|´|’|‘)\\\\]*(?:[\d(\"|'|`|´|’|‘)]+|[^(\"|'|`|´|’|‘)]+(\"|'|`|´|’|‘)))+\s*(?:n?and|x?x?or|div|like|between|and|not|\|\||\&\&)\s*[\w(\"|'|`|´|’|‘)[+&!@(),.-])|(?:[^\w\s]\w+\s*[|-]\s*(\"|'|`|´|’|‘)\s*\w)|(?:@\w+\s+(and|x?or|div|like|between|and)\s*[(\"|'|`|´|’|‘)\d]+)|(?:@[\w-]+\s(and|x?or|div|like|between|and)\s*[^\w\s])|(?:[^\w\s:]\s*\d\W+[^\w\s]\s*(\"|'|`|´|’|‘).)|(?:\Winformation_schema|table_name\W))" "phase:2,capture,multiMatch,t:none,t:urlDecodeUni,t:replaceComments,block,msg:'Detects classic SQL injection probings 1/2',id:'981242',tag:'WEB_ATTACK/SQLI',tag:'WEB_ATTACK/ID',tag:'WEB_ATTACK/LFI',logdata:'%{TX.0}',severity:'2',setvar:'tx.msg=%{rule.id}-%{rule.msg}',setvar:tx.anomaly_score=+6,setvar:'tx.%{tx.msg}-WEB_ATTACK/SQLI-%{matched_var_name}=%{tx.0}',setvar:'tx.%{tx.msg}-WEB_ATTACK/ID-%{matched_var_name}=%{tx.0}',setvar:'tx.%{tx.msg}-WEB_ATTACK/LFI-%{matched_var_name}=%{tx.0}'" ====== If I understand the rule correctly it is really 8 separate, independent tests in one rule. Effectively: (?i:(\"|'|`|´|’|‘)\s*x?or|div|like|between|and\s*(\"|'|`|´|’|‘)?\d) (?i:\\\\x(?:23|27|3d)) (?i:^.?(\"|'|`|´|’|‘)$) (?:(?:^[(\"|'|`|´|’|‘)\\\\]*(?:[\d(\"|'|`|´|’|‘)]+|[^(\"|'|`|´|’|‘)]+(\"|'|`|´|’|‘)))+\s*(?:n?and|x?x?or|div|like|between|and|not|\|\||\&\&)\s*[\w(\"|'|`|´|’|‘)[+&!@(),.-]) (?i:[^\w\s]\w+\s*[|-]\s*(\"|'|`|´|’|‘)\s*\w) (?i:@\w+\s+(and|x?or|div|like|between|and)\s*[(\"|'|`|´|’|‘)\d]+) (?i:@[\w-]+\s(and|x?or|div|like|between|and)\s*[^\w\s])|(?:[^\w\s:]\s*\d\W+[^\w\s]\s*(\"|'|`|´|’|‘).) (?i:\Winformation_schema|table_name\W) are all evaluated independently. What are the advantages to combining these into one rule? >From my perspective there are some obvious downsides eg: 1) Makes the rule more opaque when trying to understand it. 2) Makes determining exactly what a particular exception is caused by more difficult. 3) Makes rule removal more risky, ie if I need remove the rule (perhaps to allow "3 Orchard" which the 4th regex picks up) then I forgo a bunch of other checks that may still be useful. It would seem to me that the simpler the rules can be the more oversight of the rules there will be from people interested in the area. A lot of the posts to the list seem to be of the "how do I disable X" sort, perhaps because they take one look at the rule and put it in the too hard basket. They might be more inclined to try and improve them if they are more readily understandable. Specifically for the 4th regex which is one I have butted up against: (?:(?:^[(\"|'|`|´|’|‘)\\\\]*(?:[\d(\"|'|`|´|’|‘)]+|[^(\"|'|`|´|’|‘)]+(\"|'|`|´|’|‘)))+\s*(?:n?and|x?x?or|div|like|between|and|not|\|\||\&\&)\s*[\w(\"|'|`|´|’|‘)[+&!@(),.-]) 1) Some of the () grouping in the regex seems to obfuscate rather than assist understanding, ie are the following the same (I am certainly not a regex expert): ^[(\"|'|`|´|’|‘)\\\\]* ^[\"'`´’‘\\\\]* 2) What are the four \ in the same class, doesn't it just produce two literal \ in the class the second one being redundant? 3) Is the "x?x?" intentional? If so why and should other areas only checking for xor also check for xxor? 4) Is it wise to try and handle word operators (or, and, not) in the same rule code as symbol operators || and &&? I think this might be the underlying root of my false positive as SQL parsers will likely deal with them differently. Ie no SQL parser (I hope) is going to see "3 Orchard" and interpret the c as the start of a new word, whereas "3 ||chard" likely would and could lead to exploitation (perhaps "3 Ortrue" and "3 ||true" is a clearer example). It seems that tighter regexs with less false positives could be written if lexically different operators were treated separately. Paul _______________________________________________ Owasp-modsecurity-core-rule-set mailing list [email protected] https://lists.owasp.org/mailman/listinfo/owasp-modsecurity-core-rule-set
