Christian,

Am 23.08.2012 20:04, schrieb Christian Klossek:
> Ryan,
> 
> the Server and the page uses utf-8 encoding.
> When the form is being submitted the apache log looks like this:
> 
> "GET /test.php?keywords=%C4%99 HTTP/1.0" 403 375 "-" "Mozilla/5.0 (X11;
> Ubuntu; Linux x86_64; rv:14.0) Gecko/20100101 Firefox/14.0.1"
> 
> You can see the "ę" is correctly encoded using utf-8 as "%C4%99".

> Why is
> modsecurity not decoding this in a proper way before doing it's tests?
> Or is it an apache configuration problem?

I'll answer the 2'nd part of the question first: no, it's not an apache
configuration problem, and it's not an apache problem at all.

Now the 1'st part: modsecurity *does* decode proper, cause it exactly decodes
in the way you (I mean you, the human) configured it to do.

No offense meant so far, I'll try to explain:
it's difficult to qualify this as a "problem", 'cause anything works as 
expected,
it's just you (the human again) who expects a different behaviour.
Why is this? 'cause the protocol (HTTP) is plain (old) 7-bit ASCII, which
requires anything not 7-bit ASCII to be URL-encoded (except if specified as
OCTET stream). And that's what your (and most likely all other) browser does:
it URL-encodes as the protocol requires, i.e. ę as %C4%99.

On the other side (think: the server side) this %C4%99 decodes simply to two
bytes. If we look at these bytes, %C4 is Ä, perfect, we're done (my opinion:),
and %99, well I don't know.

Before we dig deeper into the decoding jungle, lets look at HTTP:
the request line (GET /something....) is the very first part of the complete
request, it's neither the request header nor the request body, just the
request line itself. At this point the receiving side has no information
how the data should be decoded. It needs to rely on the core protocol
definitions (HTTP here) which defines: 7-bit ASCII.
And that's what apache (and every other web server too) gets, usually.

Consequently no web server decodes anything here unless being told to do so.
They even don't decode URL-encoding as this is the application's jobs (any
CGI, or module or whatever is an application).

Said this, it's time to look at devices/systems/whatever before the web server,
I mean for example WAFs. modsecurity is a WAF (for simplicity I don't go into
the details of an embedded WAF like modsecurity).
Such a WAF (just like any IDS/IPS) inspects the traffic. And it also needs to
rely on protocol specifications. But it has the advantage that it handles the
request -the request line + request header + request body- as a unique object
and hence is able to decode this complete object as specified by the object
itself (i.e the request header). *BUT* it (the WAF) has to be configured to
do so.

In practice, espacialy in this example, the WAF has to be configured to decode
anything to utf-8. One way to tweak modsecurity is using @validateUtf8Encoding.
Then you need to adapt your rules accordingly (not a simple job).

However, this again is not the behaviour you (the human) expects, unfortunately,
i.e. try to decode %eb%b8%be%3e

Have fun, I guess you end up in: somthing>
:-))


This is not the answer you expect, probably, but it hopefully helps to
understand the problem (it's a problem only if you expect malicious data:)

BTW, there exist comercial WAFs which force you to configure the charset
and encoding to be expected and used for each request. Heaven (or the 
developers) know how this works relyable in practice (see my funny example
above:).


 
> Christian
> 
> On 08/23/2012 06:39 PM, Ryan Barnett wrote:
>> Christian,
>> Is this an application that you control?  It looks as though this is some
>> type of search field where the user can type in text.  If this is the
>> case, then I suggest that the application use proper Unicode encoding of
>> the input data so that the param field would be - keywords=%u0119 rather
>> than keywords=ę.
>>
>> In addition, I would suggest that you add the following to your
>> ModSecurity main config file -
>>
>> SecUnicodeCodePoint 20127
>> SecUnicodeMapFile /path/to/unicode.mapping
>>
>> You will need to adjust the first directive to suit your local code point.
>>  20127 maps to US-ASCII.  The unicode.mapping file comes bundled with the
>> ModSecurity source code.  By using these directives, any t:urlDecodeUni
>> transformation functions will map the Unicode data to the mappings you
>> specify.
>>
>> -Ryan
>>
>>
>>
>> On 8/23/12 9:34 AM, "Christian Klossek" <c.klos...@apodiscounter.de> wrote:
>>
>>> Hi,
>>>
>>> I'm using modsecurity 2.6.7 with CRS 2.2.5 on a debian squeeze system.
>>>
>>> Why is the rule 981318 triggering on a GET-param with a value of "ę"
>>> (Unicode U+0119)?
>>>
>>> I get this in my debug log (debug level 9):
>>> -------------------------------------
>>> SecRule
>>> "REQUEST_COOKIES|REQUEST_COOKIES_NAMES|REQUEST_FILENAME|ARGS_NAMES|ARGS|XM
>>> L:/*"
>>> "@rx
>>> (^[\"'`\xc2\xb4\xe2\x80\x99\xe2\x80\x98;]+|[\"'`\xc2\xb4\xe2\x80\x99\xe2\x
>>> 80\x98;]+$)"
>>> "phase:2,nolog,auditlog,rev:2.2.5,capture,t:none,t:urlDecodeUni,block,msg:
>>> 'SQL
>>> Injection Attack: Common Injection Testing
>>> Detected',id:981318,logdata:%{TX.0},severity:2,tag:WEB_ATTACK/SQL_INJECTIO
>>> N,tag:WASCTC/WASC-19,tag:OWASP_TOP_10/A1,tag:OWASP_AppSensor/CIE1,tag:PCI/
>>> 6.5.2,setvar:tx.msg=%{rule.msg},setvar:tx.sql_injection_score=+%{tx.critic
>>> al_anomaly_score},setvar:tx.anomaly_score=+%{tx.critical_anomaly_score},se
>>> tvar:tx.%{rule.id}-WEB_ATTACK/SQL_INJECTION-%{matched_var_name}=%{tx.0}"
>>>
>>> Expanded
>>> "REQUEST_COOKIES|REQUEST_COOKIES_NAMES|REQUEST_FILENAME|ARGS_NAMES|ARGS|XM
>>> L:/*"
>>> to "REQUEST_FILENAME|ARGS_NAMES:keywords|ARGS:keywords".
>>>
>>> T (0) urlDecodeUni: "/test.php"
>>> Transformation completed in 13 usec.
>>> Executing operator "rx" with param
>>> "(^[\"'`\xc2\xb4\xe2\x80\x99\xe2\x80\x98;]+|[\"'`\xc2\xb4\xe2\x80\x99\xe2\
>>> x80\x98;]+$)"
>>> against REQUEST_FILENAME.
>>> Target value: "/test.php"
>>> Operator completed in 9 usec.
>>>
>>> T (0) urlDecodeUni: "keywords"
>>> Transformation completed in 13 usec.
>>> Executing operator "rx" with param
>>> "(^[\"'`\xc2\xb4\xe2\x80\x99\xe2\x80\x98;]+|[\"'`\xc2\xb4\xe2\x80\x99\xe2\
>>> x80\x98;]+$)"
>>> against ARGS_NAMES:keywords.
>>> Target value: "keywords"
>>> Operator completed in 4 usec.
>>>
>>> T (0) urlDecodeUni: "\xc4\x99"
>>> Transformation completed in 14 usec.
>>> Executing operator "rx" with param
>>> "(^[\"'`\xc2\xb4\xe2\x80\x99\xe2\x80\x98;]+|[\"'`\xc2\xb4\xe2\x80\x99\xe2\
>>> x80\x98;]+$)"
>>> against ARGS:keywords.
>>> Target value: "\xc4\x99"
>>> Added regex subexpression to TX.0: \x99
>>> Added regex subexpression to TX.1: \x99
>>> Operator completed in 38 usec.
>>> Setting variable: tx.msg=%{rule.msg}
>>> Resolved macro %{rule.msg} to: SQL Injection Attack: Common Injection
>>> Testing Detected
_______________________________________________
Owasp-modsecurity-core-rule-set mailing list
Owasp-modsecurity-core-rule-set@lists.owasp.org
https://lists.owasp.org/mailman/listinfo/owasp-modsecurity-core-rule-set

Reply via email to