Hello,
I currently use mod-security rules from Atomicorp . The utf-8 rules are about 6
month old and looks like this:
#UTF abuse protection
SecRule RESPONSE_HEADERS:Content-Type "charset=utf-8" \
"phase:3,t:none,pass,nolog,setvar:global.utf8_encoding_used=1"
SecRule GLOBAL:UTF8_ENCODING_USED "@eq 1"
"chain,phase:2,t:none,auditlog,status:400,msg:'Atomicorp.com - FREE UNSUPPORTED
DELAYED FEED - WAF Rules: UTF8 Encoding Abuse Attack
Attempt',id:'390620',rev:3,severity:'7'"
SecRule REQUEST_FILENAME|ARGS|ARGS_NAMES|!ARGS:/text/|!ARGS:/txt/
"@validateUtf8Encoding"
I found reports about this to be buggy and I played with it a little , the
behavior looks like this:
1. the first rule is setting the global.utf8_encoding_used=1 , when
RESPONSE_HEADERS contains charset=utf-8 .(actually one of my servers send
charset=UTF=8 and the variable was never set, that is why I had different
behavior)
2. the second rule, during phase 2, if global.utf8_encoding_used=1 and the
third rule is a match , then status=400 is returned
This looks like it's generating some false positives (reacting to  
%C2%A0%) but it also behaves OK, for instance when bytes are missing, this is
exactly what @validateUtf8Encoding is doing.
Also I noticed that the approach on the new AtomiCorp and on Owasp is not so
intrusive , rather it just logging ..
What do more experienced people think? Maybe I am not looking where I should.
Can you help me in giving me some directions?
Thank you.
_______________________________________________
Owasp-modsecurity-core-rule-set mailing list
Owasp-modsecurity-core-rule-set@lists.owasp.org
https://lists.owasp.org/mailman/listinfo/owasp-modsecurity-core-rule-set