Re: Haproxy and UTF8-encoded chars
On Fri, Jul 27, 2012 at 8:49 AM, Stojan Rancic (Iprom) sto...@iprom.si wrote: On 26.7.2012 10:15, Sander Klein wrote: So I definitely think Internet Explorer is doing it wrong. It relies on the fact that most web servers will encode the URL for them, which most actually do If you really want to accept the 'bad' URLs then you might enable 'option accept-invalid-http-request' but I strongly recommend to not enable this in a production environment. I did a bit of ngrep-ing yesterday as well, and we are indeed seeing non-ASCII encoded characters flowing in URLs and Referrers, from MSIE clients. We'll see how we handle those on our actual backends (without HAproxy forwarding the traffic), and then decide on what the next step would be. Thanks for the input so far. br, Stojan Hi, You could use as well the option option accept-invalid-http-request to tell HAProxy not to run character compliance testing. This could be used temporary, the time for the applications to be fixed. Baptiste
Re: Haproxy and UTF8-encoded chars
On 26.7.2012 10:15, Sander Klein wrote: So I definitely think Internet Explorer is doing it wrong. It relies on the fact that most web servers will encode the URL for them, which most actually do If you really want to accept the 'bad' URLs then you might enable 'option accept-invalid-http-request' but I strongly recommend to not enable this in a production environment. I did a bit of ngrep-ing yesterday as well, and we are indeed seeing non-ASCII encoded characters flowing in URLs and Referrers, from MSIE clients. We'll see how we handle those on our actual backends (without HAproxy forwarding the traffic), and then decide on what the next step would be. Thanks for the input so far. br, Stojan
Re: Haproxy and UTF8-encoded chars
On 25.7.2012 11:21, Sander Klein wrote: We are experiencing the same issue, but it only happens with Internet Explorer. So I figured it must be a bug on the internet explorer side and not on the HAProxy side since internet explorer doesn't seem to encode the URL correctly. I'm afraid I don't have any control over what browsers the users are using, and I'm sure a fair amount of those are IE . And the fact that I'm seeing \x escaped characters in both GET and Referrer headers isn't helping any either. How do you deal with IE users then ? br, Stojan
Re: Haproxy and UTF8-encoded chars
On 26.07.2012 09:44, Stojan Rancic (Iprom) wrote: On 25.7.2012 11:21, Sander Klein wrote: We are experiencing the same issue, but it only happens with Internet Explorer. So I figured it must be a bug on the internet explorer side and not on the HAProxy side since internet explorer doesn't seem to encode the URL correctly. I'm afraid I don't have any control over what browsers the users are using, and I'm sure a fair amount of those are IE . And the fact that I'm seeing \x escaped characters in both GET and Referrer headers isn't helping any either. How do you deal with IE users then ? This is always a bit problematic. If the URL is being generated from our software then we fix our software to create pre-encoded URLs. If it's 3rd party software, we tell the 3rd party to fix their stuff. Currently we have one case where the 3rd party doesn't understand the issue, and then we just tell the users to start using a browser which does proper encoding. Because of your question I wiresharked a bit yesterday to make sure I was giving you the right info. My tests showed that Safari, Firefox and Chrome do proper encoding of the URL before sending it and Internet Explorer only encodes some parts of the URL. I also check RFC3986 and it says in section 2.5 paragraph 6: --- When a new URI scheme defines a component that represents textual data consisting of characters from the Universal Character Set [UCS], the data should first be encoded as octets according to the UTF-8 character encoding [STD63]; then only those octets that do not correspond to characters in the unreserved set should be percent- encoded. For example, the character A would be represented as A, the character LATIN CAPITAL LETTER A WITH GRAVE would be represented as %C3%80, and the character KATAKANA LETTER A would be represented as %E3%82%A2. --- So I definitely think Internet Explorer is doing it wrong. It relies on the fact that most web servers will encode the URL for them, which most actually do If you really want to accept the 'bad' URLs then you might enable 'option accept-invalid-http-request' but I strongly recommend to not enable this in a production environment. Greets, Sander Klein
Re: Haproxy and UTF8-encoded chars
On 07/25/2012 08:22 AM, Stojan Rancic (Iprom) wrote: Hello, we're experiencing issues with HAproxy 1.5-dev11 rejecting GET requests with UTF8-encoded characters. The encoding happens with Javascript's Encode function for east european characters (š, č, ž, etc) . The requests (as seen from 'echo show errors | socat stdio unix-connect:/var/run/haproxy.sock') are: Total events captured on [25/Jul/2012:08:10:16.070] : 1347 [25/Jul/2012:08:10:16.033] frontend http-in (#2): invalid request backend NONE (#-1), server NONE (#-1), event #1346 src 91.216.172.145:40752, session #527199, session flags 0x HTTP msg state 27, msg flags 0x, tx flags 0x HTTP chunk len 0 bytes, HTTP body len 0 bytes buffer flags 0x00809002, out 0 bytes, total 873 bytes pending 873 bytes, wrapping at 16384, error at position 51: 0 GET /XXX/YYY?z=39;t=js;sid=index;ssid=\xC2\xA7=index;m=ZZ 00064+ ZZ;ref=http://www.ZZZ.com/lala;num=9;kw=;flash=0;res=lala;r 00134+ e=http%3A%2F%2Fwww.QQQ.com%2Fsi%2FZZZ.html;rmc=1343196615443;cpre 00204+ mium=false;url=http%3A//www.ZZZ.com/lala HTTP/1.1\r\n I think that you should urlencode utf8 strings you want to put into URI or query string. Check out encodeURIComponent() -- Brane F. Gračnar skrbnik aplikacij/applications manager e: brane.grac...@tsmedia.si TSmedia, d.o.o. a: Cigaletova 15, 1000 Ljubljana; Slovenia t: +386 1 473 00 10 f: +386 1 473 00 16
Re: Haproxy and UTF8-encoded chars
On 26.7.2012 14:51, Brane F. Gračnar wrote: 0 GET /XXX/YYY?z=39;t=js;sid=index;ssid=\xC2\xA7=index;m=ZZ 00064+ ZZ;ref=http://www.ZZZ.com/lala;num=9;kw=;flash=0;res=lala;r 00134+ e=http%3A%2F%2Fwww.QQQ.com%2Fsi%2FZZZ.html;rmc=1343196615443;cpre 00204+ mium=false;url=http%3A//www.ZZZ.com/lala HTTP/1.1\r\n I think that you should urlencode utf8 strings you want to put into URI or query string. Check out encodeURIComponent() After talking to our team, this is exactly what we're supposed to be using.. encodeURIComponent(header.referrer)
Re: Haproxy and UTF8-encoded chars
Hi, On 25.07.2012 08:22, Stojan Rancic (Iprom) wrote: Hello, we're experiencing issues with HAproxy 1.5-dev11 rejecting GET requests with UTF8-encoded characters. The encoding happens with Javascript's Encode function for east european characters (š, č, ž, etc) . We are experiencing the same issue, but it only happens with Internet Explorer. So I figured it must be a bug on the internet explorer side and not on the HAProxy side since internet explorer doesn't seem to encode the URL correctly. Greets, Sander