Re: Haproxy and UTF8-encoded chars

2012-08-01 Thread Baptiste
On Fri, Jul 27, 2012 at 8:49 AM, Stojan Rancic (Iprom) sto...@iprom.si wrote:
 On 26.7.2012 10:15, Sander Klein wrote:

 So I definitely think Internet Explorer is doing it wrong. It relies on
 the fact that most web servers will encode the URL for them, which most
 actually do

 If you really want to accept the 'bad' URLs then you might enable
 'option accept-invalid-http-request' but I strongly recommend to not
 enable this in a production environment.


 I did a bit of ngrep-ing yesterday as well, and we are indeed seeing
 non-ASCII encoded characters flowing in URLs and Referrers, from MSIE
 clients. We'll see how we handle those on our actual backends (without
 HAproxy forwarding the traffic), and then decide on what the next step would
 be.

 Thanks for the input so far.

 br, Stojan



Hi,

You could use as well the option option accept-invalid-http-request
to tell HAProxy not to run character compliance testing.
This could be used temporary, the time for the applications to be fixed.

Baptiste



Re: Haproxy and UTF8-encoded chars

2012-07-27 Thread Stojan Rancic (Iprom)

On 26.7.2012 10:15, Sander Klein wrote:


So I definitely think Internet Explorer is doing it wrong. It relies on
the fact that most web servers will encode the URL for them, which most
actually do

If you really want to accept the 'bad' URLs then you might enable
'option accept-invalid-http-request' but I strongly recommend to not
enable this in a production environment.


I did a bit of ngrep-ing yesterday as well, and we are indeed seeing 
non-ASCII encoded characters flowing in URLs and Referrers, from MSIE 
clients. We'll see how we handle those on our actual backends (without 
HAproxy forwarding the traffic), and then decide on what the next step 
would be.


Thanks for the input so far.

br, Stojan



Re: Haproxy and UTF8-encoded chars

2012-07-26 Thread Stojan Rancic (Iprom)

On 25.7.2012 11:21, Sander Klein wrote:


We are experiencing the same issue, but it only happens with Internet
Explorer. So I figured it must be a bug on the internet explorer side
and not on the HAProxy side since internet explorer doesn't seem to
encode the URL correctly.


I'm afraid I don't have any control over what browsers the users are 
using, and I'm sure a fair amount of those are IE . And the fact that 
I'm seeing \x escaped characters in both GET and Referrer headers isn't 
helping any either.


How do you deal with IE users then ?

br, Stojan




Re: Haproxy and UTF8-encoded chars

2012-07-26 Thread Sander Klein

On 26.07.2012 09:44, Stojan Rancic (Iprom) wrote:

On 25.7.2012 11:21, Sander Klein wrote:

We are experiencing the same issue, but it only happens with 
Internet
Explorer. So I figured it must be a bug on the internet explorer 
side

and not on the HAProxy side since internet explorer doesn't seem to
encode the URL correctly.


I'm afraid I don't have any control over what browsers the users are
using, and I'm sure a fair amount of those are IE . And the fact that
I'm seeing \x escaped characters in both GET and Referrer headers
isn't helping any either.

How do you deal with IE users then ?


This is always a bit problematic.

If the URL is being generated from our software then we fix our 
software to create pre-encoded URLs. If it's 3rd party software, we tell 
the 3rd party to fix their stuff.


Currently we have one case where the 3rd party doesn't understand the 
issue, and then we just tell the users to start using a browser which 
does proper encoding.


Because of your question I wiresharked a bit yesterday to make sure I 
was giving you the right info. My tests showed that Safari, Firefox and 
Chrome do proper encoding of the URL before sending it and Internet 
Explorer only encodes some parts of the URL.


I also check RFC3986 and it says in section 2.5 paragraph 6:

---
When a new URI scheme defines a component that represents textual
data consisting of characters from the Universal Character Set [UCS],
the data should first be encoded as octets according to the UTF-8
character encoding [STD63]; then only those octets that do not
correspond to characters in the unreserved set should be percent-
encoded.  For example, the character A would be represented as A,
the character LATIN CAPITAL LETTER A WITH GRAVE would be represented
as %C3%80, and the character KATAKANA LETTER A would be represented
as %E3%82%A2.
---

So I definitely think Internet Explorer is doing it wrong. It relies on 
the fact that most web servers will encode the URL for them, which most 
actually do


If you really want to accept the 'bad' URLs then you might enable 
'option accept-invalid-http-request' but I strongly recommend to not 
enable this in a production environment.


Greets,

Sander Klein



Re: Haproxy and UTF8-encoded chars

2012-07-26 Thread Brane F. Gračnar
On 07/25/2012 08:22 AM, Stojan Rancic (Iprom) wrote:
 Hello,
 
 we're experiencing issues with HAproxy 1.5-dev11 rejecting GET requests 
 with UTF8-encoded characters. The encoding happens with Javascript's 
 Encode function for east european characters (š, č, ž, etc) .
 
 The requests (as seen from 'echo show errors | socat stdio 
 unix-connect:/var/run/haproxy.sock') are:
 
 Total events captured on [25/Jul/2012:08:10:16.070] : 1347
 
 [25/Jul/2012:08:10:16.033] frontend http-in (#2): invalid request
backend NONE (#-1), server NONE (#-1), event #1346
src 91.216.172.145:40752, session #527199, session flags 0x
HTTP msg state 27, msg flags 0x, tx flags 0x
HTTP chunk len 0 bytes, HTTP body len 0 bytes
buffer flags 0x00809002, out 0 bytes, total 873 bytes
pending 873 bytes, wrapping at 16384, error at position 51:
 
0  GET /XXX/YYY?z=39;t=js;sid=index;ssid=\xC2\xA7=index;m=ZZ
00064+ ZZ;ref=http://www.ZZZ.com/lala;num=9;kw=;flash=0;res=lala;r
00134+ e=http%3A%2F%2Fwww.QQQ.com%2Fsi%2FZZZ.html;rmc=1343196615443;cpre
00204+ mium=false;url=http%3A//www.ZZZ.com/lala HTTP/1.1\r\n

I think that you should urlencode utf8 strings you want to put into URI
or query string.

Check out encodeURIComponent()





-- 
Brane F. Gračnar
skrbnik aplikacij/applications manager

e: brane.grac...@tsmedia.si
TSmedia, d.o.o.
a: Cigaletova 15, 1000 Ljubljana; Slovenia
t: +386 1 473 00 10
f: +386 1 473 00 16



Re: Haproxy and UTF8-encoded chars

2012-07-26 Thread Stojan Rancic (Iprom)

On 26.7.2012 14:51, Brane F. Gračnar wrote:


0  GET /XXX/YYY?z=39;t=js;sid=index;ssid=\xC2\xA7=index;m=ZZ
00064+ ZZ;ref=http://www.ZZZ.com/lala;num=9;kw=;flash=0;res=lala;r
00134+ e=http%3A%2F%2Fwww.QQQ.com%2Fsi%2FZZZ.html;rmc=1343196615443;cpre
00204+ mium=false;url=http%3A//www.ZZZ.com/lala HTTP/1.1\r\n


I think that you should urlencode utf8 strings you want to put into URI
or query string.

Check out encodeURIComponent()


After talking to our team, this is exactly what we're supposed to be 
using.. encodeURIComponent(header.referrer)





Re: Haproxy and UTF8-encoded chars

2012-07-25 Thread Sander Klein

Hi,

On 25.07.2012 08:22, Stojan Rancic (Iprom) wrote:

Hello,

we're experiencing issues with HAproxy 1.5-dev11 rejecting GET
requests with UTF8-encoded characters. The encoding happens with
Javascript's Encode function for east european characters (š, č, ž,
etc) .


We are experiencing the same issue, but it only happens with Internet 
Explorer. So I figured it must be a bug on the internet explorer side 
and not on the HAProxy side since internet explorer doesn't seem to 
encode the URL correctly.


Greets,

Sander