I recently ran into this problem on a production server, and it was causing 
my users to lose their sessions.

Many browsers will happily post UTF-8 encoded data in cookie strings. This 
will result in cookie data such as this, which I captured from my nginx log:

'Good=1234;bad=\xe6\xb8\x85\xe9\xa2\xa8;sessionid=abc'

When Django tries to parse this cookie input, it will lose the cookies from 
"bad" onwards:

Python 2.7.10 (default, Sep 23 2015, 04:34:21) 
[GCC 4.2.1 Compatible Apple LLVM 7.0.0 (clang-700.0.72)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import django
>>> django.VERSION
(1, 9, 2, 'final', 0)
>>> from django.http.cookie import parse_cookie
>>> cookie = 'Good=1234;bad=\xe6\xb8\x85\xe9\xa2\xa8;sessionid=abc'
>>> print cookie
Good=1234;bad=清風;sessionid=abc
>>> parse_cookie(cookie)
django/http/cookie.py:92: UnicodeWarning: Unicode equal comparison failed 
to convert both arguments to Unicode - interpreting them as being unequal
  if cookie == '':
{'Good': '1234'}


Python 3.5.1 (default, Dec 26 2015, 18:11:22) 
[GCC 4.2.1 Compatible Apple LLVM 7.0.2 (clang-700.1.81)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import django
>>> django.VERSION
(1, 9, 2, 'final', 0)
>>> from django.http.cookie import parse_cookie
>>> cookie = 'Good=1234;bad=\xe6\xb8\x85\xe9\xa2\xa8;sessionid=abc'
>>> print(cookie)
Good=1234;bad=æ¸ é¢¨;sessionid=abc
>>> parse_cookie(cookie)
{'Good': '1234'}
>>> cookie = 
b'Good=1234;bad=\xe6\xb8\x85\xe9\xa2\xa8;sessionid=abc'.decode('utf-8')
>>> print(cookie)
Good=1234;bad=清風;sessionid=abc
>>> parse_cookie(cookie)
{'Good': '1234'}

This link on SO has an interesting discussion about encoding in cookies: 
http://stackoverflow.com/questions/1969232/allowed-characters-in-cookies. 
The take away for me was this statement: "so in practice you cannot use 
non-ASCII characters in cookies at all".

Unfortunately in my case, my server is running as a sub-domain, and some 
other server in the domain has set a domain cookie with UTF-8 characters in 
it. Since this other server is often going to be a gateway to my server, 
this is causing problems for me as I'm also getting hit with those cookies, 
and Django is losing everything after the illegal characters.

I have resolved this in my instance as follows in django/http/cookie.py:

def parse_cookie(cookie):
    cookie = re.sub('[^\x20-\x7e]+', 'X', cookie)
    ...

This limits the cookie characters to the printable lower ASCII characters. 
I consider anything else to be a bad use of cookies, and since I have 
control of my own cookies I'm not worried about this.

I'm not sure if this would be considered such an edge case that it's not 
worthy of a patch, but it might also be considered a DoS vector.

Interested to hear other thoughts or ideas for a better solution.

Will

-- 
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-users+unsubscr...@googlegroups.com.
To post to this group, send email to django-users@googlegroups.com.
Visit this group at https://groups.google.com/group/django-users.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-users/2523830d-e819-42b1-a3dc-566d78ea9ca0%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to