I recently ran into this problem on a production server, and it was causing my users to lose their sessions.
Many browsers will happily post UTF-8 encoded data in cookie strings. This will result in cookie data such as this, which I captured from my nginx log: 'Good=1234;bad=\xe6\xb8\x85\xe9\xa2\xa8;sessionid=abc' When Django tries to parse this cookie input, it will lose the cookies from "bad" onwards: Python 2.7.10 (default, Sep 23 2015, 04:34:21) [GCC 4.2.1 Compatible Apple LLVM 7.0.0 (clang-700.0.72)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> import django >>> django.VERSION (1, 9, 2, 'final', 0) >>> from django.http.cookie import parse_cookie >>> cookie = 'Good=1234;bad=\xe6\xb8\x85\xe9\xa2\xa8;sessionid=abc' >>> print cookie Good=1234;bad=清風;sessionid=abc >>> parse_cookie(cookie) django/http/cookie.py:92: UnicodeWarning: Unicode equal comparison failed to convert both arguments to Unicode - interpreting them as being unequal if cookie == '': {'Good': '1234'} Python 3.5.1 (default, Dec 26 2015, 18:11:22) [GCC 4.2.1 Compatible Apple LLVM 7.0.2 (clang-700.1.81)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> import django >>> django.VERSION (1, 9, 2, 'final', 0) >>> from django.http.cookie import parse_cookie >>> cookie = 'Good=1234;bad=\xe6\xb8\x85\xe9\xa2\xa8;sessionid=abc' >>> print(cookie) Good=1234;bad=æ¸ é¢¨;sessionid=abc >>> parse_cookie(cookie) {'Good': '1234'} >>> cookie = b'Good=1234;bad=\xe6\xb8\x85\xe9\xa2\xa8;sessionid=abc'.decode('utf-8') >>> print(cookie) Good=1234;bad=清風;sessionid=abc >>> parse_cookie(cookie) {'Good': '1234'} This link on SO has an interesting discussion about encoding in cookies: http://stackoverflow.com/questions/1969232/allowed-characters-in-cookies. The take away for me was this statement: "so in practice you cannot use non-ASCII characters in cookies at all". Unfortunately in my case, my server is running as a sub-domain, and some other server in the domain has set a domain cookie with UTF-8 characters in it. Since this other server is often going to be a gateway to my server, this is causing problems for me as I'm also getting hit with those cookies, and Django is losing everything after the illegal characters. I have resolved this in my instance as follows in django/http/cookie.py: def parse_cookie(cookie): cookie = re.sub('[^\x20-\x7e]+', 'X', cookie) ... This limits the cookie characters to the printable lower ASCII characters. I consider anything else to be a bad use of cookies, and since I have control of my own cookies I'm not worried about this. I'm not sure if this would be considered such an edge case that it's not worthy of a patch, but it might also be considered a DoS vector. Interested to hear other thoughts or ideas for a better solution. Will -- You received this message because you are subscribed to the Google Groups "Django users" group. To unsubscribe from this group and stop receiving emails from it, send an email to django-users+unsubscr...@googlegroups.com. To post to this group, send email to django-users@googlegroups.com. Visit this group at https://groups.google.com/group/django-users. To view this discussion on the web visit https://groups.google.com/d/msgid/django-users/2523830d-e819-42b1-a3dc-566d78ea9ca0%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.