Yuen Ho Wong <wyue...@gmail.com> added the comment:

Well I think the WSGI 1.x spec has made a mistake of mandating all strings in 
environ to be 
byte strings while not defining a global environment variable to give 
middlewares a hint of how 
to decode the byte strings. This is a recognized problem that is address in 
WSGI 2 by 
mandating strings to be unicode.

The problem with not knowing how to decode byte strings is not not knowing how 
to decode in 
the handlers but how to decode in the middlewares, which is supposed to be 

To limit this problem to just repoze.who, say I have an IIdentifier that wants 
to remember 
credentials according to different charsets on a per request basis. In a 
perfect world, the server 
will set a well known variable in the request on the first opportunity, and the 
plugin will just look 
for it and encode accordingly. But in WSGI, there's no request object, there's 
only an environ, 
so we are stuck with that. So in this less perfect world, there would be a 
well-known charset 
variable in the environ to give hints to middlewares and the applications. But 
there isn't, so we 
application developers have to invent one. Right now, every framework deals 
with it differently, 
but at the end of the day, there is a threadlocal charset variable that 
__handlers__ can use. 
There is no equivalent in repoze.who and repoze.what.

As I have already said, until Py3k takes off and we are all using WSGI 2, this 
will be a problem 
we are stuck with and middlewares will need to deal with it. I have already 
proposed 3 
solutions in comment #1. I'm in favor of solution number 2.

To answer your questions. Yes, just having a charset for repoze.who will not 
solve all the 
problems of decoding in WSGI apps, but at least it's half of the solution. I 
believe the 
repoze.who middleware should take a parameter in the constructor, such that it 
can set it into 
the environ as early as possible. To plugins, this serves as a hint - meaning 
only a default - 
charset to decode bytestrings. There's no compliance forced on plugins, but it 
only serves as a  
very helpful clue. Holistic support simply means 1) have a globally visible 
charset variable for 
all of repoze.who at any scope, 2) all the plugins will make a best effort to 
decode according to 
the global charset.

I hope this clears up the issue.

Repoze Bugs <b...@bugs.repoze.org>
Repoze-dev mailing list

Reply via email to