Mathis Hofer wrote on 6/29/07 12:29 PM:
> There were two problems:
>   1. I apparently used a unicode string for the "Content-Disposition"
> which I had to convert to ASCII.
>   2. I had to override the HTTPContent.write() method and put out the
> strings without converting them to "str". Otherwise WebKit tried to
> convert the CP1252-encoded string into ASCII which also resulted in a
> UnicodeDecodeError for special chars.
>
> The latter is a general problem for web pages in Unicode and I think it
> should be changed (if not already done):
> The HTTPContent.write() should not convert the output to ASCII, since
> this makes output in different encodings impossible... it should either
> leave the output like it is so it's up to the developer to perform the
> encoding, or it should use a customizable encoding.
>
>   
HTTPContent.write() is not explicitly converting to ASCII. It is 
converting to a string whatever object is about to be written. Because 
you are passing a unicode object, Python needs to encode it with a 
character set to get a string, and the default site-wide character set 
for a Python installation is ASCII. You can configure this for your 
Python installation if you need a different default character set.

In general, if you are using unicode objects in your code, you as the 
developer need to apply a character set to them before they can be 
"written" anywhere (to a file, socket, etc). Even without the explicit 
str() call in HTTPContent.write(), you would get the same error I think, 
because Python will need to have a string to append to the buffer, and 
the default __str__ implementation for unicode objects is to encode them 
with the site-wide character set (ASCII by default). If not, removing 
that str() call would give you TypeError or something along those lines.

I would recommend overriding .write() in your main servlet class, and 
have that function compensate for the lack of a character set. For 
example, you could change the function signature to be def 
write(unicodeObject, charSet):, or you could have the write() function 
accept either a string or both a unicode object and character set, or 
reference an attribute on the servlet which specifies the charset to be 
used with unicode objects, etc.

Regards - Ben

-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
Webware-devel mailing list
Webware-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/webware-devel

Reply via email to