Mathis Hofer wrote on 6/29/07 12:29 PM: > There were two problems: > 1. I apparently used a unicode string for the "Content-Disposition" > which I had to convert to ASCII. > 2. I had to override the HTTPContent.write() method and put out the > strings without converting them to "str". Otherwise WebKit tried to > convert the CP1252-encoded string into ASCII which also resulted in a > UnicodeDecodeError for special chars. > > The latter is a general problem for web pages in Unicode and I think it > should be changed (if not already done): > The HTTPContent.write() should not convert the output to ASCII, since > this makes output in different encodings impossible... it should either > leave the output like it is so it's up to the developer to perform the > encoding, or it should use a customizable encoding. > > HTTPContent.write() is not explicitly converting to ASCII. It is converting to a string whatever object is about to be written. Because you are passing a unicode object, Python needs to encode it with a character set to get a string, and the default site-wide character set for a Python installation is ASCII. You can configure this for your Python installation if you need a different default character set.
In general, if you are using unicode objects in your code, you as the developer need to apply a character set to them before they can be "written" anywhere (to a file, socket, etc). Even without the explicit str() call in HTTPContent.write(), you would get the same error I think, because Python will need to have a string to append to the buffer, and the default __str__ implementation for unicode objects is to encode them with the site-wide character set (ASCII by default). If not, removing that str() call would give you TypeError or something along those lines. I would recommend overriding .write() in your main servlet class, and have that function compensate for the lack of a character set. For example, you could change the function signature to be def write(unicodeObject, charSet):, or you could have the write() function accept either a string or both a unicode object and character set, or reference an attribute on the servlet which specifies the charset to be used with unicode objects, etc. Regards - Ben ------------------------------------------------------------------------- This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/ _______________________________________________ Webware-devel mailing list Webware-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/webware-devel