[ https://issues.apache.org/jira/browse/THRIFT-395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12689327#action_12689327 ]
David Reiss commented on THRIFT-395: ------------------------------------ I think that the attached patch is definitely how things should work in Python 3, and I definitely think we should try to keep consistency between the pure python implementation and the extension. For Python 2, I can think of two possible approaches. - Try to conform to the old behavior to the extent possible. Don't attempt to re-encode str objects when writing. Return raw str objects when reading. - Implement type annotations for base types (I'm about 75% of the way through this) and require an annotation to trigger the unicode behavior in Python 2. Unfortunately, the convention in Python 2 is that strings are blobs, not unicode. Thoughts? > Python library + compiler does not support unicode strings > ---------------------------------------------------------- > > Key: THRIFT-395 > URL: https://issues.apache.org/jira/browse/THRIFT-395 > Project: Thrift > Issue Type: Bug > Components: Compiler (Python), Library (Python) > Reporter: Jonathan Ellis > Assignee: Jonathan Ellis > Priority: Blocker > Fix For: 0.1 > > Attachments: python-utf8-v2.patch, python-utf8.patch > > > Effectively, all strings in the python bindings are treated as binary strings > -- no encoding/decoding to UTF-8 is done. So if a unicode object is passed > to a (regular, non-binary) string, an exception is raised. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.