[ 
https://issues.apache.org/jira/browse/THRIFT-395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12689327#action_12689327
 ] 

David Reiss commented on THRIFT-395:
------------------------------------

I think that the attached patch is definitely how things should work in Python 
3, and I definitely think we should try to keep consistency between the pure 
python implementation and the extension.  For Python 2, I can think of two 
possible approaches.

- Try to conform to the old behavior to the extent possible.  Don't attempt to 
re-encode str objects when writing.  Return raw str objects when reading.
- Implement type annotations for base types (I'm about 75% of the way through 
this) and require an annotation to trigger the unicode behavior in Python 2.

Unfortunately, the convention in Python 2 is that strings are blobs, not 
unicode.

Thoughts?

> Python library + compiler does not support unicode strings
> ----------------------------------------------------------
>
>                 Key: THRIFT-395
>                 URL: https://issues.apache.org/jira/browse/THRIFT-395
>             Project: Thrift
>          Issue Type: Bug
>          Components: Compiler (Python), Library (Python)
>            Reporter: Jonathan Ellis
>            Assignee: Jonathan Ellis
>            Priority: Blocker
>             Fix For: 0.1
>
>         Attachments: python-utf8-v2.patch, python-utf8.patch
>
>
> Effectively, all strings in the python bindings are treated as binary strings 
> -- no encoding/decoding to UTF-8 is done.  So if a unicode object is passed 
> to a (regular, non-binary) string, an exception is raised.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to