As I recall, Jonathan was the only person who really seemed to care about this 
issue, and he wasn't satisfied with my suggestion, so I put it aside. Chad also 
requested some changes to my diff for the JSON protocol. I'll try to reevaluate 
the status some time soon, but I am away from a computer today. 
Sent from my BlackBerry.

----- Original Message -----
From: Terry Jones (JIRA) <j...@apache.org>
To: David Reiss
Sent: Thu Jul 09 03:29:15 2009
Subject: [jira] Commented: (THRIFT-395) Python library + compiler does not 
support unicode strings


    [ 
https://issues.apache.org/jira/browse/THRIFT-395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12729145#action_12729145
 ] 

Terry Jones commented on THRIFT-395:
------------------------------------

It seems that a decision/consensus was almost reached here, specifically 
David's suggestion at http://bit.ly/ofFr0

Can we re-animate this issue and get it resolved?  I somehow skipped this 
discussion when it was going on as I knew (or thought I knew) that strings were 
sent as UTF-8 and was mistakenly assuming that the Python support did the Right 
Thing and that if an app passed a Python unicode object in a call you'd get a 
Python unicode object out on the other end. Last night I found out to my great 
surprise that that's not the case.

It would be *really* nice to have this resolved. Otherwise it's going to mean a 
bunch of crufty manual coding decoding. And it's made worse in our case as we 
have a dozen internal services that all speak to each other extensively using 
Thrift. So not only do we need to deal with outside clients being able to 
somehow pass unicode, we'd have to manually decode each arg in each method in 
each service, and then manually encode them again to call another Thrift method 
inside our own service. Either that or keep things as UTF-8 strings, which 
isn't an option.

The patches are in, and backwards compatibility is not an issue with David's 
suggestion. Real users need it ASAP to avoid real pain :-)  What's still 
stopping this from being resolved/applied/committed?

Terry


> Python library + compiler does not support unicode strings
> ----------------------------------------------------------
>
>                 Key: THRIFT-395
>                 URL: https://issues.apache.org/jira/browse/THRIFT-395
>             Project: Thrift
>          Issue Type: Improvement
>          Components: Compiler (Python), Library (Python)
>            Reporter: Jonathan Ellis
>            Assignee: Jonathan Ellis
>             Fix For: 0.2
>
>         Attachments: 
> 0001-python-Minor-cleanup-of-protocols-don-t-use-str.patch, 
> 0002-THRIFT-395.-python-Phase-One-of-support-for-unicode.patch, 
> 0003-THRIFT-395.-python-Phase-Two-of-support-for-unicode.patch, 
> 0004-python-Remove-ridiculous-semicolons-from-gen-code.patch, 
> python-utf8-v2.patch, python-utf8.patch
>
>
> Effectively, all strings in the python bindings are treated as binary strings 
> -- no encoding/decoding to UTF-8 is done.  So if a unicode object is passed 
> to a (regular, non-binary) string, an exception is raised.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to