[ 
https://issues.apache.org/jira/browse/THRIFT-1229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

bryan newbold updated THRIFT-1229:
----------------------------------

    Attachment: python_fastbinary_utf8.patch

This git-style patch adds an extra arg to the fastbinary methods; see also 
https://github.com/octopart/thrift/commit/79611ef2cad47714d8addaa429bec4ce51bdf297

As a disclaimer, I don't have much experience writing Python C-API code, and 
this was not tested with Python3 at all. Use of global variables or separate 
functions ('encode_binary_utf8') may be more appropriate style. 

The binary encode function checks every passed string argument and only does 
UTF-8 encoding on Unicode PyObjects, which is arguably poor behavior but fit 
our use case best. If the utf8strings flag is set then all read objects are 
decoded as UTF-8; this could potentially lead to a situation where a client 
writes a non-UTF8 byte string with the utf8strings flag set with no error, but 
the server (also with the utf8strings flag set) has trouble decoding.

Code generated with the optional utf8strings flag to fastbinary would require 
the most recent version of the python libraries to be installed, i'm not sure 
if that flavor of backwards incompatibility is an issue. 

The Fastbinary.py is non-functional; see 
https://github.com/octopart/thrift/commit/1152508165783dcf624471ac66458dac3ca67e62
 for a partial fix. 
                
> Python fastbinary.c can not handle unicode as generated python code
> -------------------------------------------------------------------
>
>                 Key: THRIFT-1229
>                 URL: https://issues.apache.org/jira/browse/THRIFT-1229
>             Project: Thrift
>          Issue Type: Bug
>          Components: Python - Compiler, Python - Library
>    Affects Versions: 0.7
>         Environment: mac osx 10.6
>            Reporter: Favo
>         Attachments: python_fastbinary_utf8.patch
>
>
> #THRIFT-395 
> ([r959516|http://svn.apache.org/viewvc?view=revision&revision=959516]) fixed 
> python unicode support by adding a parameter to thrift command line for 
> py-generator. However this will not affect fastbinary.c. A normal generated 
> Read/Write function looks like below, notice that the function returned 
> before reach unicode handling logic.
> {code:title=TType.py|borderStyle=solid}
>   def write(self, oprot):
>     if oprot.__class__ == TBinaryProtocol.TBinaryProtocolAccelerated and 
> self.thrift_spec is not None and fastbinary is not None:
>       oprot.trans.write(fastbinary.encode_binary(self, (self.__class__, 
> self.thrift_spec)))
>       return
>     if self.ip is not None:
>       oprot.writeFieldBegin('ip', TType.STRING, 6)
>       oprot.writeString(self.ip.encode('utf-8'))
>       oprot.writeFieldEnd()
> {code}
> Any suggestion for this?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to