I've had a chance to revisit this and I tested this using the python Riak 
library.  As a result, I made some changes, and I have a much better Python 
3 implementation.

My original python 3 protobufs code took in byte data passed in as strs. 
 The way the Riak library works, it's reading data from a socket and 
sending that to protobufs to decode.  I believe most libraries will do the 
same - read from a socket or maybe a file, which will load 'bytes' in 
Python 3.  I changed my code so that it works with bytes instead of 
strings.  I translated portions of code until I was able to read from Riak, 
then went back and got the unittests to pass as well, and both are working 
now!

My next goal is to get Python 3 to use the _pb2 suffix just like the Python 
2 code does.  I am currently using _py3_pb2 as a suffix because of the C++ 
tests, but this meant I had to go into the Riak library and change a bunch 
of imports, for example import riak_pb2 --> import riak_py3_pb2 as riak_pb2.


On Monday, October 1, 2012 4:13:11 PM UTC-7, Charles Law wrote:
>
> I assumed that the type/value errors are no longer valid in Python 3, so I 
> removed the 3 checks in reflection_test.testStringUTF8Encoding().  All unit 
> tests now pass!
>
>
> On Tuesday, September 25, 2012 1:47:09 PM UTC-7, Charles Law wrote:
>>
>> I thought about this a little, and realized that both unicode and str 
>> type strings are passed into fields that have cpp_type CPP_STRING and 
>> field_type TYPE_STRING.  I know the 7-bit character limit is only imposed 
>> on str type strings - all the extreme value tests use unicode strings.  In 
>> Python3, all strings are unicode, so should this limit only exist in Python 
>> 2.x?
>>
>>
>> On Friday, September 21, 2012 6:16:41 PM UTC-7, Charles Law wrote:
>>>
>>> I've made an attempt to create a Python3 compatible version of 
>>> protobufs.  I have some code that passes pretty much all the unit tests 
>>> which I've posted here:
>>>
>>> https://github.com/openx/python3-protobuf
>>>
>>> I probably won't have a chance to look at this again for a couple weeks 
>>> if not longer, so I want to get it out there.  In my attempt I decided to 
>>> follow the advice in another post, and treat python3 as a new language.  To 
>>> get python3 working, you'll have to compile the C code.  There are also a 
>>> few issues I ran into along the way:
>>>
>>>    - I decided to use strings where unicode is used in Python 2.  I was 
>>>    originally going to try to use bytes/bytearrays, but they do not support 
>>> >8 
>>>    bit characters, and some of the setup.py tests use "exotic" 16 bit chrs. 
>>>    (Warning: I might have something conceptually wrong here)
>>>    - There are places where byte data is stored as strings, then 
>>>    converted to unicode.  I ended up converting strings (I called them 
>>>    bytestr's) to normal strings.  I'm not sure this is done correctly 
>>>    everywhere though.
>>>    - Data is packed/unpacked using struct.pack/unpack which is done 
>>>    using bytes instead of strings in Python3.  I have simple 
>>> string_to_bytes() 
>>>    and bytes_to_string() functions to do this.
>>>
>>>
>>> What's left is:
>>>
>>>    - There are a couple Exceptions that I don't throw.  They are 
>>>    supposed to be where the Python2 code converts from unicode strings to 
>>>    regular strings.  I am definitely missing something conceptually here - 
>>> I 
>>>    haven't figured out how Python 2x supports strings with "exotic" 
>>>    characters, but not strings like u'a\x80a'.  If someone can solve this 
>>>    problem & figure out when to throw the exceptions Python3 will be *
>>>    fully* working.
>>>
>>>
>>> I might have small bits of time here or there but I don't think I can 
>>> devote the time I need to get this finished for several weeks, so if 
>>> someone wants to finish this up, feel free to fork this code.  If anyone 
>>> wants to see what I did, the best way to do this is to diff between the 
>>> latest commit and commit 49ccf5d8b3b688c335dc35bcb9f219eca78c7210.
>>> Thanks!
>>> Charles
>>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To view this discussion on the web visit 
https://groups.google.com/d/msg/protobuf/-/us3gNXh6TSEJ.
To post to this group, send email to protobuf@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.

Reply via email to