I thought about this a little, and realized that both unicode and str type 
strings are passed into fields that have cpp_type CPP_STRING and field_type 
TYPE_STRING.  I know the 7-bit character limit is only imposed on str type 
strings - all the extreme value tests use unicode strings.  In Python3, all 
strings are unicode, so should this limit only exist in Python 2.x?


On Friday, September 21, 2012 6:16:41 PM UTC-7, Charles Law wrote:
>
> I've made an attempt to create a Python3 compatible version of protobufs. 
>  I have some code that passes pretty much all the unit tests which I've 
> posted here:
>
> https://github.com/openx/python3-protobuf
>
> I probably won't have a chance to look at this again for a couple weeks if 
> not longer, so I want to get it out there.  In my attempt I decided to 
> follow the advice in another post, and treat python3 as a new language.  To 
> get python3 working, you'll have to compile the C code.  There are also a 
> few issues I ran into along the way:
>
>    - I decided to use strings where unicode is used in Python 2.  I was 
>    originally going to try to use bytes/bytearrays, but they do not support 
> >8 
>    bit characters, and some of the setup.py tests use "exotic" 16 bit chrs. 
>    (Warning: I might have something conceptually wrong here)
>    - There are places where byte data is stored as strings, then 
>    converted to unicode.  I ended up converting strings (I called them 
>    bytestr's) to normal strings.  I'm not sure this is done correctly 
>    everywhere though.
>    - Data is packed/unpacked using struct.pack/unpack which is done using 
>    bytes instead of strings in Python3.  I have simple string_to_bytes() and 
>    bytes_to_string() functions to do this.
>
>
> What's left is:
>
>    - There are a couple Exceptions that I don't throw.  They are supposed 
>    to be where the Python2 code converts from unicode strings to regular 
>    strings.  I am definitely missing something conceptually here - I haven't 
>    figured out how Python 2x supports strings with "exotic" characters, but 
>    not strings like u'a\x80a'.  If someone can solve this problem & figure 
> out 
>    when to throw the exceptions Python3 will be *fully* working.
>
>
> I might have small bits of time here or there but I don't think I can 
> devote the time I need to get this finished for several weeks, so if 
> someone wants to finish this up, feel free to fork this code.  If anyone 
> wants to see what I did, the best way to do this is to diff between the 
> latest commit and commit 49ccf5d8b3b688c335dc35bcb9f219eca78c7210.
> Thanks!
> Charles
>

-- 
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To view this discussion on the web visit 
https://groups.google.com/d/msg/protobuf/-/yzBvNh_ERv4J.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.

Reply via email to