On Sun, Mar 9, 2014 at 5:19 AM, liao liao <[email protected]> wrote:

> "new design/implementation in the next release" to significantly improve
> python sdk's performance, when?
>
Probably in Q3.


> the c++ binding has been experimental for years.
>
>
> On Saturday, March 1, 2014 2:49:06 AM UTC+8, Feng Xiao wrote:
>
>> For patch #1, you can submit it to the issue tracker. We don't have time
>> to look at it immediately but will get to it in the next release.
>>
>> For patch #2, we don't accept patches to the experimental C++ binding
>> support any more because it will be replaced by a new design/implementation
>> in the next release. Sorry for that but AFAIK the new design/implementation
>> is significantly better so it's worthwhile to try if you are currently
>> using the old one (which has hard-to-address memory ownership problems).
>>
>> On Wed, Feb 26, 2014 at 4:11 PM, <[email protected]> wrote:
>>
>>> Hey guys,
>>>
>>>
>>> I wrote two different patches which optimize python proto performance.
>>>  Both patches are running in production at Dropbox.  What is the best way
>>> to upstream these changes?
>>>
>>>
>>> Patrick
>>>
>>> ============================
>>>
>>> Patch #1. Python message patch (https://www.dropbox.com/s/
>>> q0y44ypti0by779/protobuf-2.5.0.patch1):
>>>
>>> Changes:
>>> - precompute various varint tables
>>> - don't use proto's ByteSize function for serialization
>>> - simplified some code (got rid of the listener)
>>> - got rid of StringIO
>>>
>>> Internal benchmark:
>>> - random repeated int32s - ~18% faster
>>> - random repeated int64s - ~20% faster
>>> - random repeated strings - 27% faster
>>> - random repeated bytes - 27% faster
>>> - repeated message with each with a single random string - ~20% faster
>>>
>>> NOTE:
>>> - predefined_varints.py is generated by generate_predefined_varints.py
>>>
>>> ============================
>>>
>>> Patch #2. C++ experimental binding patch (https://www.dropbox.com/s/
>>> 5nr0v76nfraaxif/protobuf-2.5.0.patch2):
>>>
>>> Changes:
>>> - fixed memory ownership / dangling pointer (see NOTE #1 for known
>>> issues)
>>>     1. inc ref count parent message when accessing a field,
>>>     2. a cleared field's is freed only when the parent is deleted
>>> - fixed MakeDescriptor to correctly generating simple proto (see NOTE #2)
>>> - fixed MergeFrom to not crash on check failure due to self merge
>>> - fixed both repeated and non-repeated field clearing
>>> - modified varint deserialization to always return PyLong (to match
>>> existing python implementation)
>>> - always mark message as mutate when extending a repeated field (even
>>> when extending by an empty list)
>>> - deleted/updated bad tests from the protobuf test suite
>>>
>>> Internal benchmark (relative to the first patch):
>>> - 30x faster for repeated varints
>>> - 8x faster for repeated string
>>> - 6x faster for repeated bytes
>>> - 26x speed up for repeated nested msg
>>>
>>> NOTE:
>>> 1. In the current implementation, a new python object is created each
>>> time a field is accessed. To make this 100% correct, we should return the
>>> same python object whenever the same field is accessed; however, I don't
>>>  think the accounting overhead is worth it.  Implications due to the
>>> current implementation:
>>>     - repeatedly clearing / mutating the same message can cause memory
>>> blow up
>>>     - There's a subtle bug with clearing / mutating default message
>>> fields:
>>>
>>>     This is correct. Holding a reference to a MUTATED field X, then
>>> clearing the parent, then mutate X. e.g.,
>>>         child = parent.optional_nested_msg
>>>         child.field = 123 # this mutates the field
>>>         parent.Clear()
>>>         child.field = 321
>>>         assert not parent.HasField('child') # passes
>>>
>>>     This is incorrect. Holding a reference to a UNMUTATED field X, then
>>> clearing the parent, then mutate X.
>>>         child = parent.optional_nested_msg
>>>         parent.Clear()
>>>         child.field = 321 # this inadvertently causes parent to generate
>>> a different empty msg for optional_nested_msg.
>>>         assert not parent.HasField('optional_nested_msg') # fail
>>>
>>>     Luckily, these access patterns are extremely rare (at least at
>>> dropbox).
>>>
>>> 2. I wrote a fully functional MakeDescriptor for c++ protos when I was
>>> at google.  Talk to the F1 team (specifically Bart Samwel / Chad Whipkey)
>>> if you're interested in upstreaming that to the opensource community.
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "Protocol Buffers" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to [email protected].
>>> To post to this group, send email to [email protected].
>>>
>>> Visit this group at http://groups.google.com/group/protobuf.
>>> For more options, visit https://groups.google.com/groups/opt_out.
>>>
>>
>>  --
> You received this message because you are subscribed to the Google Groups
> "Protocol Buffers" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To post to this group, send email to [email protected].
> Visit this group at http://groups.google.com/group/protobuf.
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/protobuf.
For more options, visit https://groups.google.com/d/optout.

Reply via email to