For patch #1, you can submit it to the issue tracker. We don't have time to
look at it immediately but will get to it in the next release.

For patch #2, we don't accept patches to the experimental C++ binding
support any more because it will be replaced by a new design/implementation
in the next release. Sorry for that but AFAIK the new design/implementation
is significantly better so it's worthwhile to try if you are currently
using the old one (which has hard-to-address memory ownership problems).

On Wed, Feb 26, 2014 at 4:11 PM, <[email protected]> wrote:

> Hey guys,
>
>
> I wrote two different patches which optimize python proto performance.
>  Both patches are running in production at Dropbox.  What is the best way
> to upstream these changes?
>
>
> Patrick
>
> ============================
>
> Patch #1. Python message patch (
> https://www.dropbox.com/s/q0y44ypti0by779/protobuf-2.5.0.patch1):
>
> Changes:
> - precompute various varint tables
> - don't use proto's ByteSize function for serialization
> - simplified some code (got rid of the listener)
> - got rid of StringIO
>
> Internal benchmark:
> - random repeated int32s - ~18% faster
> - random repeated int64s - ~20% faster
> - random repeated strings - 27% faster
> - random repeated bytes - 27% faster
> - repeated message with each with a single random string - ~20% faster
>
> NOTE:
> - predefined_varints.py is generated by generate_predefined_varints.py
>
> ============================
>
> Patch #2. C++ experimental binding patch (
> https://www.dropbox.com/s/5nr0v76nfraaxif/protobuf-2.5.0.patch2):
>
> Changes:
> - fixed memory ownership / dangling pointer (see NOTE #1 for known issues)
>     1. inc ref count parent message when accessing a field,
>     2. a cleared field's is freed only when the parent is deleted
> - fixed MakeDescriptor to correctly generating simple proto (see NOTE #2)
> - fixed MergeFrom to not crash on check failure due to self merge
> - fixed both repeated and non-repeated field clearing
> - modified varint deserialization to always return PyLong (to match
> existing python implementation)
> - always mark message as mutate when extending a repeated field (even when
> extending by an empty list)
> - deleted/updated bad tests from the protobuf test suite
>
> Internal benchmark (relative to the first patch):
> - 30x faster for repeated varints
> - 8x faster for repeated string
> - 6x faster for repeated bytes
> - 26x speed up for repeated nested msg
>
> NOTE:
> 1. In the current implementation, a new python object is created each time
> a field is accessed. To make this 100% correct, we should return the same
> python object whenever the same field is accessed; however, I don't think
> the accounting overhead is worth it.  Implications due to the current
> implementation:
>     - repeatedly clearing / mutating the same message can cause memory
> blow up
>     - There's a subtle bug with clearing / mutating default message
> fields:
>
>     This is correct. Holding a reference to a MUTATED field X, then
> clearing the parent, then mutate X. e.g.,
>         child = parent.optional_nested_msg
>         child.field = 123 # this mutates the field
>         parent.Clear()
>         child.field = 321
>         assert not parent.HasField('child') # passes
>
>     This is incorrect. Holding a reference to a UNMUTATED field X, then
> clearing the parent, then mutate X.
>         child = parent.optional_nested_msg
>         parent.Clear()
>         child.field = 321 # this inadvertently causes parent to generate a
> different empty msg for optional_nested_msg.
>         assert not parent.HasField('optional_nested_msg') # fail
>
>     Luckily, these access patterns are extremely rare (at least at
> dropbox).
>
> 2. I wrote a fully functional MakeDescriptor for c++ protos when I was at
> google.  Talk to the F1 team (specifically Bart Samwel / Chad Whipkey) if
> you're interested in upstreaming that to the opensource community.
>
> --
> You received this message because you are subscribed to the Google Groups
> "Protocol Buffers" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To post to this group, send email to [email protected].
> Visit this group at http://groups.google.com/group/protobuf.
> For more options, visit https://groups.google.com/groups/opt_out.
>

-- 
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/protobuf.
For more options, visit https://groups.google.com/groups/opt_out.

Reply via email to