Am 27.02.2014 um 01:11 schrieb [email protected]: > Hey guys, > > > I wrote two different patches which optimize python proto performance. Both > patches are running in production at Dropbox.
I would love to see these patches being reintegrated. Q: what is the relation to Python C bindings, if any? regards, Michael > What is the best way to upstream these changes? > > > Patrick > > ============================ > > Patch #1. Python message patch > (https://www.dropbox.com/s/q0y44ypti0by779/protobuf-2.5.0.patch1): > > Changes: > - precompute various varint tables > - don't use proto's ByteSize function for serialization > - simplified some code (got rid of the listener) > - got rid of StringIO > > Internal benchmark: > - random repeated int32s - ~18% faster > - random repeated int64s - ~20% faster > - random repeated strings - 27% faster > - random repeated bytes - 27% faster > - repeated message with each with a single random string - ~20% faster > > NOTE: > - predefined_varints.py is generated by generate_predefined_varints.py > > ============================ > > Patch #2. C++ experimental binding patch > (https://www.dropbox.com/s/5nr0v76nfraaxif/protobuf-2.5.0.patch2): > > Changes: > - fixed memory ownership / dangling pointer (see NOTE #1 for known issues) > 1. inc ref count parent message when accessing a field, > 2. a cleared field's is freed only when the parent is deleted > - fixed MakeDescriptor to correctly generating simple proto (see NOTE #2) > - fixed MergeFrom to not crash on check failure due to self merge > - fixed both repeated and non-repeated field clearing > - modified varint deserialization to always return PyLong (to match existing > python implementation) > - always mark message as mutate when extending a repeated field (even when > extending by an empty list) > - deleted/updated bad tests from the protobuf test suite > > Internal benchmark (relative to the first patch): > - 30x faster for repeated varints > - 8x faster for repeated string > - 6x faster for repeated bytes > - 26x speed up for repeated nested msg > > NOTE: > 1. In the current implementation, a new python object is created each time a > field is accessed. To make this 100% correct, we should return the same > python object whenever the same field is accessed; however, I don't think the > accounting overhead is worth it. Implications due to the current > implementation: > - repeatedly clearing / mutating the same message can cause memory blow up > - There's a subtle bug with clearing / mutating default message fields: > > This is correct. Holding a reference to a MUTATED field X, then clearing > the parent, then mutate X. e.g., > child = parent.optional_nested_msg > child.field = 123 # this mutates the field > parent.Clear() > child.field = 321 > assert not parent.HasField('child') # passes > > This is incorrect. Holding a reference to a UNMUTATED field X, then > clearing the parent, then mutate X. > child = parent.optional_nested_msg > parent.Clear() > child.field = 321 # this inadvertently causes parent to generate a > different empty msg for optional_nested_msg. > assert not parent.HasField('optional_nested_msg') # fail > > Luckily, these access patterns are extremely rare (at least at dropbox). > > 2. I wrote a fully functional MakeDescriptor for c++ protos when I was at > google. Talk to the F1 team (specifically Bart Samwel / Chad Whipkey) if > you're interested in upstreaming that to the opensource community. > > -- > You received this message because you are subscribed to the Google Groups > "Protocol Buffers" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To post to this group, send email to [email protected]. > Visit this group at http://groups.google.com/group/protobuf. > For more options, visit https://groups.google.com/groups/opt_out. -- You received this message because you are subscribed to the Google Groups "Protocol Buffers" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/protobuf. For more options, visit https://groups.google.com/groups/opt_out.
