On Thu, Feb 21, 2013 at 8:37 PM, Mike Grove <[email protected]> wrote:

>
>
>
> On Thu, Feb 21, 2013 at 12:25 AM, Feng Xiao <[email protected]> wrote:
>
>>
>>
>> On Thu, Feb 21, 2013 at 12:11 AM, Michael Grove <[email protected]>wrote:
>>
>>> I am using protobuf for the wire format of a protocol I'm working on as
>>> a replacement to JSON.  The original protobuf messages were not much more
>>> than JSON as protobuf; my protobuf message just contained the same fields
>>> w/ the same format as the JSON structure.  This worked fine, but the
>>> payloads tended to be the same or larger than their JSON equivalent.  I
>>> tried using the union types technique, specifically with extensions as
>>> outlined in the docs [1], and this worked very well wrt to compression, the
>>> resulting messages were much smaller than the previous approach.
>>>
>>> However, the parsing of the smaller messages far outweighs the advantage
>>> of less IO.
>>>
>>
>
>>  You mean parsing protobufs performs worse than parsing JSON?
>>
>
> For the nest structured based on extensions as described in the techniques
> sections of the protobuf docs, throughput it about the same.  I assume that
> means parsing is slower because I'm sending fewer bytes over the wire.  My
> original attempt at a protobuf based format was the fastest option, but it
> tended to be the most bytes sent over the wire, often more than the raw
> data I was sending.
>
>
>>
>>
>>> When I run a simple profiling example, the top 10-15 hot spots are all
>>> parsing of the messages.  The top ten most expensive methods are as follows:
>>>
>>> MessageType1$Builder.mergeFrom
>>> MessageType2$Builder.mergeFrom
>>> MessageType1.getDescriptor()
>>> MessageType1$Builder.getDescriptorForType
>>> MessageType3$Builder.mergeFrom
>>> MessageType2.getDescriptor
>>> MessageType2$Builder.getDescriptorForType
>>> MessageType1$Builder.create
>>> MessageType1$Builder.buildPartial
>>> MessageType3.isInitialized
>>>
>>> The organization is pretty straightforward, MessageType3 contains a
>>> repeated list of MessageType2.  MessageType2 has three required fields of
>>> type MessageType1.  MessageType1 has a single required value, which is an
>>> enum.  The value of the enum defines which of the extensions, again as
>>> shown in [1], are present on the message.  There are a total of 6 possible
>>> extensions to MessageType1, each of which is a single primitive value, such
>>> as an int or a string.  There tends to be no more than 3 of the 6 possible
>>> extensions used at any give time.
>>>
>>> The top two mergeFrom hot spots take ~32% of execution time, the test is
>>> the transmission of 1.85M objects of MessageType2 from client to server.
>>>  These are bundled in roughly 64k chunks, using 58 top level MessageType3
>>> objects.
>>>
>> You can try the new parser API introduced in 2.5.0rc1, i.e., use
>> MessageType3.parseFrom()  instead of the Builder API to parse the message.
>> Another option is to simplify the message structure. Instead of nesting
>> many small MessageType2 in MessageType3, you can simply put the repeated
>> extensions in MessageType3.
>>
>
> This sounds good, I will try both of these options.
>
> Is 2.5.0rc1 fairly stable?
>
Yes, no big changes made since then.


>
> Thanks.
>
> Michael
>
>
>>
>>
>>> Obviously all of the hot spot methods are auto-generated (Java).  There
>>> might be some hand changes I could make to that code, but if I ever
>>> re-generate, then i'd lose that work.  I am wondering if there are any
>>> tricks or changes that could be made to improve the parse time of the
>>> messages?
>>>
>>> Thanks.
>>>
>>> Michael
>>>
>>> [1] https://developers.google.com/protocol-buffers/docs/techniques
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "Protocol Buffers" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to [email protected].
>>> To post to this group, send email to [email protected].
>>> Visit this group at http://groups.google.com/group/protobuf?hl=en.
>>> For more options, visit https://groups.google.com/groups/opt_out.
>>>
>>>
>>>
>>
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/protobuf?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.


Reply via email to