On Thu, Feb 21, 2013 at 8:37 PM, Mike Grove <[email protected]> wrote:
> > > > On Thu, Feb 21, 2013 at 12:25 AM, Feng Xiao <[email protected]> wrote: > >> >> >> On Thu, Feb 21, 2013 at 12:11 AM, Michael Grove <[email protected]>wrote: >> >>> I am using protobuf for the wire format of a protocol I'm working on as >>> a replacement to JSON. The original protobuf messages were not much more >>> than JSON as protobuf; my protobuf message just contained the same fields >>> w/ the same format as the JSON structure. This worked fine, but the >>> payloads tended to be the same or larger than their JSON equivalent. I >>> tried using the union types technique, specifically with extensions as >>> outlined in the docs [1], and this worked very well wrt to compression, the >>> resulting messages were much smaller than the previous approach. >>> >>> However, the parsing of the smaller messages far outweighs the advantage >>> of less IO. >>> >> > >> You mean parsing protobufs performs worse than parsing JSON? >> > > For the nest structured based on extensions as described in the techniques > sections of the protobuf docs, throughput it about the same. I assume that > means parsing is slower because I'm sending fewer bytes over the wire. My > original attempt at a protobuf based format was the fastest option, but it > tended to be the most bytes sent over the wire, often more than the raw > data I was sending. > > >> >> >>> When I run a simple profiling example, the top 10-15 hot spots are all >>> parsing of the messages. The top ten most expensive methods are as follows: >>> >>> MessageType1$Builder.mergeFrom >>> MessageType2$Builder.mergeFrom >>> MessageType1.getDescriptor() >>> MessageType1$Builder.getDescriptorForType >>> MessageType3$Builder.mergeFrom >>> MessageType2.getDescriptor >>> MessageType2$Builder.getDescriptorForType >>> MessageType1$Builder.create >>> MessageType1$Builder.buildPartial >>> MessageType3.isInitialized >>> >>> The organization is pretty straightforward, MessageType3 contains a >>> repeated list of MessageType2. MessageType2 has three required fields of >>> type MessageType1. MessageType1 has a single required value, which is an >>> enum. The value of the enum defines which of the extensions, again as >>> shown in [1], are present on the message. There are a total of 6 possible >>> extensions to MessageType1, each of which is a single primitive value, such >>> as an int or a string. There tends to be no more than 3 of the 6 possible >>> extensions used at any give time. >>> >>> The top two mergeFrom hot spots take ~32% of execution time, the test is >>> the transmission of 1.85M objects of MessageType2 from client to server. >>> These are bundled in roughly 64k chunks, using 58 top level MessageType3 >>> objects. >>> >> You can try the new parser API introduced in 2.5.0rc1, i.e., use >> MessageType3.parseFrom() instead of the Builder API to parse the message. >> Another option is to simplify the message structure. Instead of nesting >> many small MessageType2 in MessageType3, you can simply put the repeated >> extensions in MessageType3. >> > > This sounds good, I will try both of these options. > > Is 2.5.0rc1 fairly stable? > Yes, no big changes made since then. > > Thanks. > > Michael > > >> >> >>> Obviously all of the hot spot methods are auto-generated (Java). There >>> might be some hand changes I could make to that code, but if I ever >>> re-generate, then i'd lose that work. I am wondering if there are any >>> tricks or changes that could be made to improve the parse time of the >>> messages? >>> >>> Thanks. >>> >>> Michael >>> >>> [1] https://developers.google.com/protocol-buffers/docs/techniques >>> >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "Protocol Buffers" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to [email protected]. >>> To post to this group, send email to [email protected]. >>> Visit this group at http://groups.google.com/group/protobuf?hl=en. >>> For more options, visit https://groups.google.com/groups/opt_out. >>> >>> >>> >> >> > -- You received this message because you are subscribed to the Google Groups "Protocol Buffers" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/protobuf?hl=en. For more options, visit https://groups.google.com/groups/opt_out.
