On Tue, Nov 16, 2010 at 7:28 PM, Kenton Varda <ken...@google.com> wrote:
> On Tue, Nov 9, 2010 at 10:42 PM, Christopher Smith <cbsm...@gmail.com>wrote:
>> This aspect could be mostly mitigated by integrating a metadata header in
>> to files. For systems with this kind of an approach look at Avro & Hessian.
> Problems with that:
> 1) Protobufs are routinely used to encode small messages of just a few
> bytes. Metadata would almost certainly be larger than the actual messages
> in such cases.
> 2) This metadata would add an extra layer of indirection into the parsing
> process which would probably make it much slower than it is today.
> 3) Interpreting the metadata itself to build that table would add
> additional time and memory overhead. Presumably this would have to involve
> looking up field names in hash maps -- expensive operations compared to the
> things the protobuf parser does today.
Sorry, wasn't meaning to suggest that changes be made to protobuf. Mostly
just meaning that if that you want that, there are other solutions that are
a better fit. I think Avro in particularly has a solution that mitigates
drawbacks 1-3, at the expense of some additional complexity.
You can hack this in to a protobuf solutions though. You just encode the
FileDescriptorSet in to your file header. Then when you start a scan, you
read it in, find out the field numbers that correspond to the field names
you want, and then parse the protobuf's as before. The key thing is the
overhead is only once per file (which presumably has tons of small messages)
and that you transform the parse/query after reading the header to exactly
what you'd have had if you used the field numbers to start with.
Honestly, for me the win with the field numbers tends to be with long term
forward and backward compatibility.
You received this message because you are subscribed to the Google Groups
"Protocol Buffers" group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to
For more options, visit this group at