The end-tag approach is more efficient than your idea -- it's faster (no
need to count elements at all) and it takes no more space (no need to write
a count, which makes up for the extra space taken by the end tag).
But in any case, the encoding is not something we can change at this point,
since protocol buffers is nothing without backwards-compatibility.

And yes, some existing parsers do, in fact, take advantage of the ability to
skip over messages without parsing them, and there are many features that
people are considering implementing (like lazy parsing) which would need

It actually turns out that pre-computing the size of the embedded message
does not take very long compared to actually writing it.

On Wed, Jun 24, 2009 at 12:57 AM, etorri <> wrote:

> Does some existing parser actually implement that skipping feature?
> There would not be any need for a end-tag. Let's assume that there
> would be two different tags
> 2 - Length_Delimited,  which could contain a packed list of bytes
> (string, memory block) or other types where the parser needs to know
> what is packed inside (no tags)
> 6 -  Group or Element_Delimited - which would be like Length_Delimited
> but have the number of elements that follow that belong to this field
> So for an example message where the first field is a group
> (1,6),3 - field  numbered 1 of the message, type 6 = Group and 3
> elements that follow belong to this group
>  (1,2),5,"Hello" -field number 1 of the embedded message would be a
> string
>  (3,1),120 - field nr 3 of the embedded message, varint of value 120
>  (4,1),0 - field nr 4 of the embedded message, varint of value 0
> (2,2),5,"World" - field nr 2 of the message
> this would be the encoding of the following TheMessage
> message Embedded {
>  required string greeting = 1;
>  optional int32 useless = 2;
>  required int32  good = 3;
>  required int32  evil = 4;
> }
> message TheMessage {
>  required Embedded e = 1;
>  required string target = 2;
> }
> So in this case there would not be need for an end tag. When
> constructing the message it should be relatively easy to count the
> number of embedded elements instead of knowing how much space they
> occupy. This would enable streaming/serializing the elements
> recursively out one by one.
> On Jun 23, 9:07 pm, Kenton Varda <> wrote:
> > The advantage of writing the length is that a parser can skip the entire
> > sub-message easily without having to parse its contents.  Otherwise, we
> > would probably use the "group" encoding for sub-messages, where a special
> > end tag marks the end of the message.
> >
> >

You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To post to this group, send email to
To unsubscribe from this group, send email to
For more options, visit this group at

Reply via email to