[protobuf] Re: tag number range explanation

nicolas hofmann Mon, 11 Jul 2011 01:47:15 -0700

Ok, i was expecting the historical reasons explanation but now I’m
sure. Thank for that.


My idea still supports unknown fields. It just makes it impossible to
skip embedded messages that use the end marker technique without
parsing them. It's less efficient but possible. You are free to
customize which embedded message field use it or not and are free to
switch from one method to the other whenever you want if not
satisfied. It's just a tool that allow you to be more efficient if you
are in the case where you are sure you will not have to deal with
unknown fields. The worst case scenario is equivalent to the case
where you know all the fields. It also allows you to send multiple
messages without having to compute their length.

I know the exemption issue. Admitting that we forget the idea for
embedded messages, the only way to avoid it without having to rewrite
the generated code each time you change something or rewrite the code
generator is to wrap the stream into a fake stream that emulate the
end of stream when you find the null tag. It’s highly inefficient.

In anyway i thank both of you for your help. I now have enough
information to continue my job.

On 8 juil, 22:55, Jason Hsueh <[email protected]> wrote:
> Zero is reserved for historical reasons; it used to be used for propagating
> error codes.
>
> The wire format was designed with extensibility in mind; while in many
> applications unknown fields won't be encountered, this support is critical
> in many internal applications.
>
> Note that simply adding a zero tag would cause the parser (CodedInputStream)
> to stop reading bytes, but would also make ConsumedEntireMessage() false.
> This would make most of the higher level message-parsing routines fail.
>
> On Fri, Jul 8, 2011 at 3:43 AM, nicolas hofmann <[email protected]>wrote:
>
>
>
>
>
>
>
> > Thank for your help but I don't think you understood my problem. I
> > will give more detail.
>
> > I'm trying to make communicate two process using protocol buffers in
> > pipes (the group thing won’t work). For efficiency purpose, they will
> > keep their connection as long as they are alive but the messages have
> > to be processed as soon as possible. The easiest known way to delimit
> > the messages is described in the techniques section of the
> > documentation but I’m uncomfortable with the fact of having to write
> > and compute the total length because it seem useless if you never skip
> > entire messages, wish is my case. This argument also work for embedded
> > messages if none of them are unexpected, wish is also my case. The end
> > marker technique avoids the length calculation without making it
> > impossible to skip your message/embedded message if you really need
> > to.
>
> > Moreover, if you use static format for your size field, the whole
> > thing will fail miserably if your size field is too small or be
> > inefficient if your messages are small and you choose a too big size
> > field size. The varint format is a solution but the null tag would in
> > any way be more efficient since it always only take one byte.
>
> > I'm just asking if having the possibility to make the protocol buffer
> > deal with a null field number the same way it does with the end of
> > stream instead of throwing an exemption and do something similar to
> > this with emended messages is a good idea. Doing this efficiently
> > would require to integrate it inside the generated code... with all
> > the problem that it involve... That's why i will probably not have the
> > time to implement that the whole thing with elegancy anyway and
> > therefore not plan to do so. So my question is purely theoretical.
>
> > On 7 juil, 20:07, Marc Gravell <[email protected]> wrote:
> > > I don't pretend to know the original thinking, but it would be very hard
> > to add such now without breaking existing clients. However, note that if you
> > *really* don't want to have to get the lengths, you could encode your data
> > inside a "group", since this has a terminator rather than a length prefix.
> > Treat the data as a "repeated" set of the group, and job done.
>
> > > Of course, writing the length isn't usually a massive task either, but
> > *not* writing it is easier :p
>
> > > Caveat: technically groups are semi-deprecated, giving preference to
> > length-prefixed messages. I believe that part of the reasoning here is the
> > higher cost of reading *unexpected* groups since you must parse the stream
> > rather than just copy (or skip) the next [n] bytes, but in my *own* use of
> > protobuf this is rarely an issue: in the majority of cases all my clients
> > know about the fields. There is also a difference in the size, but
> > "difference" is the key term here - neither approach is always longer or
> > always shorter; any comparison depends on both the field number and the size
> > of the data.
>
> > > I openly confess to having a strong like for "groups" - they do make the
> > encoding process simpler :p
>
> > > Marc
>
> > > On 7 Jul 2011, at 14:18, nicolas hofmann <[email protected]> wrote:
>
> > > > My question is simple but i didn't found the answer anywhere. I was
> > > > wondering why the tag number range started at one and not zero.
>
> > > > I was looking for a way to stream multiple messages without having to
> > > > compute their sizes and realized that just add a zero tag number at
> > > > the end could be a good way to mean the end of the current message
> > > > since it's illegal. I searched here and in the
> > > > documentation(especially in the encoding section) but saw no good
> > > > reason to this limitation.
>
> > > > It look so simple i think I'm probably missing something but, by
> > > > definition, i don't know what.
>
> > > > Thanks,...
>
> > > > --
> > > > You received this message because you are subscribed to the Google
> > Groups "Protocol Buffers" group.
> > > > To post to this group, send email to [email protected].
> > > > To unsubscribe from this group, send email to
> > [email protected].
> > > > For more options, visit this group athttp://
> > groups.google.com/group/protobuf?hl=en.
>
> > --
> > You received this message because you are subscribed to the Google Groups
> > "Protocol Buffers" group.
> > To post to this group, send email to [email protected].
> > To unsubscribe from this group, send email to
> > [email protected].
> > For more options, visit this group at
> >http://groups.google.com/group/protobuf?hl=en.

-- 
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.

[protobuf] Re: tag number range explanation

Reply via email to