Thanks a lot for the time spent on this topic. Really appreciate it. I
think we will end up using your optional message suggestion. There's
several places where the amount of messages might get us into trouble
in the future.

Once again thank you.

Dan

On May 22, 10:14 pm, Kenton Varda <ken...@google.com> wrote:
> Again, I want to emphasize that any program which expects parse failures to
> happen during normal operation is probably doing something wrong.  Whether
> or not the parser accepts a message actually tells you very little about the
> integrity of the message.  Remember that the parser is designed to be
> tolerant of unexpected information in the message, since this is how it
> provides forward- and backward-compatibility.
> For example, take these two messages:
>
>   message A {
>     optional int32 i = 1;
>     required bool b = 2;
>   }
>
>   enum E { FOO = 1; BAR = 2; }
>   message B {
>     optional double d = 1;
>     repeated E e = 2;
>   }
>
> What do you think would happen if you had a valid encoded message of type A
> and you tried to parse it as a B.  Guess what?  It succeeds.  In fact, *any*
> valid protocol message of *any* type will parse successfully as a B, since B
> has no required fields.
>
> But what if you parsed a B as an A?  A has a required field, and B doesn't
> even have any fields of that type, so it can't succeed, right?  Wrong.
>  Booleans and enums are both encoded as varints in the wire.  So if B.e is
> non-empty, then A will interpret it as a value of A.b, satisfying the
> required field.
>
> Now, you say that you are doing extra validation afterwards which you are
> convinced will ensure that a B cannot be mistaken for an A or vice versa.
>  This could work, but I still think it is a hacky approach, and hacky
> approaches tend to lead to unexpected problems down the road.  If at all
> possible, you should move to using the AorB message I proposed earlier.
>
> All that said, the reason for the error message is simply for debugging
> help:  It identifies exactly which field was missing from the message.
>  Typically the caller doesn't particularly care which field was missing,
> since it will reject message regardless.  However, a developer testing the
> program would be very interested to know what was missing.  We discovered
> that without this message, it was very hard to debug these kinds of errors.
>  In general, error messages logged by libprotobuf are for such debugging
> purposes.
>
> If you want to avoid printing this error, what you should really do is use
> one of the Message::ParsePartial*() methods when parsing.  This skips the
> required fields check, and will never print an error message.  You can
> manually call IsInitialized() on the result to check for required fields
> yourself -- this also will not log any errors.
>
> Another alternative is to use 
> LogSilencer:http://code.google.com/apis/protocolbuffers/docs/reference/cpp/google...
>
> But using ParsePartial will avoid even constructing the error message, which
> can be expensive.
>
> On Fri, May 22, 2009 at 11:06 AM, <dan.schmidt.va...@gmail.com> wrote:
>
> > Thanks a lot for that reply.
>
> > I do appreciate that two messages with the same layout will parse
> > indifferently as each others type. In the particular situation we are
> > just now, A and B are completely different, so there is no possibility
> > of confusion there. Dealing with specific messages, we do not rely
> > exclusively on the correct parsing of them to identify their type.
> > However, we do use it to verify that specific attributes are in fact
> > present in one of the messages. We then have other constraints to
> > check that the content of the message is valid. This is where our
> > debate is centred. Is it reasonable to use the pb to verify at least
> > that initial structure even though that every wrong type of message
> > parsing gives an ERROR message? Does the fact that there is such a
> > message mean that something could go wrong or is it just information
> > equivalent to the false return value of the function?
>
> > Many thanks again.
>
> > Dan
>
> > On May 22, 5:44 pm, Kenton Varda <ken...@google.com> wrote:
> > > You can capture log output using SetLogHandler():
> >http://code.google.com/apis/protocolbuffers/docs/reference/cpp/google...
>
> > > <http://code.google.com/apis/protocolbuffers/docs/reference/cpp/google..
> > .>However,
> > > you have a deeper problem:  Protocol Buffers does not guarantee that
> > trying
> > > to parse an A from data which is actually of type B will fail.  In fact,
> > if
> > > your input is a valid protocol buffer of *some* type, then the only
> > reason
> > > parsing it as type A would fail is if the message did not happen to
> > satisfy
> > > A's required fields.  Remember that protocol buffers are not
> > > self-describing, so there is no way to know for sure what a particular
> > > message's type is or isn't.  It may be worth reading the encoding
> > > documentation to understand what is actually being sent on the wire.
>
> > > In any case, what you really need to do is create a message like:
>
> > > message AorB {
> > >   optional A a = 1;
> > >   optional B b = 2;
>
> > > }
>
> > > Then send *that* message.  Now you can check has_a() and has_b() to
> > > distinguish the types.
>
> > > On Fri, May 22, 2009 at 3:59 AM, <dan.schmidt.va...@gmail.com> wrote:
>
> > > > Hey, there.
>
> > > > Massive discussion at work. We have two types of pb messages with
> > > > required fields (A & B, let's say) and we're using A.ParseFromString()
> > > > to determine the type. Obviously, when the message that arrives is of
> > > > type B, the parsing fails and we get the pertinent "Can't parse
> > > > message of type". Now, that comes with a capitalised ERROR at the
> > > > beginning of the message. Several questions now:
>
> > > > 1. Does that mean something really wrong happens inside the pb code?
> > > > 2. Why would this message appear inconsistently?
> > > > 3. Diving into the code, the output comes from line 121 of message.cc
> > > > (as the title of the post says ;) ). Output should redirect to the
> > > > GOOGLE_LOG. Is there anyway of defining where this out actually goes
> > > > to? Like a flag when you compile them?
>
> > > > I am aware of the ongoing discussion between required and repeated
> > > > fields, but the confusing bit to me as a developer is that the
> > > > function returns a boolean. So when I ask it to parse, should I not
> > > > only expect just a true and a false? Or if an error actually happens,
> > > > would it not be better to let me handle it? Maybe an int or error code
> > > > return value would be handy to indicate the exact nature of the
> > > > error...
>
> > > > Anyway, thanks for any thoughts on that.
>
> > > > Oh, and the response to this post is going to settle who gets to hold
> > > > the TV remote at the office, so this is important stuff ;)
>
> > > > Regards,
>
> > > > Dan
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To post to this group, send email to protobuf@googlegroups.com
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to