On Tue, Sep 29, 2009 at 12:41, alopecoid <alopec...@gmail.com> wrote:
>
>> Given that the serialized bytes have to be able to *deserialize* back
>> to the original messages, surely if those original messages aren't
>> equal, the serialized forms would have to be different too - assuming
>> we're talking about the same message type
>
> But, as in my example, that doesn't seem to be the case (necessarily).
> Again, for example, let's say you have two messages, both of the same
> type. The proto defines two optional fields, both of type variable
> int64.
>
> Say message A poopulates both optional fields:
> [1 byte tag] [3 byte value] [1 byte tag] [2 byte value] = 7 bytes
>
> And message B populates only one optional field:
> [1 byte tag] [6 byte value] = 7 bytes

The varints are self synchronizing
  http://code.google.com/apis/protocolbuffers/docs/encoding.html#varints
i.e. the first bit is always set in the bytes except for the last one.
So the 3 byte value will have something like 1xxxxxxx 1xxxxxxx
0xxxxxxx while the 6 byte value will have have a msb set to 1 at the
third byte. So they will always be different.

So yes, they will be different. As Jon said: the protocol decoder
needs to be able to decode it properly - a confusion between a (3byte
+ tag + 2 byte varint) vs. (6 byte varint) would not work. So two
different messages of the same message type are always different
(however two messages of different type could theoretically encode two
the same).

The thing you have to worry about more is the _sequence_ in which the
tags are encoded. The decoder does not care in which sequence the
fields are encoded, so it could be that messages with the same content
can be encoded in different ways.

However the encoding in the Google implementation guarantees that the
fields are always in a consistent order (I guess too many people
relied on the fact that messages can be used as a key/can be hashed).

-h

>
> Couldn't these generate, by chance, the same 7 bytes? Yes, using
> deserialize will correctly parse two unequal messages, but if you look
> at the raw serialized byte sequences, they could actually be the same.
> >
>

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To post to this group, send email to protobuf@googlegroups.com
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to