> That makes no sense.  If the bytes were the same, how would deserializing
> them be able to produce unequal messages?

Yes, I guess if we can rely on the canonical ordering of the fields,
that should be enough.

> If possible, I would recommend designing your application such that it only
> requires that equal messages have the same serialization *most* of the time.
>  For example, if you were designing a cache where the cache key is the hash
> of a serialized message, then the worst that can happen if two equal
> messages had different serializations is that you'd perform the same
> operation twice rather than hitting cache.  As long as this is relatively
> rare, it's no big deal.

I was thinking more along the lines of keys in a MapReduce (between
Mapper and Reducer phases). I don't think this would work in this
case.

> If you must know the details, in the varint encoding, the upper bit of each
> byte is used to indicate whether there are more bytes in the value.  So, in
> a 3-byte varint, the first two bytes have the upper bit set, but the last
> byte does not.  So obviously a 3-byte varint cannot start with the same
> bytes as a 6-byte varint, because in a 6-byte varint the third byte would
> have the upper bit set.

Yes, I actually know this :) I've implemented similar encoding, and
also looked at the source... the beauty of open source. It was just
hard to grok the entirety of the serialization scheme from the code.
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To post to this group, send email to protobuf@googlegroups.com
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to