> That makes no sense. If the bytes were the same, how would deserializing > them be able to produce unequal messages?
Yes, I guess if we can rely on the canonical ordering of the fields, that should be enough. > If possible, I would recommend designing your application such that it only > requires that equal messages have the same serialization *most* of the time. > For example, if you were designing a cache where the cache key is the hash > of a serialized message, then the worst that can happen if two equal > messages had different serializations is that you'd perform the same > operation twice rather than hitting cache. As long as this is relatively > rare, it's no big deal. I was thinking more along the lines of keys in a MapReduce (between Mapper and Reducer phases). I don't think this would work in this case. > If you must know the details, in the varint encoding, the upper bit of each > byte is used to indicate whether there are more bytes in the value. So, in > a 3-byte varint, the first two bytes have the upper bit set, but the last > byte does not. So obviously a 3-byte varint cannot start with the same > bytes as a 6-byte varint, because in a 6-byte varint the third byte would > have the upper bit set. Yes, I actually know this :) I've implemented similar encoding, and also looked at the source... the beauty of open source. It was just hard to grok the entirety of the serialization scheme from the code. --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Protocol Buffers" group. To post to this group, send email to protobuf@googlegroups.com To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/protobuf?hl=en -~----------~----~----~----~------~----~------~--~---