On Wed, Mar 10, 2010 at 11:03 AM, Evan Jones <ev...@mit.edu> wrote:

> Kenton Varda wrote:
>> The output of your serializer will no longer be canonical.  People won't
>> be able to, say, used serialized messages as map keys or to compute hash
>> values.  That alone is probably enough reason not to do this.
> Please correct me if I'm wrong, but isn't this *exactly* the UTF-8 security
> issue? My understanding is that the problem is that a single could have
> different byte representations, and in many cases people will only compare
> the byte representations to check for a match.
> Of course, it is a difficult to imagine a case where someone would use a
> serialized protocol buffer in the same way, but it seems *possible*?

With Protocol Buffers, we already say that fields can appear in any order,
so any security mechanism built around matching the raw bytes of a protobuf
message against some blacklist is obviously flawed.  But many people rely on
the ability to parse a message and re-serialize it to produce a canonical
representation -- usually for caching purposes rather than security

Also, people commonly need to escape text strings in order to prevent
injection attacks of various kinds, and escaping mechanisms can easily break
if there are multiple ways to represent the same character.  But, this use
case obviously doesn't apply to protocol buffers.

But yes, it is a very similar problem.

You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
For more options, visit this group at 

Reply via email to