On Wed, May 16, 2018 at 7:39 AM Dmitry Timofeev <[email protected]> wrote:
> Hi, > > I consider if Protocol Buffers can be possibly used in an application that > requires canonical representation of messages coming from external source. > > The encoding and proto3 guide [1, 2] include several requirements for a > parser that make it accept non-canonical data (this list is probably not > exhaustive): > - Message fields may appear in any order > - There are might be multiple instances of the same *non-repeated* field > - Message may contain unknown fields > - ¿Default values of primitives may appear on the wire > - Map entries may appear in any order > - Repeated fields of primitives may be packed or unpacked. > > 1. Is there any natural way to extend the parser with checks of canonical > form? > No. > By "natural" I mean a compiler and/or runtime plugin, something that does > not require a fork of the project. > 2. If not, does such optional feature make sense in Protocol Buffers? Would > you accept an option that makes the generated reader code 'strict', > rejecting non-canonical representations, and, consequently, not > forward-compatible? > Also no here. Compatibility is one of the main reasons to use protobuf because it allows you to evolve your protocol without breaking anyone in a complex system. If you don't need compatibility at all (i.e., you will never change your protocol), using a C++ struct will be much more performant than protobuf because you can skip the whole parsing/serialization cost. There is a way to mimic the behavior you want though: 1. parse the input data to a proto message 2. check if the proto message has any unknown fields; if any, report error 3. serialize the proto message using deterministic serialization ( https://github.com/google/protobuf/blob/master/src/google/protobuf/io/coded_stream.h#L842 ) 4. compare the serialized data against the input data; if they match, the input data is in the "canonical form"; if not, report error It will incur an additional serialization cost, but can get you close enough to the canonical form. > > Thanks, > Dmitry > > [1] https://developers.google.com/protocol-buffers/docs/encoding > [2] https://developers.google.com/protocol-buffers/docs/proto3 > > -- > You received this message because you are subscribed to the Google Groups > "Protocol Buffers" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To post to this group, send email to [email protected]. > Visit this group at https://groups.google.com/group/protobuf. > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "Protocol Buffers" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/protobuf. For more options, visit https://groups.google.com/d/optout.
