Since I've implemented this before, I have a fairly lengthy list.
Most of these constraints are in the docs, at least in some way/shape/form.
But some are enforced by protoc but never actually mentioned in the docs
(such as disallowing the use of "map_entry" and "uninterpreted_options"
options):
- Tags for all fields of a given message must be valid:
- Must be in the range 1 to 536,870,911 (2^29-1) inclusive.
- Must *not* be in the reserved range 19,000 to 19,999 (inclusive).
- All fields must have a unique tag (i.e. no tag re-use allowed).
- Tags from reserved ranges defined in the message are not allowed.
- Tags from extension ranges defined in the message are not allowed
for normal fields. Similarly, extension fields *must* use a tag in
one of the message's extension ranges.
- Other message properties must be valid:
- No field may be named using a reserved name.
- Any given reserved range or extension range must not overlap with
any other reserved range or extension range defined in the message.
- Reserved names may not contain duplicates.
- No message is allowed to use the "map_entry" option. (This option
is used in representing a message as a descriptor, in which a message
descriptor is synthesized for every map field. Only those synthetic
messages may have this option.)
- A field whose type refers to a message must not have a "default"
option.
- Map fields and repeated fields also must not have "default" options.
- Numeric values for all enum values must be valid:
- Numeric values must be in range for signed 32-bit integers
(-4,294,967,296 to 4,294,967,295).
- Numeric values for a single enum must be unique and may not be
reused *unless* the enum includes the option allow_alias set to true.
- Numeric values from reserved ranges defined in the enum are not
allowed.
- Other enum properties must be valid:
- No value may be named using a reserved name.
- Any given reserved range must not overlap with any other reserved
range defined in the enum.
- Reserved names may not contain duplicates.
- Must be able to resolve all relative references using protobuf scoping
rules:
- Types in field definitions must resolve to an element that is a
message or enum.
- Targets of "extends" blocks must resolve to a message.
- Request/response types in methods must resolve to a message.
- Names in custom options must resolve to an extension. That resolved
extension must extend the appropriate option type (e.g.
"google.protobuf.MessageOptions" for options that are scoped to
a message).
- All options must have a value of the correct type:
- The "correct type" of the value is determined by resolving the option
name to a field (or extension) of the appropriate option type.
- Options in file scope must resolve to fields or extensions of
"google.protobuf.FileOptions"; options in a message scope
must resolve to
fields or extensions of "google.protobuf.MessageOptions"; etc.
- Normal options (as opposed to custom options) will refer to field
names on the option type itself.
- There are two exceptions: "default" and "json_name" in field
options: there are no fields with these names on
"google.protobuf.FieldOptions". (These instead correspond to
fields named
"default_value" and "json_name" on the
"google.protobuf.FieldDescriptorProto".)
- The "json_name" option value must be a string.
- The "default" option value must be the same type as the field
itself. So it must be a literal integer for fields with an
integer type, a
literal string for fields with a string or bytes type, etc. (Default
options are not allowed on repeated fields, map fields, and
fields whose
type is a message.)
- Custom options (those that use parentheses around the first portion
of the option name; e.g. "(custom.option)") will refer to
extension fields.
- Any two option statements in the same scope may not attempt to set
the same option field(s).
- If any path of the option name refers to a message type, there can
be additional path elements that refer to fields of that message type.
For example, in a service option that has name
"(custom.option).foo.bar":
- "(custom.option)" refers to an extension that extends
"google.protobuf.ServiceOptions" whose type is a message
named MessageOne.
- "(custom.option).foo" refers to a field named "foo" in
MessageOne. This field's type is a message named MessageTwo.
- "(custom.option).foo.bar" refers to a field named "bar" in
MessageTwo.
- References to repeated fields may not have values that appear to
be list literals. Instead, a single option that references a
repeated field
defines a single element of the repeated option. Subsequent options that
reference the same repeated field define subsequent elements.
- Only the *leaf* field named can be a repeated field. If earlier
path elements name repeated fields, the option is invalid.
Instead, the
option statement must refer only to the *first* repeated field in
the path and then use an aggregate value (which then includes
all values
for nested repeated fields).
- References to message fields may use an aggregate value: an
aggregate value is enclosed in curly braces "{ }" and uses the protobuf
text format therein to define the message value.
- Option values whose type is an enum must use unqualified
identifiers. The identifier must be one of the values in the enum.
- No option statement may refer to an option named
"uninterpreted_option". (This field of the various option types is used
internally by protoc and other parsers to represent
unresolved/uninterpreted option statements.)
- If the file indicates syntax = "proto3":
- No "optional" or "required" labels are allowed on any field
definition. (Fields are always optional in proto3, unless they have the
"repeated" label or are map fields.)
- No messages may define extension ranges.
- No messages may define groups.
- No messages may include a field whose type is an enum defined in a
file with syntax = "proto2".
- The first value for an enum *must* have a numeric value of zero.
(Similarly: every enum must have a value whose numeric value is zero.)
- Field definitions may not use the "default" option (default field
values in proto3 are always the zero value for the field's type).
- If the file indicates syntax = "proto2":
- Fields and extensions must specify a label: "optional", "repeated",
or "required" (excluding those in oneof declarations and excluding map
fields, which must *not* have a label).
- Extension fields must *not* use the "required" label.
----
*Josh Humphries*
[email protected]
On Sat, Mar 9, 2019 at 2:03 PM Michael Powell <[email protected]> wrote:
> Hello,
>
> I am looking for guidance as far as what steps one must take in order to
> verify that a Proto is valid, link the names, i.e. considering ambiguities
> of Element (i.e. Message or Enum) Type Names during parsing, etc.
>
> Some things seem obvious, such as the Field Numbers must be unique, that
> sort of thing, but a more comprehensive set of guidelines would be helpful.
>
> Cheers, thank you.
>
> Michael W Powell
>
> --
> You received this message because you are subscribed to the Google Groups
> "Protocol Buffers" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To post to this group, send email to [email protected].
> Visit this group at https://groups.google.com/group/protobuf.
> For more options, visit https://groups.google.com/d/optout.
>
--
You received this message because you are subscribed to the Google Groups
"Protocol Buffers" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/protobuf.
For more options, visit https://groups.google.com/d/optout.