On Tuesday, March 12, 2019 at 9:43:38 PM UTC-4, Josh Humphries wrote: > > > On Tue, Mar 12, 2019 at 7:31 PM Michael Powell <[email protected] > <javascript:>> wrote: > >> >> >> On Monday, March 11, 2019 at 12:24:57 PM UTC-4, Josh Humphries wrote: >>> >>> Since I've implemented this before, I have a fairly lengthy list. >>> >>> Most of these constraints are in the docs, at least in some >>> way/shape/form. But some are enforced by protoc but never actually >>> mentioned in the docs (such as disallowing the use of "map_entry" and >>> "uninterpreted_options" options): >>> >> >> So, this may sound like a stupid question, but where do you even go to >> discover heuristics such as these? Well, besides this super comprehensive >> list, that is. >> > > I go to the source: in this case protoc (and sometimes its C source code). > > Many I discovered because I was writing a parser/compiler in Go. I > realized that these were areas where the spec was light and the comments in > descriptor.proto weren't crystal clear. So I just tried a few things out to > observe the behavior of protoc. >
I see, I see. Thanks for pointing that out. For example, it just seems like these are so arbitrary. I'm sure there is a >> reason why, but in and of itself, "just because" does not seem like an >> adequate response: >> >> >> - If the file indicates syntax = "proto2": >> - Fields and extensions must specify a label: "optional", >> "repeated", or "required" (excluding those in oneof declarations and >> excluding map fields, which must *not* have a label). >> - Extension fields must *not* use the "required" label. >> >> >> Are these heuristics internal bits? Or are they really coming from Google >> Protocol Buffers? >> > > I'm not sure I understand the question. I think the answer is that they > come from Google Protocol Buffers. > > Regarding labels, this is a difference in the actual language > <https://developers.google.com/protocol-buffers/docs/reference/proto2-spec> > specs > <https://developers.google.com/protocol-buffers/docs/reference/proto3-spec>. > I was writing a single parser that could support either, so instead of > implementing two different specs, I merged the two. And the label thing > shook out as a difference between the two specs. > > For extensions not being required, it's not explicitly documented. But > given that extensions are not meant to even be known by all clients, it's > intuitive that they can't be required. It's also hinted at in the docs, where > it is stated > <https://developers.google.com/protocol-buffers/docs/proto#updating> that > non-required fields can safely be converted to extensions and vice versa. > > >> >> >>> >>> - Tags for all fields of a given message must be valid: >>> - Must be in the range 1 to 536,870,911 (2^29-1) inclusive. >>> - Must *not* be in the reserved range 19,000 to 19,999 >>> (inclusive). >>> - All fields must have a unique tag (i.e. no tag re-use allowed). >>> - Tags from reserved ranges defined in the message are not >>> allowed. >>> - Tags from extension ranges defined in the message are not >>> allowed for normal fields. Similarly, extension fields *must* use >>> a tag in one of the message's extension ranges. >>> - Other message properties must be valid: >>> - No field may be named using a reserved name. >>> - Any given reserved range or extension range must not overlap >>> with any other reserved range or extension range defined in the >>> message. >>> - Reserved names may not contain duplicates. >>> - No message is allowed to use the "map_entry" option. (This >>> option is used in representing a message as a descriptor, in which a >>> message descriptor is synthesized for every map field. Only those >>> synthetic >>> messages may have this option.) >>> - A field whose type refers to a message must not have a >>> "default" option. >>> - Map fields and repeated fields also must not have "default" >>> options. >>> - Numeric values for all enum values must be valid: >>> - Numeric values must be in range for signed 32-bit integers >>> (-4,294,967,296 to 4,294,967,295). >>> - Numeric values for a single enum must be unique and may not be >>> reused *unless* the enum includes the option allow_alias set to >>> true. >>> - Numeric values from reserved ranges defined in the enum are not >>> allowed. >>> - Other enum properties must be valid: >>> - No value may be named using a reserved name. >>> - Any given reserved range must not overlap with any other >>> reserved range defined in the enum. >>> - Reserved names may not contain duplicates. >>> - Must be able to resolve all relative references using protobuf >>> scoping rules: >>> - Types in field definitions must resolve to an element that is a >>> message or enum. >>> - Targets of "extends" blocks must resolve to a message. >>> - Request/response types in methods must resolve to a message. >>> - Names in custom options must resolve to an extension. That >>> resolved extension must extend the appropriate option type (e.g. >>> "google.protobuf.MessageOptions" for options that are scoped to a >>> message). >>> - All options must have a value of the correct type: >>> - The "correct type" of the value is determined by resolving the >>> option name to a field (or extension) of the appropriate option type. >>> - Options in file scope must resolve to fields or extensions >>> of "google.protobuf.FileOptions"; options in a message scope must >>> resolve >>> to fields or extensions of "google.protobuf.MessageOptions"; etc. >>> - Normal options (as opposed to custom options) will refer to >>> field names on the option type itself. >>> - There are two exceptions: "default" and "json_name" in field >>> options: there are no fields with these names on >>> "google.protobuf.FieldOptions". (These instead correspond to >>> fields named >>> "default_value" and "json_name" on the >>> "google.protobuf.FieldDescriptorProto".) >>> - The "json_name" option value must be a string. >>> - The "default" option value must be the same type as the >>> field itself. So it must be a literal integer for fields with an >>> integer >>> type, a literal string for fields with a string or bytes type, >>> etc. >>> (Default options are not allowed on repeated fields, map fields, >>> and fields >>> whose type is a message.) >>> - Custom options (those that use parentheses around the first >>> portion of the option name; e.g. "(custom.option)") will refer to >>> extension >>> fields. >>> - Any two option statements in the same scope may not attempt to >>> set the same option field(s). >>> - If any path of the option name refers to a message type, there >>> can be additional path elements that refer to fields of that message >>> type. >>> For example, in a service option that has name >>> "(custom.option).foo.bar": >>> - "(custom.option)" refers to an extension that extends >>> "google.protobuf.ServiceOptions" whose type is a message named >>> MessageOne. >>> - "(custom.option).foo" refers to a field named "foo" in >>> MessageOne. This field's type is a message named MessageTwo. >>> - "(custom.option).foo.bar" refers to a field named "bar" in >>> MessageTwo. >>> - References to repeated fields may not have values that >>> appear to be list literals. Instead, a single option that references >>> a >>> repeated field defines a single element of the repeated option. >>> Subsequent >>> options that reference the same repeated field define subsequent >>> elements. >>> - Only the *leaf* field named can be a repeated field. If earlier >>> path elements name repeated fields, the option is invalid. >>> Instead, the >>> option statement must refer only to the *first* repeated field >>> in the path and then use an aggregate value (which then includes >>> all values >>> for nested repeated fields). >>> - References to message fields may use an aggregate value: an >>> aggregate value is enclosed in curly braces "{ }" and uses the >>> protobuf >>> text format therein to define the message value. >>> - Option values whose type is an enum must use unqualified >>> identifiers. The identifier must be one of the values in the enum. >>> - No option statement may refer to an option named >>> "uninterpreted_option". (This field of the various option types is used >>> internally by protoc and other parsers to represent >>> unresolved/uninterpreted option statements.) >>> - If the file indicates syntax = "proto3": >>> - No "optional" or "required" labels are allowed on any field >>> definition. (Fields are always optional in proto3, unless they have >>> the >>> "repeated" label or are map fields.) >>> - No messages may define extension ranges. >>> - No messages may define groups. >>> - No messages may include a field whose type is an enum defined >>> in a file with syntax = "proto2". >>> - The first value for an enum *must* have a numeric value of >>> zero. (Similarly: every enum must have a value whose numeric value is >>> zero.) >>> - Field definitions may not use the "default" option (default >>> field values in proto3 are always the zero value for the field's >>> type). >>> - If the file indicates syntax = "proto2": >>> - Fields and extensions must specify a label: "optional", >>> "repeated", or "required" (excluding those in oneof declarations and >>> excluding map fields, which must *not* have a label). >>> - Extension fields must *not* use the "required" label. >>> >>> >>> >>> ---- >>> *Josh Humphries* >>> [email protected] >>> >>> >>> On Sat, Mar 9, 2019 at 2:03 PM Michael Powell <[email protected]> >>> wrote: >>> >>>> Hello, >>>> >>>> I am looking for guidance as far as what steps one must take in order >>>> to verify that a Proto is valid, link the names, i.e. considering >>>> ambiguities of Element (i.e. Message or Enum) Type Names during parsing, >>>> etc. >>>> >>>> Some things seem obvious, such as the Field Numbers must be unique, >>>> that sort of thing, but a more comprehensive set of guidelines would be >>>> helpful. >>>> >>>> Cheers, thank you. >>>> >>>> Michael W Powell >>>> >>>> -- >>>> You received this message because you are subscribed to the Google >>>> Groups "Protocol Buffers" group. >>>> To unsubscribe from this group and stop receiving emails from it, send >>>> an email to [email protected]. >>>> To post to this group, send email to [email protected]. >>>> Visit this group at https://groups.google.com/group/protobuf. >>>> For more options, visit https://groups.google.com/d/optout. >>>> >>> -- >> You received this message because you are subscribed to the Google Groups >> "Protocol Buffers" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected] <javascript:>. >> To post to this group, send email to [email protected] >> <javascript:>. >> Visit this group at https://groups.google.com/group/protobuf. >> For more options, visit https://groups.google.com/d/optout. >> > -- You received this message because you are subscribed to the Google Groups "Protocol Buffers" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/protobuf. For more options, visit https://groups.google.com/d/optout.
