On Tuesday, March 12, 2019 at 9:43:38 PM UTC-4, Josh Humphries wrote:
>
>
> On Tue, Mar 12, 2019 at 7:31 PM Michael Powell <[email protected] 
> <javascript:>> wrote:
>
>>
>>
>> On Monday, March 11, 2019 at 12:24:57 PM UTC-4, Josh Humphries wrote:
>>>
>>> Since I've implemented this before, I have a fairly lengthy list.
>>>
>>> Most of these constraints are in the docs, at least in some 
>>> way/shape/form. But some are enforced by protoc but never actually 
>>> mentioned in the docs (such as disallowing the use of "map_entry" and  
>>> "uninterpreted_options" options):
>>>
>>
>> So, this may sound like a stupid question, but where do you even go to 
>> discover heuristics such as these? Well, besides this super comprehensive 
>> list, that is.
>>
>
> I go to the source: in this case protoc (and sometimes its C source code).
>
> Many I discovered because I was writing a parser/compiler in Go. I 
> realized that these were areas where the spec was light and the comments in 
> descriptor.proto weren't crystal clear. So I just tried a few things out to 
> observe the behavior of protoc.
>

I see, I see. Thanks for pointing that out. 

For example, it just seems like these are so arbitrary. I'm sure there is a 
>> reason why, but in and of itself, "just because" does not seem like an 
>> adequate response:
>>
>>
>>    - If the file indicates syntax = "proto2":
>>       - Fields and extensions must specify a label: "optional", 
>>       "repeated", or "required" (excluding those in oneof declarations and 
>>       excluding map fields, which must *not* have a label).
>>       - Extension fields must *not* use the "required" label.
>>    
>>
>> Are these heuristics internal bits? Or are they really coming from Google 
>> Protocol Buffers?
>>
>
> I'm not sure I understand the question. I think the answer is that they 
> come from Google Protocol Buffers.
>
> Regarding labels, this is a difference in the actual language 
> <https://developers.google.com/protocol-buffers/docs/reference/proto2-spec> 
> specs 
> <https://developers.google.com/protocol-buffers/docs/reference/proto3-spec>. 
> I was writing a single parser that could support either, so instead of 
> implementing two different specs, I merged the two. And the label thing 
> shook out as a difference between the two specs.
>
> For extensions not being required, it's not explicitly documented. But 
> given that extensions are not meant to even be known by all clients, it's 
> intuitive that they can't be required. It's also hinted at in the docs, where 
> it is stated 
> <https://developers.google.com/protocol-buffers/docs/proto#updating> that 
> non-required fields can safely be converted to extensions and vice versa.
>  
>
>>  
>>
>>>
>>>    - Tags for all fields of a given message must be valid:
>>>       - Must be in the range 1 to 536,870,911 (2^29-1) inclusive.
>>>       - Must *not* be in the reserved range 19,000 to 19,999 
>>>       (inclusive).
>>>       - All fields must have a unique tag (i.e. no tag re-use allowed).
>>>       - Tags from reserved ranges defined in the message are not 
>>>       allowed.
>>>       - Tags from extension ranges defined in the message are not 
>>>       allowed for normal fields. Similarly, extension fields *must* use 
>>>       a tag in one of the message's extension ranges.
>>>    - Other message properties must be valid:
>>>       - No field may be named using a reserved name.
>>>       - Any given reserved range or extension range must not overlap 
>>>       with any other reserved range or extension range defined in the 
>>> message.
>>>       - Reserved names may not contain duplicates.
>>>       - No message is allowed to use the "map_entry" option. (This 
>>>       option is used in representing a message as a descriptor, in which a 
>>>       message descriptor is synthesized for every map field. Only those 
>>> synthetic 
>>>       messages may have this option.)
>>>       - A field whose type refers to a message must not have a 
>>>       "default" option.
>>>       - Map fields and repeated fields also must not have "default" 
>>>       options.
>>>    - Numeric values for all enum values must be valid:
>>>       - Numeric values must be in range for signed 32-bit integers 
>>>       (-4,294,967,296 to 4,294,967,295).
>>>       - Numeric values for a single enum must be unique and may not be 
>>>       reused *unless* the enum includes the option allow_alias set to 
>>>       true.
>>>       - Numeric values from reserved ranges defined in the enum are not 
>>>       allowed.
>>>    - Other enum properties must be valid:
>>>       - No value may be named using a reserved name.
>>>       - Any given reserved range must not overlap with any other 
>>>       reserved range defined in the enum.
>>>       - Reserved names may not contain duplicates.
>>>    - Must be able to resolve all relative references using protobuf 
>>>    scoping rules:
>>>       - Types in field definitions must resolve to an element that is a 
>>>       message or enum.
>>>       - Targets of "extends" blocks must resolve to a message.
>>>       - Request/response types in methods must resolve to a message.
>>>       - Names in custom options must resolve to an extension. That 
>>>       resolved extension must extend the appropriate option type (e.g. 
>>>       "google.protobuf.MessageOptions" for options that are scoped to a 
>>> message).
>>>    - All options must have a value of the correct type:
>>>    - The "correct type" of the value is determined by resolving the 
>>>       option name to a field (or extension) of the appropriate option type.
>>>          - Options in file scope must resolve to fields or extensions 
>>>          of "google.protobuf.FileOptions"; options in a message scope must 
>>> resolve 
>>>          to fields or extensions of "google.protobuf.MessageOptions"; etc.
>>>       - Normal options (as opposed to custom options) will refer to 
>>>       field names on the option type itself.
>>>          - There are two exceptions: "default" and "json_name" in field 
>>>          options: there are no fields with these names on 
>>>          "google.protobuf.FieldOptions". (These instead correspond to 
>>> fields named 
>>>          "default_value" and "json_name" on the 
>>>          "google.protobuf.FieldDescriptorProto".)
>>>          - The "json_name" option value must be a string.
>>>          - The "default" option value must be the same type as the 
>>>          field itself. So it must be a literal integer for fields with an 
>>> integer 
>>>          type, a literal string for fields with a string or bytes type, 
>>> etc. 
>>>          (Default options are not allowed on repeated fields, map fields, 
>>> and fields 
>>>          whose type is a message.)
>>>       - Custom options (those that use parentheses around the first 
>>>       portion of the option name; e.g. "(custom.option)") will refer to 
>>> extension 
>>>       fields.
>>>       - Any two option statements in the same scope may not attempt to 
>>>       set the same option field(s).
>>>       - If any path of the option name refers to a message type, there 
>>>       can be additional path elements that refer to fields of that message 
>>> type.
>>>       For example, in a service option that has name 
>>>       "(custom.option).foo.bar":
>>>       - "(custom.option)" refers to an extension that extends 
>>>          "google.protobuf.ServiceOptions" whose type is a message named 
>>> MessageOne.
>>>          - "(custom.option).foo" refers to a field named "foo" in 
>>>          MessageOne. This field's type is a message named MessageTwo.
>>>          - "(custom.option).foo.bar" refers to a field named "bar" in 
>>>          MessageTwo.
>>>          - References to repeated fields may not have values that 
>>>       appear to be list literals. Instead, a single option that references 
>>> a 
>>>       repeated field defines a single element of the repeated option. 
>>> Subsequent 
>>>       options that reference the same repeated field define subsequent 
>>> elements.
>>>       - Only the *leaf* field named can be a repeated field. If earlier 
>>>          path elements name repeated fields, the option is invalid. 
>>> Instead, the 
>>>          option statement must refer only to the *first* repeated field 
>>>          in the path and then use an aggregate value (which then includes 
>>> all values 
>>>          for nested repeated fields).
>>>       - References to message fields may use an aggregate value: an 
>>>       aggregate value is enclosed in curly braces "{ }" and uses the 
>>> protobuf 
>>>       text format therein to define the message value.
>>>       - Option values whose type is an enum must use unqualified 
>>>       identifiers. The identifier must be one of the values in the enum.
>>>    - No option statement may refer to an option named 
>>>    "uninterpreted_option". (This field of the various option types is used 
>>>    internally by protoc and other parsers to represent 
>>>    unresolved/uninterpreted option statements.)
>>>    - If the file indicates syntax = "proto3":
>>>       - No "optional" or "required" labels are allowed on any field 
>>>       definition. (Fields are always optional in proto3, unless they have 
>>> the 
>>>       "repeated" label or are map fields.)
>>>       - No messages may define extension ranges.
>>>       - No messages may define groups.
>>>       - No messages may include a field whose type is an enum defined 
>>>       in a file with syntax = "proto2".
>>>       - The first value for an enum *must* have a numeric value of 
>>>       zero. (Similarly: every enum must have a value whose numeric value is 
>>> zero.)
>>>       - Field definitions may not use the "default" option (default 
>>>       field values in proto3 are always the zero value for the field's 
>>> type).
>>>    - If the file indicates syntax = "proto2":
>>>       - Fields and extensions must specify a label: "optional", 
>>>       "repeated", or "required" (excluding those in oneof declarations and 
>>>       excluding map fields, which must *not* have a label).
>>>       - Extension fields must *not* use the "required" label.
>>>    
>>>
>>>
>>> ----
>>> *Josh Humphries*
>>> [email protected]
>>>
>>>
>>> On Sat, Mar 9, 2019 at 2:03 PM Michael Powell <[email protected]> 
>>> wrote:
>>>
>>>> Hello,
>>>>
>>>> I am looking for guidance as far as what steps one must take in order 
>>>> to verify that a Proto is valid, link the names, i.e. considering 
>>>> ambiguities of Element (i.e. Message or Enum) Type Names during parsing, 
>>>> etc.
>>>>
>>>> Some things seem obvious, such as the Field Numbers must be unique, 
>>>> that sort of thing, but a more comprehensive set of guidelines would be 
>>>> helpful.
>>>>
>>>> Cheers, thank you.
>>>>
>>>> Michael W Powell
>>>>
>>>> -- 
>>>> You received this message because you are subscribed to the Google 
>>>> Groups "Protocol Buffers" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send 
>>>> an email to [email protected].
>>>> To post to this group, send email to [email protected].
>>>> Visit this group at https://groups.google.com/group/protobuf.
>>>> For more options, visit https://groups.google.com/d/optout.
>>>>
>>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "Protocol Buffers" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to [email protected] <javascript:>.
>> To post to this group, send email to [email protected] 
>> <javascript:>.
>> Visit this group at https://groups.google.com/group/protobuf.
>> For more options, visit https://groups.google.com/d/optout.
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/protobuf.
For more options, visit https://groups.google.com/d/optout.

Reply via email to