This was a great description. I appreciate you taking the time to write 
that out. I hadn't been able to find something as clear as this in the 
documentation. Thank you!

On Friday, September 9, 2022 at 5:07:10 PM UTC-4 [email protected] wrote:

> Yeah, it's a bit confusing, but the numbers are not types.  They are 
> field numbers and array indices -- that's all. So the table is 
> descriptor.proto. 
>
> Start with the FileDescriptorProto and dereference by field number, if you 
> hit an array, the next number is an index:
> message FileDescriptorProto { 
> optional string name = 1; // file name, relative to root of source tree 
> optional string package = 2; // e.g. "foo", "foo.bar", etc. 
> // Names of files imported by this file. 
> repeated string dependency = 3; 
> ...
> // All top-level definitions in this file. 
> repeated DescriptorProto message_type = 4; 
> repeated EnumDescriptorProto enum_type = 5; 
> repeated ServiceDescriptorProto service = 6; 
> repeated FieldDescriptorProto extension = 7; 
> }
>
> A path starting with 1 would refer to the file name (shouldn't have any 
> further numbers).
> A path starting with 2 would refer to the package (shouldn't have any 
> further numbers).
> A path starting with 3 would refer to a dependency (import), the next 
> number in the path is an array index (which dependency)
> A path starting with 4 refers to a message, the next number is the array 
> index that tells you which message.
> A path starting with 5 refers to an enum, the next number is the array 
> index that tells you which enum.
> A path starting with 6 refers to an service, the next number is the array 
> index that tells you which enum.
>
> If you have a path starting with [4,0,...] you are looking at 
> fileDescriptor.getMessageType(0);
> If you have a path starting with [4,1,...] you are looking at 
> fileDescriptor.getMessageType(1);
> If you have a path starting with [5,2,...] you are looking at 
> fileDescriptor.getEnumType(2);
>
> The next path element tells you the field number within the top-level 
> item's descriptor. For example, paths that point into a top-level message 
> definition:
> message DescriptorProto { 
> optional string name = 1; 
> repeated FieldDescriptorProto field = 2; 
> repeated FieldDescriptorProto extension = 6; 
> repeated DescriptorProto nested_type = 3; 
> repeated EnumDescriptorProto enum_type = 4; 
> ...
> }
>
> A path [4,0,2,1,...] corresponds to: fileDescriptor.getMessageType(0).
> getField(1);
> A path [4,0,3,1,2,3...] corresponds to: fileDescriptor.getMessageType(0).
> getNestedType(1).getField(3);
>
> Just follow the proto field numbers.
>
>
> On Fri, Sep 9, 2022 at 2:45 PM Kyle Papili <[email protected]> wrote:
>
>> My question is far dumber haha. Is there a table that describes what 
>> Field numbers correlate to what object types?
>>
>> I've seen 1,2,3,4,5,6,7 show up in paths as field numbers. My naive brain 
>> was under impression that they correlated to object types, no?
>>
>> 4: Message, 5: Enum, 6: Extension??
>>
>> Is this not correct? Is there a table that can show me what each field 
>> number correlates to?
>>
>> On Friday, September 9, 2022 at 4:38:21 PM UTC-4 [email protected] wrote:
>>
>>> The ints in the path should be the field numbers and array indices along 
>>> the way from a top level field descriptor proto, like this:
>>> Path: [4, 0, 2, 0]
>>> Starting with the FileDescriptorProto:
>>> 4 -> FileDescriptorProto {
>>>   ...
>>> * repeated DescriptorProto message_type = 4;*
>>>  repeated EnumDescriptorProto enum_type = 5;
>>> }
>>> 0 -> index into FileDescriptorProto.messages[0]
>>> 2 -> DescriptorProto {
>>>   optional name = 1;
>>> *  repeated FieldDescriptorProto field = 2;*
>>>  ...
>>> }
>>> 0 -> index into DescriptorProto.field[0] 
>>>
>>> Thus this path/Location [4, 0, 2, 0] applies to the whole field 
>>> statement.
>>> I believe the index of a message in the message_type array generally 
>>> corresponds to the order of all top-level message items in the file.
>>> I also believe that the index of a field likewise corresponds to the 
>>> ordering of fields within the message.
>>>
>>> So if you have to deal with nested messages, the path will start with:
>>> [4, (top-level-message-index), 3, (index-of-nested-message-type), ...]
>>>
>>> If I remember correctly, this breaks down for options because sometimes 
>>> the comments/location for an option is dropped, and when it is present the 
>>> path points to field 999 the uninterpreted options.
>>>
>>> But maybe you already had gotten that far and I misunderstood 
>>> your question.
>>>
>>>
>>> On Fri, Sep 9, 2022 at 2:15 PM Kyle Papili <[email protected]> wrote:
>>>
>>>> Is there somewhere in the documentation that provides clear table 
>>>> describing which numbers in the path correlate to which types? I have 
>>>> found 
>>>> some inconsistencies with what I had thought. Any link to a table like 
>>>> this?
>>>>
>>>> On Friday, September 9, 2022 at 4:14:46 PM UTC-4 [email protected] 
>>>> wrote:
>>>>
>>>>> Ah, no, there is no magic. I only meant that if you wanted to have one 
>>>>> part of your code match up location data to descriptor object and attach 
>>>>> the location info directly, you could do it in a custom option. There's 
>>>>> no 
>>>>> getting around the actual awkward stepping through the paths to match 
>>>>> them 
>>>>> up.
>>>>>
>>>>> On Fri, Sep 9, 2022 at 2:08 PM Kyle Papili <[email protected]> wrote:
>>>>>
>>>>>> Yes, the "hacky method" proposed by shaod@ is basically what I am 
>>>>>> doing currently. It just seems to be unnecessarily complicated. 
>>>>>>
>>>>>> What do you mean "store the location object in a custom option 
>>>>>> extension on the object in question". How would I store the location 
>>>>>> object 
>>>>>> as a custom extension of the object without knowing the object? If I 
>>>>>> knew 
>>>>>> the object that that location corresponded to then my problem would be 
>>>>>> resolved. The only way to match up location objects to Proto objects 
>>>>>> from 
>>>>>> what I've found is the hacky path traversal suggested by Shaod@. Am I 
>>>>>> missing something here?
>>>>>>
>>>>>> On Friday, September 9, 2022 at 4:02:05 PM UTC-4 [email protected] 
>>>>>> wrote:
>>>>>>
>>>>>>> Unfortunately, the only way to know the path to the Location object 
>>>>>>> is to know the path to the descriptor proto object in question.
>>>>>>> Alternatively, you could iterate through all the sourcecodeinfo 
>>>>>>> elements and use their paths to navigate to the correct descriptor 
>>>>>>> object.
>>>>>>> One technique I have used in the past is to iterate through all the 
>>>>>>> sourcecodeinfo elements and store the location object in a custom 
>>>>>>> option 
>>>>>>> extension on the object in question (or the parent object if it 
>>>>>>> something 
>>>>>>> that doesn't have options).
>>>>>>>
>>>>>>> Also, as shaod@ points out, some comments will not show up in 
>>>>>>> sourcecodeinfo.
>>>>>>>
>>>>>>> On Wednesday, September 7, 2022 at 4:11:32 PM UTC-6 [email protected] 
>>>>>>> wrote:
>>>>>>>
>>>>>>>> First keep in mind that some comments are detached and thus ignored 
>>>>>>>> by SourceCodeInfo.
>>>>>>>>
>>>>>>>> That being said, IIRC I've seen a very hacky way to achieve similar 
>>>>>>>> goals: 
>>>>>>>> https://github.com/protocolbuffers/protobuf/blob/3322c0b92a5001ade92608d75891d63c749d624d/src/google/protobuf/compiler/parser_unittest.cc#L2472
>>>>>>>> On Thursday, September 1, 2022 at 7:16:52 AM UTC-7 
>>>>>>>> [email protected] wrote:
>>>>>>>>
>>>>>>>>> I'm parsing a large number of protobuf files and am using the 
>>>>>>>>> Source Code Info descriptor to extract comment data from the source 
>>>>>>>>> files 
>>>>>>>>> as well. I currently use the FileDescriptorProto.ListFields() method 
>>>>>>>>> to 
>>>>>>>>> extract the DescriptorProto objects I care about as well as the 
>>>>>>>>> SourceCodeInfo.
>>>>>>>>>
>>>>>>>>> To my knowledge, the only way to pair up Location fields with the 
>>>>>>>>> corresponding objects is via the path attribute 
>>>>>>>>> <https://developers.google.com/protocol-buffers/docs/reference/cpp/google.protobuf.descriptor.pb>.
>>>>>>>>>  
>>>>>>>>> This is fine; except for the fact that involves me manually stepping 
>>>>>>>>> through said path to land at my parsed Protobuf Object. This gets 
>>>>>>>>> complicated when dealing with layers of nested_types and I am 
>>>>>>>>> convinced 
>>>>>>>>> there must be a way for me to extract the path from the particular 
>>>>>>>>> DescriptorProto Object and then use that to match up the object with 
>>>>>>>>> the 
>>>>>>>>> path specified in the corresponding Location field.
>>>>>>>>>
>>>>>>>>> In short: How can I easily pair up DescriptorProto objects with 
>>>>>>>>> the Location objects that correspond to them? Specifically for 
>>>>>>>>> comment 
>>>>>>>>> parsing purposes.
>>>>>>>>>
>>>>>>>> -- 
>>>>>> You received this message because you are subscribed to the Google 
>>>>>> Groups "Protocol Buffers" group.
>>>>>> To unsubscribe from this group and stop receiving emails from it, 
>>>>>> send an email to [email protected].
>>>>>> To view this discussion on the web visit 
>>>>>> https://groups.google.com/d/msgid/protobuf/0c8d36db-53f9-4179-942f-201cd205b9dfn%40googlegroups.com
>>>>>>  
>>>>>> <https://groups.google.com/d/msgid/protobuf/0c8d36db-53f9-4179-942f-201cd205b9dfn%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>>> .
>>>>>>
>>>>>
>>>>>
>>>>> -- 
>>>>> Jerry Berg | Software Engineer | [email protected] | 720-808-1188
>>>>>
>>>> -- 
>>>> You received this message because you are subscribed to the Google 
>>>> Groups "Protocol Buffers" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send 
>>>> an email to [email protected].
>>>>
>>> To view this discussion on the web visit 
>>>> https://groups.google.com/d/msgid/protobuf/e97868df-f57d-45bf-b4f5-eeb0e2f4eed3n%40googlegroups.com
>>>>  
>>>> <https://groups.google.com/d/msgid/protobuf/e97868df-f57d-45bf-b4f5-eeb0e2f4eed3n%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>> .
>>>>
>>>
>>>
>>> -- 
>>> Jerry Berg | Software Engineer | [email protected] | 720-808-1188
>>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "Protocol Buffers" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to [email protected].
>>
> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/protobuf/94923187-2142-44ad-b0b8-97d1c12a5d5fn%40googlegroups.com
>>  
>> <https://groups.google.com/d/msgid/protobuf/94923187-2142-44ad-b0b8-97d1c12a5d5fn%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>>
>
>
> -- 
> Jerry Berg | Software Engineer | [email protected] | 720-808-1188
>

-- 
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/protobuf/9b5d819b-76a6-4344-80a6-90fedf1ca756n%40googlegroups.com.

Reply via email to