I am mostly familiar with the Java API.
How are you currently getting the SourceCodeInfo now? If you have access to
it, you should be able to access to the FileDescriptor.

On Sun, Sep 11, 2022 at 1:40 PM Kyle Papili <[email protected]> wrote:

> I'm not seeing these methods supported in the Python API. Any idea if this
> is just unsupported?
> On Friday, September 9, 2022 at 5:11:33 PM UTC-4 Kyle Papili wrote:
>
>> This was a great description. I appreciate you taking the time to write
>> that out. I hadn't been able to find something as clear as this in the
>> documentation. Thank you!
>>
>> On Friday, September 9, 2022 at 5:07:10 PM UTC-4 [email protected] wrote:
>>
>>> Yeah, it's a bit confusing, but the numbers are not types.  They are
>>> field numbers and array indices -- that's all. So the table is
>>> descriptor.proto.
>>>
>>> Start with the FileDescriptorProto and dereference by field number, if
>>> you hit an array, the next number is an index:
>>> message FileDescriptorProto {
>>> optional string name = 1; // file name, relative to root of source tree
>>> optional string package = 2; // e.g. "foo", "foo.bar", etc.
>>> // Names of files imported by this file.
>>> repeated string dependency = 3;
>>> ...
>>> // All top-level definitions in this file.
>>> repeated DescriptorProto message_type = 4;
>>> repeated EnumDescriptorProto enum_type = 5;
>>> repeated ServiceDescriptorProto service = 6;
>>> repeated FieldDescriptorProto extension = 7;
>>> }
>>>
>>> A path starting with 1 would refer to the file name (shouldn't have any
>>> further numbers).
>>> A path starting with 2 would refer to the package (shouldn't have any
>>> further numbers).
>>> A path starting with 3 would refer to a dependency (import), the next
>>> number in the path is an array index (which dependency)
>>> A path starting with 4 refers to a message, the next number is the array
>>> index that tells you which message.
>>> A path starting with 5 refers to an enum, the next number is the array
>>> index that tells you which enum.
>>> A path starting with 6 refers to an service, the next number is the
>>> array index that tells you which enum.
>>>
>>> If you have a path starting with [4,0,...] you are looking at
>>> fileDescriptor.getMessageType(0);
>>> If you have a path starting with [4,1,...] you are looking at
>>> fileDescriptor.getMessageType(1);
>>> If you have a path starting with [5,2,...] you are looking at
>>> fileDescriptor.getEnumType(2);
>>>
>>> The next path element tells you the field number within the top-level
>>> item's descriptor. For example, paths that point into a top-level message
>>> definition:
>>> message DescriptorProto {
>>> optional string name = 1;
>>> repeated FieldDescriptorProto field = 2;
>>> repeated FieldDescriptorProto extension = 6;
>>> repeated DescriptorProto nested_type = 3;
>>> repeated EnumDescriptorProto enum_type = 4;
>>> ...
>>> }
>>>
>>> A path [4,0,2,1,...] corresponds to: fileDescriptor.getMessageType(0).
>>> getField(1);
>>> A path [4,0,3,1,2,3...] corresponds to: fileDescriptor.getMessageType(0
>>> ).getNestedType(1).getField(3);
>>>
>>> Just follow the proto field numbers.
>>>
>>>
>>> On Fri, Sep 9, 2022 at 2:45 PM Kyle Papili <[email protected]> wrote:
>>>
>>>> My question is far dumber haha. Is there a table that describes what
>>>> Field numbers correlate to what object types?
>>>>
>>>> I've seen 1,2,3,4,5,6,7 show up in paths as field numbers. My naive
>>>> brain was under impression that they correlated to object types, no?
>>>>
>>>> 4: Message, 5: Enum, 6: Extension??
>>>>
>>>> Is this not correct? Is there a table that can show me what each field
>>>> number correlates to?
>>>>
>>>> On Friday, September 9, 2022 at 4:38:21 PM UTC-4 [email protected]
>>>> wrote:
>>>>
>>>>> The ints in the path should be the field numbers and array indices
>>>>> along the way from a top level field descriptor proto, like this:
>>>>> Path: [4, 0, 2, 0]
>>>>> Starting with the FileDescriptorProto:
>>>>> 4 -> FileDescriptorProto {
>>>>>   ...
>>>>> * repeated DescriptorProto message_type = 4;*
>>>>>  repeated EnumDescriptorProto enum_type = 5;
>>>>> }
>>>>> 0 -> index into FileDescriptorProto.messages[0]
>>>>> 2 -> DescriptorProto {
>>>>>   optional name = 1;
>>>>> *  repeated FieldDescriptorProto field = 2;*
>>>>>  ...
>>>>> }
>>>>> 0 -> index into DescriptorProto.field[0]
>>>>>
>>>>> Thus this path/Location [4, 0, 2, 0] applies to the whole field
>>>>> statement.
>>>>> I believe the index of a message in the message_type array generally
>>>>> corresponds to the order of all top-level message items in the file.
>>>>> I also believe that the index of a field likewise corresponds to the
>>>>> ordering of fields within the message.
>>>>>
>>>>> So if you have to deal with nested messages, the path will start with:
>>>>> [4, (top-level-message-index), 3, (index-of-nested-message-type), ...]
>>>>>
>>>>> If I remember correctly, this breaks down for options
>>>>> because sometimes the comments/location for an option is dropped, and when
>>>>> it is present the path points to field 999 the uninterpreted options.
>>>>>
>>>>> But maybe you already had gotten that far and I misunderstood
>>>>> your question.
>>>>>
>>>>>
>>>>> On Fri, Sep 9, 2022 at 2:15 PM Kyle Papili <[email protected]> wrote:
>>>>>
>>>>>> Is there somewhere in the documentation that provides clear table
>>>>>> describing which numbers in the path correlate to which types? I have 
>>>>>> found
>>>>>> some inconsistencies with what I had thought. Any link to a table like 
>>>>>> this?
>>>>>>
>>>>>> On Friday, September 9, 2022 at 4:14:46 PM UTC-4 [email protected]
>>>>>> wrote:
>>>>>>
>>>>>>> Ah, no, there is no magic. I only meant that if you wanted to have
>>>>>>> one part of your code match up location data to descriptor object and
>>>>>>> attach the location info directly, you could do it in a custom option.
>>>>>>> There's no getting around the actual awkward stepping through the paths 
>>>>>>> to
>>>>>>> match them up.
>>>>>>>
>>>>>>> On Fri, Sep 9, 2022 at 2:08 PM Kyle Papili <[email protected]>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Yes, the "hacky method" proposed by shaod@ is basically what I am
>>>>>>>> doing currently. It just seems to be unnecessarily complicated.
>>>>>>>>
>>>>>>>> What do you mean "store the location object in a custom option
>>>>>>>> extension on the object in question". How would I store the location 
>>>>>>>> object
>>>>>>>> as a custom extension of the object without knowing the object? If I 
>>>>>>>> knew
>>>>>>>> the object that that location corresponded to then my problem would be
>>>>>>>> resolved. The only way to match up location objects to Proto objects 
>>>>>>>> from
>>>>>>>> what I've found is the hacky path traversal suggested by Shaod@.
>>>>>>>> Am I missing something here?
>>>>>>>>
>>>>>>>> On Friday, September 9, 2022 at 4:02:05 PM UTC-4 [email protected]
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Unfortunately, the only way to know the path to the Location
>>>>>>>>> object is to know the path to the descriptor proto object in question.
>>>>>>>>> Alternatively, you could iterate through all the sourcecodeinfo
>>>>>>>>> elements and use their paths to navigate to the correct descriptor 
>>>>>>>>> object.
>>>>>>>>> One technique I have used in the past is to iterate through all
>>>>>>>>> the sourcecodeinfo elements and store the location object in a custom
>>>>>>>>> option extension on the object in question (or the parent object if it
>>>>>>>>> something that doesn't have options).
>>>>>>>>>
>>>>>>>>> Also, as shaod@ points out, some comments will not show up in
>>>>>>>>> sourcecodeinfo.
>>>>>>>>>
>>>>>>>>> On Wednesday, September 7, 2022 at 4:11:32 PM UTC-6
>>>>>>>>> [email protected] wrote:
>>>>>>>>>
>>>>>>>>>> First keep in mind that some comments are detached and thus
>>>>>>>>>> ignored by SourceCodeInfo.
>>>>>>>>>>
>>>>>>>>>> That being said, IIRC I've seen a very hacky way to achieve
>>>>>>>>>> similar goals:
>>>>>>>>>> https://github.com/protocolbuffers/protobuf/blob/3322c0b92a5001ade92608d75891d63c749d624d/src/google/protobuf/compiler/parser_unittest.cc#L2472
>>>>>>>>>> On Thursday, September 1, 2022 at 7:16:52 AM UTC-7
>>>>>>>>>> [email protected] wrote:
>>>>>>>>>>
>>>>>>>>>>> I'm parsing a large number of protobuf files and am using the
>>>>>>>>>>> Source Code Info descriptor to extract comment data from the source 
>>>>>>>>>>> files
>>>>>>>>>>> as well. I currently use the FileDescriptorProto.ListFields() 
>>>>>>>>>>> method to
>>>>>>>>>>> extract the DescriptorProto objects I care about as well as the
>>>>>>>>>>> SourceCodeInfo.
>>>>>>>>>>>
>>>>>>>>>>> To my knowledge, the only way to pair up Location fields with
>>>>>>>>>>> the corresponding objects is via the path attribute
>>>>>>>>>>> <https://developers.google.com/protocol-buffers/docs/reference/cpp/google.protobuf.descriptor.pb>.
>>>>>>>>>>> This is fine; except for the fact that involves me manually stepping
>>>>>>>>>>> through said path to land at my parsed Protobuf Object. This gets
>>>>>>>>>>> complicated when dealing with layers of nested_types and I am 
>>>>>>>>>>> convinced
>>>>>>>>>>> there must be a way for me to extract the path from the particular
>>>>>>>>>>> DescriptorProto Object and then use that to match up the object 
>>>>>>>>>>> with the
>>>>>>>>>>> path specified in the corresponding Location field.
>>>>>>>>>>>
>>>>>>>>>>> In short: How can I easily pair up DescriptorProto objects with
>>>>>>>>>>> the Location objects that correspond to them? Specifically for 
>>>>>>>>>>> comment
>>>>>>>>>>> parsing purposes.
>>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>> You received this message because you are subscribed to the Google
>>>>>>>> Groups "Protocol Buffers" group.
>>>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>>>> send an email to [email protected].
>>>>>>>> To view this discussion on the web visit
>>>>>>>> https://groups.google.com/d/msgid/protobuf/0c8d36db-53f9-4179-942f-201cd205b9dfn%40googlegroups.com
>>>>>>>> <https://groups.google.com/d/msgid/protobuf/0c8d36db-53f9-4179-942f-201cd205b9dfn%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>>>>> .
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Jerry Berg | Software Engineer | [email protected] | 720-808-1188
>>>>>>>
>>>>>> --
>>>>>> You received this message because you are subscribed to the Google
>>>>>> Groups "Protocol Buffers" group.
>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>> send an email to [email protected].
>>>>>>
>>>>> To view this discussion on the web visit
>>>>>> https://groups.google.com/d/msgid/protobuf/e97868df-f57d-45bf-b4f5-eeb0e2f4eed3n%40googlegroups.com
>>>>>> <https://groups.google.com/d/msgid/protobuf/e97868df-f57d-45bf-b4f5-eeb0e2f4eed3n%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>>> .
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Jerry Berg | Software Engineer | [email protected] | 720-808-1188
>>>>>
>>>> --
>>>> You received this message because you are subscribed to the Google
>>>> Groups "Protocol Buffers" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>> an email to [email protected].
>>>>
>>> To view this discussion on the web visit
>>>> https://groups.google.com/d/msgid/protobuf/94923187-2142-44ad-b0b8-97d1c12a5d5fn%40googlegroups.com
>>>> <https://groups.google.com/d/msgid/protobuf/94923187-2142-44ad-b0b8-97d1c12a5d5fn%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>> .
>>>>
>>>
>>>
>>> --
>>> Jerry Berg | Software Engineer | [email protected] | 720-808-1188
>>>
>> --
> You received this message because you are subscribed to the Google Groups
> "Protocol Buffers" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/protobuf/17cbf006-fb8c-440e-8f51-6c02d83c0f5dn%40googlegroups.com
> <https://groups.google.com/d/msgid/protobuf/17cbf006-fb8c-440e-8f51-6c02d83c0f5dn%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>


-- 
Jerry Berg | Software Engineer | [email protected] | 720-808-1188

-- 
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/protobuf/CAHLB6ReiVJorm%3DP3v43kj5kqf2FtGr0ajoRFKTMQBTXGHtzWeg%40mail.gmail.com.

Reply via email to