Yes, I had the FileDescriptor no problem but the functions are non-existent
in Python. I figured it out though, I can access the elements using
(.ListFields(), .nested_type, .message_type, .enum_type, .service,
.extension, and .options.
A few questions I still had:
// Test Comment 1
message mainMessage {
// Test Comment 2
enum internalEnum {
SomeField = 1; // Test Comment 3
SomeOtherField = 2;
}
}
The location for Test Comment 2 is [4, 0, 4, 0]. Shouldn't it be [4, 0, 5,
0]??
Also, how exactly do you set and retrieve custom option extensions via the
Python API? So far I have tried:
setattr(pointer.Extensions, "my_custom_option", comments)
but that is not correct.
Any ideas here?
On Sunday, September 11, 2022 at 11:31:25 PM UTC-4 [email protected] wrote:
> I am mostly familiar with the Java API.
> How are you currently getting the SourceCodeInfo now? If you have access
> to it, you should be able to access to the FileDescriptor.
>
> On Sun, Sep 11, 2022 at 1:40 PM Kyle Papili <[email protected]> wrote:
>
>> I'm not seeing these methods supported in the Python API. Any idea if
>> this is just unsupported?
>> On Friday, September 9, 2022 at 5:11:33 PM UTC-4 Kyle Papili wrote:
>>
>>> This was a great description. I appreciate you taking the time to write
>>> that out. I hadn't been able to find something as clear as this in the
>>> documentation. Thank you!
>>>
>>> On Friday, September 9, 2022 at 5:07:10 PM UTC-4 [email protected] wrote:
>>>
>>>> Yeah, it's a bit confusing, but the numbers are not types. They are
>>>> field numbers and array indices -- that's all. So the table is
>>>> descriptor.proto.
>>>>
>>>> Start with the FileDescriptorProto and dereference by field number, if
>>>> you hit an array, the next number is an index:
>>>> message FileDescriptorProto {
>>>> optional string name = 1; // file name, relative to root of source tree
>>>> optional string package = 2; // e.g. "foo", "foo.bar", etc.
>>>> // Names of files imported by this file.
>>>> repeated string dependency = 3;
>>>> ...
>>>> // All top-level definitions in this file.
>>>> repeated DescriptorProto message_type = 4;
>>>> repeated EnumDescriptorProto enum_type = 5;
>>>> repeated ServiceDescriptorProto service = 6;
>>>> repeated FieldDescriptorProto extension = 7;
>>>> }
>>>>
>>>> A path starting with 1 would refer to the file name (shouldn't have any
>>>> further numbers).
>>>> A path starting with 2 would refer to the package (shouldn't have any
>>>> further numbers).
>>>> A path starting with 3 would refer to a dependency (import), the next
>>>> number in the path is an array index (which dependency)
>>>> A path starting with 4 refers to a message, the next number is the
>>>> array index that tells you which message.
>>>> A path starting with 5 refers to an enum, the next number is the array
>>>> index that tells you which enum.
>>>> A path starting with 6 refers to an service, the next number is the
>>>> array index that tells you which enum.
>>>>
>>>> If you have a path starting with [4,0,...] you are looking at
>>>> fileDescriptor.getMessageType(0);
>>>> If you have a path starting with [4,1,...] you are looking at
>>>> fileDescriptor.getMessageType(1);
>>>> If you have a path starting with [5,2,...] you are looking at
>>>> fileDescriptor.getEnumType(2);
>>>>
>>>> The next path element tells you the field number within the top-level
>>>> item's descriptor. For example, paths that point into a top-level message
>>>> definition:
>>>> message DescriptorProto {
>>>> optional string name = 1;
>>>> repeated FieldDescriptorProto field = 2;
>>>> repeated FieldDescriptorProto extension = 6;
>>>> repeated DescriptorProto nested_type = 3;
>>>> repeated EnumDescriptorProto enum_type = 4;
>>>> ...
>>>> }
>>>>
>>>> A path [4,0,2,1,...] corresponds to: fileDescriptor.getMessageType(0).
>>>> getField(1);
>>>> A path [4,0,3,1,2,3...] corresponds to: fileDescriptor.getMessageType(0
>>>> ).getNestedType(1).getField(3);
>>>>
>>>> Just follow the proto field numbers.
>>>>
>>>>
>>>> On Fri, Sep 9, 2022 at 2:45 PM Kyle Papili <[email protected]> wrote:
>>>>
>>>>> My question is far dumber haha. Is there a table that describes what
>>>>> Field numbers correlate to what object types?
>>>>>
>>>>> I've seen 1,2,3,4,5,6,7 show up in paths as field numbers. My naive
>>>>> brain was under impression that they correlated to object types, no?
>>>>>
>>>>> 4: Message, 5: Enum, 6: Extension??
>>>>>
>>>>> Is this not correct? Is there a table that can show me what each field
>>>>> number correlates to?
>>>>>
>>>>> On Friday, September 9, 2022 at 4:38:21 PM UTC-4 [email protected]
>>>>> wrote:
>>>>>
>>>>>> The ints in the path should be the field numbers and array indices
>>>>>> along the way from a top level field descriptor proto, like this:
>>>>>> Path: [4, 0, 2, 0]
>>>>>> Starting with the FileDescriptorProto:
>>>>>> 4 -> FileDescriptorProto {
>>>>>> ...
>>>>>> * repeated DescriptorProto message_type = 4;*
>>>>>> repeated EnumDescriptorProto enum_type = 5;
>>>>>> }
>>>>>> 0 -> index into FileDescriptorProto.messages[0]
>>>>>> 2 -> DescriptorProto {
>>>>>> optional name = 1;
>>>>>> * repeated FieldDescriptorProto field = 2;*
>>>>>> ...
>>>>>> }
>>>>>> 0 -> index into DescriptorProto.field[0]
>>>>>>
>>>>>> Thus this path/Location [4, 0, 2, 0] applies to the whole field
>>>>>> statement.
>>>>>> I believe the index of a message in the message_type array generally
>>>>>> corresponds to the order of all top-level message items in the file.
>>>>>> I also believe that the index of a field likewise corresponds to the
>>>>>> ordering of fields within the message.
>>>>>>
>>>>>> So if you have to deal with nested messages, the path will start with:
>>>>>> [4, (top-level-message-index), 3, (index-of-nested-message-type), ...]
>>>>>>
>>>>>> If I remember correctly, this breaks down for options
>>>>>> because sometimes the comments/location for an option is dropped, and
>>>>>> when
>>>>>> it is present the path points to field 999 the uninterpreted options.
>>>>>>
>>>>>> But maybe you already had gotten that far and I misunderstood
>>>>>> your question.
>>>>>>
>>>>>>
>>>>>> On Fri, Sep 9, 2022 at 2:15 PM Kyle Papili <[email protected]>
>>>>>> wrote:
>>>>>>
>>>>>>> Is there somewhere in the documentation that provides clear table
>>>>>>> describing which numbers in the path correlate to which types? I have
>>>>>>> found
>>>>>>> some inconsistencies with what I had thought. Any link to a table like
>>>>>>> this?
>>>>>>>
>>>>>>> On Friday, September 9, 2022 at 4:14:46 PM UTC-4 [email protected]
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Ah, no, there is no magic. I only meant that if you wanted to have
>>>>>>>> one part of your code match up location data to descriptor object and
>>>>>>>> attach the location info directly, you could do it in a custom option.
>>>>>>>> There's no getting around the actual awkward stepping through the
>>>>>>>> paths to
>>>>>>>> match them up.
>>>>>>>>
>>>>>>>> On Fri, Sep 9, 2022 at 2:08 PM Kyle Papili <[email protected]>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Yes, the "hacky method" proposed by shaod@ is basically what I am
>>>>>>>>> doing currently. It just seems to be unnecessarily complicated.
>>>>>>>>>
>>>>>>>>> What do you mean "store the location object in a custom option
>>>>>>>>> extension on the object in question". How would I store the location
>>>>>>>>> object
>>>>>>>>> as a custom extension of the object without knowing the object? If I
>>>>>>>>> knew
>>>>>>>>> the object that that location corresponded to then my problem would
>>>>>>>>> be
>>>>>>>>> resolved. The only way to match up location objects to Proto objects
>>>>>>>>> from
>>>>>>>>> what I've found is the hacky path traversal suggested by Shaod@. Am I
>>>>>>>>> missing something here?
>>>>>>>>>
>>>>>>>>> On Friday, September 9, 2022 at 4:02:05 PM UTC-4 [email protected]
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Unfortunately, the only way to know the path to the Location
>>>>>>>>>> object is to know the path to the descriptor proto object in
>>>>>>>>>> question.
>>>>>>>>>> Alternatively, you could iterate through all the sourcecodeinfo
>>>>>>>>>> elements and use their paths to navigate to the correct descriptor
>>>>>>>>>> object.
>>>>>>>>>> One technique I have used in the past is to iterate through all
>>>>>>>>>> the sourcecodeinfo elements and store the location object in a
>>>>>>>>>> custom
>>>>>>>>>> option extension on the object in question (or the parent object if
>>>>>>>>>> it
>>>>>>>>>> something that doesn't have options).
>>>>>>>>>>
>>>>>>>>>> Also, as shaod@ points out, some comments will not show up in
>>>>>>>>>> sourcecodeinfo.
>>>>>>>>>>
>>>>>>>>>> On Wednesday, September 7, 2022 at 4:11:32 PM UTC-6
>>>>>>>>>> [email protected] wrote:
>>>>>>>>>>
>>>>>>>>>>> First keep in mind that some comments are detached and thus
>>>>>>>>>>> ignored by SourceCodeInfo.
>>>>>>>>>>>
>>>>>>>>>>> That being said, IIRC I've seen a very hacky way to achieve
>>>>>>>>>>> similar goals:
>>>>>>>>>>> https://github.com/protocolbuffers/protobuf/blob/3322c0b92a5001ade92608d75891d63c749d624d/src/google/protobuf/compiler/parser_unittest.cc#L2472
>>>>>>>>>>> On Thursday, September 1, 2022 at 7:16:52 AM UTC-7
>>>>>>>>>>> [email protected] wrote:
>>>>>>>>>>>
>>>>>>>>>>>> I'm parsing a large number of protobuf files and am using the
>>>>>>>>>>>> Source Code Info descriptor to extract comment data from the
>>>>>>>>>>>> source files
>>>>>>>>>>>> as well. I currently use the FileDescriptorProto.ListFields()
>>>>>>>>>>>> method to
>>>>>>>>>>>> extract the DescriptorProto objects I care about as well as the
>>>>>>>>>>>> SourceCodeInfo.
>>>>>>>>>>>>
>>>>>>>>>>>> To my knowledge, the only way to pair up Location fields with
>>>>>>>>>>>> the corresponding objects is via the path attribute
>>>>>>>>>>>> <https://developers.google.com/protocol-buffers/docs/reference/cpp/google.protobuf.descriptor.pb>.
>>>>>>>>>>>>
>>>>>>>>>>>> This is fine; except for the fact that involves me manually
>>>>>>>>>>>> stepping
>>>>>>>>>>>> through said path to land at my parsed Protobuf Object. This gets
>>>>>>>>>>>> complicated when dealing with layers of nested_types and I am
>>>>>>>>>>>> convinced
>>>>>>>>>>>> there must be a way for me to extract the path from the particular
>>>>>>>>>>>> DescriptorProto Object and then use that to match up the object
>>>>>>>>>>>> with the
>>>>>>>>>>>> path specified in the corresponding Location field.
>>>>>>>>>>>>
>>>>>>>>>>>> In short: How can I easily pair up DescriptorProto objects with
>>>>>>>>>>>> the Location objects that correspond to them? Specifically for
>>>>>>>>>>>> comment
>>>>>>>>>>>> parsing purposes.
>>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>> You received this message because you are subscribed to the Google
>>>>>>>>> Groups "Protocol Buffers" group.
>>>>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>>>>> send an email to [email protected].
>>>>>>>>> To view this discussion on the web visit
>>>>>>>>> https://groups.google.com/d/msgid/protobuf/0c8d36db-53f9-4179-942f-201cd205b9dfn%40googlegroups.com
>>>>>>>>>
>>>>>>>>> <https://groups.google.com/d/msgid/protobuf/0c8d36db-53f9-4179-942f-201cd205b9dfn%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>>>>>> .
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Jerry Berg | Software Engineer | [email protected] | 720-808-1188
>>>>>>>>
>>>>>>> --
>>>>>>> You received this message because you are subscribed to the Google
>>>>>>> Groups "Protocol Buffers" group.
>>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>>> send an email to [email protected].
>>>>>>>
>>>>>> To view this discussion on the web visit
>>>>>>> https://groups.google.com/d/msgid/protobuf/e97868df-f57d-45bf-b4f5-eeb0e2f4eed3n%40googlegroups.com
>>>>>>>
>>>>>>> <https://groups.google.com/d/msgid/protobuf/e97868df-f57d-45bf-b4f5-eeb0e2f4eed3n%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>>>> .
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Jerry Berg | Software Engineer | [email protected] | 720-808-1188
>>>>>>
>>>>> --
>>>>> You received this message because you are subscribed to the Google
>>>>> Groups "Protocol Buffers" group.
>>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>>> an email to [email protected].
>>>>>
>>>> To view this discussion on the web visit
>>>>> https://groups.google.com/d/msgid/protobuf/94923187-2142-44ad-b0b8-97d1c12a5d5fn%40googlegroups.com
>>>>>
>>>>> <https://groups.google.com/d/msgid/protobuf/94923187-2142-44ad-b0b8-97d1c12a5d5fn%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>> .
>>>>>
>>>>
>>>>
>>>> --
>>>> Jerry Berg | Software Engineer | [email protected] | 720-808-1188
>>>>
>>> --
>> You received this message because you are subscribed to the Google Groups
>> "Protocol Buffers" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [email protected].
>>
> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/protobuf/17cbf006-fb8c-440e-8f51-6c02d83c0f5dn%40googlegroups.com
>>
>> <https://groups.google.com/d/msgid/protobuf/17cbf006-fb8c-440e-8f51-6c02d83c0f5dn%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>>
>
>
> --
> Jerry Berg | Software Engineer | [email protected] | 720-808-1188
>
--
You received this message because you are subscribed to the Google Groups
"Protocol Buffers" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/protobuf/9986364e-8b09-4cc3-aa23-999f4c026ff8n%40googlegroups.com.