[4, 0, 4, 0]. is the correct path to Test Comment 2. 4 = FileDescriptorProto.message_type 0 = FileDescriptorProto.message_type[0] 4 = DescriptorProto.enum_type 0 = DescriptorProto.enum_type[0]
I have never done custom options using Python. I've only used them in Java and C++. On Sun, Sep 11, 2022 at 9:35 PM Kyle Papili <[email protected]> wrote: > Yes, I had the FileDescriptor no problem but the functions are > non-existent in Python. I figured it out though, I can access the elements > using (.ListFields(), .nested_type, .message_type, .enum_type, .service, > .extension, and .options. > > A few questions I still had: > // Test Comment 1 > message mainMessage { > // Test Comment 2 > enum internalEnum { > SomeField = 1; // Test Comment 3 > SomeOtherField = 2; > } > } > > The location for Test Comment 2 is [4, 0, 4, 0]. Shouldn't it be [4, 0, 5, > 0]?? > > Also, how exactly do you set and retrieve custom option extensions via the > Python API? So far I have tried: > setattr(pointer.Extensions, "my_custom_option", comments) > but that is not correct. > > Any ideas here? > On Sunday, September 11, 2022 at 11:31:25 PM UTC-4 [email protected] wrote: > >> I am mostly familiar with the Java API. >> How are you currently getting the SourceCodeInfo now? If you have access >> to it, you should be able to access to the FileDescriptor. >> >> On Sun, Sep 11, 2022 at 1:40 PM Kyle Papili <[email protected]> wrote: >> >>> I'm not seeing these methods supported in the Python API. Any idea if >>> this is just unsupported? >>> On Friday, September 9, 2022 at 5:11:33 PM UTC-4 Kyle Papili wrote: >>> >>>> This was a great description. I appreciate you taking the time to write >>>> that out. I hadn't been able to find something as clear as this in the >>>> documentation. Thank you! >>>> >>>> On Friday, September 9, 2022 at 5:07:10 PM UTC-4 [email protected] >>>> wrote: >>>> >>>>> Yeah, it's a bit confusing, but the numbers are not types. They are >>>>> field numbers and array indices -- that's all. So the table is >>>>> descriptor.proto. >>>>> >>>>> Start with the FileDescriptorProto and dereference by field number, if >>>>> you hit an array, the next number is an index: >>>>> message FileDescriptorProto { >>>>> optional string name = 1; // file name, relative to root of source >>>>> tree >>>>> optional string package = 2; // e.g. "foo", "foo.bar", etc. >>>>> // Names of files imported by this file. >>>>> repeated string dependency = 3; >>>>> ... >>>>> // All top-level definitions in this file. >>>>> repeated DescriptorProto message_type = 4; >>>>> repeated EnumDescriptorProto enum_type = 5; >>>>> repeated ServiceDescriptorProto service = 6; >>>>> repeated FieldDescriptorProto extension = 7; >>>>> } >>>>> >>>>> A path starting with 1 would refer to the file name (shouldn't have >>>>> any further numbers). >>>>> A path starting with 2 would refer to the package (shouldn't have any >>>>> further numbers). >>>>> A path starting with 3 would refer to a dependency (import), the next >>>>> number in the path is an array index (which dependency) >>>>> A path starting with 4 refers to a message, the next number is the >>>>> array index that tells you which message. >>>>> A path starting with 5 refers to an enum, the next number is the array >>>>> index that tells you which enum. >>>>> A path starting with 6 refers to an service, the next number is the >>>>> array index that tells you which enum. >>>>> >>>>> If you have a path starting with [4,0,...] you are looking at >>>>> fileDescriptor.getMessageType(0); >>>>> If you have a path starting with [4,1,...] you are looking at >>>>> fileDescriptor.getMessageType(1); >>>>> If you have a path starting with [5,2,...] you are looking at >>>>> fileDescriptor.getEnumType(2); >>>>> >>>>> The next path element tells you the field number within the top-level >>>>> item's descriptor. For example, paths that point into a top-level message >>>>> definition: >>>>> message DescriptorProto { >>>>> optional string name = 1; >>>>> repeated FieldDescriptorProto field = 2; >>>>> repeated FieldDescriptorProto extension = 6; >>>>> repeated DescriptorProto nested_type = 3; >>>>> repeated EnumDescriptorProto enum_type = 4; >>>>> ... >>>>> } >>>>> >>>>> A path [4,0,2,1,...] corresponds to: fileDescriptor.getMessageType(0). >>>>> getField(1); >>>>> A path [4,0,3,1,2,3...] corresponds to: fileDescriptor.getMessageType( >>>>> 0).getNestedType(1).getField(3); >>>>> >>>>> Just follow the proto field numbers. >>>>> >>>>> >>>>> On Fri, Sep 9, 2022 at 2:45 PM Kyle Papili <[email protected]> wrote: >>>>> >>>>>> My question is far dumber haha. Is there a table that describes what >>>>>> Field numbers correlate to what object types? >>>>>> >>>>>> I've seen 1,2,3,4,5,6,7 show up in paths as field numbers. My naive >>>>>> brain was under impression that they correlated to object types, no? >>>>>> >>>>>> 4: Message, 5: Enum, 6: Extension?? >>>>>> >>>>>> Is this not correct? Is there a table that can show me what each >>>>>> field number correlates to? >>>>>> >>>>>> On Friday, September 9, 2022 at 4:38:21 PM UTC-4 [email protected] >>>>>> wrote: >>>>>> >>>>>>> The ints in the path should be the field numbers and array indices >>>>>>> along the way from a top level field descriptor proto, like this: >>>>>>> Path: [4, 0, 2, 0] >>>>>>> Starting with the FileDescriptorProto: >>>>>>> 4 -> FileDescriptorProto { >>>>>>> ... >>>>>>> * repeated DescriptorProto message_type = 4;* >>>>>>> repeated EnumDescriptorProto enum_type = 5; >>>>>>> } >>>>>>> 0 -> index into FileDescriptorProto.messages[0] >>>>>>> 2 -> DescriptorProto { >>>>>>> optional name = 1; >>>>>>> * repeated FieldDescriptorProto field = 2;* >>>>>>> ... >>>>>>> } >>>>>>> 0 -> index into DescriptorProto.field[0] >>>>>>> >>>>>>> Thus this path/Location [4, 0, 2, 0] applies to the whole field >>>>>>> statement. >>>>>>> I believe the index of a message in the message_type array generally >>>>>>> corresponds to the order of all top-level message items in the file. >>>>>>> I also believe that the index of a field likewise corresponds to the >>>>>>> ordering of fields within the message. >>>>>>> >>>>>>> So if you have to deal with nested messages, the path will start >>>>>>> with: >>>>>>> [4, (top-level-message-index), 3, (index-of-nested-message-type), >>>>>>> ...] >>>>>>> >>>>>>> If I remember correctly, this breaks down for options >>>>>>> because sometimes the comments/location for an option is dropped, and >>>>>>> when >>>>>>> it is present the path points to field 999 the uninterpreted options. >>>>>>> >>>>>>> But maybe you already had gotten that far and I misunderstood >>>>>>> your question. >>>>>>> >>>>>>> >>>>>>> On Fri, Sep 9, 2022 at 2:15 PM Kyle Papili <[email protected]> >>>>>>> wrote: >>>>>>> >>>>>>>> Is there somewhere in the documentation that provides clear table >>>>>>>> describing which numbers in the path correlate to which types? I have >>>>>>>> found >>>>>>>> some inconsistencies with what I had thought. Any link to a table like >>>>>>>> this? >>>>>>>> >>>>>>>> On Friday, September 9, 2022 at 4:14:46 PM UTC-4 [email protected] >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Ah, no, there is no magic. I only meant that if you wanted to have >>>>>>>>> one part of your code match up location data to descriptor object and >>>>>>>>> attach the location info directly, you could do it in a custom option. >>>>>>>>> There's no getting around the actual awkward stepping through the >>>>>>>>> paths to >>>>>>>>> match them up. >>>>>>>>> >>>>>>>>> On Fri, Sep 9, 2022 at 2:08 PM Kyle Papili <[email protected]> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> Yes, the "hacky method" proposed by shaod@ is basically what I >>>>>>>>>> am doing currently. It just seems to be unnecessarily complicated. >>>>>>>>>> >>>>>>>>>> What do you mean "store the location object in a custom option >>>>>>>>>> extension on the object in question". How would I store the location >>>>>>>>>> object >>>>>>>>>> as a custom extension of the object without knowing the object? If I >>>>>>>>>> knew >>>>>>>>>> the object that that location corresponded to then my problem would >>>>>>>>>> be >>>>>>>>>> resolved. The only way to match up location objects to Proto objects >>>>>>>>>> from >>>>>>>>>> what I've found is the hacky path traversal suggested by Shaod@. >>>>>>>>>> Am I missing something here? >>>>>>>>>> >>>>>>>>>> On Friday, September 9, 2022 at 4:02:05 PM UTC-4 [email protected] >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> Unfortunately, the only way to know the path to the Location >>>>>>>>>>> object is to know the path to the descriptor proto object in >>>>>>>>>>> question. >>>>>>>>>>> Alternatively, you could iterate through all the sourcecodeinfo >>>>>>>>>>> elements and use their paths to navigate to the correct descriptor >>>>>>>>>>> object. >>>>>>>>>>> One technique I have used in the past is to iterate through all >>>>>>>>>>> the sourcecodeinfo elements and store the location object in a >>>>>>>>>>> custom >>>>>>>>>>> option extension on the object in question (or the parent object if >>>>>>>>>>> it >>>>>>>>>>> something that doesn't have options). >>>>>>>>>>> >>>>>>>>>>> Also, as shaod@ points out, some comments will not show up in >>>>>>>>>>> sourcecodeinfo. >>>>>>>>>>> >>>>>>>>>>> On Wednesday, September 7, 2022 at 4:11:32 PM UTC-6 >>>>>>>>>>> [email protected] wrote: >>>>>>>>>>> >>>>>>>>>>>> First keep in mind that some comments are detached and thus >>>>>>>>>>>> ignored by SourceCodeInfo. >>>>>>>>>>>> >>>>>>>>>>>> That being said, IIRC I've seen a very hacky way to achieve >>>>>>>>>>>> similar goals: >>>>>>>>>>>> https://github.com/protocolbuffers/protobuf/blob/3322c0b92a5001ade92608d75891d63c749d624d/src/google/protobuf/compiler/parser_unittest.cc#L2472 >>>>>>>>>>>> On Thursday, September 1, 2022 at 7:16:52 AM UTC-7 >>>>>>>>>>>> [email protected] wrote: >>>>>>>>>>>> >>>>>>>>>>>>> I'm parsing a large number of protobuf files and am using the >>>>>>>>>>>>> Source Code Info descriptor to extract comment data from the >>>>>>>>>>>>> source files >>>>>>>>>>>>> as well. I currently use the FileDescriptorProto.ListFields() >>>>>>>>>>>>> method to >>>>>>>>>>>>> extract the DescriptorProto objects I care about as well as the >>>>>>>>>>>>> SourceCodeInfo. >>>>>>>>>>>>> >>>>>>>>>>>>> To my knowledge, the only way to pair up Location fields with >>>>>>>>>>>>> the corresponding objects is via the path attribute >>>>>>>>>>>>> <https://developers.google.com/protocol-buffers/docs/reference/cpp/google.protobuf.descriptor.pb>. >>>>>>>>>>>>> This is fine; except for the fact that involves me manually >>>>>>>>>>>>> stepping >>>>>>>>>>>>> through said path to land at my parsed Protobuf Object. This gets >>>>>>>>>>>>> complicated when dealing with layers of nested_types and I am >>>>>>>>>>>>> convinced >>>>>>>>>>>>> there must be a way for me to extract the path from the particular >>>>>>>>>>>>> DescriptorProto Object and then use that to match up the object >>>>>>>>>>>>> with the >>>>>>>>>>>>> path specified in the corresponding Location field. >>>>>>>>>>>>> >>>>>>>>>>>>> In short: How can I easily pair up DescriptorProto objects >>>>>>>>>>>>> with the Location objects that correspond to them? Specifically >>>>>>>>>>>>> for comment >>>>>>>>>>>>> parsing purposes. >>>>>>>>>>>>> >>>>>>>>>>>> -- >>>>>>>>>> You received this message because you are subscribed to the >>>>>>>>>> Google Groups "Protocol Buffers" group. >>>>>>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>>>>>> send an email to [email protected]. >>>>>>>>>> To view this discussion on the web visit >>>>>>>>>> https://groups.google.com/d/msgid/protobuf/0c8d36db-53f9-4179-942f-201cd205b9dfn%40googlegroups.com >>>>>>>>>> <https://groups.google.com/d/msgid/protobuf/0c8d36db-53f9-4179-942f-201cd205b9dfn%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>>>>>>> . >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Jerry Berg | Software Engineer | [email protected] | 720-808-1188 >>>>>>>>> >>>>>>>> -- >>>>>>>> You received this message because you are subscribed to the Google >>>>>>>> Groups "Protocol Buffers" group. >>>>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>>>> send an email to [email protected]. >>>>>>>> >>>>>>> To view this discussion on the web visit >>>>>>>> https://groups.google.com/d/msgid/protobuf/e97868df-f57d-45bf-b4f5-eeb0e2f4eed3n%40googlegroups.com >>>>>>>> <https://groups.google.com/d/msgid/protobuf/e97868df-f57d-45bf-b4f5-eeb0e2f4eed3n%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>>>>> . >>>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Jerry Berg | Software Engineer | [email protected] | 720-808-1188 >>>>>>> >>>>>> -- >>>>>> You received this message because you are subscribed to the Google >>>>>> Groups "Protocol Buffers" group. >>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>> send an email to [email protected]. >>>>>> >>>>> To view this discussion on the web visit >>>>>> https://groups.google.com/d/msgid/protobuf/94923187-2142-44ad-b0b8-97d1c12a5d5fn%40googlegroups.com >>>>>> <https://groups.google.com/d/msgid/protobuf/94923187-2142-44ad-b0b8-97d1c12a5d5fn%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>>> . >>>>>> >>>>> >>>>> >>>>> -- >>>>> Jerry Berg | Software Engineer | [email protected] | 720-808-1188 >>>>> >>>> -- >>> You received this message because you are subscribed to the Google >>> Groups "Protocol Buffers" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to [email protected]. >>> >> To view this discussion on the web visit >>> https://groups.google.com/d/msgid/protobuf/17cbf006-fb8c-440e-8f51-6c02d83c0f5dn%40googlegroups.com >>> <https://groups.google.com/d/msgid/protobuf/17cbf006-fb8c-440e-8f51-6c02d83c0f5dn%40googlegroups.com?utm_medium=email&utm_source=footer> >>> . >>> >> >> >> -- >> Jerry Berg | Software Engineer | [email protected] | 720-808-1188 >> > -- > You received this message because you are subscribed to the Google Groups > "Protocol Buffers" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/protobuf/9986364e-8b09-4cc3-aa23-999f4c026ff8n%40googlegroups.com > <https://groups.google.com/d/msgid/protobuf/9986364e-8b09-4cc3-aa23-999f4c026ff8n%40googlegroups.com?utm_medium=email&utm_source=footer> > . > -- Jerry Berg | Software Engineer | [email protected] | 720-808-1188 -- You received this message because you are subscribed to the Google Groups "Protocol Buffers" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/protobuf/CAHLB6Rfekqjcvop%2BN3j-Bi7c18JjyWBh-bAqpKj-di%3DprfJ4TA%40mail.gmail.com.
