I'm not seeing these methods supported in the Python API. Any idea if this is just unsupported? On Friday, September 9, 2022 at 5:11:33 PM UTC-4 Kyle Papili wrote:
> This was a great description. I appreciate you taking the time to write > that out. I hadn't been able to find something as clear as this in the > documentation. Thank you! > > On Friday, September 9, 2022 at 5:07:10 PM UTC-4 [email protected] wrote: > >> Yeah, it's a bit confusing, but the numbers are not types. They are >> field numbers and array indices -- that's all. So the table is >> descriptor.proto. >> >> Start with the FileDescriptorProto and dereference by field number, if >> you hit an array, the next number is an index: >> message FileDescriptorProto { >> optional string name = 1; // file name, relative to root of source tree >> optional string package = 2; // e.g. "foo", "foo.bar", etc. >> // Names of files imported by this file. >> repeated string dependency = 3; >> ... >> // All top-level definitions in this file. >> repeated DescriptorProto message_type = 4; >> repeated EnumDescriptorProto enum_type = 5; >> repeated ServiceDescriptorProto service = 6; >> repeated FieldDescriptorProto extension = 7; >> } >> >> A path starting with 1 would refer to the file name (shouldn't have any >> further numbers). >> A path starting with 2 would refer to the package (shouldn't have any >> further numbers). >> A path starting with 3 would refer to a dependency (import), the next >> number in the path is an array index (which dependency) >> A path starting with 4 refers to a message, the next number is the array >> index that tells you which message. >> A path starting with 5 refers to an enum, the next number is the array >> index that tells you which enum. >> A path starting with 6 refers to an service, the next number is the array >> index that tells you which enum. >> >> If you have a path starting with [4,0,...] you are looking at >> fileDescriptor.getMessageType(0); >> If you have a path starting with [4,1,...] you are looking at >> fileDescriptor.getMessageType(1); >> If you have a path starting with [5,2,...] you are looking at >> fileDescriptor.getEnumType(2); >> >> The next path element tells you the field number within the top-level >> item's descriptor. For example, paths that point into a top-level message >> definition: >> message DescriptorProto { >> optional string name = 1; >> repeated FieldDescriptorProto field = 2; >> repeated FieldDescriptorProto extension = 6; >> repeated DescriptorProto nested_type = 3; >> repeated EnumDescriptorProto enum_type = 4; >> ... >> } >> >> A path [4,0,2,1,...] corresponds to: fileDescriptor.getMessageType(0). >> getField(1); >> A path [4,0,3,1,2,3...] corresponds to: fileDescriptor.getMessageType(0). >> getNestedType(1).getField(3); >> >> Just follow the proto field numbers. >> >> >> On Fri, Sep 9, 2022 at 2:45 PM Kyle Papili <[email protected]> wrote: >> >>> My question is far dumber haha. Is there a table that describes what >>> Field numbers correlate to what object types? >>> >>> I've seen 1,2,3,4,5,6,7 show up in paths as field numbers. My naive >>> brain was under impression that they correlated to object types, no? >>> >>> 4: Message, 5: Enum, 6: Extension?? >>> >>> Is this not correct? Is there a table that can show me what each field >>> number correlates to? >>> >>> On Friday, September 9, 2022 at 4:38:21 PM UTC-4 [email protected] wrote: >>> >>>> The ints in the path should be the field numbers and array indices >>>> along the way from a top level field descriptor proto, like this: >>>> Path: [4, 0, 2, 0] >>>> Starting with the FileDescriptorProto: >>>> 4 -> FileDescriptorProto { >>>> ... >>>> * repeated DescriptorProto message_type = 4;* >>>> repeated EnumDescriptorProto enum_type = 5; >>>> } >>>> 0 -> index into FileDescriptorProto.messages[0] >>>> 2 -> DescriptorProto { >>>> optional name = 1; >>>> * repeated FieldDescriptorProto field = 2;* >>>> ... >>>> } >>>> 0 -> index into DescriptorProto.field[0] >>>> >>>> Thus this path/Location [4, 0, 2, 0] applies to the whole field >>>> statement. >>>> I believe the index of a message in the message_type array generally >>>> corresponds to the order of all top-level message items in the file. >>>> I also believe that the index of a field likewise corresponds to the >>>> ordering of fields within the message. >>>> >>>> So if you have to deal with nested messages, the path will start with: >>>> [4, (top-level-message-index), 3, (index-of-nested-message-type), ...] >>>> >>>> If I remember correctly, this breaks down for options because sometimes >>>> the comments/location for an option is dropped, and when it is present the >>>> path points to field 999 the uninterpreted options. >>>> >>>> But maybe you already had gotten that far and I misunderstood >>>> your question. >>>> >>>> >>>> On Fri, Sep 9, 2022 at 2:15 PM Kyle Papili <[email protected]> wrote: >>>> >>>>> Is there somewhere in the documentation that provides clear table >>>>> describing which numbers in the path correlate to which types? I have >>>>> found >>>>> some inconsistencies with what I had thought. Any link to a table like >>>>> this? >>>>> >>>>> On Friday, September 9, 2022 at 4:14:46 PM UTC-4 [email protected] >>>>> wrote: >>>>> >>>>>> Ah, no, there is no magic. I only meant that if you wanted to have >>>>>> one part of your code match up location data to descriptor object and >>>>>> attach the location info directly, you could do it in a custom option. >>>>>> There's no getting around the actual awkward stepping through the paths >>>>>> to >>>>>> match them up. >>>>>> >>>>>> On Fri, Sep 9, 2022 at 2:08 PM Kyle Papili <[email protected]> >>>>>> wrote: >>>>>> >>>>>>> Yes, the "hacky method" proposed by shaod@ is basically what I am >>>>>>> doing currently. It just seems to be unnecessarily complicated. >>>>>>> >>>>>>> What do you mean "store the location object in a custom option >>>>>>> extension on the object in question". How would I store the location >>>>>>> object >>>>>>> as a custom extension of the object without knowing the object? If I >>>>>>> knew >>>>>>> the object that that location corresponded to then my problem would be >>>>>>> resolved. The only way to match up location objects to Proto objects >>>>>>> from >>>>>>> what I've found is the hacky path traversal suggested by Shaod@. Am I >>>>>>> missing something here? >>>>>>> >>>>>>> On Friday, September 9, 2022 at 4:02:05 PM UTC-4 [email protected] >>>>>>> wrote: >>>>>>> >>>>>>>> Unfortunately, the only way to know the path to the Location object >>>>>>>> is to know the path to the descriptor proto object in question. >>>>>>>> Alternatively, you could iterate through all the sourcecodeinfo >>>>>>>> elements and use their paths to navigate to the correct descriptor >>>>>>>> object. >>>>>>>> One technique I have used in the past is to iterate through all the >>>>>>>> sourcecodeinfo elements and store the location object in a custom >>>>>>>> option >>>>>>>> extension on the object in question (or the parent object if it >>>>>>>> something >>>>>>>> that doesn't have options). >>>>>>>> >>>>>>>> Also, as shaod@ points out, some comments will not show up in >>>>>>>> sourcecodeinfo. >>>>>>>> >>>>>>>> On Wednesday, September 7, 2022 at 4:11:32 PM UTC-6 >>>>>>>> [email protected] wrote: >>>>>>>> >>>>>>>>> First keep in mind that some comments are detached and thus >>>>>>>>> ignored by SourceCodeInfo. >>>>>>>>> >>>>>>>>> That being said, IIRC I've seen a very hacky way to achieve >>>>>>>>> similar goals: >>>>>>>>> https://github.com/protocolbuffers/protobuf/blob/3322c0b92a5001ade92608d75891d63c749d624d/src/google/protobuf/compiler/parser_unittest.cc#L2472 >>>>>>>>> On Thursday, September 1, 2022 at 7:16:52 AM UTC-7 >>>>>>>>> [email protected] wrote: >>>>>>>>> >>>>>>>>>> I'm parsing a large number of protobuf files and am using the >>>>>>>>>> Source Code Info descriptor to extract comment data from the source >>>>>>>>>> files >>>>>>>>>> as well. I currently use the FileDescriptorProto.ListFields() method >>>>>>>>>> to >>>>>>>>>> extract the DescriptorProto objects I care about as well as the >>>>>>>>>> SourceCodeInfo. >>>>>>>>>> >>>>>>>>>> To my knowledge, the only way to pair up Location fields with the >>>>>>>>>> corresponding objects is via the path attribute >>>>>>>>>> <https://developers.google.com/protocol-buffers/docs/reference/cpp/google.protobuf.descriptor.pb>. >>>>>>>>>> >>>>>>>>>> This is fine; except for the fact that involves me manually stepping >>>>>>>>>> through said path to land at my parsed Protobuf Object. This gets >>>>>>>>>> complicated when dealing with layers of nested_types and I am >>>>>>>>>> convinced >>>>>>>>>> there must be a way for me to extract the path from the particular >>>>>>>>>> DescriptorProto Object and then use that to match up the object with >>>>>>>>>> the >>>>>>>>>> path specified in the corresponding Location field. >>>>>>>>>> >>>>>>>>>> In short: How can I easily pair up DescriptorProto objects with >>>>>>>>>> the Location objects that correspond to them? Specifically for >>>>>>>>>> comment >>>>>>>>>> parsing purposes. >>>>>>>>>> >>>>>>>>> -- >>>>>>> You received this message because you are subscribed to the Google >>>>>>> Groups "Protocol Buffers" group. >>>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>>> send an email to [email protected]. >>>>>>> To view this discussion on the web visit >>>>>>> https://groups.google.com/d/msgid/protobuf/0c8d36db-53f9-4179-942f-201cd205b9dfn%40googlegroups.com >>>>>>> >>>>>>> <https://groups.google.com/d/msgid/protobuf/0c8d36db-53f9-4179-942f-201cd205b9dfn%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>>>> . >>>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Jerry Berg | Software Engineer | [email protected] | 720-808-1188 >>>>>> >>>>> -- >>>>> You received this message because you are subscribed to the Google >>>>> Groups "Protocol Buffers" group. >>>>> To unsubscribe from this group and stop receiving emails from it, send >>>>> an email to [email protected]. >>>>> >>>> To view this discussion on the web visit >>>>> https://groups.google.com/d/msgid/protobuf/e97868df-f57d-45bf-b4f5-eeb0e2f4eed3n%40googlegroups.com >>>>> >>>>> <https://groups.google.com/d/msgid/protobuf/e97868df-f57d-45bf-b4f5-eeb0e2f4eed3n%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>> . >>>>> >>>> >>>> >>>> -- >>>> Jerry Berg | Software Engineer | [email protected] | 720-808-1188 >>>> >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "Protocol Buffers" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to [email protected]. >>> >> To view this discussion on the web visit >>> https://groups.google.com/d/msgid/protobuf/94923187-2142-44ad-b0b8-97d1c12a5d5fn%40googlegroups.com >>> >>> <https://groups.google.com/d/msgid/protobuf/94923187-2142-44ad-b0b8-97d1c12a5d5fn%40googlegroups.com?utm_medium=email&utm_source=footer> >>> . >>> >> >> >> -- >> Jerry Berg | Software Engineer | [email protected] | 720-808-1188 >> > -- You received this message because you are subscribed to the Google Groups "Protocol Buffers" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/protobuf/17cbf006-fb8c-440e-8f51-6c02d83c0f5dn%40googlegroups.com.
