dichlorodiphen opened a new pull request, #53828:
URL: https://github.com/apache/spark/pull/53828
### What changes were proposed in this pull request?
This PR adds support for proto2 extensions to `from_protobuf` and
`to_protobuf` (when file descriptor set is provided, as Java classes do not
contain enough information to support extensions).
This is done by building an ExtensionRegistry and a map from descriptor name
to its extensions. The registry is used during construction of the
DynamicMessage to provide the Protobuf library with visibility of the
extensions. The index is plumbed through the various helper classes for use in
schema conversion and serde.
### Why are the changes needed?
Proto2 extensions are a valid, if somewhat uncommon, feature of Protobuf,
and it therefore makes sense to incorporate them into the schema when provided
so as to not confuse the user.
### Does this PR introduce _any_ user-facing change?
Yes. Previously, extension fields would be dropped by both `from_protobuf`
and `to_protobuf`. Now, they are retained. This can be demonstrated with the
minimal example below. See the unit tests for more examples.
```proto
message Person {
int32 id = 1;
extensions 100 to 200;
}
extend Person {
int32 age = 100;
}
```
### How was this patch tested?
Unit tests were added for the new behavior, including basic behavior,
extending nested messages, and extensions defined in separate files.
### Was this patch authored or co-authored using generative AI tooling?
Initial draft authored with Claude Code.
Generated-by: claude-4.5-opus
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]