Khaled Hammouda created FLINK-38689:
---------------------------------------
Summary: Support Protobuf 4.x field name conflict resolution in
dynamic codegen
Key: FLINK-38689
URL: https://issues.apache.org/jira/browse/FLINK-38689
Project: Flink
Issue Type: Improvement
Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile)
Affects Versions: 2.1.1, 2.1.0
Reporter: Khaled Hammouda
When protobuf messages contain fields like:
* Enum field `status` + string field `status_value`
* Repeated field `tags` + string field `tags_list` and/or `tags_count`
The protoc compiler renames accessor methods in such cases by suffxing methods
with field numbers to avoid conflicts in generated code (e.g.,
`getStatus1Value()` instead of `getStatusValue()`, `getTags4List()` instead of
`getTagsList()`, etc.). Flink's dynamic code generation assumes standard naming
and thus fails at runtime to call the right methods.
Users typically would work around this issue by renaming their proto fields to
avoid those conflicts in generated code, but sometimes renaming field is an
expensive choice due to being a breaking change.
### Solution
Implement `PbFieldConflictResolver` that:
- Detects accessor name conflicts by analyzing message descriptors
- Applies field number suffixes matching protoc behavior
- Caches resolved mappings for performance
- Integrates with serialization/deserialization codegen
--
This message was sent by Atlassian Jira
(v8.20.10#820010)