Sai Sharath Dandi created FLINK-37528:
-----------------------------------------
Summary: Protobuf Format (proto3): Handle default values for
optional primitive types and primitive types in one of fields
Key: FLINK-37528
URL: https://issues.apache.org/jira/browse/FLINK-37528
Project: Flink
Issue Type: Improvement
Affects Versions: 2.0-preview
Reporter: Sai Sharath Dandi
The Read Default Values is
[forced|https://github.com/apache/flink/blob/master/flink-formats/flink-protobuf/src/main/java/org/apache/flink/formats/protobuf/deserialize/ProtoToRowConverter.java#L74]
to be true for primitive types in proto3. This can cause bugs in some cases
for messages like below
{code:java}
oneof test {
string aa = 1;
int32 bb = 2;
bool cc = 3;
Corpus dd = 4;
} {code}
Even if a only the first field is set in the oneOf, reading default values
makes it so that all the fields are non-null after decoding. When such data is
encoded back to protobuf, it will produce a different protobuf message than the
original and cause data correctness issues.
solution:
{code:java}
if (PbFormatUtils.isSimpleType(subType) && !(elementFd.getContainingOneof() !=
null || elementFd.hasOptionalKeyword())) {
readDefaultValues =
formatContext.isReadDefaultValuesForPrimitiveTypes();
} {code}
For primitive types in proto3, we can still do field presence checks when it is
defined an optional field or it is part of a oneOf message.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)