rangadi commented on code in PR #40686:
URL: https://github.com/apache/spark/pull/40686#discussion_r1180870422
##########
connector/protobuf/src/main/scala/org/apache/spark/sql/protobuf/ProtobufDeserializer.scala:
##########
@@ -299,6 +298,29 @@ private[sql] class ProtobufDeserializer(
}
}
+ private def getFieldValue(record: DynamicMessage, field: FieldDescriptor):
AnyRef = {
+ // We return a value if one of:
+ // - the field is repeated
+ // - the field is explicitly present in the serialized proto
+ // - the field is proto2 with a default
+ // - field presence is not available and materializeZeroValues is set
+ //
+ // Repeated fields have to be treated separately as they cannot have
`hasField`
+ // called on them. And we only materialize zero values for fields without
presence
+ // information because that flag controls the behavior for those fields in
the ambiguous
+ // case of "unset" or "set to zero value". See the docs in
[[ProtobufOptions]] for more
+ // details.
Review Comment:
Messages are not getting serialized because of this check (those fields have
`hasPresence()`). It would be better if we explicitly exclude them there.
This check in fact does not matter for scalar fields since `hasField()`
returns true any way. So `hasPresense()` check has no effect. I.e if you remove
this check only thing that will change in your tests should be with message
fields.
Overall:
- Exclude message types from serializing explicitly (no matter what our
presense policy)
- Give a rational for using presence information. I.e. why should
`optional string a` and `string a` behave differently with this flag.
- In fact this check does not even matter exception for Message types.
So there is no need to have it.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]