justaparth commented on code in PR #41498:
URL: https://github.com/apache/spark/pull/41498#discussion_r1223202896
##########
connector/protobuf/src/main/scala/org/apache/spark/sql/protobuf/ProtobufDeserializer.scala:
##########
@@ -247,12 +247,86 @@ private[sql] class ProtobufDeserializer(
updater.setLong(ordinal, micros +
TimeUnit.NANOSECONDS.toMicros(nanoSeconds))
case (MESSAGE, StringType)
- if protoType.getMessageType.getFullName == "google.protobuf.Any" =>
+ if protoType.getMessageType.getFullName == "google.protobuf.Any" =>
(updater, ordinal, value) =>
// Convert 'Any' protobuf message to JSON string.
val jsonStr = jsonPrinter.print(value.asInstanceOf[DynamicMessage])
updater.set(ordinal, UTF8String.fromString(jsonStr))
+ // Handle well known wrapper types. We unpack the value field instead of
keeping
Review Comment:
Also one point that may not have been clear, but these wrapper types for
primitives are not inherently useful as "messages"; rather they exist for the
purpose of distinguishing between empty and 0 in proto3, before the existence
of optional.
As an example
```
syntax = "proto3";
message A {
int64 val = 1;
}
```
and then
```
A()
A(0)
```
are not distinguishable from one another because default values are [not
serialized](https://protobuf.dev/programming-guides/field_presence/#presence-in-tag-value-stream-wire-format-serialization)
to the wire for primitives in proto3.
But if you had
```
syntax = "proto3";
message A {
google.protobuf.Int64Value val = 1;
}
```
then
```
A()
A(Int64Value(0))
```
Now become distinguishable from one another.
Basically it's not like it's useful to just nest primitives inside a message
wrapper for the sake of it, the purpose is simply to disambiguate in that case
above (and you're right that `optional` in newer versions of protos basically
solves this problem).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]