[GitHub] [spark] rangadi commented on a diff in pull request #41498: [SPARK-44001][Protobuf] spark protobuf: handle well known wrapper types

via GitHub Wed, 07 Jun 2023 23:52:12 -0700


rangadi commented on code in PR #41498:
URL: https://github.com/apache/spark/pull/41498#discussion_r1222534443



##########
connector/protobuf/src/main/scala/org/apache/spark/sql/protobuf/ProtobufDeserializer.scala:
##########
@@ -247,12 +247,86 @@ private[sql] class ProtobufDeserializer(
           updater.setLong(ordinal, micros + 
TimeUnit.NANOSECONDS.toMicros(nanoSeconds))
 
       case (MESSAGE, StringType)
-          if protoType.getMessageType.getFullName == "google.protobuf.Any" =>
+        if protoType.getMessageType.getFullName == "google.protobuf.Any" =>
         (updater, ordinal, value) =>
           // Convert 'Any' protobuf message to JSON string.
           val jsonStr = jsonPrinter.print(value.asInstanceOf[DynamicMessage])
           updater.set(ordinal, UTF8String.fromString(jsonStr))
 
+      // Handle well known wrapper types. We unpack the value field instead of 
keeping

Review Comment:
   Not sure I follow. This is a serde for Protobuf and Spark struct. Consumer 
and Producers are expected to know the schema.
   
   Can we have a concrete example where this makes a difference? What problem 
are we solving? Wrapper types are used because the wrapper is important, 
otherwise no need to use it. I don't see how stripping the wrapper is the right 
thing. 
   These are just utilities, not a Protobuf spec. 
   
   Did you check generate Java code? It treats it just as another Protobuf 
message. There is no special treatment. Why should Spark be different? 
   
   Can we have a fully spelled out example in Spark that shows the the 
benefits? 
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] rangadi commented on a diff in pull request #41498: [SPARK-44001][Protobuf] spark protobuf: handle well known wrapper types

Reply via email to