pang-wu commented on code in PR #41498:
URL: https://github.com/apache/spark/pull/41498#discussion_r1223754931


##########
connector/protobuf/src/main/scala/org/apache/spark/sql/protobuf/ProtobufDeserializer.scala:
##########
@@ -247,12 +247,86 @@ private[sql] class ProtobufDeserializer(
           updater.setLong(ordinal, micros + 
TimeUnit.NANOSECONDS.toMicros(nanoSeconds))
 
       case (MESSAGE, StringType)
-          if protoType.getMessageType.getFullName == "google.protobuf.Any" =>
+        if protoType.getMessageType.getFullName == "google.protobuf.Any" =>
         (updater, ordinal, value) =>
           // Convert 'Any' protobuf message to JSON string.
           val jsonStr = jsonPrinter.print(value.asInstanceOf[DynamicMessage])
           updater.set(ordinal, UTF8String.fromString(jsonStr))
 
+      // Handle well known wrapper types. We unpack the value field instead of 
keeping

Review Comment:
   So the motivating example is if someone want to convert the struct generated 
by Spark to json, and compare(or maintain the compatibility of) that json with 
another json generated from the same protobuf message using Go or Java 
JsonFormat, they are not able to assert the equality because Spark doesn't 
follow spec to translate well know types. 
   The team who want to do such a comparison has to write a custom converter, 
and writing such converter is painful(if not impossible) because 1) the type 
information is loss, 2) even with the type info, there is no easy way to know 
at what level we should get rid of the struct and replace it with a scalar -- 
this is a real usecase we are running into.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to