chaoqin-li1123 commented on code in PR #44643:
URL: https://github.com/apache/spark/pull/44643#discussion_r1469066080
##########
connector/protobuf/src/main/scala/org/apache/spark/sql/protobuf/utils/ProtobufOptions.scala:
##########
@@ -207,6 +207,23 @@ private[sql] class ProtobufOptions(
// nil => nil, Int32Value(0) => 0, Int32Value(100) => 100.
val unwrapWellKnownTypes: Boolean =
parameters.getOrElse("unwrap.primitive.wrapper.types",
false.toString).toBoolean
+
+ // Since Spark doesn't allow writing empty StructType, empty proto message
type will be
+ // dropped by default. Setting this option to true will insert a dummy
column to empty proto
+ // message so that the empty message will be retained.
+ // For example, an empty message is used as field in another message:
+ //
+ // ```
+ // message A {}
+ // Message B {A a = 1, string name = 2}
+ // ```
+ //
+ // By default, in the spark schema field a will be dropped, which result in
schema
+ // b struct<name: string>
+ // If retain.empty.message.types=true, field a will be retained by inserting
a dummy column.
+ // b struct<name: string, a struct<__dummy_field_in_empty_struct: string>>
Review Comment:
Nice catch, fixed.
##########
connector/protobuf/src/test/scala/org/apache/spark/sql/protobuf/ProtobufFunctionsSuite.scala:
##########
@@ -1136,7 +1208,8 @@ class ProtobufFunctionsSuite extends QueryTest with
SharedSparkSession with Prot
val df = emptyBinaryDF.select(
from_protobuf_wrapper($"binary", name, descFilePathOpt,
options).as("empty_proto")
)
- assert(df.schema == structFromDDL("empty_proto struct<>"))
+ assert(df.schema ==
+ structFromDDL("empty_proto struct<>"))
Review Comment:
This test don't configure the option explicitly, so the default behavior is
the same as before.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]