harshmotw-db commented on code in PR #53120:
URL: https://github.com/apache/spark/pull/53120#discussion_r2548267850


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala:
##########
@@ -1593,6 +1593,14 @@ object SQLConf {
       .booleanConf
       .createWithDefault(false)
 
+  val PARQUET_IGNORE_VARIANT_ANNOTATION =
+    buildConf("spark.sql.parquet.ignoreVariantAnnotation")
+      .doc("When true, ignore the variant logical type annotation and treat 
the Parquet " +
+        "column in the same way as the underlying struct type")
+      .version("4.1.0")
+      .booleanConf
+      .createWithDefault(false)

Review Comment:
   I have added a new test `variant logical type annotation - ignore variant 
annotation` to demonstrate this point.
   
   So, if the `ignoreVariantAnnotation` config is enabled, you can read an 
parquet file with an underlying variant column into a struct of binaries 
schema. So for a variant column `v`, you could run:
   `spark.read.format("parquer").schema("v struct<value: BINARY, metadata: 
BINARY>").load(...)` and it would load the value and metadata columns into 
these fields even though the data is logically not a struct of two binaries but 
is instead a variant. People could use this to debug the physical variant 
values.
   
   If the config is disabled, which is the default, this read would give an 
error and you would need to read variant columns into a variant schema.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to