Re: [PR] [SPARK-45891][SQL][FOLLOWUP] Disable `spark.sql.variant.allowReadingShredded` by default [spark]

via GitHub Wed, 12 Feb 2025 10:30:41 -0800


cashmand commented on PR #49874:
URL: https://github.com/apache/spark/pull/49874#issuecomment-2654532466


   Hi @pan3793, I opened https://github.com/apache/spark/pull/49910 to remove 
the Variant docs from Spark, and link to the Parquet repo.
   
   Regarding the status of Variant:
   
   1) Shredded writes in are still in a test-only state, although it should be 
compatible with the latest version of the shredding spec in Parquet. There's 
currently no API to enable shredded writes other than to use 
`spark.sql.variant.forceShreddingSchemaForTest`, which is clearly marked as 
being meant for test purposes, and isn't practical for real use cases.
   
   2) Reads from shredded Variant should work correctly. That being said, there 
are currently no production writers, so there hasn't been much testing outside 
of the unit tests added to Spark. It's also possible that the shredding spec 
could change, although I'm hoping that's unlikely at this point. Given those 
concerns, I think it's reasonable to disable the flag by default, but will 
leave it up to you, @gene-db and @cloud-fan.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] [SPARK-45891][SQL][FOLLOWUP] Disable `spark.sql.variant.allowReadingShredded` by default [spark]

Reply via email to