[I] [FEATURE] Native scan support for VariantType columns (Iceberg + Spark 4.0) [datafusion-comet]

via GitHub Tue, 12 May 2026 04:17:21 -0700


Shekharrajak opened a new issue, #4295:
URL: https://github.com/apache/datafusion-comet/issues/4295


   ### What is the problem the feature request solves?
   
   Today, Comet falls back to JVM Spark for any query touching a VariantType 
column. This eliminates Comet's acceleration for the entire query, including 
unrelated operators that could otherwise run natively. This issue tracks adding 
native execution for Variant column scans, both for plain Parquet and Iceberg 
tables, by consuming new Variant primitives from arrow-rs and iceberg-rust.
   
   ### Describe the potential solution
   
   
   ```
   spark/src/main/scala/org/apache/comet/rules/CometScanRule.scala (line 729):
   
   case s: StructType if isVariantStruct(s) =>
     fallbackReasons +=
       s"Unsupported $name of type VariantType (shredded; not supported by 
$scanImpl scan)"
     false
   ```
   
   
   
   ### Additional context
   
   Follow-ups
   
   •  Comet support for parse_json / to_variant (write side)
   •  Comet support for schema_of_variant (metadata operation)
   •  Predicate pushdown on variant_get(...) = literal into Parquet's column 
index
   •  Variant in shuffle / spill paths (separate issue)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[I] [FEATURE] Native scan support for VariantType columns (Iceberg + Spark 4.0) [datafusion-comet]

Reply via email to