qlong opened a new issue, #16448:
URL: https://github.com/apache/iceberg/issues/16448
### Feature Request / Improvement
Iceberg can write shredded variant columns to Parquet (#14297). On the read
path,
`SparkScanBuilder` does not implement Spark 4.1's
`SupportsPushDownVariantExtractions`,
so Spark never rewrites `variant_get(...)` into struct field accesses and
never prunes the
scan output schema to the requested shredded fields.
As a result, queries that only need one or two paths (e.g.
`variant_get(payload, '$.size', 'long')`)
still read the full shredded Parquet layout for each variant column
(hundreds of
`typed_value.*` columns in practice), materialize a full `VARIANT` value per
row, and evaluate
`variant_get` in a Spark `Filter` above `BatchScan`. That reconstruction is
expensive: on the
GitHub Activities 1-day shredded Iceberg table (GHA), a simple filter/count
query was ~14×
slower than the same workload on a plain JSON string column (~63s vs ~4.4s),
despite only
~1.6× more storage.
This issue implements the DSv2 contract (plan rewrite). A follow-on change
will wire the annotated readSchema() into the Parquet reader and avoid
full-variant reconstruction (I/O reduction).
**Plan rewrite example**
For this query:
```sql
CREATE TABLE events (
id INT,
type STRING,
payload VARIANT
) USING iceberg
TBLPROPERTIES ('format-version' = '3');
SELECT count(*) AS large_events
FROM events
WHERE type = 'PushEvent'
AND variant_get(payload, '$.size', 'long') > 5;
```
**Before (today — no SupportsPushDownVariantExtractions)**
```
Aggregate [count(1)]
+- Filter (... type = PushEvent)
AND (variant_get(payload#22, $.size, LongType, ...) > 5)) ←
still a function call
+- RelationV2[type#19, payload#22]
↑
payload is VariantType (full variant)
Filter (variant_get(payload#22, $.size, ...) > 5) ← runs per row AFTER
scan
+- BatchScan [type#15, payload#18]
IcebergScan(..., filters=type IS NOT NULL, payload IS NOT NULL, type
= 'PushEvent')
ReadSchema: full payload variant / all shredded columns
```
**After (DSv2 plan rewrite)**
```
Aggregate [count(1)]
+- Filter (... type = PushEvent)
AND (payload#25.0 > 5)) ← struct field
access, not variant_get
+- RelationV2[type#24, payload#25]
↑
payload is StructType { "0": LongType [VariantMetadata
path=$.size] }
Filter (isnotnull(payload#25) AND (payload#25.0 > 5)) ← compares long
field directly
+- BatchScan [type#24, payload#25]
readSchema(): struct<payload: struct<0: bigint, ...>>
```
**Note**
This issue changes the logical/physical plan shape and readSchema()
contract. Parquet may still read all shredded columns until the follow-on
wires readSchema() into the reader.
### Query engine
Spark
### Willingness to contribute
- [x] I can contribute this improvement/feature independently
- [x] I would be willing to contribute this improvement/feature with
guidance from the Iceberg community
- [ ] I cannot contribute this improvement/feature at this time
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]