Hi Iceberg devs,

I was thinking about a general use-case of having data arriving with a
partial schema than that of a dataset. By having that dataset backed by an
Iceberg table we need to make sure that before each file is committed we at
least add the new columns to the Iceberg schema first.

I was looking for
https://github.com/apache/incubator-iceberg/blob/master/spark/src/main/java/org/apache/iceberg/spark/SparkSchemaUtil.java
and
couldn't find a matching utility method.
Do you think it would make sense to have such an utility method, i.e.
detect the new (nested) fields in the schema of the inbound file by
comparing it to the Iceberg current schema and generate the effective
schema commit?

-- 
/Filip

Reply via email to