[ 
https://issues.apache.org/jira/browse/NIFI-12130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Bathori reassigned NIFI-12130:
-----------------------------------

    Assignee: Mark Bathori

> PutIceberg: Ability to configure snapshot properties via dynamic attributes
> ---------------------------------------------------------------------------
>
>                 Key: NIFI-12130
>                 URL: https://issues.apache.org/jira/browse/NIFI-12130
>             Project: Apache NiFi
>          Issue Type: New Feature
>          Components: Extensions
>            Reporter: William Dyson
>            Assignee: Mark Bathori
>            Priority: Minor
>              Labels: iceberg
>
> *Motivation*
> Spark's implementation of Iceberg allows users to add snapshot properties, 
> when writing data to an Iceberg table, using properties prefixed with 
> "snapshot-property." like so:
> {{df.write}}
> {{  .option("write-format", "avro")}}
> {{  .option("snapshot-property.key", "value")}}
> {{  .insertInto("catalog.db.table") }}
> [https://iceberg.apache.org/docs/latest/spark-configuration/#write-options]
> These properties can be used to add context to Iceberg snapshots and help 
> users locate snapshots in recovery scenarios.
> In fact, Spark automatically adds the application name as {_}spark.app.id{_}.
> Examples of when these properties might be useful include:
>  * Recording the data source used to produce the new records
>  * UUID of flow file used to update the table so it can be matched to NiFi 
> provenance
> They can be queried from the snapshots metatable (feature of Iceberg).
> *Feature request*
> It would be great if we could configure PutIceberg to add these properties in 
> a similar fashion (e.g. using dynamic properties of the form 
> snapshot-property.*). Continuing with the comparison to Spark, it may also be 
> worth automatically adding the flowfile UUID as something like 
> {_}nifi.flowfile.id{_}.
> *Further details*
> I'm not entirely clued up on the Iceberg API, but it looks like these are set 
> on the SnapshotUpdate (AppendFiles inherits from this class):
> [https://iceberg.apache.org/javadoc/master/org/apache/iceberg/SnapshotUpdate.html]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to