pan3793 opened a new issue #2231:
URL: https://github.com/apache/iceberg/issues/2231


   Currently, we build ETL workflows based on Hive tables, and to achieve 
version control ability of data, we add a top level partition which named `ts` 
to all tables. And assign `ts` a specific value when trigger the workflow, then 
we can get the expected version of data of all tables with one `ts`.
   
   I know I can use `snapshot_id` which is auto generated to fetch the specific 
version of data in iceberg table, and it means if I want to identify the 
snapshots of all tables involved in workflow, I need to persist each table's 
`snapshot_id` when the table updated so I can use it later.
   
   Is there an approach to assign the snapshot a `snapshot_name` besides 
`snapshot_id`, so we can track the snapshots of relevant tables such as 
generated in same workflow in a convenient way?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to