cccs-jc commented on issue #3417: URL: https://github.com/apache/iceberg/issues/3417#issuecomment-1664323373
@Fokko We have encountered the same issue. We run a process which appends data files to an iceberg table. As per this article. https://medium.com/towards-data-science/leveraging-azure-event-grid-to-create-a-java-iceberg-table-d419da06dbc6 This creates lots of commits (1 per minute). So we run maintenance jobs every hour. These jobs age off the data, expire snapshots and rewrite manifests. After running for a few hours we run into the same issue as above. We have a new metadata.json file (say v1001.metadata.json). This file points to many snapshots, however one of them, the newly added one is not on disk? Version file v1000.metadata.json is okay. Also to note the version-hint.txt file points to v1000.metadata.json. We are using the Hadoop catalog. We have no idea how it can get into this state.. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
