rdblue commented on issue #1286: URL: https://github.com/apache/iceberg/issues/1286#issuecomment-670972927
> looks like applying the changes into current base snapshot is executed per retrial Yes, but work that has already been done is reused. If a table has 2 manifests, A and B, when a rewrite starts and another commit adds manifest C, then the retry won't need to filter A and B a second time. It will only filter C, and it will use metadata to determine if C needs to be rewritten. By reusing this work, reattempts should only require writing the manifest list and root metadata file. > Probably the cheapest approach would be allowing reorder of snapshots Reordering snapshots is not allowed because it changes history. A reattempt should be a cheap operation, we just need the logs to know why it isn't in this case. > Alternatively, we can list up manifests written from snapshots after base snapshots . . . I don't quite understand what you're suggesting here, but it sounds very similar to what is already done. Most of the time, an append results in a new manifest added, so the situation I described above is how the commit is reattempted. A couple things can go wrong: 1. Manifests are compacted: A, B, and C might be rewritten into D. If that happens, then D must be scanned and rewritten because A and B had to be. That takes time, which could result in the retry failing. The next retry would probably get manifests D and E, and the original situation (D is already done, E doesn't match) would apply for a quick retry. 2. New manifests must be scanned: If the metadata for C shows that it might contain files that were rewritten, then the commit must scan C to check for them. The metadata that we use is the range of partitions in a manifest. So if you're partitioning by hour, for example, then a compaction that rewrites data in the hour currently being written will need to scan all new manifests each retry. The most likely situation is the second one, but logs that show what files are getting created will tell us what is happening. I think we should find out what is taking so long in the reattempt in order to plan how to fix it. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
