Thanks for sharing those observations. They are very pertinent. On Mon, Jun 3, 2019 at 5:19 PM Ryan Blue <rb...@netflix.com> wrote:
> Repeated conflicts is something that we keep an eye on in our > infrastructure. We have streaming tables that are written to every 10 > minutes from multiple regions, commits to move the files back to a single > region, and compaction all happening at the same time. We don't really see > a significant problem with several writers. The manifest list files are > generally small enough that it's okay. Definitely better than keeping all > that information in the root metadata file. > > On Mon, Jun 3, 2019 at 2:13 PM Erik Wright <erik.wri...@shopify.com> > wrote: > >> Thanks for the response, Ryan. I can certainly see the benefits of >> manifest files are. I can see that with potentially long lists of valid >> snapshots, each having long lists of manifest files, the mere process of >> committing a new snapshot could, itself, become costly and increase the >> likelihood of commit conflicts. >> >> I gather that the potential for repeated commit conflicts due to the cost >> of rewriting the manifest list file after each failed attempt is not >> something that has really materialized yet. >> >> On Mon, Jun 3, 2019 at 4:50 PM Ryan Blue <rb...@netflix.com.invalid> >> wrote: >> >>> Hi Erik, >>> >>> Manifest lists serve two purposes: >>> >>> 1. Reduce the amount of data tracked by the root metadata file >>> 2. Provide a rough index over manifest files to cut down on planning >>> time >>> >>> Manifests are reused to cut down on the amount of work required in a >>> commit, but by doing this we end up with a large number of manifests. That >>> list gets expensive if it is added to the root metadata, which includes all >>> valid snapshots. So moving that list to its own file allows Iceberg to >>> avoid reading the list unless it is used, and to avoid re-writing the list >>> for every valid snapshot. >>> >>> As long as the list is written to its own file, we may as well write >>> metadata about partitions in each manifest so that we can skip manifests >>> that don’t match a query. That’s where the rough index comes from, and it >>> really does speed up queries. In fact, we have a new PR out to rewrite >>> manifests to take advantage of this: >>> https://github.com/apache/incubator-iceberg/pull/200/files >>> >>> Does that answer your question? >>> >>> On Mon, Jun 3, 2019 at 1:38 PM Erik Wright >>> <erik.wri...@shopify.com.invalid> wrote: >>> >>>> In the process of following up on the "Updates/Deletes/Upserts" thread, >>>> I'm re-reading the table spec. I have a question about Manifest List files. >>>> >>>> If I understand correctly, the manifest list files are separate files >>>> that are created prior to attempting to commit a new snapshot. Each >>>> snapshot may have a single manifest list file. The manifest list file >>>> references _all_ manifest files included in the snapshot. >>>> >>>> During a commit collision, two writers will produce new manifest list >>>> files. Assuming the two writes are compatible (one is append, one is >>>> replace, for example) the loser should be able to re-process their commit >>>> without rewriting any data files but will, nonetheless, need to rewrite >>>> their manifest list file in addition to rewriting their snapshot file. >>>> >>>> I was under the impression that it was a design objective to minimize >>>> the amount of work required in order to retry a commit. The inability to >>>> compose multiple manifest list files together seems like it adds mandatory >>>> read and write steps to almost every commit collision. >>>> >>>> Can someone clarify what the philosophy is with regards to minimizing >>>> the cost of commit retries? >>>> >>>> Thanks! >>>> >>>> -Erik >>>> >>> >>> >>> -- >>> Ryan Blue >>> Software Engineer >>> Netflix >>> >> > > -- > Ryan Blue > Software Engineer > Netflix >