This is great. Thnx! -d
On Fri, Sep 13, 2019 at 2:37 PM Ryan Blue <[email protected]> wrote: > Okay, thanks for explaining. I understand now. > > The Hadoop table implementation is the only place where rename is used, > and it requires a file system that supports atomic rename. If you're using > an object store like S3 or GCS, then you should be using the HMS > implementation or a custom catalog instead of Hadoop tables. > > The difference between these is how Iceberg keeps track of the current > root metadata file. HMS tables store the metadata location as a table > property of a table in the Hive MetaStore, and use the table locking API to > coordinate updates. If you're using the Hive MetaStore, then this should > work out of the box. > > If you are using an alternative metastore, then you just need to implement > a custom catalog that handles the atomic swap from one metadata location to > another. Mouli just added a guide for doing this here (thanks!): > http://iceberg.apache.org/custom-catalog/ > > That's where you'd plug in your preferred method for making an atomic > update. That could be locking with ZooKeeper, using a database transaction, > or some other method. You just need to provide a way to atomically swap > metadata file location strings, and a way to get the current location. > > I hope that helps! In the end it should be easier, since the API for > plugging in already exists. > -- > Ryan Blue > Software Engineer > Netflix >
