Re: Iceberg table format

Ryan Blue Thu, 04 Jan 2018 11:15:06 -0800

I've also created a google group for discussion:
[email protected]


You can join the group here:
https://groups.google.com/forum/#!forum/iceberg-devel

rb

On Thu, Jan 4, 2018 at 11:03 AM, Ryan Blue <[email protected]> wrote:

> Just made it public a minute ago. The repo is here: https://github.com/
> Netflix/iceberg
>
> It's built with gradle and requires a Spark 2.3.0-SNAPSHOT (for Datasource
> V2) and Parquet 1.9.1-SNAPSHOT (for API additions and bug fixes).
>
> An early version of the spec is available for comments here:
> https://docs.google.com/document/d/1Q-zL5lSCle6NEEdyfiYsXYzX_
> Q8Qf0ctMyGBKslOswA/edit?usp=sharing
>
> Feedback is definitely welcome.
>
> rb
>
> On Wed, Jan 3, 2018 at 6:28 PM, Julien Le Dem <[email protected]>
> wrote:
>
>> Happy new year!
>> I'm interested as well.
>> Did you get to publish your code on github?
>> Thanks
>>
>> On Fri, Dec 8, 2017 at 8:42 AM, Ryan Blue <[email protected]>
>> wrote:
>>
>>> I'm working on getting the code out to our open source github org,
>>> probably
>>> early next week. I'll set up a mailing list for it as well.
>>>
>>> rb
>>>
>>> On Thu, Dec 7, 2017 at 6:38 PM, Jacques Nadeau <[email protected]>
>>> wrote:
>>>
>>> > Sounds super interesting. Would love to collaborate on this. Do you
>>> have a
>>> > repo or mailing list where you are working on this?
>>> >
>>> >
>>> >
>>> > On Wed, Dec 6, 2017 at 4:20 PM, Ryan Blue <[email protected]>
>>> > wrote:
>>> >
>>> >> Hi everyone,
>>> >>
>>> >> I mentioned in the sync-up this morning that I’d send out an
>>> introduction
>>> >> to the table format we’re working on, which we’re calling Iceberg.
>>> >>
>>> >> For anyone that wasn’t around here’s the background: there are several
>>> >> problems with how we currently manage data files to make up a table
>>> in the
>>> >> Hadoop ecosystem. The one that came up today was that you can’t
>>> actually
>>> >> update a table atomically to, for example, rewrite a file and safely
>>> >> delete
>>> >> records. That’s because Hive tables track what files are currently
>>> visible
>>> >> by listing partition directories, and we don’t have (or want)
>>> transactions
>>> >> for changes in Hadoop file systems. This means that you can’t actually
>>> >> have
>>> >> isolated commits to a table and the result is that *query results from
>>> >> Hive
>>> >> tables can be wrong*, though rarely in practice.
>>> >>
>>> >> The problems with current tables are caused primarily by keeping state
>>> >> about what files are in or not in a table in the file system. As I
>>> said,
>>> >> one problem is that there are no transactions but you also have to
>>> list
>>> >> directories to plan jobs (bad on S3) and rename files from a temporary
>>> >> location to a final location (really, really bad on S3).
>>> >>
>>> >> To avoid these problems we’ve been building the Iceberg format that
>>> tracks
>>> >> tracks every file in a table instead of tracking directories. Iceberg
>>> >> maintains snapshots of all the files in a dataset and atomically swaps
>>> >> snapshots and other metadata to commit. There are a few benefits to
>>> doing
>>> >> it this way:
>>> >>
>>> >>    - *Snapshot isolation*: Readers always use a consistent snapshot
>>> of the
>>> >>    table, without needing to hold a lock. All updates are atomic.
>>> >>    - *O(1) RPCs to plan*: Instead of listing O(n) directories in a
>>> table
>>> >> to
>>> >>    plan a job, reading a snapshot requires O(1) RPC calls
>>> >>    - *Distributed planning*: File pruning and predicate push-down is
>>> >>    distributed to jobs, removing the metastore bottleneck
>>> >>    - *Version history and rollback*: Table snapshots are kept around
>>> and
>>> >> it
>>> >>    is possible to roll back if a job has a bug and commits
>>> >>    - *Finer granularity partitioning*: Distributed planning and O(1)
>>> RPC
>>> >>    calls remove the current barriers to finer-grained partitioning
>>> >>
>>> >> We’re also taking this opportunity to fix a few other problems:
>>> >>
>>> >>    - Schema evolution: columns are tracked by ID to support
>>> >> add/drop/rename
>>> >>    - Types: a core set of types, thoroughly tested to work
>>> consistently
>>> >>    across all of the supported data formats
>>> >>    - Metrics: cost-based optimization metrics are kept in the
>>> snapshots
>>> >>    - Portable spec: tables should not be tied to Java and should have
>>> a
>>> >>    simple and clear specification for other implementers
>>> >>
>>> >> We have the core library to track files done, along with most of a
>>> >> specification, and a Spark datasource (v2) that can read Iceberg
>>> tables.
>>> >> I’ll be working on the write path next and we plan to build a Presto
>>> >> implementation soon.
>>> >>
>>> >> I think this should be useful to others and it would be great to
>>> >> collaborate with anyone that is interested.
>>> >>
>>> >> rb
>>> >> 
>>> >> --
>>> >> Ryan Blue
>>> >> Software Engineer
>>> >> Netflix
>>> >>
>>> >
>>> >
>>>
>>>
>>> --
>>> Ryan Blue
>>> Software Engineer
>>> Netflix
>>>
>>
>>
>
>
> --
> Ryan Blue
> Software Engineer
> Netflix
>



-- 
Ryan Blue
Software Engineer
Netflix

Re: Iceberg table format

Reply via email to