Hi guys, Let me ping again on this thread ;)
I think it would be great to give some visibility to the community, especially about Spec v3 and Iceberg 2.0.0. Any comments about Spec V2 / Iceberg 2.0.0 ? Thanks ! Regards JB On Fri, Feb 16, 2024 at 4:52 PM Jean-Baptiste Onofré <j...@nanthrax.net> wrote: > > Hi guys, > > During the last community meeting, we started to quickly discuss Iceberg 2.0. > I was quite surprised it came during the community meeting because I > don't remember having a previous discussion (on the mailing list) > about that. > > So, I would like to have to start an open discussion about our > community driven roadmap. > > I see the following topics that should be discussed (maybe as proposed > by Brian we can have corresponding GitHub issues tagged with > "discussion" flag). That's open questions, feel free to add points I > missed: > > * Spec v3 > We have the discussion about ts_nanosecond, and other enhancements > in the spec. Do we plan to have Iceberg 2.0 with Spec v3 ? What do we > plan to include in spec v3 as a target ? > * Catalogs > We have a consensus that we have too many catalogs, especially > with different capabilities/issues. Jack already started the > discussion to deprecate DynamoDBCatalog. The discussion is: > - Where do we want the catalog to leave (repository) ? > - What catalogs do we want to deprecate (HadoopCatalog for instance :)) ? > - Do we want to have the REST Catalog as a kind of façade for > other catalog/backend ? > * REST Catalog > If we want to use the REST Catalog as a façade, what are the > requirements to have it even more pluggable for both backend (other > catalogs) and the REST itself (authentication/authorization, runtime, > etc) ? Jack also started a discussion about permission on the REST > catalog. > * Engines > What engines (and version) do we plan to still support ? What new > engines do we plan (for instance I can work on an Apache Beam and an > Apache Karaf powered engine) ? > * Data file formats / Table formats > Do we plan to add/remove/update data file formats for 2.0 (Parquet, > ORC, ...) ? > Same question about table formats ? Do we plan a kind of "tool" to > move data from table formats to Iceberg ? > * Data Injection (e.g. Kafka Connect sink) > Iceberg 1.5.0 will include the first bricks of Kafka Connect, new > ones will come with 1.6+. > What do we plan for Iceberg 2.0 on this front ? Do we plan an > additional layer next to Kafka Connect (for instance why not provide > an Apache Camel for read/write data to Iceberg) ? > * Rough date: depending on all previous points (and maybe others :)), > when do we target 2.0.0 ? > > That's a raw discussion start, I propose to create a GitHub > "Discussion" issue (flagged with 2.0.0 milestone) for each topic where > we have consensus. > > Thoughts ? > > Regards > JB