It seems we have a consensus on the approach. I can take a look at implementing this if no one has any objections.
Gabor On Fri, Jan 13, 2023 at 11:28 PM Ryan Blue <[email protected]> wrote: > That sounds like a good idea to me. > > On Fri, Jan 13, 2023 at 11:04 AM Jack Ye <[email protected]> wrote: > >> > I think the issue is that all of the built-in catalogs currently call >> the version of `newTableMetadata` that defaults to v1. >> >> Yes I think this seems like the key issue for the catalogs that extend >> BaseMetastoreCatalog. Looks like we should make changes to make the default >> format version a catalog property, instead of hard-coded in TableMetadata? >> >> -Jack >> >> On Thu, Jan 12, 2023 at 11:47 PM Jean-Baptiste Onofré <[email protected]> >> wrote: >> >>> Hi Gabor, >>> >>> It makes sense to me. AFAIK, as the tables creation comes from catalog >>> "controller", they can "decide" the version. So, it would be each >>> catalog to deal with the way/version they want to create tables. >>> >>> Regards >>> JB >>> >>> On Wed, Jan 11, 2023 at 11:11 PM Gabor Kaszab <[email protected]> >>> wrote: >>> > >>> > Naively asking, can't we add some property to tell Iceberg which >>> version to use as default when creating tables? (If there is no such >>> setting currently) >>> > >>> > Gabor >>> > >>> > Jack Ye <[email protected]> ezt írta (időpont: 2023. jan. 11., Sze >>> 20:04): >>> >> >>> >> Should we start a community vote on this? >>> >> >>> >> I remember in today's community sync meeting Russell briefly >>> discussed about some compaction supports that are not there yet and some >>> users are struggled with small delete files issue, and it was to some >>> extent why Spark is still defaulting v1. >>> >> >>> >> Regarding feature side, changelog scan is mostly there in Spark, and >>> there will also likely be movements on Trino side for it very soon. >>> >> >>> >> Overall, I think it would be beneficial to move default to v2, which >>> could incentivize the completion of those missing parts across engines. >>> >> >>> >> Best, >>> >> Jack Ye >>> >> >>> >> >>> >> >>> >> >>> >> On Wed, Jan 11, 2023 at 5:47 AM Piotr Findeisen < >>> [email protected]> wrote: >>> >>> >>> >>> Hi, >>> >>> >>> >>> FWIW Trino already creates v2 tables by default. >>> >>> Thought it's worth sharing for context. >>> >>> >>> >>> Best >>> >>> PF >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> On Tue, Jan 10, 2023 at 10:09 AM Manu Zhang <[email protected]> >>> wrote: >>> >>>> >>> >>>> Hi all, >>> >>>> >>> >>>> We've maintained a forked Iceberg internally and all our use cases >>> involve v2 tables with row-level updates and deletes. Our users need to >>> remember to create table with the `'format-version'='2'` option or alter >>> table afterwards. >>> >>>> >>> >>>> I'm thinking about changing the default format-version of our >>> forked Iceberg to v2 . Is there any concern for this change? Any hidden >>> issues I've missed? >>> >>>> >>> >>>> Thanks, >>> >>>> Manu >>> >> > > -- > Ryan Blue > Tabular >
