I believe the conclusion here was that there is already a catalog level
property with the purpose of adding table defaults. This could be used to
make the default table format to v2 on a particular catalog. See my last
email on this thread. One thing I haven't checked is if this property works
for all the catalog types or just a subset of them. But I think it's worth
a try to see if it works in your environment.
It's "table.default.<TABLE_PARAM>" setting

On Mon, Mar 20, 2023 at 5:41 AM Manu Zhang <[email protected]> wrote:

> Is there any progress to make default format version a catalog property?
>
> Thanks,
> Manu
>
> On Wed, Jan 18, 2023 at 5:43 PM Gabor Kaszab
> <[email protected]> wrote:
>
>> I also ran into this "table-default." setting
>> <https://github.com/apache/iceberg/blob/35151fe17b47c0af22787db4e4964b0cfcfdb215/core/src/main/java/org/apache/iceberg/CatalogProperties.java#L30>
>> prefix. For me it seems that it's a catalog level config so it's enough to
>> provide e.g. "table-default.format-version" = "2" to each catalog as a
>> startup flag. For me it seems that catalogs derived from
>> BaseMetastoreCatalog use this table default prefix
>> <https://github.com/apache/iceberg/blob/35151fe17b47c0af22787db4e4964b0cfcfdb215/core/src/main/java/org/apache/iceberg/BaseMetastoreCatalog.java#L148>
>> .
>>
>> Gabor
>>
>> On Wed, Jan 18, 2023 at 12:00 AM Yufei Gu <[email protected]> wrote:
>>
>>> The functionality has been there if we are talking about setting the
>>> default format at the Iceberg catalog.  For example, we can set a catalog
>>> like this. All tables created will be v2 tables.
>>> spark.sql.catalog.hive_prod.table-default.format-version = "2"
>>>
>>> Of course, we need to set it for each Spark App. Setting Trino would be
>>> easier. It would be one catalog level change.
>>>
>>> Best,
>>>
>>> Yufei
>>>
>>> `This is not a contribution`
>>>
>>>
>>> On Mon, Jan 16, 2023 at 1:34 AM Gabor Kaszab
>>> <[email protected]> wrote:
>>>
>>>> It seems we have a consensus on the approach. I can take a look at
>>>> implementing this if no one has any objections.
>>>>
>>>> Gabor
>>>>
>>>> On Fri, Jan 13, 2023 at 11:28 PM Ryan Blue <[email protected]> wrote:
>>>>
>>>>> That sounds like a good idea to me.
>>>>>
>>>>> On Fri, Jan 13, 2023 at 11:04 AM Jack Ye <[email protected]> wrote:
>>>>>
>>>>>> > I think the issue is that all of the built-in catalogs currently
>>>>>> call the version of `newTableMetadata` that defaults to v1.
>>>>>>
>>>>>> Yes I think this seems like the key issue for the catalogs that
>>>>>> extend BaseMetastoreCatalog. Looks like we should make changes to make 
>>>>>> the
>>>>>> default format version a catalog property, instead of hard-coded in
>>>>>> TableMetadata?
>>>>>>
>>>>>> -Jack
>>>>>>
>>>>>> On Thu, Jan 12, 2023 at 11:47 PM Jean-Baptiste Onofré <
>>>>>> [email protected]> wrote:
>>>>>>
>>>>>>> Hi Gabor,
>>>>>>>
>>>>>>> It makes sense to me. AFAIK, as the tables creation comes from
>>>>>>> catalog
>>>>>>> "controller", they can "decide" the version. So, it would be each
>>>>>>> catalog to deal with the way/version they want to create tables.
>>>>>>>
>>>>>>> Regards
>>>>>>> JB
>>>>>>>
>>>>>>> On Wed, Jan 11, 2023 at 11:11 PM Gabor Kaszab <
>>>>>>> [email protected]> wrote:
>>>>>>> >
>>>>>>> > Naively asking, can't we add some property to tell Iceberg which
>>>>>>> version to use as default when creating tables? (If there is no such
>>>>>>> setting currently)
>>>>>>> >
>>>>>>> > Gabor
>>>>>>> >
>>>>>>> > Jack Ye <[email protected]> ezt írta (időpont: 2023. jan. 11.,
>>>>>>> Sze 20:04):
>>>>>>> >>
>>>>>>> >> Should we start a community vote on this?
>>>>>>> >>
>>>>>>> >> I remember in today's community sync meeting Russell briefly
>>>>>>> discussed about some compaction supports that are not there yet and some
>>>>>>> users are struggled with small delete files issue, and it was to some
>>>>>>> extent why Spark is still defaulting v1.
>>>>>>> >>
>>>>>>> >> Regarding feature side, changelog scan is mostly there in Spark,
>>>>>>> and there will also likely be movements on Trino side for it very soon.
>>>>>>> >>
>>>>>>> >> Overall, I think it would be beneficial to move default to v2,
>>>>>>> which could incentivize the completion of those missing parts across
>>>>>>> engines.
>>>>>>> >>
>>>>>>> >> Best,
>>>>>>> >> Jack Ye
>>>>>>> >>
>>>>>>> >>
>>>>>>> >>
>>>>>>> >>
>>>>>>> >> On Wed, Jan 11, 2023 at 5:47 AM Piotr Findeisen <
>>>>>>> [email protected]> wrote:
>>>>>>> >>>
>>>>>>> >>> Hi,
>>>>>>> >>>
>>>>>>> >>> FWIW Trino already creates v2 tables by default.
>>>>>>> >>> Thought it's worth sharing for context.
>>>>>>> >>>
>>>>>>> >>> Best
>>>>>>> >>> PF
>>>>>>> >>>
>>>>>>> >>>
>>>>>>> >>>
>>>>>>> >>>
>>>>>>> >>> On Tue, Jan 10, 2023 at 10:09 AM Manu Zhang <
>>>>>>> [email protected]> wrote:
>>>>>>> >>>>
>>>>>>> >>>> Hi all,
>>>>>>> >>>>
>>>>>>> >>>> We've maintained a forked Iceberg internally and all our use
>>>>>>> cases involve v2 tables with row-level updates and deletes. Our users 
>>>>>>> need
>>>>>>> to remember to create table with the `'format-version'='2'` option or 
>>>>>>> alter
>>>>>>> table afterwards.
>>>>>>> >>>>
>>>>>>> >>>> I'm thinking about changing the default format-version of our
>>>>>>> forked Iceberg to v2 . Is there any concern for this change? Any hidden
>>>>>>> issues I've missed?
>>>>>>> >>>>
>>>>>>> >>>> Thanks,
>>>>>>> >>>> Manu
>>>>>>>
>>>>>>
>>>>>
>>>>> --
>>>>> Ryan Blue
>>>>> Tabular
>>>>>
>>>>

Reply via email to