> reuse the 'Snapshot#timeMillis' field Don't do this, tag is just snapshot reference, it cannot alter snapshot fields.
> the TTL has higher priority We should maintain a behavior similar to snapshot expiration, as long as one of the conditions hits, then delete it without setting any priority Best, Jingsong On Wed, Apr 3, 2024 at 10:33 AM wj wang <[email protected]> wrote: > > Thanks jingsong li and yu zelin for reply. > > > I think it's similar to the snapshot expire, where both the number and > time are used to determine whether it should be deleted. This is > reasonable, and the hit should be deleted. > > OK, I will do. > > > > Java API 'createTag': Use 'Duration' as parameter instead of 'String'. I > think it's better. > > OK > > > > For the field 'tagCreateTime' in class 'Tag': I think we can just use the > 'Snapshot#timeMillis' field. The 'timeMillis' is the create time of the > snapshot, I think the time won't be used when we read the corresponding > tag. So I think we can just reuse the field, what do you think? And if do > so, > > I think it's possible to reuse the 'Snapshot#timeMillis' field for auto > created tags, but I don't think 'Snapshot#timeMillis' field can be used for > non-auto created tags. > what do you think? > > > > in the tags system table, 'commit_time' can be renamed to 'create_time' > or 'tag_create_time' or other name. > > I think create-time and time-retained is good. > > > > Should we add TTL to auto-created tags? I think we should. Users can set > the same TTL for all auto-created tags by table options.My suggestion of > how to handle `tag.num-retained-max` and TTL is: the TTL has higher > priority. When we try to expire an auto-created tag, we first found > candidates by `tag.num-retained-max`, then if the candidate's survival time > is less than TTL, we don't expire it. > > OK, I will do. > > > > On Tue, Apr 2, 2024 at 5:26 PM yu zelin <[email protected]> wrote: > > > Thanks wj for driving this! I'd like to give some inputs: > > > > 1. Java API 'createTag': Use 'Duration' as parameter instead of 'String'. I > > think it's better. > > > > 2. For the field 'tagCreateTime' in class 'Tag': I think we can just use > > the 'Snapshot#timeMillis' field. > > The 'timeMillis' is the create time of the snapshot, I think the time won't > > be used when we read > > the corresponding tag. So I think we can just reuse the field, what do you > > think? And if do so, > > in the tags system table, 'commit_time' can be renamed to 'create_time' or > > 'tag_create_time' or > > other name. > > > > 3. Should we add TTL to auto-created tags? I think we should. Users can set > > the same TTL for > > all auto-created tags by table options.My suggestion of how to handle > > `tag.num-retained-max` > > and TTL is: the TTL has higher priority. When we try to expire auto-created > > tag, we first found > > candidates by `tag.num-retained-max`, then if the candidate's survival time > > is less than TTL, we > > don't expire it. > > > > Best regards, > > Zelin Yu > > > > > > On Mon, Apr 1, 2024 at 9:54 AM <[email protected]> wrote: > > > > > Hi devs: > > > > > > I would like to start a discussion of PIP-20: Introduce TTL for tags > > which > > > are not auto-created. [1]. Currently, Paimon has automatic clearing > > > mechanisms for tags created by TagAutoCreation, but not for other tags. > > It > > > can't meet our demands.For example:1、The current tag cleanup mechanism > > may > > > lead to resource-wasting.2、Tag does not support TTL, so it is not > > flexible > > > to use. > > > This PIP aims to > > > support each Tag has its own TTL, so that the user can use the tag more > > flexibly and reduce the probability of resource waste.And > > > Paimon keep up with other data lake products such as Iceberg. > > > Looking forward to your feedback, thanks. > > > [1] > > > > > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=300026341 > > > > > > > > > Best, > > > wangwj > >
