I think you could achieve what you're looking for by setting the age to 1 ms and the minimum number of snapshots to keep. Then snapshot expiration would always expire all snapshots other than the min number, getting you what you want.
It probably wouldn't make sense to set a maximum as well. Right now, the min number of snapshots is a requirement that keeps snapshots around even if they are eligible to be removed because of expiration. A maximum would work differently and would be a second way to consider a snapshot eligible for expiration -- or else we would have to redefine how the min works. I think that would be a bit confusing to configure in practice because we'd need to define these cases for which configuration takes precedence. It seems much simpler to me to use the min snapshots setting with a very short expiration interval if you want to always keep some number of snapshots rather than using the age-based expiration. On Tue, Jan 21, 2025 at 9:51 AM Daniel Weeks <dwe...@apache.org> wrote: > Hey Manu, > > I think I understand what you're trying to achieve here and I feel like > the most important part is to have an updated version of the retention > procedure <https://iceberg.apache.org/spec/#snapshot-retention-policy> to > clearly state how this interacts with the other settings as part of the PR. > > -Dan > > On Thu, Jan 16, 2025 at 8:37 PM Yufei Gu <flyrain...@gmail.com> wrote: > >> It makes sense to have an option to control the max number of snapshots. >> Thanks Manu for the proposal. >> >> Yufei >> >> >> On Thu, Jan 16, 2025 at 7:46 PM Manu Zhang <owenzhang1...@gmail.com> >> wrote: >> >>> Hi all, >>> >>> Do you have more comments on this feature? Do you have concerns about >>> adding a new field to SnapshotRef? >>> >>> Thanks, >>> Manu >>> >>> On Tue, Jan 7, 2025 at 2:37 PM Manu Zhang <owenzhang1...@gmail.com> >>> wrote: >>> >>>> Hi Ajantha, >>>> >>>> `history.expire.min-snapshots-to-keep` is the *minimum number of >>>> snapshots* we can keep. I'm proposing to decide the *maximum number of >>>> snapshots* to keep by count rather than by age. >>>> >>>> Thanks, >>>> Manu >>>> >>>> On Tue, Jan 7, 2025 at 2:18 PM Ajantha Bhat <ajanthab...@gmail.com> >>>> wrote: >>>> >>>>> Hi Manu, >>>>> >>>>> We already have `retain_last` and >>>>> `history.expire.min-snapshots-to-keep` to retain the snapshots based on >>>>> count. Can you please elaborate on why can't we use the same? >>>>> >>>>> - Ajantha >>>>> >>>>> On Tue, Jan 7, 2025 at 11:33 AM Walaa Eldin Moustafa < >>>>> wa.moust...@gmail.com> wrote: >>>>> >>>>>> Thanks Manu for starting this discussion. That is definitely a valid >>>>>> feature. I have always found maintaining snapshots by day makes it harder >>>>>> to provide different types of guarantees/contracts especially when tables >>>>>> change rates are diverse or irregular. Maintaining by snapshot count >>>>>> makes >>>>>> a lot of sense and prevents table sizes from growing excessively when >>>>>> change rate is frequent. >>>>>> >>>>>> Thanks, >>>>>> Walaa. >>>>>> >>>>>> >>>>>> On Mon, Jan 6, 2025 at 8:38 PM Manu Zhang <owenzhang1...@gmail.com> >>>>>> wrote: >>>>>> >>>>>>> Hi all, >>>>>>> >>>>>>> While maintaining Iceberg tables for our customers, I find it's >>>>>>> difficult to set a default snapshot expiration time >>>>>>> (`history.expire.max-snapshot-age-ms`) for different workloads. The >>>>>>> default >>>>>>> value of 5 days looks good for daily batch jobs but is too long for >>>>>>> frequently-updated jobs. >>>>>>> >>>>>>> I'm thinking about adding another option like >>>>>>> `history.expire.max-snapshots-to-keep` to keep at most N snapshots. A >>>>>>> snapshot will be removed when either its age is larger than >>>>>>> `history.expire.max-snapshot-age-ms` or it's the oldest in >>>>>>> `history.expire.max-snapshots-to-keep + 1` snapshots. I've created a >>>>>>> draft >>>>>>> PR to demo the idea[1]. >>>>>>> >>>>>>> If you agree this is a valid feature request, we also need to update >>>>>>> SnapshotRef[2] adding a new field `max-snapshots-to-keep`. Will there >>>>>>> be a >>>>>>> compatibility issue or too much cost to maintain compatibility? My >>>>>>> experiment shows many parsers need to be updated. >>>>>>> >>>>>>> I'd like to hear your thoughts on this. >>>>>>> >>>>>>> 1. https://github.com/apache/iceberg/pull/11879 >>>>>>> 2. https://iceberg.apache.org/spec/#snapshot-references >>>>>>> >>>>>>> Happy New Year! >>>>>>> Manu >>>>>>> >>>>>>