Re: Iceberg old partition gc
Let me paraphrase the use case to make sure I'm getting it right: The idea is to be able to remove expired data and delete the data files associated with it, but without losing the history of other changes to the table. Because new data and old data are modified in the same linear history, physically removing old data (via snapshot expiration) prevents you from keeping history for the new data. There are a few ways I can think of to work around this. I think what most people do is remove data a few days ahead of time so that it doesn't need to be physically removed immediately. That's the default behavior, which I think isn't what you want in this case. Another option is to just delete the expired data files immediately. You'd still have metadata references to them, but those won't cause issues as long as no one tries to read the files. Of course, that runs into issues with full table operations, like `select * limit 10` where you could accidentally try to access a deleted file that's still referenced. Last, I think you could solve this with branching, while also keeping overhead down. The idea is to create a branch for each version you actually want to keep. That's probably like a daily branch so you don't keep every version of the table. Then you can apply deletes to all of the historical branches and keep just the latest snapshot for each branch. That allows you to select the table states you want to keep and still delete within that set of states. Deleting data would be a bit more difficult, but you would probably be able to reuse the same metadata changes for all the deletes. It sounds like the last option is probably the one that makes the most sense for you. Customizing history is a great use for tagging and branching. Ryan On Sat, Jun 3, 2023 at 5:03 AM Szehon Ho wrote: > @Szehon, I am wondering if we can create materialized views for metadata >> tables to support infinite history on metadata tables (like snapshots or >> partitions). Obviously, materialized views can't be used for time travel or >> rollback. They are only meant for maintaining long/infinite histories. > > > Yea, that's a good idea, there's definitely options like building a tool > outside Iceberg (dumped it from time to time to materialized view), or > build a history-preserving catalog layer that saves old snapshot metadata, > rather than building it in Iceberg spec itself to keep expired metadata > files. > > Thanks > Szehon > > On Sat, Jun 3, 2023 at 10:06 AM Steven Wu wrote: > >> > the main use case I had was table historical analysis (last update time >> for each partitions, how many snapshots did this table ever have, for >> example), >> >> Partition level stats can probably help with questions like "last update >> time for each partition". >> >> @Szehon, I am wondering if we can create materialized views for metadata >> tables to support infinite history on metadata tables (like snapshots or >> partitions). Obviously, materialized views can't be used for time travel or >> rollback. They are only meant for maintaining long/infinite histories. >> >> > One use case is the user might need to time travel to a certain >> snapshot. However, such a snapshot is expired due to the snapshot >> expiration that only retains the latest snapshot operation, and this >> operation's only intent is to remove the gc partition. It seems a little >> overkill to me. >> >> @Pucheng, usually people keep Iceberg snapshot history (for time travel >> or rollback) for a few days (like 7). Very long history can burden the >> metadata system. tagging can extend the history with selective snapshots. >> >> It seems that you are saying that purging actions of old partitions are >> creating new snapshots, which are taking up some space in the snapshot >> history. But if snapshot expiration is time based (like 7 days), this >> shouldn't be a problem, right? >> >> On Fri, Jun 2, 2023 at 6:17 PM Szehon Ho wrote: >> >>> Yea, for the original use case in this thread, agree it's delete (soft) >>> + expire (physical, permanent). >>> >>> I guess I should have phrased my thought better, I was replying to >>> Ryan's question above >>> We don't often have people ask to keep snapshots that can't be read >>> >>> >>> and had thought it'd be nice to have a ExpireSnapshot mode where we >>> keep older metadata for longer periods of time beyond physical expiration. >>> >>> But the main use case I had was table historical analysis (last update >>> time for each partitions, how many snapshots did this table ever have, for >>> example), it's more a nice-to-have and definitely not sure it is a very >>> compelling use-case. Another option I guess, is custom catalog can keep >>> around these historical information. >>> >>> Thanks >>> Szehon >>> >>> On Fri, Jun 2, 2023 at 10:28 PM Russell Spitzer < >>> [email protected]> wrote: >>> I think "soft-mode" is really just doing the delete. You can then recover the snapshot if you happen to have accidental
Re: Iceberg old partition gc
> > @Szehon, I am wondering if we can create materialized views for metadata > tables to support infinite history on metadata tables (like snapshots or > partitions). Obviously, materialized views can't be used for time travel or > rollback. They are only meant for maintaining long/infinite histories. Yea, that's a good idea, there's definitely options like building a tool outside Iceberg (dumped it from time to time to materialized view), or build a history-preserving catalog layer that saves old snapshot metadata, rather than building it in Iceberg spec itself to keep expired metadata files. Thanks Szehon On Sat, Jun 3, 2023 at 10:06 AM Steven Wu wrote: > > the main use case I had was table historical analysis (last update time > for each partitions, how many snapshots did this table ever have, for > example), > > Partition level stats can probably help with questions like "last update > time for each partition". > > @Szehon, I am wondering if we can create materialized views for metadata > tables to support infinite history on metadata tables (like snapshots or > partitions). Obviously, materialized views can't be used for time travel or > rollback. They are only meant for maintaining long/infinite histories. > > > One use case is the user might need to time travel to a certain > snapshot. However, such a snapshot is expired due to the snapshot > expiration that only retains the latest snapshot operation, and this > operation's only intent is to remove the gc partition. It seems a little > overkill to me. > > @Pucheng, usually people keep Iceberg snapshot history (for time travel or > rollback) for a few days (like 7). Very long history can burden the > metadata system. tagging can extend the history with selective snapshots. > > It seems that you are saying that purging actions of old partitions are > creating new snapshots, which are taking up some space in the snapshot > history. But if snapshot expiration is time based (like 7 days), this > shouldn't be a problem, right? > > On Fri, Jun 2, 2023 at 6:17 PM Szehon Ho wrote: > >> Yea, for the original use case in this thread, agree it's delete (soft) + >> expire (physical, permanent). >> >> I guess I should have phrased my thought better, I was replying to Ryan's >> question above >> >>> We don't often have people ask to keep snapshots that can't be read >> >> >> and had thought it'd be nice to have a ExpireSnapshot mode where we >> keep older metadata for longer periods of time beyond physical expiration. >> >> But the main use case I had was table historical analysis (last update >> time for each partitions, how many snapshots did this table ever have, for >> example), it's more a nice-to-have and definitely not sure it is a very >> compelling use-case. Another option I guess, is custom catalog can keep >> around these historical information. >> >> Thanks >> Szehon >> >> On Fri, Jun 2, 2023 at 10:28 PM Russell Spitzer < >> [email protected]> wrote: >> >>> I think "soft-mode" is really just doing the delete. You can then >>> recover the snapshot if you happen to have accidentally TTL'd a partition. >>> >>> On Fri, Jun 2, 2023 at 8:51 AM Szehon Ho >>> wrote: >>> I think this violates Iceberg’s assumption of immutable snapshots. That would require modifying the old snapshot to no longer point to those gc’ed data files, else not sure how you can time-travel to read from that snapshot, if some of its files are deleted? That being said, I also had this thought at some point, to keep snapshot info around longer. I expect most organizations operate in a mode where they expire snapshots after a few days, and reasonably expect any time-travel or snapshot-related operation (like CDC) to happen within this timeframe. And of course, use tags to keep the snapshot from expiration. But there are some use-cases where keeping more snapshot metadata for a period longer than when it could be read could be interesting. For example, if I want to know info about the snapshot that added each data file, we probably have lost most of those snapshot metadata as they were added long ago. Example, the frequent ask to find each partition's last modified time, (in an earlier email thread). I haven't thought it completely through, but it crossed my mind that a ‘Soft’-mode of ExpireSnapshot may be useful, where we can delete data files but just mark snapshot’s metadata files as expired without physically deleting them, and so retain the ability to answer these questions. It could be done by adding ‘expired-snapshots’ list to metadata.json. That being said, its a singular use case and not sure if anyone also has interest or other use-case? It would add a bit of complexity. Thanks Szehon Szehon On Fri, Jun 2, 2023 at 7:12 AM Pucheng Yang wrote: > Ryan, > > One use case is the user mig
Re: Iceberg old partition gc
> the main use case I had was table historical analysis (last update time for each partitions, how many snapshots did this table ever have, for example), Partition level stats can probably help with questions like "last update time for each partition". @Szehon, I am wondering if we can create materialized views for metadata tables to support infinite history on metadata tables (like snapshots or partitions). Obviously, materialized views can't be used for time travel or rollback. They are only meant for maintaining long/infinite histories. > One use case is the user might need to time travel to a certain snapshot. However, such a snapshot is expired due to the snapshot expiration that only retains the latest snapshot operation, and this operation's only intent is to remove the gc partition. It seems a little overkill to me. @Pucheng, usually people keep Iceberg snapshot history (for time travel or rollback) for a few days (like 7). Very long history can burden the metadata system. tagging can extend the history with selective snapshots. It seems that you are saying that purging actions of old partitions are creating new snapshots, which are taking up some space in the snapshot history. But if snapshot expiration is time based (like 7 days), this shouldn't be a problem, right? On Fri, Jun 2, 2023 at 6:17 PM Szehon Ho wrote: > Yea, for the original use case in this thread, agree it's delete (soft) + > expire (physical, permanent). > > I guess I should have phrased my thought better, I was replying to Ryan's > question above > >> We don't often have people ask to keep snapshots that can't be read > > > and had thought it'd be nice to have a ExpireSnapshot mode where we > keep older metadata for longer periods of time beyond physical expiration. > > But the main use case I had was table historical analysis (last update > time for each partitions, how many snapshots did this table ever have, for > example), it's more a nice-to-have and definitely not sure it is a very > compelling use-case. Another option I guess, is custom catalog can keep > around these historical information. > > Thanks > Szehon > > On Fri, Jun 2, 2023 at 10:28 PM Russell Spitzer > wrote: > >> I think "soft-mode" is really just doing the delete. You can then recover >> the snapshot if you happen to have accidentally TTL'd a partition. >> >> On Fri, Jun 2, 2023 at 8:51 AM Szehon Ho wrote: >> >>> I think this violates Iceberg’s assumption of immutable snapshots. That >>> would require modifying the old snapshot to no longer point to those gc’ed >>> data files, else not sure how you can time-travel to read from that >>> snapshot, if some of its files are deleted? >>> >>> That being said, I also had this thought at some point, to keep snapshot >>> info around longer. I expect most organizations operate in a mode where >>> they expire snapshots after a few days, and reasonably expect any >>> time-travel or snapshot-related operation (like CDC) to happen within this >>> timeframe. And of course, use tags to keep the snapshot from expiration. >>> >>> But there are some use-cases where keeping more snapshot metadata for a >>> period longer than when it could be read could be interesting. For >>> example, if I want to know info about the snapshot that added each data >>> file, we probably have lost most of those snapshot metadata as they were >>> added long ago. Example, the frequent ask to find each partition's last >>> modified time, (in an earlier email thread). >>> >>> I haven't thought it completely through, but it crossed my mind that a >>> ‘Soft’-mode of ExpireSnapshot may be useful, where we can delete data files >>> but just mark snapshot’s metadata files as expired without physically >>> deleting them, and so retain the ability to answer these questions. It >>> could be done by adding ‘expired-snapshots’ list to metadata.json. That >>> being said, its a singular use case and not sure if anyone also has >>> interest or other use-case? It would add a bit of complexity. >>> >>> Thanks >>> Szehon >>> Szehon >>> >>> On Fri, Jun 2, 2023 at 7:12 AM Pucheng Yang >>> wrote: >>> Ryan, One use case is the user might need to time travel to a certain snapshot. However, such a snapshot is expired due to the snapshot expiration that only retains the latest snapshot operation, and this operation's only intent is to remove the gc partition. It seems a little overkill to me. I hope my explanation makes sense to you. On Thu, Jun 1, 2023 at 3:39 PM Ryan Blue wrote: > Pucheng, > > What is the use case around keeping the snapshot longer? We don't > often have people ask to keep snapshots that can't be read, so it sounds > like you might have something specific in mind? > > Ryan > > On Wed, May 31, 2023 at 8:19 PM Pucheng Yang > wrote: > >> Hi community, >> >> In my organization, a big portion of the datasets are partitioned
Re: Iceberg old partition gc
Yea, for the original use case in this thread, agree it's delete (soft) + expire (physical, permanent). I guess I should have phrased my thought better, I was replying to Ryan's question above > We don't often have people ask to keep snapshots that can't be read and had thought it'd be nice to have a ExpireSnapshot mode where we keep older metadata for longer periods of time beyond physical expiration. But the main use case I had was table historical analysis (last update time for each partitions, how many snapshots did this table ever have, for example), it's more a nice-to-have and definitely not sure it is a very compelling use-case. Another option I guess, is custom catalog can keep around these historical information. Thanks Szehon On Fri, Jun 2, 2023 at 10:28 PM Russell Spitzer wrote: > I think "soft-mode" is really just doing the delete. You can then recover > the snapshot if you happen to have accidentally TTL'd a partition. > > On Fri, Jun 2, 2023 at 8:51 AM Szehon Ho wrote: > >> I think this violates Iceberg’s assumption of immutable snapshots. That >> would require modifying the old snapshot to no longer point to those gc’ed >> data files, else not sure how you can time-travel to read from that >> snapshot, if some of its files are deleted? >> >> That being said, I also had this thought at some point, to keep snapshot >> info around longer. I expect most organizations operate in a mode where >> they expire snapshots after a few days, and reasonably expect any >> time-travel or snapshot-related operation (like CDC) to happen within this >> timeframe. And of course, use tags to keep the snapshot from expiration. >> >> But there are some use-cases where keeping more snapshot metadata for a >> period longer than when it could be read could be interesting. For >> example, if I want to know info about the snapshot that added each data >> file, we probably have lost most of those snapshot metadata as they were >> added long ago. Example, the frequent ask to find each partition's last >> modified time, (in an earlier email thread). >> >> I haven't thought it completely through, but it crossed my mind that a >> ‘Soft’-mode of ExpireSnapshot may be useful, where we can delete data files >> but just mark snapshot’s metadata files as expired without physically >> deleting them, and so retain the ability to answer these questions. It >> could be done by adding ‘expired-snapshots’ list to metadata.json. That >> being said, its a singular use case and not sure if anyone also has >> interest or other use-case? It would add a bit of complexity. >> >> Thanks >> Szehon >> Szehon >> >> On Fri, Jun 2, 2023 at 7:12 AM Pucheng Yang >> wrote: >> >>> Ryan, >>> >>> One use case is the user might need to time travel to a certain >>> snapshot. However, such a snapshot is expired due to the snapshot >>> expiration that only retains the latest snapshot operation, and this >>> operation's only intent is to remove the gc partition. It seems a little >>> overkill to me. >>> >>> I hope my explanation makes sense to you. >>> >>> On Thu, Jun 1, 2023 at 3:39 PM Ryan Blue wrote: >>> Pucheng, What is the use case around keeping the snapshot longer? We don't often have people ask to keep snapshots that can't be read, so it sounds like you might have something specific in mind? Ryan On Wed, May 31, 2023 at 8:19 PM Pucheng Yang wrote: > Hi community, > > In my organization, a big portion of the datasets are partitioned by > date, normally we keep the latest X dates of partition for a given > dataset. > > One issue that always bothers me is if I want to delete a partition > that should be GC, I will run SQL query "delete from tbl where dt = ..." > and do snapshot expiration to keep the latest snapshot to make sure that > partition data is physically removed. However, the downside of this > approach is the table snapshot history will be completely lost.. > > I wonder if anyone else in the community has the same pain point? How > do you solve this? I would love to understand if there is a solution to > this otherwise we can brainstorm if there is a way to solve this. > > Thanks! > > Pucheng > -- Ryan Blue Tabular >>>
Re: Iceberg old partition gc
I think "soft-mode" is really just doing the delete. You can then recover the snapshot if you happen to have accidentally TTL'd a partition. On Fri, Jun 2, 2023 at 8:51 AM Szehon Ho wrote: > I think this violates Iceberg’s assumption of immutable snapshots. That > would require modifying the old snapshot to no longer point to those gc’ed > data files, else not sure how you can time-travel to read from that > snapshot, if some of its files are deleted? > > That being said, I also had this thought at some point, to keep snapshot > info around longer. I expect most organizations operate in a mode where > they expire snapshots after a few days, and reasonably expect any > time-travel or snapshot-related operation (like CDC) to happen within this > timeframe. And of course, use tags to keep the snapshot from expiration. > > But there are some use-cases where keeping more snapshot metadata for a > period longer than when it could be read could be interesting. For > example, if I want to know info about the snapshot that added each data > file, we probably have lost most of those snapshot metadata as they were > added long ago. Example, the frequent ask to find each partition's last > modified time, (in an earlier email thread). > > I haven't thought it completely through, but it crossed my mind that a > ‘Soft’-mode of ExpireSnapshot may be useful, where we can delete data files > but just mark snapshot’s metadata files as expired without physically > deleting them, and so retain the ability to answer these questions. It > could be done by adding ‘expired-snapshots’ list to metadata.json. That > being said, its a singular use case and not sure if anyone also has > interest or other use-case? It would add a bit of complexity. > > Thanks > Szehon > Szehon > > On Fri, Jun 2, 2023 at 7:12 AM Pucheng Yang > wrote: > >> Ryan, >> >> One use case is the user might need to time travel to a certain snapshot. >> However, such a snapshot is expired due to the snapshot expiration >> that only retains the latest snapshot operation, and this operation's only >> intent is to remove the gc partition. It seems a little overkill to me. >> >> I hope my explanation makes sense to you. >> >> On Thu, Jun 1, 2023 at 3:39 PM Ryan Blue wrote: >> >>> Pucheng, >>> >>> What is the use case around keeping the snapshot longer? We don't often >>> have people ask to keep snapshots that can't be read, so it sounds like you >>> might have something specific in mind? >>> >>> Ryan >>> >>> On Wed, May 31, 2023 at 8:19 PM Pucheng Yang >>> wrote: >>> Hi community, In my organization, a big portion of the datasets are partitioned by date, normally we keep the latest X dates of partition for a given dataset. One issue that always bothers me is if I want to delete a partition that should be GC, I will run SQL query "delete from tbl where dt = ..." and do snapshot expiration to keep the latest snapshot to make sure that partition data is physically removed. However, the downside of this approach is the table snapshot history will be completely lost.. I wonder if anyone else in the community has the same pain point? How do you solve this? I would love to understand if there is a solution to this otherwise we can brainstorm if there is a way to solve this. Thanks! Pucheng >>> >>> >>> -- >>> Ryan Blue >>> Tabular >>> >>
Re: Iceberg old partition gc
I think this violates Iceberg’s assumption of immutable snapshots. That would require modifying the old snapshot to no longer point to those gc’ed data files, else not sure how you can time-travel to read from that snapshot, if some of its files are deleted? That being said, I also had this thought at some point, to keep snapshot info around longer. I expect most organizations operate in a mode where they expire snapshots after a few days, and reasonably expect any time-travel or snapshot-related operation (like CDC) to happen within this timeframe. And of course, use tags to keep the snapshot from expiration. But there are some use-cases where keeping more snapshot metadata for a period longer than when it could be read could be interesting. For example, if I want to know info about the snapshot that added each data file, we probably have lost most of those snapshot metadata as they were added long ago. Example, the frequent ask to find each partition's last modified time, (in an earlier email thread). I haven't thought it completely through, but it crossed my mind that a ‘Soft’-mode of ExpireSnapshot may be useful, where we can delete data files but just mark snapshot’s metadata files as expired without physically deleting them, and so retain the ability to answer these questions. It could be done by adding ‘expired-snapshots’ list to metadata.json. That being said, its a singular use case and not sure if anyone also has interest or other use-case? It would add a bit of complexity. Thanks Szehon Szehon On Fri, Jun 2, 2023 at 7:12 AM Pucheng Yang wrote: > Ryan, > > One use case is the user might need to time travel to a certain snapshot. > However, such a snapshot is expired due to the snapshot expiration > that only retains the latest snapshot operation, and this operation's only > intent is to remove the gc partition. It seems a little overkill to me. > > I hope my explanation makes sense to you. > > On Thu, Jun 1, 2023 at 3:39 PM Ryan Blue wrote: > >> Pucheng, >> >> What is the use case around keeping the snapshot longer? We don't often >> have people ask to keep snapshots that can't be read, so it sounds like you >> might have something specific in mind? >> >> Ryan >> >> On Wed, May 31, 2023 at 8:19 PM Pucheng Yang >> wrote: >> >>> Hi community, >>> >>> In my organization, a big portion of the datasets are partitioned by >>> date, normally we keep the latest X dates of partition for a given dataset. >>> >>> One issue that always bothers me is if I want to delete a partition >>> that should be GC, I will run SQL query "delete from tbl where dt = ..." >>> and do snapshot expiration to keep the latest snapshot to make sure that >>> partition data is physically removed. However, the downside of this >>> approach is the table snapshot history will be completely lost.. >>> >>> I wonder if anyone else in the community has the same pain point? How do >>> you solve this? I would love to understand if there is a solution to this >>> otherwise we can brainstorm if there is a way to solve this. >>> >>> Thanks! >>> >>> Pucheng >>> >> >> >> -- >> Ryan Blue >> Tabular >> >
Re: Iceberg old partition gc
Ryan, One use case is the user might need to time travel to a certain snapshot. However, such a snapshot is expired due to the snapshot expiration that only retains the latest snapshot operation, and this operation's only intent is to remove the gc partition. It seems a little overkill to me. I hope my explanation makes sense to you. On Thu, Jun 1, 2023 at 3:39 PM Ryan Blue wrote: > Pucheng, > > What is the use case around keeping the snapshot longer? We don't often > have people ask to keep snapshots that can't be read, so it sounds like you > might have something specific in mind? > > Ryan > > On Wed, May 31, 2023 at 8:19 PM Pucheng Yang > wrote: > >> Hi community, >> >> In my organization, a big portion of the datasets are partitioned by >> date, normally we keep the latest X dates of partition for a given dataset. >> >> One issue that always bothers me is if I want to delete a partition >> that should be GC, I will run SQL query "delete from tbl where dt = ..." >> and do snapshot expiration to keep the latest snapshot to make sure that >> partition data is physically removed. However, the downside of this >> approach is the table snapshot history will be completely lost.. >> >> I wonder if anyone else in the community has the same pain point? How do >> you solve this? I would love to understand if there is a solution to this >> otherwise we can brainstorm if there is a way to solve this. >> >> Thanks! >> >> Pucheng >> > > > -- > Ryan Blue > Tabular >
Re: Iceberg old partition gc
Pucheng, What is the use case around keeping the snapshot longer? We don't often have people ask to keep snapshots that can't be read, so it sounds like you might have something specific in mind? Ryan On Wed, May 31, 2023 at 8:19 PM Pucheng Yang wrote: > Hi community, > > In my organization, a big portion of the datasets are partitioned by date, > normally we keep the latest X dates of partition for a given dataset. > > One issue that always bothers me is if I want to delete a partition > that should be GC, I will run SQL query "delete from tbl where dt = ..." > and do snapshot expiration to keep the latest snapshot to make sure that > partition data is physically removed. However, the downside of this > approach is the table snapshot history will be completely lost.. > > I wonder if anyone else in the community has the same pain point? How do > you solve this? I would love to understand if there is a solution to this > otherwise we can brainstorm if there is a way to solve this. > > Thanks! > > Pucheng > -- Ryan Blue Tabular
