Hey Russell, I agree, Table API already has ExpireSnapshots and RewriteManifests. In that case, the wrappers add two things on top:
1. Result reporting with actual delete counts across the different file types. The current table API doesn't return a result object. 2. Consistent API: ActionsProvider would aggregate all available local actions in one place for consumers like (CLI tools, testing, etc.). The more interesting actions are the ones without Table API equivalents: DeleteOrphanFiles, RewriteTablePath, RewriteDataFiles. I think it would be useful to be able to run all actions without Spark dependencies. What do you think? Cheers, Max On Wed, Feb 25, 2026 at 8:43 PM Russell Spitzer <[email protected]> wrote: > > So for those first two they already exist in our Table.java API > > table.expireSnapshots() > .expireOlderThan(tsToExpire) > .commit(); > > table.rewriteManifests() > .commit(); > > Only RewriteTablePath doesn't have a local version yet but I think we could > possibly add that > > What were you thinking of adding to the existing apis? > > On Wed, Feb 25, 2026 at 2:17 AM Maximilian Michels <[email protected]> wrote: >> >> Hi Russell, >> >> Exactly, for many actions this is mostly plumbing to make the existing >> functionality available. >> >> >Which ones would you like to add implementations for? >> >> We can start with some simple ones, e.g. ExpireSnapshots, >> RewriteManifests, RewriteTablePath. >> >> -Max >> >> >> On Tue, Feb 24, 2026 at 5:03 PM Russell Spitzer >> <[email protected]> wrote: >> > >> > We already do have non-distributed versions for a bunch of the >> > functionality in core (that's what the actions were based on) so I don't >> > think this is a wild idea. Which ones would you like to add >> > implementations for? >> > >> > On Tue, Feb 24, 2026 at 9:23 AM Maximilian Michels <[email protected]> wrote: >> >> >> >> Hi everyone, >> >> >> >> I've been looking at the Iceberg Actions [1] and noticed many of them >> >> don't fundamentally require a distributed engine. >> >> >> >> Apart from RewriteDataFiles, most of the maintenance tasks are rather >> >> lightweight in the processing department. Some of them could probably run >> >> faster and with fewer resources locally, backed by a thread pool. >> >> >> >> I wonder whether Iceberg could benefit from a local implementation for >> >> ActionsProvider [2]. We have a lot of the building blocks for these >> >> already available in the core. >> >> >> >> Granted, there are scalability limitations for large tables. Also, it's >> >> often more convenient to use existing (distributed) compute >> >> infrastructure. Yet, there are use cases where distributed computing >> >> isn't strictly required. For example: >> >> >> >> - CLI tooling >> >> - CI/CD pipelines and automation scripts >> >> - REST catalog backends which want to run maintenance internally >> >> - Small tables in general >> >> - Environments where Flink/Spark are not available >> >> >> >> I'm curious to hear your thoughts. >> >> >> >> Cheers, >> >> Max >> >> >> >> [1] >> >> https://github.com/apache/iceberg/tree/501824f0c0032b3225b0fe52b904756f0fe5c589/api/src/main/java/org/apache/iceberg/actions >> >> [2] >> >> https://github.com/apache/iceberg/blob/501824f0c0032b3225b0fe52b904756f0fe5c589/api/src/main/java/org/apache/iceberg/actions/ActionsProvider.java#L24
