We already do have non-distributed versions for a bunch of the
functionality in core (that's what the actions were based on) so I don't
think this is a wild idea. Which ones would you like to add implementations
for?

On Tue, Feb 24, 2026 at 9:23 AM Maximilian Michels <[email protected]> wrote:

> Hi everyone,
>
> I've been looking at the Iceberg Actions [1] and noticed many of them
> don't fundamentally require a distributed engine.
>
> Apart from RewriteDataFiles, most of the maintenance tasks are rather
> lightweight in the processing department. Some of them could probably run
> faster and with fewer resources locally, backed by a thread pool.
>
> I wonder whether Iceberg could benefit from a local implementation for
> ActionsProvider [2]. We have a lot of the building blocks for these already
> available in the core.
>
> Granted, there are scalability limitations for large tables. Also, it's
> often more convenient to use existing (distributed) compute infrastructure.
> Yet, there are use cases where distributed computing isn't strictly
> required. For example:
>
>   - CLI tooling
>   - CI/CD pipelines and automation scripts
>   - REST catalog backends which want to run maintenance internally
>   - Small tables in general
>   - Environments where Flink/Spark are not available
>
> I'm curious to hear your thoughts.
>
> Cheers,
> Max
>
> [1]
> https://github.com/apache/iceberg/tree/501824f0c0032b3225b0fe52b904756f0fe5c589/api/src/main/java/org/apache/iceberg/actions
> [2]
> https://github.com/apache/iceberg/blob/501824f0c0032b3225b0fe52b904756f0fe5c589/api/src/main/java/org/apache/iceberg/actions/ActionsProvider.java#L24
>

Reply via email to