Hi Yufei,

Great questions!

>From what I can see in the PR, here are the answers to your questions:
1. The first major scenario is improving the memory concerns with purge.
That's important to stabilize a core use case in the service.
2. These are related specifically to file operations. I cannot see a way
that it would be broader than that.

Go community,

Adam

On Mon, Dec 15, 2025, 3:20 PM Yufei Gu <[email protected]> wrote:

> Hi Robert,
>
> Thanks for sharing the proposal and the PR. Before diving deeper into the
> API shape, I was hoping to better understand the intended use cases you
> have in mind:
>
> 1. What concrete scenarios are you primarily targeting with these
> long-running object store operations?
> 2. Are these mostly expected to be file/object-level maintenance tasks
> (e.g. purge, cleanup), or do you envision broader categories of operations
> leveraging the same abstraction?
>
> Having a clearer picture of the motivating use cases would help evaluate
> the right level of abstraction and where this should live architecturally.
>
> Looking forward to the discussion.
>
> Yufei
>
>
> On Fri, Dec 12, 2025 at 3:48 AM Robert Stupp <[email protected]> wrote:
>
> > Hi all,
> >
> > I'd like to propose an API and corresponding implementation for (long
> > running) object store operations.
> >
> > It provides a CPU and heap-friendly API and implementation to work
> > against object stores. It is built in a way to provide "pluggable"
> > functionality. What I mean is this (Java pseudo code):
> > ---
> > FileOperations fileOps =
> > fileOperationsFactory.createFileOperations(fileIoInstance);
> > Stream<FileSpec> allIcebergTableFiles = fileOps.
> >     identifyIcebergTableFiles(metadataLocation);
> > PurgeStats purged = fileOps.purge(allIcebergTableFiles);
> > // or simpler:
> > PurgeStats purged = fileOps.purgeIcebergTable(metadataLocation);
> > // or similarly for Iceberg views
> > PurgeStats purged = fileOps.purgeIcebergView(metadataLocation);
> > // or to purge all files underneath a prefix
> > PurgeStats purged = fileOps.purge(fileOps.findFiles(prefix));
> > ---
> >
> > Not mentioned in the pseudo code is the ability to rate-limit the
> > number of purged files or batch-deletions and configure the deletion
> > batch-size.
> >
> > The PR already contains tests against an on-heap object store mock and
> > integration tests against S3/GCS/Azure emulators.
> >
> > More details can be found in the README [2] included in the PR and of
> > course in the code in the PR.
> >
> > Robert
> >
> > [1] https://github.com/apache/polaris/pull/3256
> > [2]
> >
> https://github.com/snazy/polaris/blob/obj-store-ops/storage/files/README.md
> >
>

Reply via email to