Re: Provide a mechanism to purge the events/metrics table

Nándor Kollár Wed, 10 Jun 2026 12:55:30 -0700

+1 for the Helm chart maintenance section too. Would that create a k8s
cron job, which periodically executes the cleanup admin command?
Customers, who don't use Kubernetes should solve the scheduling in
their own system, for example configuring a cron job on a VM?


Dmitri Bourlatchkov <[email protected]> ezt írta (időpont: 2026. jún.
9., K, 5:34):
>
> Hi Yong,
>
> +1 to adding a maintenance section to the helm chart.
>
> Cheers,
> Dmitri.
>
> On Mon, Jun 8, 2026 at 10:13 PM Yong Zheng <[email protected]> wrote:
>
> > Hello Nándor and Dmitri,
> >
> > I agree this is becoming more important as we persist more data in the
> > Polaris backend. Today we have at least the events tables and the persisted
> > Iceberg metrics tables that need some form of cleanup and retention
> > management.
> >
> > The admin tool approach sounds reasonable to me. It gives operators control
> > over when cleanup runs and allows them to use existing scheduling
> > mechanisms such as k8s crob.
> >
> > It would also be nice to avoid building a separate cleanup solution for
> > every feature. If we go down the admin tool route, perhaps we can have a
> > common maintenance framework that supports events cleanup, metrics cleanup,
> > engine-specific maintenance tasks (for example, rebuilding indexes), as
> > well as future maintenance operations.
> >
> > I am pretty open-ended on the implementation details. One thing that I
> > think would be beneficial is introducing a maintenance section in the
> > Polaris helm chart. That would allow operators to configure and schedule
> > maintenance tasks without having to create separate one-off charts or jobs
> > for each task.
> >
> > Thanks,
> > Yong Zheng
> >
> >
> > On Mon, Jun 8, 2026 at 8:01 PM Dmitri Bourlatchkov <[email protected]>
> > wrote:
> >
> > > Hi Yong,
> > >
> > > Thanks for starting this discussion!
> > >
> > > From my POV the Admin tool does look like a good fit for this capability.
> > > It is similar to the NoSQL maintenance task [3395].
> > >
> > > I believe end users could then schedule the maintenance runs according to
> > > their deployment mechanics, e.g. via k8s jobs.
> > >
> > > I made an attempt at refactoring the Admin CLI for pluggability in terms
> > of
> > > sub-commands in [3947]. We could revive that PR if there's community
> > > interest. The Metrics / Events maintenance tasks could then be plugged in
> > > similarly to NoSQL maintenance.
> > >
> > > [3395] https://github.com/apache/polaris/pull/3395
> > >
> > > [3947] https://github.com/apache/polaris/pull/3947
> > >
> > > Cheers,
> > > Dmitri.
> > >
> > > On Sun, Jun 7, 2026 at 2:34 PM Yong Zheng <[email protected]> wrote:
> > >
> > > > Hello,
> > > >
> > > > A while back Alex raised https://github.com/apache/polaris/issues/2573
> > > > for requesting a mechanism to purge the events table. Recently there
> > is a
> > > > persisted iceberg metrics also got introduced (
> > > > https://github.com/apache/polaris/pull/3385) and this created two
> > tables
> > > > (read and write metrics tables) which we also lack the life cycle
> > > > management and tables size should grow indefinitely. We will likely
> > need
> > > a
> > > > mechanism to handle both.
> > > >
> > > > I am wondering what does community thinks about this? Should this be
> > part
> > > > of admin tool where admins/ops should make the call on when to clean up
> > > or
> > > > should we have a janitor process that runs automatically (users will
> > need
> > > > to provide rules on what to cleanup such as time based TTL).
> > > >
> > > > Thanks,
> > > > Yong Zheng
> > > >
> > >
> >

Re: Provide a mechanism to purge the events/metrics table

Reply via email to