Hi developers!

Thanks Zelin for bringing up the discussion. The proposal seems good to me
overall. However I'd also like to bring up a few options.

1. As Jingsong mentioned, Savepoint class should not become a public API,
at least for now. What we need to discuss for the public API is how the
users can create or delete savepoints. For example, what the table option
looks like, what commands and options are provided for the Flink action,
etc.

2. Currently most Flink actions are related to streaming processing, so
only Flink can support them. However, savepoint creation and deletion seems
like a feature for batch processing. So aside from Flink actions, shall we
also provide something like Spark actions for savepoints?

I would also like to comment on Shammon's views.

Should we introduce an option for savepoint path which may be different
> from 'warehouse'? Then users can backup the data of savepoint.
>

I don't see this is necessary. To backup a table the user just need to copy
all files from the table directory. Savepoint in Paimon, as far as I
understand, is mainly for users to review historical data, not for backing
up tables.

Will the savepoint copy data files from snapshot or only save meta files?
>

It would be a heavy burden if a savepoint copies all its files. As I
mentioned above, savepoint is not for backing up tables.

 How can users create a new table and restore data from the specified
> savepoint?


This reminds me of savepoints in Flink. Still, savepoint is not for backing
up tables so I guess we don't need to support "restoring data" from a
savepoint.

Shammon FY <[email protected]> 于2023年5月17日周三 10:32写道:

> Thanks Zelin for initiating this discussion. I have some comments:
>
> 1. Should we introduce an option for savepoint path which may be different
> from 'warehouse'? Then users can backup the data of savepoint.
>
> 2. Will the savepoint copy data files from snapshot or only save meta
> files? The description in the PIP "After we introduce savepoint, we should
> also check if the data files are used by savepoints." looks like we only
> save meta files for savepoint.
>
> 3. How can users create a new table and restore data from the specified
> savepoint?
>
> Best,
> Shammon FY
>
>
> On Wed, May 17, 2023 at 10:19 AM Jingsong Li <[email protected]>
> wrote:
>
> > Thanks Zelin for driving.
> >
> > Some comments:
> >
> > 1. I think it's possible to advance `Proposed Changes` to the top,
> > Public API has no meaning if I don't know how to do it.
> >
> > 2. Public API, Savepoint and SavepointManager are not Public API, only
> > Flink action or configuration option should be public API.
> >
> > 3.Maybe we can have a separate chapter to describe
> > `savepoint.create-interval`, maybe 'Periodically savepoint'? It is not
> > just an interval, because the true user case is savepoint after 0:00.
> >
> > 4.About 'Interaction with Snapshot', to be continued ...
> >
> > Best,
> > Jingsong
> >
> > On Tue, May 16, 2023 at 7:07 PM yu zelin <[email protected]> wrote:
> > >
> > > Hi, Paimon Devs,
> > >      I’d like to start a discussion about PIP-4[1]. In this PIP, I want
> > to talk about why we need savepoint, and some thoughts about managing and
> > using savepoint. Look forward to your question and suggestions.
> > >
> > > Best,
> > > Yu Zelin
> > >
> > > [1] https://cwiki.apache.org/confluence/x/NxE0Dw
> >
>

Reply via email to