Re: [DISCUSS] PIP-4 Support savepoint

Jingsong Li Sun, 21 May 2023 05:54:48 -0700

Thanks Yun for your information.

We need to be careful to avoid confusion between Paimon and Flink
concepts about "savepoint"


Maybe we don't have to insist on using this "savepoint", for example,
TAG is also a candidate just like Iceberg [1]

[1] https://iceberg.apache.org/docs/latest/branching/

Best,
Jingsong

On Sun, May 21, 2023 at 8:51 PM Jingsong Li <[email protected]> wrote:
>
> Thanks Nicholas for your detailed requirements.
>
> We need to supplement user requirements in FLIP, which is mainly aimed
> at two purposes:
> 1. Fault recovery for data errors (named: restore or rollback-to)
> 2. Used to record versions at the day level (such as), targeting batch queries
>
> Best,
> Jingsong
>
> On Sat, May 20, 2023 at 2:55 PM Yun Tang <[email protected]> wrote:
> >
> > Hi Guys,
> >
> > Since we use Paimon with Flink in most cases, I think we need to identify 
> > the same word "savepoint" in different systems.
> >
> > For Flink, savepoint means:
> >
> >   1.  Triggered by users, not periodically triggered by the system itself. 
> > However, this FLIP wants to support it created periodically.
> >   2.  Even the so-called incremental native savepoint [1], it will not 
> > depend on the previous checkpoints or savepoints, it will still copy files 
> > on DFS to the self-contained savepoint folder. However, from the 
> > description of this FLIP about the deletion of expired snapshot files, 
> > paimion savepoint will refer to the previously existing files directly.
> >
> > I don't think we need to make the semantics of Paimon totally the same as 
> > Flink's. However, we need to introduce a table to tell the difference 
> > compared with Flink and discuss about the difference.
> >
> > [1] 
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-203%3A+Incremental+savepoints#FLIP203:Incrementalsavepoints-Semantic
> >
> > Best
> > Yun Tang
> > ________________________________
> > From: Nicholas Jiang <[email protected]>
> > Sent: Friday, May 19, 2023 17:40
> > To: [email protected] <[email protected]>
> > Subject: Re: [DISCUSS] PIP-4 Support savepoint
> >
> > Hi Guys,
> >
> > Thanks Zelin for driving the savepoint proposal. I propose some opinions 
> > for savepont:
> >
> > -- About "introduce savepoint for Paimon to persist full data in a time 
> > point"
> >
> > The motivation of savepoint proposal is more like snapshot TTL management. 
> > Actually, disaster recovery is very much mission critical for any software. 
> > Especially when it comes to data systems, the impact could be very serious 
> > leading to delay in business decisions or even wrong business decisions at 
> > times. Savepoint is proposed to assist users in recovering data from a 
> > previous state: "savepoint" and "restore".
> >
> > "savepoint" saves the Paimon table as of the commit time, therefore if 
> > there is a savepoint, the data generated in the corresponding commit could 
> > not be clean. Meanwhile, savepoint could let user restore the table to this 
> > savepoint at a later point in time if need be. On similar lines, savepoint 
> > cannot be triggered on a commit that is already cleaned up. Savepoint is 
> > synonymous to taking a backup, just that we don't make a new copy of the 
> > table, but just save the state of the table elegantly so that we can 
> > restore it later when in need.
> >
> > "restore" lets you restore your table to one of the savepoint commit. 
> > Meanwhile, it cannot be undone (or reversed) and so care should be taken 
> > before doing a restore. At this time, Paimon would delete all data files 
> > and commit files (timeline files) greater than the savepoint commit to 
> > which the table is being restored.
> >
> > BTW, it's better to introduce snapshot view based on savepoint, which could 
> > improve query performance of historical data for Paimon table.
> >
> > -- About Public API of savepont
> >
> > Current introduced savepoint interfaces in Public API are not enough for 
> > users, for example, deleteSavepoint, restoreSavepoint etc.
> >
> > -- About "Paimon's savepoint need to be combined with Flink's savepoint":
> >
> > If paimon supports savepoint mechanism and provides savepoint interfaces, 
> > the integration with Flink's savepoint is not blocked for this proposal.
> >
> > In summary, savepoint is not only used to improve the query performance of 
> > historical data, but also used for disaster recovery processing.
> >
> > On 2023/05/17 09:53:11 Jingsong Li wrote:
> > > What Shammon mentioned is interesting. I agree with what he said about
> > > the differences in savepoints between databases and stream computing.
> > >
> > > About "Paimon's savepoint need to be combined with Flink's savepoint":
> > >
> > > I think it is possible, but we may need to deal with this in another
> > > mechanism, because the snapshots after savepoint may expire. We need
> > > to compare data between two savepoints to generate incremental data
> > > for streaming read.
> > >
> > > But this may not need to block FLIP, it looks like the current design
> > > does not break the future combination?
> > >
> > > Best,
> > > Jingsong
> > >
> > > On Wed, May 17, 2023 at 5:33 PM Shammon FY <[email protected]> wrote:
> > > >
> > > > Hi Caizhi,
> > > >
> > > > Thanks for your comments. As you mentioned, I think we may need to 
> > > > discuss
> > > > the role of savepoint in Paimon.
> > > >
> > > > If I understand correctly, the main feature of savepoint in the current 
> > > > PIP
> > > > is that the savepoint will not be expired, and users can perform a 
> > > > query on
> > > > the savepoint according to time-travel. Besides that, there is 
> > > > savepoint in
> > > > the database and Flink.
> > > >
> > > > 1. Savepoint in database. The database can roll back table data to the
> > > > specified 'version' based on savepoint. So the key point of savepoint in
> > > > the database is to rollback data.
> > > >
> > > > 2. Savepoint in Flink. Users can trigger a savepoint with a specific
> > > > 'path', and save all data of state to the savepoint for job. Then users 
> > > > can
> > > > create a new job based on the savepoint to continue consuming 
> > > > incremental
> > > > data. I think the core capabilities are: backup for a job, and resume a 
> > > > job
> > > > based on the savepoint.
> > > >
> > > > In addition to the above, Paimon may also face data write corruption and
> > > > need to recover data based on the specified savepoint. So we may need to
> > > > consider what abilities should Paimon savepoint need besides the ones
> > > > mentioned in the current PIP?
> > > >
> > > > Additionally, as mentioned above, Flink also has
> > > > savepoint mechanism. During the process of streaming data from Flink to
> > > > Paimon, does Paimon's savepoint need to be combined with Flink's 
> > > > savepoint?
> > > >
> > > >
> > > > Best,
> > > > Shammon FY
> > > >
> > > >
> > > > On Wed, May 17, 2023 at 4:02 PM Caizhi Weng <[email protected]> 
> > > > wrote:
> > > >
> > > > > Hi developers!
> > > > >
> > > > > Thanks Zelin for bringing up the discussion. The proposal seems good 
> > > > > to me
> > > > > overall. However I'd also like to bring up a few options.
> > > > >
> > > > > 1. As Jingsong mentioned, Savepoint class should not become a public 
> > > > > API,
> > > > > at least for now. What we need to discuss for the public API is how 
> > > > > the
> > > > > users can create or delete savepoints. For example, what the table 
> > > > > option
> > > > > looks like, what commands and options are provided for the Flink 
> > > > > action,
> > > > > etc.
> > > > >
> > > > > 2. Currently most Flink actions are related to streaming processing, 
> > > > > so
> > > > > only Flink can support them. However, savepoint creation and deletion 
> > > > > seems
> > > > > like a feature for batch processing. So aside from Flink actions, 
> > > > > shall we
> > > > > also provide something like Spark actions for savepoints?
> > > > >
> > > > > I would also like to comment on Shammon's views.
> > > > >
> > > > > Should we introduce an option for savepoint path which may be 
> > > > > different
> > > > > > from 'warehouse'? Then users can backup the data of savepoint.
> > > > > >
> > > > >
> > > > > I don't see this is necessary. To backup a table the user just need 
> > > > > to copy
> > > > > all files from the table directory. Savepoint in Paimon, as far as I
> > > > > understand, is mainly for users to review historical data, not for 
> > > > > backing
> > > > > up tables.
> > > > >
> > > > > Will the savepoint copy data files from snapshot or only save meta 
> > > > > files?
> > > > > >
> > > > >
> > > > > It would be a heavy burden if a savepoint copies all its files. As I
> > > > > mentioned above, savepoint is not for backing up tables.
> > > > >
> > > > >  How can users create a new table and restore data from the specified
> > > > > > savepoint?
> > > > >
> > > > >
> > > > > This reminds me of savepoints in Flink. Still, savepoint is not for 
> > > > > backing
> > > > > up tables so I guess we don't need to support "restoring data" from a
> > > > > savepoint.
> > > > >
> > > > > Shammon FY <[email protected]> 于2023年5月17日周三 10:32写道：
> > > > >
> > > > > > Thanks Zelin for initiating this discussion. I have some comments:
> > > > > >
> > > > > > 1. Should we introduce an option for savepoint path which may be
> > > > > different
> > > > > > from 'warehouse'? Then users can backup the data of savepoint.
> > > > > >
> > > > > > 2. Will the savepoint copy data files from snapshot or only save 
> > > > > > meta
> > > > > > files? The description in the PIP "After we introduce savepoint, we
> > > > > should
> > > > > > also check if the data files are used by savepoints." looks like we 
> > > > > > only
> > > > > > save meta files for savepoint.
> > > > > >
> > > > > > 3. How can users create a new table and restore data from the 
> > > > > > specified
> > > > > > savepoint?
> > > > > >
> > > > > > Best,
> > > > > > Shammon FY
> > > > > >
> > > > > >
> > > > > > On Wed, May 17, 2023 at 10:19 AM Jingsong Li 
> > > > > > <[email protected]>
> > > > > > wrote:
> > > > > >
> > > > > > > Thanks Zelin for driving.
> > > > > > >
> > > > > > > Some comments:
> > > > > > >
> > > > > > > 1. I think it's possible to advance `Proposed Changes` to the top,
> > > > > > > Public API has no meaning if I don't know how to do it.
> > > > > > >
> > > > > > > 2. Public API, Savepoint and SavepointManager are not Public API, 
> > > > > > > only
> > > > > > > Flink action or configuration option should be public API.
> > > > > > >
> > > > > > > 3.Maybe we can have a separate chapter to describe
> > > > > > > `savepoint.create-interval`, maybe 'Periodically savepoint'? It 
> > > > > > > is not
> > > > > > > just an interval, because the true user case is savepoint after 
> > > > > > > 0:00.
> > > > > > >
> > > > > > > 4.About 'Interaction with Snapshot', to be continued ...
> > > > > > >
> > > > > > > Best,
> > > > > > > Jingsong
> > > > > > >
> > > > > > > On Tue, May 16, 2023 at 7:07 PM yu zelin <[email protected]>
> > > > > wrote:
> > > > > > > >
> > > > > > > > Hi, Paimon Devs,
> > > > > > > >      I’d like to start a discussion about PIP-4[1]. In this 
> > > > > > > > PIP, I
> > > > > want
> > > > > > > to talk about why we need savepoint, and some thoughts about 
> > > > > > > managing
> > > > > and
> > > > > > > using savepoint. Look forward to your question and suggestions.
> > > > > > > >
> > > > > > > > Best,
> > > > > > > > Yu Zelin
> > > > > > > >
> > > > > > > > [1] https://cwiki.apache.org/confluence/x/NxE0Dw
> > > > > > >
> > > > > >
> > > > >
> > >

Re: [DISCUSS] PIP-4 Support savepoint

Reply via email to