Re: [DISCUSS] PIP-4 Support savepoint

Jingsong Li Sun, 21 May 2023 05:51:55 -0700

Thanks Nicholas for your detailed requirements.

We need to supplement user requirements in FLIP, which is mainly aimed
at two purposes:
1. Fault recovery for data errors (named: restore or rollback-to)
2. Used to record versions at the day level (such as), targeting batch queries


Best,
Jingsong

On Sat, May 20, 2023 at 2:55 PM Yun Tang <[email protected]> wrote:
>
> Hi Guys,
>
> Since we use Paimon with Flink in most cases, I think we need to identify the 
> same word "savepoint" in different systems.
>
> For Flink, savepoint means:
>
>   1.  Triggered by users, not periodically triggered by the system itself. 
> However, this FLIP wants to support it created periodically.
>   2.  Even the so-called incremental native savepoint [1], it will not depend 
> on the previous checkpoints or savepoints, it will still copy files on DFS to 
> the self-contained savepoint folder. However, from the description of this 
> FLIP about the deletion of expired snapshot files, paimion savepoint will 
> refer to the previously existing files directly.
>
> I don't think we need to make the semantics of Paimon totally the same as 
> Flink's. However, we need to introduce a table to tell the difference 
> compared with Flink and discuss about the difference.
>
> [1] 
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-203%3A+Incremental+savepoints#FLIP203:Incrementalsavepoints-Semantic
>
> Best
> Yun Tang
> ________________________________
> From: Nicholas Jiang <[email protected]>
> Sent: Friday, May 19, 2023 17:40
> To: [email protected] <[email protected]>
> Subject: Re: [DISCUSS] PIP-4 Support savepoint
>
> Hi Guys,
>
> Thanks Zelin for driving the savepoint proposal. I propose some opinions for 
> savepont:
>
> -- About "introduce savepoint for Paimon to persist full data in a time point"
>
> The motivation of savepoint proposal is more like snapshot TTL management. 
> Actually, disaster recovery is very much mission critical for any software. 
> Especially when it comes to data systems, the impact could be very serious 
> leading to delay in business decisions or even wrong business decisions at 
> times. Savepoint is proposed to assist users in recovering data from a 
> previous state: "savepoint" and "restore".
>
> "savepoint" saves the Paimon table as of the commit time, therefore if there 
> is a savepoint, the data generated in the corresponding commit could not be 
> clean. Meanwhile, savepoint could let user restore the table to this 
> savepoint at a later point in time if need be. On similar lines, savepoint 
> cannot be triggered on a commit that is already cleaned up. Savepoint is 
> synonymous to taking a backup, just that we don't make a new copy of the 
> table, but just save the state of the table elegantly so that we can restore 
> it later when in need.
>
> "restore" lets you restore your table to one of the savepoint commit. 
> Meanwhile, it cannot be undone (or reversed) and so care should be taken 
> before doing a restore. At this time, Paimon would delete all data files and 
> commit files (timeline files) greater than the savepoint commit to which the 
> table is being restored.
>
> BTW, it's better to introduce snapshot view based on savepoint, which could 
> improve query performance of historical data for Paimon table.
>
> -- About Public API of savepont
>
> Current introduced savepoint interfaces in Public API are not enough for 
> users, for example, deleteSavepoint, restoreSavepoint etc.
>
> -- About "Paimon's savepoint need to be combined with Flink's savepoint":
>
> If paimon supports savepoint mechanism and provides savepoint interfaces, the 
> integration with Flink's savepoint is not blocked for this proposal.
>
> In summary, savepoint is not only used to improve the query performance of 
> historical data, but also used for disaster recovery processing.
>
> On 2023/05/17 09:53:11 Jingsong Li wrote:
> > What Shammon mentioned is interesting. I agree with what he said about
> > the differences in savepoints between databases and stream computing.
> >
> > About "Paimon's savepoint need to be combined with Flink's savepoint":
> >
> > I think it is possible, but we may need to deal with this in another
> > mechanism, because the snapshots after savepoint may expire. We need
> > to compare data between two savepoints to generate incremental data
> > for streaming read.
> >
> > But this may not need to block FLIP, it looks like the current design
> > does not break the future combination?
> >
> > Best,
> > Jingsong
> >
> > On Wed, May 17, 2023 at 5:33 PM Shammon FY <[email protected]> wrote:
> > >
> > > Hi Caizhi,
> > >
> > > Thanks for your comments. As you mentioned, I think we may need to discuss
> > > the role of savepoint in Paimon.
> > >
> > > If I understand correctly, the main feature of savepoint in the current 
> > > PIP
> > > is that the savepoint will not be expired, and users can perform a query 
> > > on
> > > the savepoint according to time-travel. Besides that, there is savepoint 
> > > in
> > > the database and Flink.
> > >
> > > 1. Savepoint in database. The database can roll back table data to the
> > > specified 'version' based on savepoint. So the key point of savepoint in
> > > the database is to rollback data.
> > >
> > > 2. Savepoint in Flink. Users can trigger a savepoint with a specific
> > > 'path', and save all data of state to the savepoint for job. Then users 
> > > can
> > > create a new job based on the savepoint to continue consuming incremental
> > > data. I think the core capabilities are: backup for a job, and resume a 
> > > job
> > > based on the savepoint.
> > >
> > > In addition to the above, Paimon may also face data write corruption and
> > > need to recover data based on the specified savepoint. So we may need to
> > > consider what abilities should Paimon savepoint need besides the ones
> > > mentioned in the current PIP?
> > >
> > > Additionally, as mentioned above, Flink also has
> > > savepoint mechanism. During the process of streaming data from Flink to
> > > Paimon, does Paimon's savepoint need to be combined with Flink's 
> > > savepoint?
> > >
> > >
> > > Best,
> > > Shammon FY
> > >
> > >
> > > On Wed, May 17, 2023 at 4:02 PM Caizhi Weng <[email protected]> wrote:
> > >
> > > > Hi developers!
> > > >
> > > > Thanks Zelin for bringing up the discussion. The proposal seems good to 
> > > > me
> > > > overall. However I'd also like to bring up a few options.
> > > >
> > > > 1. As Jingsong mentioned, Savepoint class should not become a public 
> > > > API,
> > > > at least for now. What we need to discuss for the public API is how the
> > > > users can create or delete savepoints. For example, what the table 
> > > > option
> > > > looks like, what commands and options are provided for the Flink action,
> > > > etc.
> > > >
> > > > 2. Currently most Flink actions are related to streaming processing, so
> > > > only Flink can support them. However, savepoint creation and deletion 
> > > > seems
> > > > like a feature for batch processing. So aside from Flink actions, shall 
> > > > we
> > > > also provide something like Spark actions for savepoints?
> > > >
> > > > I would also like to comment on Shammon's views.
> > > >
> > > > Should we introduce an option for savepoint path which may be different
> > > > > from 'warehouse'? Then users can backup the data of savepoint.
> > > > >
> > > >
> > > > I don't see this is necessary. To backup a table the user just need to 
> > > > copy
> > > > all files from the table directory. Savepoint in Paimon, as far as I
> > > > understand, is mainly for users to review historical data, not for 
> > > > backing
> > > > up tables.
> > > >
> > > > Will the savepoint copy data files from snapshot or only save meta 
> > > > files?
> > > > >
> > > >
> > > > It would be a heavy burden if a savepoint copies all its files. As I
> > > > mentioned above, savepoint is not for backing up tables.
> > > >
> > > >  How can users create a new table and restore data from the specified
> > > > > savepoint?
> > > >
> > > >
> > > > This reminds me of savepoints in Flink. Still, savepoint is not for 
> > > > backing
> > > > up tables so I guess we don't need to support "restoring data" from a
> > > > savepoint.
> > > >
> > > > Shammon FY <[email protected]> 于2023年5月17日周三 10:32写道：
> > > >
> > > > > Thanks Zelin for initiating this discussion. I have some comments:
> > > > >
> > > > > 1. Should we introduce an option for savepoint path which may be
> > > > different
> > > > > from 'warehouse'? Then users can backup the data of savepoint.
> > > > >
> > > > > 2. Will the savepoint copy data files from snapshot or only save meta
> > > > > files? The description in the PIP "After we introduce savepoint, we
> > > > should
> > > > > also check if the data files are used by savepoints." looks like we 
> > > > > only
> > > > > save meta files for savepoint.
> > > > >
> > > > > 3. How can users create a new table and restore data from the 
> > > > > specified
> > > > > savepoint?
> > > > >
> > > > > Best,
> > > > > Shammon FY
> > > > >
> > > > >
> > > > > On Wed, May 17, 2023 at 10:19 AM Jingsong Li <[email protected]>
> > > > > wrote:
> > > > >
> > > > > > Thanks Zelin for driving.
> > > > > >
> > > > > > Some comments:
> > > > > >
> > > > > > 1. I think it's possible to advance `Proposed Changes` to the top,
> > > > > > Public API has no meaning if I don't know how to do it.
> > > > > >
> > > > > > 2. Public API, Savepoint and SavepointManager are not Public API, 
> > > > > > only
> > > > > > Flink action or configuration option should be public API.
> > > > > >
> > > > > > 3.Maybe we can have a separate chapter to describe
> > > > > > `savepoint.create-interval`, maybe 'Periodically savepoint'? It is 
> > > > > > not
> > > > > > just an interval, because the true user case is savepoint after 
> > > > > > 0:00.
> > > > > >
> > > > > > 4.About 'Interaction with Snapshot', to be continued ...
> > > > > >
> > > > > > Best,
> > > > > > Jingsong
> > > > > >
> > > > > > On Tue, May 16, 2023 at 7:07 PM yu zelin <[email protected]>
> > > > wrote:
> > > > > > >
> > > > > > > Hi, Paimon Devs,
> > > > > > >      I’d like to start a discussion about PIP-4[1]. In this PIP, I
> > > > want
> > > > > > to talk about why we need savepoint, and some thoughts about 
> > > > > > managing
> > > > and
> > > > > > using savepoint. Look forward to your question and suggestions.
> > > > > > >
> > > > > > > Best,
> > > > > > > Yu Zelin
> > > > > > >
> > > > > > > [1] https://cwiki.apache.org/confluence/x/NxE0Dw
> > > > > >
> > > > >
> > > >
> >

Re: [DISCUSS] PIP-4 Support savepoint

Reply via email to