Re: [DISCUSS] PIP-4 Support savepoint

Guojun Li Tue, 23 May 2023 19:50:34 -0700

Thanks zelin for bringing up the discussion. I'm thinking about:
1. How to manage the savepoints if there are no expiration mechanism, by
the TTL management of storages or external script?
2. I think the id of compacted snapshot picked by the savepoint and
manifest file list is also important information for users, could these
information be stored in the system-table?


Best,
Guojun

On Mon, May 22, 2023 at 9:13 PM Jingsong Li <[email protected]> wrote:

> FYI
>
> The PIP lacks a table to show Discussion thread & Vote thread & ISSUE...
>
> Best
> Jingsong
>
> On Mon, May 22, 2023 at 4:48 PM yu zelin <[email protected]> wrote:
> >
> > Hi, all,
> >
> > Thank all of you for your suggestions and questions. After reading your
> suggestions, I adopt some of them and I want to share my opinions here.
> >
> > To make my statements more clear, I will still use the word `savepoint`.
> When we make a consensus, the name may be changed.
> >
> > 1. The purposes of savepoint
> >
> > As Shammon mentioned, Flink and database also have the concept of
> `savepoint`. So it’s better to clarify the purposes of our savepoint.
> Thanks for Nicholas and Jingsong, I think your explanations are very clear.
> I’d like to give my summary:
> >
> > (1) Fault recovery (or we can say disaster recovery). Users can ROLL
> BACK to a savepoint if needed. If user rollbacks to a savepoint, the table
> will hold the data in the savepoint and the data committed  after the
> savepoint will be deleted. In this scenario we need savepoint because
> snapshots may have expired, the savepoint can keep longer and save user’s
> old data.
> >
> > (2) Record versions of data at a longer interval (typically daily level
> or weekly level). With savepoint, user can query the old data in batch
> mode. Comparing to copy records to a new table or merge incremental records
> with old records (like using merge into in Hive), the savepoint is more
> lightweight because we don’t copy data files, we just record the meta data
> of them.
> >
> > As you can see, savepoint is very similar to snapshot. The differences
> are:
> >
> >  (1) Savepoint lives longer. In most cases, snapshot’s life time is
> about several minutes to hours. We suppose the savepoint can live several
> days, weeks, or even months.
> >
> > (2) Savepoint is mainly used for batch reading for historical data. In
> this PIP, we don’t introduce streaming reading for savepoint.
> >
> > 2. Candidates of name
> >
> > I agree with Jingsong that we can use a new name. Since the purpose and
> mechanism (savepoint is very similar to snapshot) of savepoint is similar
> to `tag` in iceberg, maybe we can use `tag`.
> >
> > In my opinion, an alternative is `anchor`. All the snapshots are like
> the navigation path of the streaming data, and an `anchor` can stop it in a
> place.
> >
> > 3. Public table operations and options
> >
> > We supposed to expose some operations and table options for user to
> manage the savepoint.
> >
> > (1) Operations (Currently for Flink)
> > We provide flink actions to manage savepoints:
> >     create-savepoint: To generate a savepoint from latest snapshot.
> Support to create from specified snapshot.
> >     delete-savepoint: To delete specified savepoint.
> >     rollback-to: To roll back to a specified savepoint.
> >
> > (2) Table options
> > We suppose to provide options for creating savepoint periodically:
> >     savepoint.create-time: When to create the savepoint. Example: 00:00
> >     savepoint.create-interval: Interval between the creation of two
> savepoints. Examples: 2 d.
> >     savepoint.time-retained: The maximum time of savepoints to retain.
> >
> > (3) Procedures (future work)
> > Spark supports SQL extension. After we support Spark CALL statement, we
> can provide procedures to create, delete or rollback to savepoint for Spark
> users.
> >
> > Support of CALL is on the road map of Flink. In future version, we can
> also support savepoint-related procedures for Flink users.
> >
> > 4. Expiration of data files
> >
> > Currently, when a snapshot is expired, data files that not be used by
> other snapshots. After we introduce the savepoint, we must make sure the
> data files saved by savepoint will not be deleted.
> >
> > Conversely,  when a savepoint is deleted, the data files that not be
> used by existing snapshots and other savepoints will be deleted.
> >
> > I have wrote some POC codes to implement it. I will update the mechanism
> in PIP soon.
> >
> > Best,
> > Yu Zelin
> >
> > > 2023年5月21日 20:54，Jingsong Li <[email protected]> 写道：
> > >
> > > Thanks Yun for your information.
> > >
> > > We need to be careful to avoid confusion between Paimon and Flink
> > > concepts about "savepoint"
> > >
> > > Maybe we don't have to insist on using this "savepoint", for example,
> > > TAG is also a candidate just like Iceberg [1]
> > >
> > > [1] https://iceberg.apache.org/docs/latest/branching/
> > >
> > > Best,
> > > Jingsong
> > >
> > > On Sun, May 21, 2023 at 8:51 PM Jingsong Li <[email protected]>
> wrote:
> > >>
> > >> Thanks Nicholas for your detailed requirements.
> > >>
> > >> We need to supplement user requirements in FLIP, which is mainly aimed
> > >> at two purposes:
> > >> 1. Fault recovery for data errors (named: restore or rollback-to)
> > >> 2. Used to record versions at the day level (such as), targeting
> batch queries
> > >>
> > >> Best,
> > >> Jingsong
> > >>
> > >> On Sat, May 20, 2023 at 2:55 PM Yun Tang <[email protected]> wrote:
> > >>>
> > >>> Hi Guys,
> > >>>
> > >>> Since we use Paimon with Flink in most cases, I think we need to
> identify the same word "savepoint" in different systems.
> > >>>
> > >>> For Flink, savepoint means:
> > >>>
> > >>>  1.  Triggered by users, not periodically triggered by the system
> itself. However, this FLIP wants to support it created periodically.
> > >>>  2.  Even the so-called incremental native savepoint [1], it will
> not depend on the previous checkpoints or savepoints, it will still copy
> files on DFS to the self-contained savepoint folder. However, from the
> description of this FLIP about the deletion of expired snapshot files,
> paimion savepoint will refer to the previously existing files directly.
> > >>>
> > >>> I don't think we need to make the semantics of Paimon totally the
> same as Flink's. However, we need to introduce a table to tell the
> difference compared with Flink and discuss about the difference.
> > >>>
> > >>> [1]
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-203%3A+Incremental+savepoints#FLIP203:Incrementalsavepoints-Semantic
> > >>>
> > >>> Best
> > >>> Yun Tang
> > >>> ________________________________
> > >>> From: Nicholas Jiang <[email protected]>
> > >>> Sent: Friday, May 19, 2023 17:40
> > >>> To: [email protected] <[email protected]>
> > >>> Subject: Re: [DISCUSS] PIP-4 Support savepoint
> > >>>
> > >>> Hi Guys,
> > >>>
> > >>> Thanks Zelin for driving the savepoint proposal. I propose some
> opinions for savepont:
> > >>>
> > >>> -- About "introduce savepoint for Paimon to persist full data in a
> time point"
> > >>>
> > >>> The motivation of savepoint proposal is more like snapshot TTL
> management. Actually, disaster recovery is very much mission critical for
> any software. Especially when it comes to data systems, the impact could be
> very serious leading to delay in business decisions or even wrong business
> decisions at times. Savepoint is proposed to assist users in recovering
> data from a previous state: "savepoint" and "restore".
> > >>>
> > >>> "savepoint" saves the Paimon table as of the commit time, therefore
> if there is a savepoint, the data generated in the corresponding commit
> could not be clean. Meanwhile, savepoint could let user restore the table
> to this savepoint at a later point in time if need be. On similar lines,
> savepoint cannot be triggered on a commit that is already cleaned up.
> Savepoint is synonymous to taking a backup, just that we don't make a new
> copy of the table, but just save the state of the table elegantly so that
> we can restore it later when in need.
> > >>>
> > >>> "restore" lets you restore your table to one of the savepoint
> commit. Meanwhile, it cannot be undone (or reversed) and so care should be
> taken before doing a restore. At this time, Paimon would delete all data
> files and commit files (timeline files) greater than the savepoint commit
> to which the table is being restored.
> > >>>
> > >>> BTW, it's better to introduce snapshot view based on savepoint,
> which could improve query performance of historical data for Paimon table.
> > >>>
> > >>> -- About Public API of savepont
> > >>>
> > >>> Current introduced savepoint interfaces in Public API are not enough
> for users, for example, deleteSavepoint, restoreSavepoint etc.
> > >>>
> > >>> -- About "Paimon's savepoint need to be combined with Flink's
> savepoint":
> > >>>
> > >>> If paimon supports savepoint mechanism and provides savepoint
> interfaces, the integration with Flink's savepoint is not blocked for this
> proposal.
> > >>>
> > >>> In summary, savepoint is not only used to improve the query
> performance of historical data, but also used for disaster recovery
> processing.
> > >>>
> > >>> On 2023/05/17 09:53:11 Jingsong Li wrote:
> > >>>> What Shammon mentioned is interesting. I agree with what he said
> about
> > >>>> the differences in savepoints between databases and stream
> computing.
> > >>>>
> > >>>> About "Paimon's savepoint need to be combined with Flink's
> savepoint":
> > >>>>
> > >>>> I think it is possible, but we may need to deal with this in another
> > >>>> mechanism, because the snapshots after savepoint may expire. We need
> > >>>> to compare data between two savepoints to generate incremental data
> > >>>> for streaming read.
> > >>>>
> > >>>> But this may not need to block FLIP, it looks like the current
> design
> > >>>> does not break the future combination?
> > >>>>
> > >>>> Best,
> > >>>> Jingsong
> > >>>>
> > >>>> On Wed, May 17, 2023 at 5:33 PM Shammon FY <[email protected]>
> wrote:
> > >>>>>
> > >>>>> Hi Caizhi,
> > >>>>>
> > >>>>> Thanks for your comments. As you mentioned, I think we may need to
> discuss
> > >>>>> the role of savepoint in Paimon.
> > >>>>>
> > >>>>> If I understand correctly, the main feature of savepoint in the
> current PIP
> > >>>>> is that the savepoint will not be expired, and users can perform a
> query on
> > >>>>> the savepoint according to time-travel. Besides that, there is
> savepoint in
> > >>>>> the database and Flink.
> > >>>>>
> > >>>>> 1. Savepoint in database. The database can roll back table data to
> the
> > >>>>> specified 'version' based on savepoint. So the key point of
> savepoint in
> > >>>>> the database is to rollback data.
> > >>>>>
> > >>>>> 2. Savepoint in Flink. Users can trigger a savepoint with a
> specific
> > >>>>> 'path', and save all data of state to the savepoint for job. Then
> users can
> > >>>>> create a new job based on the savepoint to continue consuming
> incremental
> > >>>>> data. I think the core capabilities are: backup for a job, and
> resume a job
> > >>>>> based on the savepoint.
> > >>>>>
> > >>>>> In addition to the above, Paimon may also face data write
> corruption and
> > >>>>> need to recover data based on the specified savepoint. So we may
> need to
> > >>>>> consider what abilities should Paimon savepoint need besides the
> ones
> > >>>>> mentioned in the current PIP?
> > >>>>>
> > >>>>> Additionally, as mentioned above, Flink also has
> > >>>>> savepoint mechanism. During the process of streaming data from
> Flink to
> > >>>>> Paimon, does Paimon's savepoint need to be combined with Flink's
> savepoint?
> > >>>>>
> > >>>>>
> > >>>>> Best,
> > >>>>> Shammon FY
> > >>>>>
> > >>>>>
> > >>>>> On Wed, May 17, 2023 at 4:02 PM Caizhi Weng <[email protected]>
> wrote:
> > >>>>>
> > >>>>>> Hi developers!
> > >>>>>>
> > >>>>>> Thanks Zelin for bringing up the discussion. The proposal seems
> good to me
> > >>>>>> overall. However I'd also like to bring up a few options.
> > >>>>>>
> > >>>>>> 1. As Jingsong mentioned, Savepoint class should not become a
> public API,
> > >>>>>> at least for now. What we need to discuss for the public API is
> how the
> > >>>>>> users can create or delete savepoints. For example, what the
> table option
> > >>>>>> looks like, what commands and options are provided for the Flink
> action,
> > >>>>>> etc.
> > >>>>>>
> > >>>>>> 2. Currently most Flink actions are related to streaming
> processing, so
> > >>>>>> only Flink can support them. However, savepoint creation and
> deletion seems
> > >>>>>> like a feature for batch processing. So aside from Flink actions,
> shall we
> > >>>>>> also provide something like Spark actions for savepoints?
> > >>>>>>
> > >>>>>> I would also like to comment on Shammon's views.
> > >>>>>>
> > >>>>>> Should we introduce an option for savepoint path which may be
> different
> > >>>>>>> from 'warehouse'? Then users can backup the data of savepoint.
> > >>>>>>>
> > >>>>>>
> > >>>>>> I don't see this is necessary. To backup a table the user just
> need to copy
> > >>>>>> all files from the table directory. Savepoint in Paimon, as far
> as I
> > >>>>>> understand, is mainly for users to review historical data, not
> for backing
> > >>>>>> up tables.
> > >>>>>>
> > >>>>>> Will the savepoint copy data files from snapshot or only save
> meta files?
> > >>>>>>>
> > >>>>>>
> > >>>>>> It would be a heavy burden if a savepoint copies all its files.
> As I
> > >>>>>> mentioned above, savepoint is not for backing up tables.
> > >>>>>>
> > >>>>>> How can users create a new table and restore data from the
> specified
> > >>>>>>> savepoint?
> > >>>>>>
> > >>>>>>
> > >>>>>> This reminds me of savepoints in Flink. Still, savepoint is not
> for backing
> > >>>>>> up tables so I guess we don't need to support "restoring data"
> from a
> > >>>>>> savepoint.
> > >>>>>>
> > >>>>>> Shammon FY <[email protected]> 于2023年5月17日周三 10:32写道：
> > >>>>>>
> > >>>>>>> Thanks Zelin for initiating this discussion. I have some
> comments:
> > >>>>>>>
> > >>>>>>> 1. Should we introduce an option for savepoint path which may be
> > >>>>>> different
> > >>>>>>> from 'warehouse'? Then users can backup the data of savepoint.
> > >>>>>>>
> > >>>>>>> 2. Will the savepoint copy data files from snapshot or only save
> meta
> > >>>>>>> files? The description in the PIP "After we introduce savepoint,
> we
> > >>>>>> should
> > >>>>>>> also check if the data files are used by savepoints." looks like
> we only
> > >>>>>>> save meta files for savepoint.
> > >>>>>>>
> > >>>>>>> 3. How can users create a new table and restore data from the
> specified
> > >>>>>>> savepoint?
> > >>>>>>>
> > >>>>>>> Best,
> > >>>>>>> Shammon FY
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> On Wed, May 17, 2023 at 10:19 AM Jingsong Li <
> [email protected]>
> > >>>>>>> wrote:
> > >>>>>>>
> > >>>>>>>> Thanks Zelin for driving.
> > >>>>>>>>
> > >>>>>>>> Some comments:
> > >>>>>>>>
> > >>>>>>>> 1. I think it's possible to advance `Proposed Changes` to the
> top,
> > >>>>>>>> Public API has no meaning if I don't know how to do it.
> > >>>>>>>>
> > >>>>>>>> 2. Public API, Savepoint and SavepointManager are not Public
> API, only
> > >>>>>>>> Flink action or configuration option should be public API.
> > >>>>>>>>
> > >>>>>>>> 3.Maybe we can have a separate chapter to describe
> > >>>>>>>> `savepoint.create-interval`, maybe 'Periodically savepoint'? It
> is not
> > >>>>>>>> just an interval, because the true user case is savepoint after
> 0:00.
> > >>>>>>>>
> > >>>>>>>> 4.About 'Interaction with Snapshot', to be continued ...
> > >>>>>>>>
> > >>>>>>>> Best,
> > >>>>>>>> Jingsong
> > >>>>>>>>
> > >>>>>>>> On Tue, May 16, 2023 at 7:07 PM yu zelin <[email protected]
> >
> > >>>>>> wrote:
> > >>>>>>>>>
> > >>>>>>>>> Hi, Paimon Devs,
> > >>>>>>>>>     I’d like to start a discussion about PIP-4[1]. In this
> PIP, I
> > >>>>>> want
> > >>>>>>>> to talk about why we need savepoint, and some thoughts about
> managing
> > >>>>>> and
> > >>>>>>>> using savepoint. Look forward to your question and suggestions.
> > >>>>>>>>>
> > >>>>>>>>> Best,
> > >>>>>>>>> Yu Zelin
> > >>>>>>>>>
> > >>>>>>>>> [1] https://cwiki.apache.org/confluence/x/NxE0Dw
> > >>>>>>>>
> > >>>>>>>
> > >>>>>>
> > >>>>
> >
>

Re: [DISCUSS] PIP-4 Support savepoint

Reply via email to