Re: [DISCUSS] PIP-4 Support savepoint

Jingsong Li Tue, 30 May 2023 19:04:33 -0700

+1 to vote


On Tue, May 30, 2023 at 6:22 PM yu zelin <[email protected]> wrote:
>
> Hi, all,
>
> Does anyone have questions or feedbacks?
>
> I will wait a while for your reply. If no, I’d like to start a vote later.
>
> Best,
> Yu Zelin
>
> > 2023年5月30日 16:19，yu zelin <[email protected]> 写道：
> >
> > I agree with you @Jingsong.
> >
> > Best,
> > Yu Zelin
> >
> >> 2023年5月30日 16:15，Jingsong Li <[email protected]> 写道：
> >>
> >> I think we can just throw exceptions for pure numeric tag names.
> >>
> >> Iceberg's behavior looks confusing.
> >>
> >> Best,
> >> Jingsong
> >>
> >> On Tue, May 30, 2023 at 3:40 PM yu zelin <[email protected]> wrote:
> >>>
> >>> Hi, Shammon,
> >>>
> >>> An intuitive way is use numeric string to indicate snapshot and 
> >>> non-numeric string to indicate tag.
> >>> For example:
> >>>
> >>> SELECT * FROM t VERSION AS OF 1  —to snapshot #1
> >>> SELECT * FROM t VERSION AS OF ‘last_year’ —to tag `last_year`
> >>>
> >>> This is also how iceberg do [1].
> >>>
> >>> However, if we use this way, the tag name cannot be numeric string. I 
> >>> think this is acceptable and I will add this to the document.
> >>>
> >>> Best,
> >>> Yu Zelin
> >>>
> >>> [1] https://iceberg.apache.org/docs/latest/spark-queries/#sql
> >>>
> >>>> 2023年5月30日 12:17，Shammon FY <[email protected]> 写道：
> >>>>
> >>>> Hi zelin,
> >>>>
> >>>> Thanks for your update. I have one comment about Time Travel on 
> >>>> savepoint.
> >>>>
> >>>> Currently we can use statement in spark for specific snapshot 1
> >>>> SELECT * FROM t VERSION AS OF 1;
> >>>>
> >>>> My point is how can we distinguish between snapshot and savepoint when
> >>>> users submit a statement as followed:
> >>>> SELECT * FROM t VERSION AS OF <version value>;
> >>>>
> >>>> Best,
> >>>> Shammon FY
> >>>>
> >>>> On Tue, May 30, 2023 at 11:37 AM yu zelin <[email protected]> wrote:
> >>>>
> >>>>> Hi, Jingsong,
> >>>>>
> >>>>> Thanks for your feedback.
> >>>>>
> >>>>> ## TAG ID
> >>>>> It seems the id is useless currently. I’ll remove it.
> >>>>>
> >>>>> ## Time Travel Syntax
> >>>>> Since tag id is removed, we can just use:
> >>>>>
> >>>>> SELECT * FROM t VERSION AS OF ’tag-name’
> >>>>>
> >>>>> to travel to a tag.
> >>>>>
> >>>>> ## Tag class
> >>>>> I agree with you that we can reuse the Snapshot class. We can introduce
> >>>>> `TagManager`
> >>>>> only to manage tags.
> >>>>>
> >>>>> ## Expiring Snapshot
> >>>>>> why not record it in ManifestEntry?
> >>>>> This is because every time Paimon generate a snapshot, it will create 
> >>>>> new
> >>>>> ManifestEntries
> >>>>> for data files. Consider this scenario, if we record it in 
> >>>>> ManifestEntry,
> >>>>> assuming     we commit
> >>>>> data file A to snapshot #1, we will get manifest entry Entry#1 as [ADD,
> >>>>> A, commit at #1].
> >>>>> Then we commit -A to snapshot #2, we will get manifest entry Entry#2 as
> >>>>> [DELETE, A, ?],
> >>>>> as you can see, we cannot know at which snapshot we commit the file A. 
> >>>>> So
> >>>>> we have to
> >>>>> record this information to data file meta directly.
> >>>>>
> >>>>>> We should note that "record it in `DataFileMeta` should be done before
> >>>>> “tag”
> >>>>> and document version compatibility.
> >>>>>
> >>>>> I will add message for this.
> >>>>>
> >>>>> Best,
> >>>>> Yu Zelin
> >>>>>
> >>>>>
> >>>>>> 2023年5月29日 10:29，Jingsong Li <[email protected]> 写道：
> >>>>>>
> >>>>>> Thanks Zelin for the update.
> >>>>>>
> >>>>>> ## TAG ID
> >>>>>>
> >>>>>> Is this useful? We have tag-name, snapshot-id, and now introducing a
> >>>>>> tag id? What is used?
> >>>>>>
> >>>>>> ## Time Travel
> >>>>>>
> >>>>>> SELECT * FROM t VERSION AS OF tag-name.<name>
> >>>>>>
> >>>>>> This does not look like sql standard.
> >>>>>>
> >>>>>> Why do we introduce this `tag-name` prefix?
> >>>>>>
> >>>>>> ## Tag class
> >>>>>>
> >>>>>> Why not just use the Snapshot class? It looks like we don't need to
> >>>>>> introduce Tag class. We can just copy the snapshot file to tag/.
> >>>>>>
> >>>>>> ## Expiring Snapshot
> >>>>>>
> >>>>>> We should note that "record it in `DataFileMeta`" should be done
> >>>>>> before "tag". And document version compatibility.
> >>>>>> And why not record it in ManifestEntry?
> >>>>>>
> >>>>>> Best,
> >>>>>> Jingsong
> >>>>>>
> >>>>>> On Fri, May 26, 2023 at 11:15 AM yu zelin <[email protected]> 
> >>>>>> wrote:
> >>>>>>>
> >>>>>>> Hi, all,
> >>>>>>>
> >>>>>>> FYI, I have updated the PIP [1].
> >>>>>>>
> >>>>>>> Main changes:
> >>>>>>> - Use new name `tag`
> >>>>>>> - Enrich Motivation
> >>>>>>> - New Section `Data Files Handling` to describe how to determine a 
> >>>>>>> data
> >>>>> files can be deleted.
> >>>>>>>
> >>>>>>> Best,
> >>>>>>> Yu Zelin
> >>>>>>>
> >>>>>>> [1] https://cwiki.apache.org/confluence/x/NxE0Dw
> >>>>>>>
> >>>>>>>> 2023年5月24日 17:18，yu zelin <[email protected]> 写道：
> >>>>>>>>
> >>>>>>>> Hi, Guojun,
> >>>>>>>>
> >>>>>>>> I’d like to share my thoughts about your questions.
> >>>>>>>>
> >>>>>>>> 1. Expiration of savepoint
> >>>>>>>> In my opinion, savepoints are created in a long interval, so there
> >>>>> will not exist too many of them.
> >>>>>>>> If users create a savepoint per day, there are 365 savepoints a year.
> >>>>> So I didn’t consider expiration
> >>>>>>>> of it, and I think provide a flink action like `delete-savepoint id =
> >>>>> 1` is enough now.
> >>>>>>>> But if it is really important, we can introduce table options to do
> >>>>> so. I think we can do it like expiring
> >>>>>>>> snapshots.
> >>>>>>>>
> >>>>>>>> 2. >   id of compacted snapshot picked by the savepoint
> >>>>>>>> My initial idea is picking a compacted snapshot or doing compaction
> >>>>> before creating savepoint. But
> >>>>>>>> After discuss with Jingsong, I found it’s difficult. So now I suppose
> >>>>> to directly create savepoint from
> >>>>>>>> the given snapshot. Maybe we can optimize it later.
> >>>>>>>> The changes will be updated soon.
> >>>>>>>>> manifest file list in system-table
> >>>>>>>> I think manifest file is not very important for users. Users can find
> >>>>> when a savepoint is created, and
> >>>>>>>> get the savepoint id, then they can query it from the savepoint by 
> >>>>>>>> the
> >>>>> id. I did’t see what scenario
> >>>>>>>> the users need the manifest file information. What do you think?
> >>>>>>>>
> >>>>>>>> Best,
> >>>>>>>> Yu Zelin
> >>>>>>>>
> >>>>>>>>> 2023年5月24日 10:50，Guojun Li <[email protected]> 写道：
> >>>>>>>>>
> >>>>>>>>> Thanks zelin for bringing up the discussion. I'm thinking about:
> >>>>>>>>> 1. How to manage the savepoints if there are no expiration 
> >>>>>>>>> mechanism,
> >>>>> by
> >>>>>>>>> the TTL management of storages or external script?
> >>>>>>>>> 2. I think the id of compacted snapshot picked by the savepoint and
> >>>>>>>>> manifest file list is also important information for users, could
> >>>>> these
> >>>>>>>>> information be stored in the system-table?
> >>>>>>>>>
> >>>>>>>>> Best,
> >>>>>>>>> Guojun
> >>>>>>>>>
> >>>>>>>>> On Mon, May 22, 2023 at 9:13 PM Jingsong Li <[email protected]>
> >>>>> wrote:
> >>>>>>>>>
> >>>>>>>>>> FYI
> >>>>>>>>>>
> >>>>>>>>>> The PIP lacks a table to show Discussion thread & Vote thread &
> >>>>> ISSUE...
> >>>>>>>>>>
> >>>>>>>>>> Best
> >>>>>>>>>> Jingsong
> >>>>>>>>>>
> >>>>>>>>>> On Mon, May 22, 2023 at 4:48 PM yu zelin <[email protected]>
> >>>>> wrote:
> >>>>>>>>>>>
> >>>>>>>>>>> Hi, all,
> >>>>>>>>>>>
> >>>>>>>>>>> Thank all of you for your suggestions and questions. After reading
> >>>>> your
> >>>>>>>>>> suggestions, I adopt some of them and I want to share my opinions
> >>>>> here.
> >>>>>>>>>>>
> >>>>>>>>>>> To make my statements more clear, I will still use the word
> >>>>> `savepoint`.
> >>>>>>>>>> When we make a consensus, the name may be changed.
> >>>>>>>>>>>
> >>>>>>>>>>> 1. The purposes of savepoint
> >>>>>>>>>>>
> >>>>>>>>>>> As Shammon mentioned, Flink and database also have the concept of
> >>>>>>>>>> `savepoint`. So it’s better to clarify the purposes of our 
> >>>>>>>>>> savepoint.
> >>>>>>>>>> Thanks for Nicholas and Jingsong, I think your explanations are 
> >>>>>>>>>> very
> >>>>> clear.
> >>>>>>>>>> I’d like to give my summary:
> >>>>>>>>>>>
> >>>>>>>>>>> (1) Fault recovery (or we can say disaster recovery). Users can 
> >>>>>>>>>>> ROLL
> >>>>>>>>>> BACK to a savepoint if needed. If user rollbacks to a savepoint, 
> >>>>>>>>>> the
> >>>>> table
> >>>>>>>>>> will hold the data in the savepoint and the data committed  after 
> >>>>>>>>>> the
> >>>>>>>>>> savepoint will be deleted. In this scenario we need savepoint 
> >>>>>>>>>> because
> >>>>>>>>>> snapshots may have expired, the savepoint can keep longer and save
> >>>>> user’s
> >>>>>>>>>> old data.
> >>>>>>>>>>>
> >>>>>>>>>>> (2) Record versions of data at a longer interval (typically daily
> >>>>> level
> >>>>>>>>>> or weekly level). With savepoint, user can query the old data in
> >>>>> batch
> >>>>>>>>>> mode. Comparing to copy records to a new table or merge incremental
> >>>>> records
> >>>>>>>>>> with old records (like using merge into in Hive), the savepoint is
> >>>>> more
> >>>>>>>>>> lightweight because we don’t copy data files, we just record the
> >>>>> meta data
> >>>>>>>>>> of them.
> >>>>>>>>>>>
> >>>>>>>>>>> As you can see, savepoint is very similar to snapshot. The
> >>>>> differences
> >>>>>>>>>> are:
> >>>>>>>>>>>
> >>>>>>>>>>> (1) Savepoint lives longer. In most cases, snapshot’s life time is
> >>>>>>>>>> about several minutes to hours. We suppose the savepoint can live
> >>>>> several
> >>>>>>>>>> days, weeks, or even months.
> >>>>>>>>>>>
> >>>>>>>>>>> (2) Savepoint is mainly used for batch reading for historical 
> >>>>>>>>>>> data.
> >>>>> In
> >>>>>>>>>> this PIP, we don’t introduce streaming reading for savepoint.
> >>>>>>>>>>>
> >>>>>>>>>>> 2. Candidates of name
> >>>>>>>>>>>
> >>>>>>>>>>> I agree with Jingsong that we can use a new name. Since the 
> >>>>>>>>>>> purpose
> >>>>> and
> >>>>>>>>>> mechanism (savepoint is very similar to snapshot) of savepoint is
> >>>>> similar
> >>>>>>>>>> to `tag` in iceberg, maybe we can use `tag`.
> >>>>>>>>>>>
> >>>>>>>>>>> In my opinion, an alternative is `anchor`. All the snapshots are
> >>>>> like
> >>>>>>>>>> the navigation path of the streaming data, and an `anchor` can stop
> >>>>> it in a
> >>>>>>>>>> place.
> >>>>>>>>>>>
> >>>>>>>>>>> 3. Public table operations and options
> >>>>>>>>>>>
> >>>>>>>>>>> We supposed to expose some operations and table options for user 
> >>>>>>>>>>> to
> >>>>>>>>>> manage the savepoint.
> >>>>>>>>>>>
> >>>>>>>>>>> (1) Operations (Currently for Flink)
> >>>>>>>>>>> We provide flink actions to manage savepoints:
> >>>>>>>>>>> create-savepoint: To generate a savepoint from latest snapshot.
> >>>>>>>>>> Support to create from specified snapshot.
> >>>>>>>>>>> delete-savepoint: To delete specified savepoint.
> >>>>>>>>>>> rollback-to: To roll back to a specified savepoint.
> >>>>>>>>>>>
> >>>>>>>>>>> (2) Table options
> >>>>>>>>>>> We suppose to provide options for creating savepoint periodically:
> >>>>>>>>>>> savepoint.create-time: When to create the savepoint. Example: 
> >>>>>>>>>>> 00:00
> >>>>>>>>>>> savepoint.create-interval: Interval between the creation of two
> >>>>>>>>>> savepoints. Examples: 2 d.
> >>>>>>>>>>> savepoint.time-retained: The maximum time of savepoints to retain.
> >>>>>>>>>>>
> >>>>>>>>>>> (3) Procedures (future work)
> >>>>>>>>>>> Spark supports SQL extension. After we support Spark CALL
> >>>>> statement, we
> >>>>>>>>>> can provide procedures to create, delete or rollback to savepoint
> >>>>> for Spark
> >>>>>>>>>> users.
> >>>>>>>>>>>
> >>>>>>>>>>> Support of CALL is on the road map of Flink. In future version, we
> >>>>> can
> >>>>>>>>>> also support savepoint-related procedures for Flink users.
> >>>>>>>>>>>
> >>>>>>>>>>> 4. Expiration of data files
> >>>>>>>>>>>
> >>>>>>>>>>> Currently, when a snapshot is expired, data files that not be used
> >>>>> by
> >>>>>>>>>> other snapshots. After we introduce the savepoint, we must make 
> >>>>>>>>>> sure
> >>>>> the
> >>>>>>>>>> data files saved by savepoint will not be deleted.
> >>>>>>>>>>>
> >>>>>>>>>>> Conversely,  when a savepoint is deleted, the data files that not 
> >>>>>>>>>>> be
> >>>>>>>>>> used by existing snapshots and other savepoints will be deleted.
> >>>>>>>>>>>
> >>>>>>>>>>> I have wrote some POC codes to implement it. I will update the
> >>>>> mechanism
> >>>>>>>>>> in PIP soon.
> >>>>>>>>>>>
> >>>>>>>>>>> Best,
> >>>>>>>>>>> Yu Zelin
> >>>>>>>>>>>
> >>>>>>>>>>>> 2023年5月21日 20:54，Jingsong Li <[email protected]> 写道：
> >>>>>>>>>>>>
> >>>>>>>>>>>> Thanks Yun for your information.
> >>>>>>>>>>>>
> >>>>>>>>>>>> We need to be careful to avoid confusion between Paimon and Flink
> >>>>>>>>>>>> concepts about "savepoint"
> >>>>>>>>>>>>
> >>>>>>>>>>>> Maybe we don't have to insist on using this "savepoint", for
> >>>>> example,
> >>>>>>>>>>>> TAG is also a candidate just like Iceberg [1]
> >>>>>>>>>>>>
> >>>>>>>>>>>> [1] https://iceberg.apache.org/docs/latest/branching/
> >>>>>>>>>>>>
> >>>>>>>>>>>> Best,
> >>>>>>>>>>>> Jingsong
> >>>>>>>>>>>>
> >>>>>>>>>>>> On Sun, May 21, 2023 at 8:51 PM Jingsong Li <
> >>>>> [email protected]>
> >>>>>>>>>> wrote:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Thanks Nicholas for your detailed requirements.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> We need to supplement user requirements in FLIP, which is mainly
> >>>>> aimed
> >>>>>>>>>>>>> at two purposes:
> >>>>>>>>>>>>> 1. Fault recovery for data errors (named: restore or 
> >>>>>>>>>>>>> rollback-to)
> >>>>>>>>>>>>> 2. Used to record versions at the day level (such as), targeting
> >>>>>>>>>> batch queries
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Best,
> >>>>>>>>>>>>> Jingsong
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> On Sat, May 20, 2023 at 2:55 PM Yun Tang <[email protected]>
> >>>>> wrote:
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Hi Guys,
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Since we use Paimon with Flink in most cases, I think we need 
> >>>>>>>>>>>>>> to
> >>>>>>>>>> identify the same word "savepoint" in different systems.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> For Flink, savepoint means:
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> 1.  Triggered by users, not periodically triggered by the 
> >>>>>>>>>>>>>> system
> >>>>>>>>>> itself. However, this FLIP wants to support it created 
> >>>>>>>>>> periodically.
> >>>>>>>>>>>>>> 2.  Even the so-called incremental native savepoint [1], it 
> >>>>>>>>>>>>>> will
> >>>>>>>>>> not depend on the previous checkpoints or savepoints, it will still
> >>>>> copy
> >>>>>>>>>> files on DFS to the self-contained savepoint folder. However, from
> >>>>> the
> >>>>>>>>>> description of this FLIP about the deletion of expired snapshot
> >>>>> files,
> >>>>>>>>>> paimion savepoint will refer to the previously existing files
> >>>>> directly.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> I don't think we need to make the semantics of Paimon totally 
> >>>>>>>>>>>>>> the
> >>>>>>>>>> same as Flink's. However, we need to introduce a table to tell the
> >>>>>>>>>> difference compared with Flink and discuss about the difference.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> [1]
> >>>>>>>>>>
> >>>>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-203%3A+Incremental+savepoints#FLIP203:Incrementalsavepoints-Semantic
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Best
> >>>>>>>>>>>>>> Yun Tang
> >>>>>>>>>>>>>> ________________________________
> >>>>>>>>>>>>>> From: Nicholas Jiang <[email protected]>
> >>>>>>>>>>>>>> Sent: Friday, May 19, 2023 17:40
> >>>>>>>>>>>>>> To: [email protected] <[email protected]>
> >>>>>>>>>>>>>> Subject: Re: [DISCUSS] PIP-4 Support savepoint
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Hi Guys,
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Thanks Zelin for driving the savepoint proposal. I propose some
> >>>>>>>>>> opinions for savepont:
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> -- About "introduce savepoint for Paimon to persist full data 
> >>>>>>>>>>>>>> in
> >>>>> a
> >>>>>>>>>> time point"
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> The motivation of savepoint proposal is more like snapshot TTL
> >>>>>>>>>> management. Actually, disaster recovery is very much mission
> >>>>> critical for
> >>>>>>>>>> any software. Especially when it comes to data systems, the impact
> >>>>> could be
> >>>>>>>>>> very serious leading to delay in business decisions or even wrong
> >>>>> business
> >>>>>>>>>> decisions at times. Savepoint is proposed to assist users in
> >>>>> recovering
> >>>>>>>>>> data from a previous state: "savepoint" and "restore".
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> "savepoint" saves the Paimon table as of the commit time,
> >>>>> therefore
> >>>>>>>>>> if there is a savepoint, the data generated in the corresponding
> >>>>> commit
> >>>>>>>>>> could not be clean. Meanwhile, savepoint could let user restore the
> >>>>> table
> >>>>>>>>>> to this savepoint at a later point in time if need be. On similar
> >>>>> lines,
> >>>>>>>>>> savepoint cannot be triggered on a commit that is already cleaned 
> >>>>>>>>>> up.
> >>>>>>>>>> Savepoint is synonymous to taking a backup, just that we don't make
> >>>>> a new
> >>>>>>>>>> copy of the table, but just save the state of the table elegantly 
> >>>>>>>>>> so
> >>>>> that
> >>>>>>>>>> we can restore it later when in need.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> "restore" lets you restore your table to one of the savepoint
> >>>>>>>>>> commit. Meanwhile, it cannot be undone (or reversed) and so care
> >>>>> should be
> >>>>>>>>>> taken before doing a restore. At this time, Paimon would delete all
> >>>>> data
> >>>>>>>>>> files and commit files (timeline files) greater than the savepoint
> >>>>> commit
> >>>>>>>>>> to which the table is being restored.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> BTW, it's better to introduce snapshot view based on savepoint,
> >>>>>>>>>> which could improve query performance of historical data for Paimon
> >>>>> table.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> -- About Public API of savepont
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Current introduced savepoint interfaces in Public API are not
> >>>>> enough
> >>>>>>>>>> for users, for example, deleteSavepoint, restoreSavepoint etc.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> -- About "Paimon's savepoint need to be combined with Flink's
> >>>>>>>>>> savepoint":
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> If paimon supports savepoint mechanism and provides savepoint
> >>>>>>>>>> interfaces, the integration with Flink's savepoint is not blocked
> >>>>> for this
> >>>>>>>>>> proposal.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> In summary, savepoint is not only used to improve the query
> >>>>>>>>>> performance of historical data, but also used for disaster recovery
> >>>>>>>>>> processing.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> On 2023/05/17 09:53:11 Jingsong Li wrote:
> >>>>>>>>>>>>>>> What Shammon mentioned is interesting. I agree with what he 
> >>>>>>>>>>>>>>> said
> >>>>>>>>>> about
> >>>>>>>>>>>>>>> the differences in savepoints between databases and stream
> >>>>>>>>>> computing.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> About "Paimon's savepoint need to be combined with Flink's
> >>>>>>>>>> savepoint":
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> I think it is possible, but we may need to deal with this in
> >>>>> another
> >>>>>>>>>>>>>>> mechanism, because the snapshots after savepoint may expire. 
> >>>>>>>>>>>>>>> We
> >>>>> need
> >>>>>>>>>>>>>>> to compare data between two savepoints to generate incremental
> >>>>> data
> >>>>>>>>>>>>>>> for streaming read.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> But this may not need to block FLIP, it looks like the current
> >>>>>>>>>> design
> >>>>>>>>>>>>>>> does not break the future combination?
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Best,
> >>>>>>>>>>>>>>> Jingsong
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> On Wed, May 17, 2023 at 5:33 PM Shammon FY <[email protected]>
> >>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Hi Caizhi,
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Thanks for your comments. As you mentioned, I think we may
> >>>>> need to
> >>>>>>>>>> discuss
> >>>>>>>>>>>>>>>> the role of savepoint in Paimon.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> If I understand correctly, the main feature of savepoint in 
> >>>>>>>>>>>>>>>> the
> >>>>>>>>>> current PIP
> >>>>>>>>>>>>>>>> is that the savepoint will not be expired, and users can
> >>>>> perform a
> >>>>>>>>>> query on
> >>>>>>>>>>>>>>>> the savepoint according to time-travel. Besides that, there 
> >>>>>>>>>>>>>>>> is
> >>>>>>>>>> savepoint in
> >>>>>>>>>>>>>>>> the database and Flink.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> 1. Savepoint in database. The database can roll back table
> >>>>> data to
> >>>>>>>>>> the
> >>>>>>>>>>>>>>>> specified 'version' based on savepoint. So the key point of
> >>>>>>>>>> savepoint in
> >>>>>>>>>>>>>>>> the database is to rollback data.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> 2. Savepoint in Flink. Users can trigger a savepoint with a
> >>>>>>>>>> specific
> >>>>>>>>>>>>>>>> 'path', and save all data of state to the savepoint for job.
> >>>>> Then
> >>>>>>>>>> users can
> >>>>>>>>>>>>>>>> create a new job based on the savepoint to continue consuming
> >>>>>>>>>> incremental
> >>>>>>>>>>>>>>>> data. I think the core capabilities are: backup for a job, 
> >>>>>>>>>>>>>>>> and
> >>>>>>>>>> resume a job
> >>>>>>>>>>>>>>>> based on the savepoint.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> In addition to the above, Paimon may also face data write
> >>>>>>>>>> corruption and
> >>>>>>>>>>>>>>>> need to recover data based on the specified savepoint. So we
> >>>>> may
> >>>>>>>>>> need to
> >>>>>>>>>>>>>>>> consider what abilities should Paimon savepoint need besides
> >>>>> the
> >>>>>>>>>> ones
> >>>>>>>>>>>>>>>> mentioned in the current PIP?
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Additionally, as mentioned above, Flink also has
> >>>>>>>>>>>>>>>> savepoint mechanism. During the process of streaming data 
> >>>>>>>>>>>>>>>> from
> >>>>>>>>>> Flink to
> >>>>>>>>>>>>>>>> Paimon, does Paimon's savepoint need to be combined with
> >>>>> Flink's
> >>>>>>>>>> savepoint?
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Best,
> >>>>>>>>>>>>>>>> Shammon FY
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> On Wed, May 17, 2023 at 4:02 PM Caizhi Weng <
> >>>>> [email protected]>
> >>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Hi developers!
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Thanks Zelin for bringing up the discussion. The proposal
> >>>>> seems
> >>>>>>>>>> good to me
> >>>>>>>>>>>>>>>>> overall. However I'd also like to bring up a few options.
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> 1. As Jingsong mentioned, Savepoint class should not become 
> >>>>>>>>>>>>>>>>> a
> >>>>>>>>>> public API,
> >>>>>>>>>>>>>>>>> at least for now. What we need to discuss for the public API
> >>>>> is
> >>>>>>>>>> how the
> >>>>>>>>>>>>>>>>> users can create or delete savepoints. For example, what the
> >>>>>>>>>> table option
> >>>>>>>>>>>>>>>>> looks like, what commands and options are provided for the
> >>>>> Flink
> >>>>>>>>>> action,
> >>>>>>>>>>>>>>>>> etc.
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> 2. Currently most Flink actions are related to streaming
> >>>>>>>>>> processing, so
> >>>>>>>>>>>>>>>>> only Flink can support them. However, savepoint creation and
> >>>>>>>>>> deletion seems
> >>>>>>>>>>>>>>>>> like a feature for batch processing. So aside from Flink
> >>>>> actions,
> >>>>>>>>>> shall we
> >>>>>>>>>>>>>>>>> also provide something like Spark actions for savepoints?
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> I would also like to comment on Shammon's views.
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Should we introduce an option for savepoint path which may 
> >>>>>>>>>>>>>>>>> be
> >>>>>>>>>> different
> >>>>>>>>>>>>>>>>>> from 'warehouse'? Then users can backup the data of
> >>>>> savepoint.
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> I don't see this is necessary. To backup a table the user 
> >>>>>>>>>>>>>>>>> just
> >>>>>>>>>> need to copy
> >>>>>>>>>>>>>>>>> all files from the table directory. Savepoint in Paimon, as
> >>>>> far
> >>>>>>>>>> as I
> >>>>>>>>>>>>>>>>> understand, is mainly for users to review historical data, 
> >>>>>>>>>>>>>>>>> not
> >>>>>>>>>> for backing
> >>>>>>>>>>>>>>>>> up tables.
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Will the savepoint copy data files from snapshot or only 
> >>>>>>>>>>>>>>>>> save
> >>>>>>>>>> meta files?
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> It would be a heavy burden if a savepoint copies all its
> >>>>> files.
> >>>>>>>>>> As I
> >>>>>>>>>>>>>>>>> mentioned above, savepoint is not for backing up tables.
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> How can users create a new table and restore data from the
> >>>>>>>>>> specified
> >>>>>>>>>>>>>>>>>> savepoint?
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> This reminds me of savepoints in Flink. Still, savepoint is
> >>>>> not
> >>>>>>>>>> for backing
> >>>>>>>>>>>>>>>>> up tables so I guess we don't need to support "restoring 
> >>>>>>>>>>>>>>>>> data"
> >>>>>>>>>> from a
> >>>>>>>>>>>>>>>>> savepoint.
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Shammon FY <[email protected]> 于2023年5月17日周三 10:32写道：
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> Thanks Zelin for initiating this discussion. I have some
> >>>>>>>>>> comments:
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> 1. Should we introduce an option for savepoint path which
> >>>>> may be
> >>>>>>>>>>>>>>>>> different
> >>>>>>>>>>>>>>>>>> from 'warehouse'? Then users can backup the data of
> >>>>> savepoint.
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> 2. Will the savepoint copy data files from snapshot or only
> >>>>> save
> >>>>>>>>>> meta
> >>>>>>>>>>>>>>>>>> files? The description in the PIP "After we introduce
> >>>>> savepoint,
> >>>>>>>>>> we
> >>>>>>>>>>>>>>>>> should
> >>>>>>>>>>>>>>>>>> also check if the data files are used by savepoints." looks
> >>>>> like
> >>>>>>>>>> we only
> >>>>>>>>>>>>>>>>>> save meta files for savepoint.
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> 3. How can users create a new table and restore data from 
> >>>>>>>>>>>>>>>>>> the
> >>>>>>>>>> specified
> >>>>>>>>>>>>>>>>>> savepoint?
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> Best,
> >>>>>>>>>>>>>>>>>> Shammon FY
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> On Wed, May 17, 2023 at 10:19 AM Jingsong Li <
> >>>>>>>>>> [email protected]>
> >>>>>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> Thanks Zelin for driving.
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> Some comments:
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> 1. I think it's possible to advance `Proposed Changes` to
> >>>>> the
> >>>>>>>>>> top,
> >>>>>>>>>>>>>>>>>>> Public API has no meaning if I don't know how to do it.
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> 2. Public API, Savepoint and SavepointManager are not 
> >>>>>>>>>>>>>>>>>>> Public
> >>>>>>>>>> API, only
> >>>>>>>>>>>>>>>>>>> Flink action or configuration option should be public API.
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> 3.Maybe we can have a separate chapter to describe
> >>>>>>>>>>>>>>>>>>> `savepoint.create-interval`, maybe 'Periodically
> >>>>> savepoint'? It
> >>>>>>>>>> is not
> >>>>>>>>>>>>>>>>>>> just an interval, because the true user case is savepoint
> >>>>> after
> >>>>>>>>>> 0:00.
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> 4.About 'Interaction with Snapshot', to be continued ...
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> Best,
> >>>>>>>>>>>>>>>>>>> Jingsong
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> On Tue, May 16, 2023 at 7:07 PM yu zelin <
> >>>>> [email protected]
> >>>>>>>>>>>
> >>>>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> Hi, Paimon Devs,
> >>>>>>>>>>>>>>>>>>>> I’d like to start a discussion about PIP-4[1]. In this
> >>>>>>>>>> PIP, I
> >>>>>>>>>>>>>>>>> want
> >>>>>>>>>>>>>>>>>>> to talk about why we need savepoint, and some thoughts 
> >>>>>>>>>>>>>>>>>>> about
> >>>>>>>>>> managing
> >>>>>>>>>>>>>>>>> and
> >>>>>>>>>>>>>>>>>>> using savepoint. Look forward to your question and
> >>>>> suggestions.
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> Best,
> >>>>>>>>>>>>>>>>>>>> Yu Zelin
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> [1] https://cwiki.apache.org/confluence/x/NxE0Dw
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>
> >>>>>
> >>>
> >
>

Re: [DISCUSS] PIP-4 Support savepoint

Reply via email to