Automatic lifecycle management based on a few configurations would be very useful for the community.
I read the description in https://issues.apache.org/jira/browse/HUDI-4677 May I ask the rationale for choosing Hive Metastore to manage the snapshots? Perhaps, RFC would have more details. Looking forward to it! Regards, Sagar On Wed, Sep 14, 2022 at 8:13 AM 冯健 <fengjian...@gmail.com> wrote: > Hi Ethan, > > Yes, based on the current situation, we still need to do much extra > work to provide snapshot view feature for the users( or users do this by > themself) > . I plan to merge the COW part of this feature to 0.13.0 at least. will > consider your suggestion if time is tight > Thanks > > > > On Wed, 14 Sept 2022 at 03:02, Y Ethan Guo <yi...@apache.org> wrote: > > > Hi Feng Jian, > > > > Looking forward to the RFC! Is the snapshot view management more like > > managing commits / savepoints in the Hudi timeline and hiding Hudi > > internals from the users? > > > > Do you plan to merge the implementation of snapshot view and lifecycle > > management for the next major release (0.13.0)? Timeline-wise, if time > is > > tight, you may also consider scoping out a subset of features to target > > 0.13.0. > > > > Best, > > - Ethan > > > > On Mon, Sep 12, 2022 at 10:43 PM Sivabalan <n.siv...@gmail.com> wrote: > > > > > Sounds like a nice feature to have. Eagerly looking forward for the > RFC. > > > > > > On Sat, 27 Aug 2022 at 20:51, 冯健 <fengjian...@gmail.com> wrote: > > > > > > > I attached the image in this Jira Epic > > > > https://issues.apache.org/jira/browse/HUDI-4677, and the RFC is WIP, > > > will > > > > create a pr in the next few days > > > > Yeah, the basic idea is to implement lifecycle management based on > the > > > > savepoint and time travel features, providing new ways for the user > to > > > > operate > > > > and coordinate. won't propose any new concept > > > > > > > > On Sun, 28 Aug 2022 at 02:06, Shiyan Xu <xu.shiyan.raym...@gmail.com > > > > > > wrote: > > > > > > > > > The dev email list does not support showing images unfortunately. > you > > > may > > > > > want to put it behind a link. > > > > > > > > > > As for the idea itself, > > > > > > > > > > What I plan to do is to let Hudi support release a snapshot view > and > > > > > > lifecycle management out-of-box. > > > > > > > > > > > > > > > Are you planning to extend the savepoint feature to have lifecycle > > > mgmt > > > > > capabilities? We should consolidate overlapping features properly. > > > > > > > > > > On Sun, Aug 21, 2022 at 12:59 PM 冯健 <fengjian...@gmail.com> wrote: > > > > > > > > > > > Hi team, > > > > > > [image: image.png] > > > > > > for the snapshot view scenario, Hudi already provides two key > > > > > > features to support it: > > > > > > > > > > > > - Time travel: user provides a timestamp to query a specific > > > > snapshot > > > > > > view of a Hudi table > > > > > > - Savepoint/restore: "savepoint" saves the table as of the > > commit > > > > time > > > > > > so that it lets you restore the table to this savepoint at a > > later > > > > > point in > > > > > > time if need be. but in this case, the user usually uses this > to > > > > > prevent > > > > > > cleaning snapshot view at a specific timestamp, only clean > > unused > > > > > files > > > > > > > > > > > > The situation is there some inconvenience for users if use them > > > > directly > > > > > > > > > > > > - Usually users incline to use a meaningful name instead of > > > querying > > > > > > Hudi table with a timestamp, using the timestamp in SQL may > lead > > > to > > > > > the > > > > > > wrong snapshot view being used. for example, we can announce > > that > > > a > > > > > new tag > > > > > > of hudi table with table_nameYYYYMMDD was released, then the > > user > > > > can > > > > > use > > > > > > this new table name to query. > > > > > > - Savepoint is not designed for this "snapshot view" scenario > in > > > the > > > > > > beginning, it is designed for disaster recovery. let's say a > new > > > > > snapshot > > > > > > view will be created every day, and it has 7 days retention, > we > > > > should > > > > > > support lifecycle management on top of it. > > > > > > > > > > > > What I plan to do is to let Hudi support release a snapshot view > > and > > > > > > lifecycle management out-of-box. We have already done some work > > when > > > > > > supporting customers' snapshot view requirements in my company, > and > > > > hope > > > > > to > > > > > > land this feature in Community too. > > > > > > > > > > > > Please feel free to let me know if you have any idea about this. > > > > > > > > > > > > Thanks, > > > > > > > > > > > > Jian Feng > > > > > > > > > > > > > > > > > > > > > -- > > > > > Best, > > > > > Shiyan > > > > > > > > > > > > > > > > > > -- > > > Regards, > > > -Sivabalan > > > > > >