Hi Ethan,
Yes, based on the current situation, we still need to do much extra
work to provide snapshot view feature for the users( or users do this by
themself)
. I plan to merge the COW part of this feature to 0.13.0 at least. will
consider your suggestion if time is tight
Thanks
On Wed, 14 Sept 2022 at 03:02, Y Ethan Guo wrote:
> Hi Feng Jian,
>
> Looking forward to the RFC! Is the snapshot view management more like
> managing commits / savepoints in the Hudi timeline and hiding Hudi
> internals from the users?
>
> Do you plan to merge the implementation of snapshot view and lifecycle
> management for the next major release (0.13.0)? Timeline-wise, if time is
> tight, you may also consider scoping out a subset of features to target
> 0.13.0.
>
> Best,
> - Ethan
>
> On Mon, Sep 12, 2022 at 10:43 PM Sivabalan wrote:
>
> > Sounds like a nice feature to have. Eagerly looking forward for the RFC.
> >
> > On Sat, 27 Aug 2022 at 20:51, 冯健 wrote:
> >
> > > I attached the image in this Jira Epic
> > > https://issues.apache.org/jira/browse/HUDI-4677, and the RFC is WIP,
> > will
> > > create a pr in the next few days
> > > Yeah, the basic idea is to implement lifecycle management based on the
> > > savepoint and time travel features, providing new ways for the user to
> > > operate
> > > and coordinate. won't propose any new concept
> > >
> > > On Sun, 28 Aug 2022 at 02:06, Shiyan Xu
> > > wrote:
> > >
> > > > The dev email list does not support showing images unfortunately. you
> > may
> > > > want to put it behind a link.
> > > >
> > > > As for the idea itself,
> > > >
> > > > What I plan to do is to let Hudi support release a snapshot view and
> > > > > lifecycle management out-of-box.
> > > >
> > > >
> > > > Are you planning to extend the savepoint feature to have lifecycle
> > mgmt
> > > > capabilities? We should consolidate overlapping features properly.
> > > >
> > > > On Sun, Aug 21, 2022 at 12:59 PM 冯健 wrote:
> > > >
> > > > > Hi team,
> > > > > [image: image.png]
> > > > > for the snapshot view scenario, Hudi already provides two key
> > > > > features to support it:
> > > > >
> > > > >- Time travel: user provides a timestamp to query a specific
> > > snapshot
> > > > >view of a Hudi table
> > > > >- Savepoint/restore: "savepoint" saves the table as of the
> commit
> > > time
> > > > >so that it lets you restore the table to this savepoint at a
> later
> > > > point in
> > > > >time if need be. but in this case, the user usually uses this to
> > > > prevent
> > > > >cleaning snapshot view at a specific timestamp, only clean
> unused
> > > > files
> > > > >
> > > > > The situation is there some inconvenience for users if use them
> > > directly
> > > > >
> > > > >- Usually users incline to use a meaningful name instead of
> > querying
> > > > >Hudi table with a timestamp, using the timestamp in SQL may lead
> > to
> > > > the
> > > > >wrong snapshot view being used. for example, we can announce
> that
> > a
> > > > new tag
> > > > >of hudi table with table_nameMMDD was released, then the
> user
> > > can
> > > > use
> > > > >this new table name to query.
> > > > >- Savepoint is not designed for this "snapshot view" scenario in
> > the
> > > > >beginning, it is designed for disaster recovery. let's say a new
> > > > snapshot
> > > > >view will be created every day, and it has 7 days retention, we
> > > should
> > > > >support lifecycle management on top of it.
> > > > >
> > > > > What I plan to do is to let Hudi support release a snapshot view
> and
> > > > > lifecycle management out-of-box. We have already done some work
> when
> > > > > supporting customers' snapshot view requirements in my company, and
> > > hope
> > > > to
> > > > > land this feature in Community too.
> > > > >
> > > > > Please feel free to let me know if you have any idea about this.
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Jian Feng
> > > > >
> > > >
> > > >
> > > > --
> > > > Best,
> > > > Shiyan
> > > >
> > >
> >
> >
> > --
> > Regards,
> > -Sivabalan
> >
>