left some comments. thanks! On Fri, 31 Mar 2023 at 00:59, 符其军 <[email protected]> wrote:
> Hi community, we have submitted RFC-65 Partition TTL Management in this > pr: https://github.com/apache/hudi/pull/8062.<br/><br/>Let me know if you > have any questions or concerns with this proposal. > At 2022-10-21 14:42:10, "stream2000" <[email protected]> wrote: > >Yes we can have a talk about it. We will try our best to write the RFC, > maybe publish it in a few weeks. > > > > > >> On Oct 21, 2022, at 10:18, JerryYue <[email protected]> wrote: > >> > >> Looking forward to the RFC > >> It's a good idea, we also need hudi data TTL in some case > >> Do we have any plan or time to do this? We also had some simple designs > to implement it > >> Maybe we can had a talk about it > >> > >> 在 2022/10/20 上午9:47,“Bingeng Huang”<dev-return-5022-272614347= > [email protected] 代表 [email protected]> 写入: > >> > >> Looking forward to the RFC. > >> We can propose RFC about support TTL config using non-partition > field after > >> > >> > >> > >> sagar sumit <[email protected]> 于2022年10月19日周三 14:42写道: > >> > >>> +1 Very nice idea. Looking forward to the RFC! > >>> > >>> On Wed, Oct 19, 2022 at 10:13 AM Shiyan Xu < > [email protected]> > >>> wrote: > >>> > >>>> great proposal. Partition TTL is a good starting point. we can extend > it > >>> to > >>>> other TTL strategies like column-based, and make it customizable and > >>>> pluggable. Looking forward to the RFC! > >>>> > >>>> On Wed, Oct 19, 2022 at 11:40 AM Jian Feng > <[email protected] > >>>> > >>>> wrote: > >>>> > >>>>> Good idea, > >>>>> this is definitely worth an RFC > >>>>> btw should it only depend on Hudi's partition? I feel it should be a > >>> more > >>>>> common feature since sometimes customers' data can not update across > >>>>> partitions > >>>>> > >>>>> > >>>>> On Wed, Oct 19, 2022 at 11:07 AM stream2000 <[email protected]> > >>> wrote: > >>>>> > >>>>>> Hi all, we have implemented a partition based data ttl management, > >>>> which > >>>>>> we can manage ttl for hudi partition by size, expired time and > >>>>>> sub-partition count. When a partition is detected as outdated, we > use > >>>>>> delete partition interface to delete it, which will generate a > >>> replace > >>>>>> commit to mark the data as deleted. The real deletion will then done > >>> by > >>>>>> clean service. > >>>>>> > >>>>>> > >>>>>> If community is interested in this idea, maybe we can propose a RFC > >>> to > >>>>>> discuss it in detail. > >>>>>> > >>>>>> > >>>>>>> On Oct 19, 2022, at 10:06, Vinoth Chandar <[email protected]> > >>> wrote: > >>>>>>> > >>>>>>> +1 love to discuss this on a RFC proposal. > >>>>>>> > >>>>>>> On Tue, Oct 18, 2022 at 13:11 Alexey Kudinkin <[email protected]> > >>>>>> wrote: > >>>>>>> > >>>>>>>> That's a very interesting idea. > >>>>>>>> > >>>>>>>> Do you want to take a stab at writing a full proposal (in the form > >>>> of > >>>>>> RFC) > >>>>>>>> for it? > >>>>>>>> > >>>>>>>> On Tue, Oct 18, 2022 at 10:20 AM Bingeng Huang < > >>> [email protected] > >>>>> > >>>>>>>> wrote: > >>>>>>>> > >>>>>>>>> Hi all, > >>>>>>>>> > >>>>>>>>> Do we have plan to integrate data TTL into HUDI, so we don't have > >>>> to > >>>>>>>>> schedule a offline spark job to delete outdated data, just set a > >>>> TTL > >>>>>>>>> config, then writer or some offline service will delete old data > >>> as > >>>>>>>>> expected. > >>>>>>>>> > >>>>>>>> > >>>>>> > >>>>>> > >>>>> > >>>>> -- > >>>>> *Jian Feng,冯健* > >>>>> Shopee | Engineer | Data Infrastructure > >>>>> > >>>> > >>>> > >>>> -- > >>>> Best, > >>>> Shiyan > >>>> > >>> > >> > -- Regards, -Sivabalan
