Thanks everyone for reviewing the RFC. I will address the comments in the wiki once I am back from vacation. Meanwhile, I have created subtasks for this effort in https://jira.apache.org/jira/browse/HUDI-242 Thanks,Balaji.V
On Sunday, December 15, 2019, 07:24:08 PM PST, Sivabalan <n.siv...@gmail.com> wrote: Nice one Balaji. have left few comments. Overall looks good :) On Sun, Dec 15, 2019 at 9:30 AM Balaji Varadarajan <v.bal...@ymail.com.invalid> wrote: > Hi Nicholas, > Once I get high level comments on the RFC, we can have concrete subtasks > around this. > Balaji.V On Saturday, December 14, 2019, 07:04:52 PM PST, 蒋晓峰 < > programg...@163.com> wrote: > > Hi Balaji, > About plan of "Efficient migration of large parquet tables to Apache > Hudi", have you split the plan into multiple subtasks? > Thanks, > Nicholas > > > At 2019-12-14 00:18:12, "Vinoth Chandar" <vin...@apache.org> wrote: > >+1 (per asf policy) > > > >+100 per my own excitement :) .. Happy to review this! > > > >On Fri, Dec 13, 2019 at 3:07 AM Balaji Varadarajan <vbal...@apache.org> > >wrote: > > > >> With Apache Hudi growing in popularity, one of the fundamental > challenges > >> for users has been about efficiently migrating their historical > datasets to > >> Apache Hudi. Apache Hudi maintains per record metadata to perform core > >> operations such as upserts and incremental pull. To take advantage of > >> Hudi’s upsert and incremental processing support, users would need to > >> rewrite their whole dataset to make it a Hudi table. This RFC provides a > >> mechanism to efficiently migrate their datasets without the need to > rewrite > >> the entire dataset. > >> > >> Please find the link for the RFC below. > >> > >> > >> > https://cwiki.apache.org/confluence/display/HUDI/RFC+-+12+%3A+Efficient+Migration+of+Large+Parquet+Tables+to+Apache+Hudi > >> > >> Please review and let me know your thoughts. > >> > >> Thanks, > >> Balaji.V > >> > -- Regards, -Sivabalan