Re: [Discuss] SCD-2 Payload

2022-10-24 Thread 冯健
to Raymond: now combineAndGetUpdateValue can only return one IndexedRecord, but in the case of SCD-2, both old and new records need to be stored. to Alexey: yeah, this feature should be designed on top of RFC-46. Can HoodieRecordMerger return 2 HoodieRecord in this case? On Tue, 25 Oct 2022

Re: [Discuss] SCD-2 Payload

2022-10-24 Thread Alexey Kudinkin
Hey, hey, Fengjian! With the landing of the RFC-46 we'll be kick-starting a process of phasing out HoodieRecordPayload as an abstraction and instead migrating to HoodieRecordMerger interface. I'd recommend to base your design considerations off the new HoodieRecordMerger interface instead of

Re: [Discuss] SCD-2 Payload

2022-10-24 Thread Shiyan Xu
Interesting thoughts. Not sure if I fully understand this part: "generate 2 records in combineAndGetUpdateValue". the API is defined to return just 1 record? On Fri, Oct 21, 2022 at 1:07 AM 冯健 wrote: > Hi guys, > After reading this article with respect to how to implement SCD-2 with > Hudi

[Discuss] SCD-2 Payload

2022-10-20 Thread 冯健
Hi guys, After reading this article with respect to how to implement SCD-2 with Hudi Build Slowly Changing Dimensions Type 2 (SCD2) with Apache Spark and Apache Hudi on Amazon EMR