n3nash commented on issue #3078:
URL: https://github.com/apache/hudi/issues/3078#issuecomment-869213421


   @guanziyue Thanks for the detailed explanation.
   
   @tandonraghav During compaction, `combineAndGetUpdateValue` is never called. 
See this to get an understanding of how compaction works -> 
https://github.com/apache/hudi/blob/e99a6b031bf4f2e3037d4cb5307d443cda2d2002/hudi-common/src/main/java/org/apache/hudi/common/table/log/HoodieMergedLogRecordScanner.java#L132.
 
   
   The issue is you want a common implementation across both `preCombine` and 
`combinAndGetUpdateValue`. The way I would recommend doing this is as follows : 
   
   class YourPayloadImplementation {
   
   Map<String, String> allTheCommonValuesNeededToDetermineHowToMerge2Records
   
   `private Boolean 
commonMergeLogic(allTheCommonValuesNeededToDetermineHowToMerge2Records old, 
allTheCommonValuesNeededToDetermineHowToMerge2Records new) {
      <your common implementation>
   }`
   
   `preCombine(YourPayloadImplementation that)
   Use allTheCommonValuesNeededToDetermineHowToMerge2Records to merge 2 
different records
   boolean pickNew 
commonMergeLogic(this.allTheCommonValuesNeededToDetermineHowToMerge2Records, 
that.allTheCommonValuesNeededToDetermineHowToMerge2Records);
   if (pickNew) {
   ..
   }
   ..
   }`
   
   `combinAndGetUpdateValue(GenericRecord old, Schema schema) {
   <create datastructure allTheCommonValuesNeededToDetermineHowToMerge2Records 
from old and new as follows>
   allTheCommonValuesNeededToDetermineHowToMerge2RecordsOld = 
extractValuesfromOld(old);
   allTheCommonValuesNeededToDetermineHowToMerge2RecordsNew = 
extractValuesfromNew(getInsertValue(schema));
   boolean pickNew = 
commonMergeLogic(allTheCommonValuesNeededToDetermineHowToMerge2RecordsOld, 
allTheCommonValuesNeededToDetermineHowToMerge2RecordsNew);
   if (pickNew) {
   ..
   }
   ..
   }`
   
   Another way is to pass the Schema through the properties file and convert 
everything to GenericRecord and then there is no need for 
allTheCommonValuesNeededToDetermineHowToMerge2Records.
   You method would look as follows : 
   
   class YourPayloadImplementation {
   
   `private Boolean commonMergeLogic(GenericRecord old, GenericRecord new) {
      <your common implementation>
   }`
   
   `preCombine(YourPayloadImplementation that, Properties p)
   Schema schema = SchemaParse.newSchema(p.getString("schema"))
   boolean pickNew commonMergeLogic(getInsertValue(schema), 
that.getData().getInsertValue(schema);
   if (pickNew) {
   ..
   }
   ..
   }`
   
   `combinAndGetUpdateValue(GenericRecord old, Schema schema) {
   boolean pickNew = commonMergeLogic(old, getInsertValue(schema));
   if (pickNew) {
   ..
   }
   ..
   }`
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to