aditiwari01 commented on issue #2756:
URL: https://github.com/apache/hudi/issues/2756#issuecomment-812516535
I think I couldn't explain myself. I am using DefaultHoodieRecordPayload
only. I am attached sample command regardinng same.
The issue is not with "combineAndGetUpdateValue", rather with "preCombine".
As per my uderstanding, combineAndGetUpdateValue is used to merge record
from parquet and in memory record, whereas preCombine is used to dedupe
multiple records in memory with same key. The preCombine function uses
orderingVal field to sort and while creating record from log file we do not set
this ordering field. And hence the issue.
The constructors are as foolows:
1. DefaultHoodieRecordPayload(Option<GenericRecord> record) {this(recordl,
0);}
2. DefaultHoodieRecordPayload(GenericRecord record, Comparable orderingVal)
{super(record, orderingVal)}
In the read path we only call the 1st constructor and hence lose the
ordering value.
Also, if we compact after each commit we dont see this issue since
"combineAndGetUpdateValue" works absolutely fine.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]