wwli05 opened a new issue, #6835:
URL: https://github.com/apache/hudi/issues/6835
from HoodieRealtimeRecordReader, it says support merge on read record
reading, but from my test, it only return data from the log file.
i looked the RealtimeCompactedRecordReader,
public boolean next(NullWritable aVoid, ArrayWritable arrayWritable) throws
IOException {
while (this.parquetReader.next(aVoid, arrayWritable)) {
if (!deltaRecordMap.isEmpty()) {
String key = arrayWritable.get()[recordKeyIndex].toString();
if (deltaRecordMap.containsKey(key)) {
this.deltaRecordKeys.remove(key);
Option<GenericRecord> rec =
buildGenericRecordwithCustomPayload(deltaRecordMap.get(key));/**/ 1. in this
method,it just get the record from log file**
if (!rec.isPresent()) {
continue;
}
setUpWritable(rec, arrayWritable, key); // **2. in this method,
it just copy ,no merge logic.**
return true;
}
}
return true;
}
....
return false;
}
**3. so i think , hive does't support merge on read record reading now, can
someone confirm this?**
**4. if want to support mor read ,in buildGenericRecordwithCustomPayload, it
should pass current value from parque ,and invoke combineAngeGetUpdateValue
instead of getInsertValue, am i right?**
the current buildGenericRecordwithCustomPayload logic
private Option<GenericRecord>
buildGenericRecordwithCustomPayload(HoodieRecord record) throws IOException {
if (usesCustomPayload) {
return ((HoodieAvroRecord)
record).getData().getInsertValue(getWriterSchema(), payloadProps);
} else {
return ((HoodieAvroRecord)
record).getData().getInsertValue(getReaderSchema(), payloadProps);
}
}
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]