the-other-tim-brown commented on code in PR #13559:
URL: https://github.com/apache/hudi/pull/13559#discussion_r2209044808
##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/io/HoodieMergeHandle.java:
##########
@@ -116,19 +127,29 @@ public class HoodieMergeHandle<T, I, K, O> extends
HoodieWriteHandle<T, I, K, O>
protected Option<String[]> partitionFields = Option.empty();
protected Object[] partitionValues = new Object[0];
+ protected KeyBasedFileGroupRecordBuffer<T> recordBuffer;
+ protected HoodieReaderContext<T> readerContext;
+ protected Option<String> orderingFieldName;
public HoodieMergeHandle(HoodieWriteConfig config, String instantTime,
HoodieTable<T, I, K, O> hoodieTable,
Iterator<HoodieRecord<T>> recordItr, String
partitionPath, String fileId,
- TaskContextSupplier taskContextSupplier,
Option<BaseKeyGenerator> keyGeneratorOpt) {
+ TaskContextSupplier taskContextSupplier,
Option<BaseKeyGenerator> keyGeneratorOpt,
+ HoodieReaderContext<T> readerContext) {
this(config, instantTime, hoodieTable, recordItr, partitionPath, fileId,
taskContextSupplier,
- getLatestBaseFile(hoodieTable, partitionPath, fileId),
keyGeneratorOpt);
+ getLatestBaseFile(hoodieTable, partitionPath, fileId),
keyGeneratorOpt, readerContext, false);
}
public HoodieMergeHandle(HoodieWriteConfig config, String instantTime,
HoodieTable<T, I, K, O> hoodieTable,
Iterator<HoodieRecord<T>> recordItr, String
partitionPath, String fileId,
- TaskContextSupplier taskContextSupplier,
HoodieBaseFile baseFile, Option<BaseKeyGenerator> keyGeneratorOpt) {
+ TaskContextSupplier taskContextSupplier,
HoodieBaseFile baseFile,
+ Option<BaseKeyGenerator> keyGeneratorOpt,
+ HoodieReaderContext<T> readerContex, boolean
useFileGroupRecordBuffer) {
Review Comment:
typo: `readerContex` -> `readerContext`
##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/io/HoodieMergeHandle.java:
##########
@@ -406,7 +454,9 @@ public void write(HoodieRecord<T> oldRecord) {
boolean copyOldRecord = true;
String key = oldRecord.getRecordKey(oldSchema, keyGeneratorOpt);
TypedProperties props = config.getPayloadConfig().getProps();
- if (keyToNewRecords.containsKey(key)) {
+ if (recordBuffer != null && recordBuffer.containsLogRecord(key)) {
Review Comment:
Is it possible to refactor the `write` to avoid going through `HoodieRecord`
by using the FileGroupReader? We are essentially reading the base file and
returning an iterator that merges with the updates which is what the FGReader
does but it avoids extra conversion where possible. It will allow us to focus
on a single code path and apply performance improvements we make for compaction
to our CoW tables and vice versa.
##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/io/HoodieMergeHandle.java:
##########
@@ -116,19 +127,29 @@ public class HoodieMergeHandle<T, I, K, O> extends
HoodieWriteHandle<T, I, K, O>
protected Option<String[]> partitionFields = Option.empty();
protected Object[] partitionValues = new Object[0];
+ protected KeyBasedFileGroupRecordBuffer<T> recordBuffer;
Review Comment:
Let's make this the FileGroupRecordBuffer interface so it is more flexible
if we want to provide new implementations in the future
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]