yihua commented on code in PR #12390:
URL: https://github.com/apache/hudi/pull/12390#discussion_r1866842950
##########
hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/table/HoodieSparkCopyOnWriteTable.java:
##########
@@ -264,6 +269,24 @@ public Iterator<List<WriteStatus>> handleInsert(
return Collections.singletonList(createHandle.close()).iterator();
}
+ @Override
+ public boolean supportsFileGroupReader() {
+ return true;
+ }
+
+ @Override
+ public List<WriteStatus> runCompactionUsingFileGroupReader(String
instantTime,
Review Comment:
The existing implementation of compaction uses
`HoodieSparkCopyOnWriteTable#handleUpdate` which implements
`HoodieCompactionHandler#handleUpdate`. That's because
`HoodieSparkMergeOnReadTable` uses `HoodieSparkCopyOnWriteTable` instance for
compaction the following logic:
```
@Override
public HoodieWriteMetadata<HoodieData<WriteStatus>> compact(
HoodieEngineContext context, String compactionInstantTime) {
RunCompactionActionExecutor<T> compactionExecutor = new
RunCompactionActionExecutor<>(
context, config, this, compactionInstantTime, new
HoodieSparkMergeOnReadTableCompactor<>(),
new HoodieSparkCopyOnWriteTable<>(config, context, getMetaClient()),
WriteOperationType.COMPACT);
return compactionExecutor.execute();
}
```
So for parity I have to add the same here to not break any contract around
`HoodieTable`. I think we need to revisit this; the use of
`HoodieSparkCopyOnWriteTable` instance in `HoodieSparkMergeOnReadTable` is
counter-intuitive.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]