yihua commented on code in PR #12390:
URL: https://github.com/apache/hudi/pull/12390#discussion_r1866796877
##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/compact/HoodieCompactor.java:
##########
@@ -161,66 +162,70 @@ public List<WriteStatus> compact(HoodieCompactionHandler
compactionHandler,
Option<InstantRange> instantRange,
TaskContextSupplier taskContextSupplier,
CompactionExecutionHelper executionHelper)
throws IOException {
- HoodieStorage storage = metaClient.getStorage();
- Schema readerSchema;
- Option<InternalSchema> internalSchemaOption = Option.empty();
- if (!StringUtils.isNullOrEmpty(config.getInternalSchema())) {
- readerSchema = new Schema.Parser().parse(config.getSchema());
- internalSchemaOption = SerDeHelper.fromJson(config.getInternalSchema());
- // its safe to modify config here, since we are running in task side.
- ((HoodieTable) compactionHandler).getConfig().setDefault(config);
+ if
(config.getBooleanOrDefault(HoodieReaderConfig.FILE_GROUP_READER_ENABLED)
+ && compactionHandler.supportsFileGroupReader()) {
Review Comment:
I agree with @vinothchandar that major use cases of snapshot queries already
work with file group reader. There are cases that might have issues, e.g.,
MDT, schema on read, etc., which I have explicitly turned off the new feature
to let them go through the old flow of compaction.
@danny0405 yes, if there is any regression in Spark, user can turn off fg
reader based compaction by turning off
`HoodieReaderConfig.FILE_GROUP_READER_ENABLED` /
`hoodie.file.group.reader.enabled`.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]