BalaMahesh commented on issue #9758:
URL: https://github.com/apache/hudi/issues/9758#issuecomment-1792328313
hoodie.clean.async=false
after setting this false compaction is being triggered for the metadata
table, earlier always there are pendingInstants of delta commits because async
clean kicks in and creates a new delta commit on metadata table timeline which
is earlier than the one on data timeline.
```
List<HoodieInstant> pendingInstants =
dataMetaClient.reloadActiveTimeline().filterInflightsAndRequested()
.findInstantsBeforeOrEquals(latestDeltaCommitTimeInMetadataTable).getInstants();
if (!pendingInstants.isEmpty()) {
LOG.info(String.format(
"Cannot compact metadata table as there are %d inflight instants
in data table before latest deltacommit in metadata table: %s. Inflight
instants in data table: %s",
pendingInstants.size(), latestDeltaCommitTimeInMetadataTable,
Arrays.toString(pendingInstants.toArray())));
return;
}
```
This piece of code in `HoodieBackedTableMetadataWriter` ,
```
if (lastCompletedCompactionInstant.isPresent()
&& metadataMetaClient.getActiveTimeline().filterCompletedInstants()
.findInstantsAfter(lastCompletedCompactionInstant.get().getTimestamp()).countInstants()
< 3) {
```
To clean the metadata files , we should always keep:
hoodie.metadata.compact.max.delta.commits=3
else files will be never cleaned and piled up.
After making these changes in config, I can see cleaner metadata files
partition, earlier it was piled up with old files.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]