[
https://issues.apache.org/jira/browse/HUDI-2735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17444216#comment-17444216
]
Ethan Guo commented on HUDI-2735:
---------------------------------
The archival process is triggered post every commit. Deltacommits are archived
based on the following config:
{code:java}
"hoodie.keep.max.commits": 8,
"hoodie.keep.min.commits": 6,
"hoodie.cleaner.commits.retained": 4 {code}
However, the number of rollbacks can keep going due to HUDI-2672 and the
rollbacks are kept being added, which are not archived. Only when more
deltacommits are added and the number of deltacommits hits the threshold, some
rollbacks are archived. It looks like the archival process does not count the
number of rollback instants.
I filed a separate ticket to track the fix since the issue is not kafka-connect
specific:
[https://issues.apache.org/jira/projects/HUDI/issues/HUDI-2765?filter=allissues.]
Once the rollbacks due to no Kafka message are fixed in
https://issues.apache.org/jira/browse/HUDI-2672, this issue won't be severe
anymore.
> Fix archival of commits in Java client for Kafka Connect
> --------------------------------------------------------
>
> Key: HUDI-2735
> URL: https://issues.apache.org/jira/browse/HUDI-2735
> Project: Apache Hudi
> Issue Type: Sub-task
> Components: Writer Core
> Reporter: Ethan Guo
> Assignee: Ethan Guo
> Priority: Blocker
> Fix For: 0.10.0
>
>
--
This message was sent by Atlassian Jira
(v8.20.1#820001)