bvaradar commented on issue #1852:
URL: https://github.com/apache/hudi/issues/1852#issuecomment-663427905
What do you mean by "runs serially with ingestion"? My understanding was
that inline compaction happened in the same flow as writing so an inline
compaction would simply slow down ingestion.
===> Yes, that is what I meant. Inline Compaction would run after ingestion
but not in parallel. You can use #1752 to have it run concurrently.
Does INLINE_COMPACT_NUM_DELTA_COMMITS_PROP refer to the number of commits
retained in general, or the number of commits for a record?
==> INLINE_COMPACT_NUM_DELTA_COMMITS_PROP refers to number of ingestion
(deltacommits) between 2 compaction runs.
I see in the timeline I have several clean.requested and clean.inflight, how
can I get these to actually complete?
==> If it is in inflight state alone, there could be errors when Hudi is
trying to cleanup. Please look for exceptions in driver logs. Cleaner run
should be run automatically by default. Also, any pending clean operations will
automatically get picked up in next ingestion. So, it must have been failing
for some reasons. You can turn on logs to see what is happening.
Is it possible to force a compaction of the existing log files.
===> Yes, by configuring INLINE_COMPACT_NUM_DELTA_COMMITS_PROP. You can set
it to 1 to have aggressive compaction.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]