[ https://issues.apache.org/jira/browse/BOOKKEEPER-1014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Yiming Zang updated BOOKKEEPER-1014: ------------------------------------ Summary: User a separate log file for compaction (was: Make BK Compaction use a separate log file) > User a separate log file for compaction > --------------------------------------- > > Key: BOOKKEEPER-1014 > URL: https://issues.apache.org/jira/browse/BOOKKEEPER-1014 > Project: Bookkeeper > Issue Type: Task > Components: bookkeeper-server > Affects Versions: 4.4.0 > Reporter: Yiming Zang > Assignee: Yiming Zang > > There're a few issues brought by the current compaction: > 1. BK can't reclaim disk space when it's full > If the disks are almost full, major/minor compactions would be suspended, and > only GC will keep running. This was intended to prevent disk usage from keep > growing up, and also because the EntryLogger can not allocate any new entry > logs due to NoWritableDirs. However, the problem is if we have a mixed of > short-lived ledgers and long-lived ledgers in all entry logs, GC wouldn't be > able to delete any entry logs, plus compaction is disabled, thus the bookie > can't release any disk space at all. So having a separate allocation logic > for compaction would address this problem. We can allocate a new file for > compaction as long as the remaining disk usage is > logSizeLimit > 2. Compaction might generate dirty data and cause BK disk full > Currently, there's no transactional operation for compaction. In the current > CompactionScannerFactory, if it fails to flush entry log file, or fails to > flush ledgerCache, the "already flushed data" wouldn't be deleted, and it > will retry for the next time since the log is still there when compaction > fail. This is generating duplicated data. And if the data being compacted is > long-lived data and compaction keeps failing for some reason(e.g. corrupted > entry, corrupted index), it would cause the BK disk usage keep growing. > Adding transactional operation for compaction would address this issue, for > example, if the compaction failed for log1, we should roll back the > compaction by deleting the data copied from log1 once we use a separate file > for compaction -- This message was sent by Atlassian JIRA (v6.3.15#6346)