Re: [jira] [Updated] (BOOKKEEPER-1040) Use separate log for compaction

Yiming Zang Wed, 19 Apr 2017 22:57:02 -0700

Sounds good.

On Wed, Apr 19, 2017 at 10:54 PM, Venkateswara Rao Jujjuri <
jujj...@gmail.com> wrote:


> We have addressed this problem in slightly different way. I would like to
> add it to the conversation tomorrow.
>
> On Wed, Apr 19, 2017 at 10:46 PM, Yiming Zang (JIRA) <j...@apache.org>
> wrote:
>
> >
> >      [ https://issues.apache.org/jira/browse/BOOKKEEPER-1040?
> > page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
> >
> > Yiming Zang updated BOOKKEEPER-1040:
> > ------------------------------------
> >     Description:
> > Bookkeeper is not able to reclaim disk space when it's full
> > If all disks are full or almost full, both major and minor compactions
> > would be suspended, and only GC will be running. In the current design,
> > this is the right thing to do, because when disks are full, EntryLogger
> can
> > not allocate any new entry logs any more, and apart from that,  the
> > intention is to prevent disk usage from keep growing.
> > However, the problem is if we have a mixed of short-lived ledgers and
> > long-lived ledgers in all entry logs, when disks are full, GC wouldn't be
> > able to delete any entry logs, and if compaction is disabled, bookie
> can't
> > reclaim any disk space any more by itself.
> >
> > Compaction might keep generating duplicated data which would cause disk
> > full
> > Currently, there's no transactional operation for compaction. In the
> > current CompactionScannerFactory, if it fails to flush entry log file, or
> > fails to flush ledgerCache, the data which is already flushed wouldn't be
> > deleted, and the entry log that is being compacted will be retried again
> > for the next time, which would generate duplicated data.
> > Moreover, if the entry log being compacted has long-lived data and the
> > compaction keeps failing for some reason(e.g. corrupted entry, corrupted
> > index), it would cause the BK disk usage keep growing until the either
> the
> > entry log can be garbage collected, or disk full.
> >
> >
> >   was:
> > Bookkeeper is not able to reclaim disk space when it's full
> > If all disks are full or almost full, both major and minor compactions
> > would be suspended, and only GC will be running. In the current design,
> > this is the right thing to do, because when disks are full, EntryLogger
> can
> > not allocate any new entry logs any more, and apart from that,  the
> > intention is to prevent disk usage from keep growing.
> > However, the problem is if we have a mixed of short-lived ledgers and
> > long-lived ledgers in all entry logs, when disks are full, GC wouldn't be
> > able to delete any entry logs, and if compaction is disabled, bookie
> can't
> > reclaim any disk space any more by itself.
> > Compaction might keep generating duplicated data which would cause disk
> > full
> > Currently, there's no transactional operation for compaction. In the
> > current CompactionScannerFactory, if it fails to flush entry log file, or
> > fails to flush ledgerCache, the data which is already flushed wouldn't be
> > deleted, and the entry log that is being compacted will be retried again
> > for the next time, which would generate duplicated data.
> > Moreover, if the entry log being compacted has long-lived data and the
> > compaction keeps failing for some reason(e.g. corrupted entry, corrupted
> > index), it would cause the BK disk usage keep growing until the either
> the
> > entry log can be garbage collected, or disk full.
> >
> >
> >
> > > Use separate log for compaction
> > > -------------------------------
> > >
> > >                 Key: BOOKKEEPER-1040
> > >                 URL: https://issues.apache.org/
> > jira/browse/BOOKKEEPER-1040
> > >             Project: Bookkeeper
> > >          Issue Type: Bug
> > >          Components: bookkeeper-server
> > >    Affects Versions: 4.3.2
> > >            Reporter: Yiming Zang
> > >            Assignee: Yiming Zang
> > >
> > > Bookkeeper is not able to reclaim disk space when it's full
> > > If all disks are full or almost full, both major and minor compactions
> > would be suspended, and only GC will be running. In the current design,
> > this is the right thing to do, because when disks are full, EntryLogger
> can
> > not allocate any new entry logs any more, and apart from that,  the
> > intention is to prevent disk usage from keep growing.
> > > However, the problem is if we have a mixed of short-lived ledgers and
> > long-lived ledgers in all entry logs, when disks are full, GC wouldn't be
> > able to delete any entry logs, and if compaction is disabled, bookie
> can't
> > reclaim any disk space any more by itself.
> > > Compaction might keep generating duplicated data which would cause disk
> > full
> > > Currently, there's no transactional operation for compaction. In the
> > current CompactionScannerFactory, if it fails to flush entry log file, or
> > fails to flush ledgerCache, the data which is already flushed wouldn't be
> > deleted, and the entry log that is being compacted will be retried again
> > for the next time, which would generate duplicated data.
> > > Moreover, if the entry log being compacted has long-lived data and the
> > compaction keeps failing for some reason(e.g. corrupted entry, corrupted
> > index), it would cause the BK disk usage keep growing until the either
> the
> > entry log can be garbage collected, or disk full.
> >
> >
> >
> > --
> > This message was sent by Atlassian JIRA
> > (v6.3.15#6346)
> >
>
>
>
> --
> Jvrao
> ---
> First they ignore you, then they laugh at you, then they fight you, then
> you win. - Mahatma Gandhi
>

Re: [jira] [Updated] (BOOKKEEPER-1040) Use separate log for compaction

Reply via email to