[
https://issues.apache.org/jira/browse/BOOKKEEPER-249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13506332#comment-13506332
]
Flavio Junqueira commented on BOOKKEEPER-249:
---------------------------------------------
I think your interpretation of what I've written is correct, but I haven't
expressed properly what I was trying to achieve, so let me step back.
When we delete a ledger L, we notify to a set B of bookies through zookeeper
that they need to garbage collect L. The set B is the union of all ensembles of
L as written in its metadata. There are possibly two reasons for getting
spurious or zombie entries:
# A bookie b is not added to set B originally (bookie missing);
# A bookie b writes entries of L after garbage-collecting it (bookie race).
According to your examples, I think the former can happen if a bookie writes
entries of L but ends up no forming part of the ensemble of L. I don't see a
way of detecting it other than:
* Having a confirmation from the client that a bookie is actually part of the
ledger ensemble, which in some sense "commits" the ledger fragment the bookie
wrote. We don't have such a confirmation today, so it would be necessary to add
this mechanism.
* Having bookies periodically check if the ledger metadata still exists.
The mechanism I was proposing was for the bookie race case, to avoid the extra
polling mechanism you suggested. I was essentially trying to maintain a
greatest lower bound for the ledgers that have already been deleted. I
understand that we don't delete them in order, although my example did give
that impression.
To maintain such a greatest lower bound, I was thinking that we could delete
only entire prefixes, with no holes. Let me go back to the example. If a bookie
has entries for ledgers L1, L5, and L6, and L5 is deleted, then we wouldn't
remove L5 or move the greatest lower bound until L1 is deleted. Once L1 is
deleted, then we have a prefix formed by L1 and L5, and we remove the
corresponding ledger fragments, setting also the greatest lower bound to 5.
One main drawback of this approach is having to wait until L1 is actually
deleted, which can happen in principle at any arbitrary time. If ledgers don't
live long, then it works fine. Otherwise, it could prevent bookies from
reclaiming space for arbitrarily long periods.
> Revisit garbage collection algorithm in Bookie server
> -----------------------------------------------------
>
> Key: BOOKKEEPER-249
> URL: https://issues.apache.org/jira/browse/BOOKKEEPER-249
> Project: Bookkeeper
> Issue Type: Improvement
> Components: bookkeeper-server
> Reporter: Sijie Guo
> Fix For: 4.2.0
>
> Attachments: gc_revisit.pdf
>
>
> Per discussion in BOOKKEEPER-181, it would be better to revisit garbage
> collection algorithm in bookie server. so create a subtask to focus on it.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira