[ https://issues.apache.org/jira/browse/BOOKKEEPER-246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13429031#comment-13429031 ]
Rakesh R commented on BOOKKEEPER-246: ------------------------------------- bq.The worse case scenario here will be another replicator comes along, and sees that the ledger is already fully replicated, so it does nothing. There would be cases with partial replication: after his work, replicator will give chance to others by releasing the lock. Assume following is the ledger metadata. 0 BK1, BK2, BK3 10 BK1, BK4, BK3 Say BK1 shuts down, BK4 has acquired the lock and would able to replicate only first fragment as BK4 is already has one copy of second fragment. Now assume while releasing lock there is a slight zk fluctuation and got connection loss exception(but zk session is still alive). Since the BK4 lock exists, others couldn't acquire the lock. Here I feel, just recreation of the LedgerManager won't work fully, instead needs to either close zk session or force releasing the lock till session expiry(timeout). Actually, I'm afraid of orphan locks that would create situations where holding locks infinitely. Also, LedgerUnderreplicationManager presently doesn't have any close apis and its taking zkclient externally? > Recording of underreplication of ledger entries > ----------------------------------------------- > > Key: BOOKKEEPER-246 > URL: https://issues.apache.org/jira/browse/BOOKKEEPER-246 > Project: Bookkeeper > Issue Type: Sub-task > Components: bookkeeper-client, bookkeeper-server > Reporter: Ivan Kelly > Assignee: Ivan Kelly > Fix For: 4.2.0 > > Attachments: BOOKKEEPER-246.diff, BOOKKEEPER-246.diff, > BOOKKEEPER-246.diff, BOOKKEEPER-246.diff > > > This JIRA is to decide how to record that entries in a ledger are > underreplicated. > I think there is a common understanding (correct me if im wrong), that > rereplication can be broken into two logically distinct phases. A) Detection > of entry underreplication & B) Rereplication. > This subtask is to handle the interaction between these two stages. Stage B > needs to know what to rereplicate; how should Stage A inform it? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira