[ 
https://issues.apache.org/jira/browse/BOOKKEEPER-246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13429031#comment-13429031
 ] 

Rakesh R commented on BOOKKEEPER-246:
-------------------------------------

bq.The worse case scenario here will be another replicator comes along, and 
sees that the ledger is already fully replicated, so it does nothing.

There would be cases with partial replication: after his work, replicator will 
give chance to others by releasing the lock. Assume following is the ledger 
metadata.
0  BK1, BK2, BK3
10 BK1, BK4, BK3
Say BK1 shuts down, BK4 has acquired the lock and would able to replicate only 
first fragment as BK4 is already has one copy of second fragment. Now assume 
while releasing lock there is a slight zk fluctuation and got connection loss 
exception(but zk session is still alive). Since the BK4 lock exists, others 
couldn't acquire the lock. Here I feel, just recreation of the LedgerManager 
won't work fully, instead needs to either close zk session or force releasing 
the lock till session expiry(timeout).

Actually, I'm afraid of orphan locks that would create situations where holding 
locks infinitely. Also, LedgerUnderreplicationManager presently doesn't have 
any close apis and its taking zkclient externally?

                
> Recording of underreplication of ledger entries
> -----------------------------------------------
>
>                 Key: BOOKKEEPER-246
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-246
>             Project: Bookkeeper
>          Issue Type: Sub-task
>          Components: bookkeeper-client, bookkeeper-server
>            Reporter: Ivan Kelly
>            Assignee: Ivan Kelly
>             Fix For: 4.2.0
>
>         Attachments: BOOKKEEPER-246.diff, BOOKKEEPER-246.diff, 
> BOOKKEEPER-246.diff, BOOKKEEPER-246.diff
>
>
> This JIRA is to decide how to record that entries in a ledger are 
> underreplicated. 
> I think there is a common understanding (correct me if im wrong), that 
> rereplication can be broken into two logically distinct phases. A) Detection 
> of entry underreplication & B) Rereplication. 
> This subtask is to handle the interaction between these two stages. Stage B 
> needs to know what to rereplicate; how should Stage A inform it?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to