[ 
https://issues.apache.org/jira/browse/BOOKKEEPER-248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13439426#comment-13439426
 ] 

Uma Maheswara Rao G commented on BOOKKEEPER-248:
------------------------------------------------

While writing multiple recovery workers test, I found one issue from 
ZKLedgerUnderreplicationManager#getLedgerToRereplicateFromHierarchy API.

Issue is: 
1) Two Workers started and trying to get the lock for same ledger.
2) Both worker found that lock file does not exist.
3) both gone ahead for creating the lock node.
4) One worker failed with NodeExists exception

Then it is just removing the children from the list and go for latch wait for 
the watch notification.

But here unfortunately we added the watch on lockPath with exists check call. 
But that time lockPatch really did not exists. SO, the lock may be invalid. 
Then it will never get the notification when lock has been cleaned by other 
worker.
Here other worker partly replicated and now the current worker should take 
lock. But it can not get that notification as it added that watch when node 
does not exist.

Shall I handle that bug along with this JIRA?

Possible solution could be that, we have to add the watcher again 
 on KeeperException.NodeExistsException right?
or simply we can handle NodeCreated also in watcher and notify, let it try 
again(Did not think of many scenarios with this option)?


                
> Rereplicating of under replicated data
> --------------------------------------
>
>                 Key: BOOKKEEPER-248
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-248
>             Project: Bookkeeper
>          Issue Type: Sub-task
>          Components: bookkeeper-client, bookkeeper-server
>            Reporter: Ivan Kelly
>            Assignee: Uma Maheswara Rao G
>             Fix For: 4.2.0
>
>         Attachments: BOOKKEEPER-248.patch
>
>
> This subtask discusses how we will rereplicate underreplicated entries.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to