[ 
https://issues.apache.org/jira/browse/BOOKKEEPER-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13722726#comment-13722726
 ] 

Matteo Merli commented on BOOKKEEPER-665:
-----------------------------------------

If the bookie machine is reachable and the process is down, it fails fast and 
retries on the next bookie. But if it's partitioned (or process unresponsive) 
it will need to wait for the speculative read timeout which is by default 2s.

But if we already know the bookie is down (not being in zk) we should avoid 
trying to read from that, or at least leave it as the last one to be tried 
after all the other replicas have failed the read operation too.
                
> BK client should not try to read entries from non-available bookies
> -------------------------------------------------------------------
>
>                 Key: BOOKKEEPER-665
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-665
>             Project: Bookkeeper
>          Issue Type: Bug
>            Reporter: Matteo Merli
>            Assignee: Matteo Merli
>            Priority: Minor
>         Attachments: BOOKKEEPER-665.patch
>
>
> If a bookie is not in the available list, we shouldn't try to read from it 
> but just treat the read from that replica as failed.
> This could be especially true if the bookie node is partitioned because that 
> could mean we need to wait the connection timeout. Also during the 
> auto-replication of ledgers most of the logs consist of errors that say it 
> was not possible to read from the failed bookie.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to