Yep, We are also thinking in the same lines. One Idea is that, we can add the failed BKto blackList and from the next entry onwards, we will try first to the healthy bks then we will try for the blackListed bookie from the quorum trails. Why we are planning to retry on blackListed bookies also is, that bookie may come back or can serve well for next reads. But we will give chance only after reamaining bookies faild for the read. How does it sound? Brahma already filed a JIRA BK-336 and Rakesh signed up that same.
Regards, Uma ________________________________________ From: Ivan Kelly [[email protected]] Sent: Tuesday, July 10, 2012 4:22 PM To: [email protected] Subject: Re: Read entry performance when bookie in the ensemble in unreachable..... Hmmm, perhaps we should add a blacklist to the PendingReadOp, or we could decide to only read from bookies which are in /ledgers/available Could you open a JIRA for this. Cheers Ivan On Fri, Jul 06, 2012 at 04:55:10PM +0000, Rakesh R wrote: > Hi All, > > > Tested the following scenario with Bookkeeper-4.1 release: > ---------------------------------------------------------- > Scenario: > Prerequisites -> Say BK1, BK2, BK3 > 1) Create a ledger with say ensemble=3 quorum=2 and added say 100 entries. > 2) After this, BK1 machine is isolated by unplug the network cable of BK1. > > > > What I've observed is waiting nearly 10seconds for connection timeout from > the failed bookie and only after that bkclient is retrying to the other > bookie...I'm thinking this will affect the read entries performance. > > Also, I could see if the machine is reachable(and just shutdown the bookie > server) the timeout is very less and immediately retrynig to the other bookie > in the ensemble. > > > Is there any thing I'm missing to configure in netty channels or in bookie > server side for the bookie connection timeout? > > > > > Following is the sample log from the test environment: > ------------------------------------------------------ > > 2012-07-06 22:07:59,734 ERROR > hidden.bkjournal.org.apache.bookkeeper.client.PendingReadOp: Bookie handle is > not available while reading entry: 18 ledgerId: 3 from bookie: /HOST1:3181 > > 2012-07-06 22:08:10,234 ERROR > hidden.bkjournal.org.apache.bookkeeper.proto.PerChannelBookieClient: Could > not connect to bookie: /HOST1:3181 > 2012-07-06 22:08:10,234 ERROR > hidden.bkjournal.org.apache.bookkeeper.client.PendingReadOp: Bookie handle is > not available while reading entry: 21 ledgerId: 3 from bookie: /HOST1:3181 > 2012-07-06 22:08:21,234 ERROR > hidden.bkjournal.org.apache.bookkeeper.proto.PerChannelBookieClient: Could > not connect to bookie: /HOST1:3181 > 2012-07-06 22:08:21,234 ERROR > hidden.bkjournal.org.apache.bookkeeper.client.PendingReadOp: Bookie handle is > not available while reading entry: 24 ledgerId: 3 from bookie: /HOST1:3181 > 2012-07-06 22:08:31,734 ERROR > hidden.bkjournal.org.apache.bookkeeper.proto.PerChannelBookieClient: Could > not connect to bookie: /HOST1:3181 > 2012-07-06 22:08:31,734 ERROR > hidden.bkjournal.org.apache.bookkeeper.client.PendingReadOp: Bookie handle is > not available while reading entry: 27 ledgerId: 3 from bookie: /HOST1:3181 > 2012-07-06 22:08:42,734 ERROR > hidden.bkjournal.org.apache.bookkeeper.proto.PerChannelBookieClient: Could > not connect to bookie: /HOST1:3181 > 2012-07-06 22:08:42,734 ERROR > hidden.bkjournal.org.apache.bookkeeper.client.PendingReadOp: Bookie handle is > not available while reading entry: 30 ledgerId: 3 from bookie: /HOST1:3181 > 2012-07-06 22:08:53,234 ERROR > hidden.bkjournal.org.apache.bookkeeper.proto.PerChannelBookieClient: Could > not connect to bookie: /HOST1:3181 > > Thanks & Regards > Rakesh R
