I'm not sure blacklisting is a good idea, the bookie could be simply a little slow at that point for whatever reason. It might be a better idea to have a separate timeout for reads. We send read requests to the quorum sequentially to optimize the use bandwidth, but it is clear that slow/dead bookies can hurt performance. We have essentially optimized for what we believed to be the common case.
-Flavio On Jul 10, 2012, at 12:52 PM, Ivan Kelly wrote: > Hmmm, perhaps we should add a blacklist to the PendingReadOp, or we > could decide to only read from bookies which are in /ledgers/available > > Could you open a JIRA for this. > > Cheers > Ivan > > On Fri, Jul 06, 2012 at 04:55:10PM +0000, Rakesh R wrote: >> Hi All, >> >> >> Tested the following scenario with Bookkeeper-4.1 release: >> ---------------------------------------------------------- >> Scenario: >> Prerequisites -> Say BK1, BK2, BK3 >> 1) Create a ledger with say ensemble=3 quorum=2 and added say 100 entries. >> 2) After this, BK1 machine is isolated by unplug the network cable of BK1. >> >> >> >> What I've observed is waiting nearly 10seconds for connection timeout from >> the failed bookie and only after that bkclient is retrying to the other >> bookie...I'm thinking this will affect the read entries performance. >> >> Also, I could see if the machine is reachable(and just shutdown the bookie >> server) the timeout is very less and immediately retrynig to the other >> bookie in the ensemble. >> >> >> Is there any thing I'm missing to configure in netty channels or in bookie >> server side for the bookie connection timeout? >> >> >> >> >> Following is the sample log from the test environment: >> ------------------------------------------------------ >> >> 2012-07-06 22:07:59,734 ERROR >> hidden.bkjournal.org.apache.bookkeeper.client.PendingReadOp: Bookie handle >> is not available while reading entry: 18 ledgerId: 3 from bookie: /HOST1:3181 >> >> 2012-07-06 22:08:10,234 ERROR >> hidden.bkjournal.org.apache.bookkeeper.proto.PerChannelBookieClient: Could >> not connect to bookie: /HOST1:3181 >> 2012-07-06 22:08:10,234 ERROR >> hidden.bkjournal.org.apache.bookkeeper.client.PendingReadOp: Bookie handle >> is not available while reading entry: 21 ledgerId: 3 from bookie: /HOST1:3181 >> 2012-07-06 22:08:21,234 ERROR >> hidden.bkjournal.org.apache.bookkeeper.proto.PerChannelBookieClient: Could >> not connect to bookie: /HOST1:3181 >> 2012-07-06 22:08:21,234 ERROR >> hidden.bkjournal.org.apache.bookkeeper.client.PendingReadOp: Bookie handle >> is not available while reading entry: 24 ledgerId: 3 from bookie: /HOST1:3181 >> 2012-07-06 22:08:31,734 ERROR >> hidden.bkjournal.org.apache.bookkeeper.proto.PerChannelBookieClient: Could >> not connect to bookie: /HOST1:3181 >> 2012-07-06 22:08:31,734 ERROR >> hidden.bkjournal.org.apache.bookkeeper.client.PendingReadOp: Bookie handle >> is not available while reading entry: 27 ledgerId: 3 from bookie: /HOST1:3181 >> 2012-07-06 22:08:42,734 ERROR >> hidden.bkjournal.org.apache.bookkeeper.proto.PerChannelBookieClient: Could >> not connect to bookie: /HOST1:3181 >> 2012-07-06 22:08:42,734 ERROR >> hidden.bkjournal.org.apache.bookkeeper.client.PendingReadOp: Bookie handle >> is not available while reading entry: 30 ledgerId: 3 from bookie: /HOST1:3181 >> 2012-07-06 22:08:53,234 ERROR >> hidden.bkjournal.org.apache.bookkeeper.proto.PerChannelBookieClient: Could >> not connect to bookie: /HOST1:3181 >> >> Thanks & Regards >> Rakesh R
