jingych,

inline:

On Wed, Nov 13, 2013 at 7:06 PM, jingych <[email protected]> wrote:

>  Thanks, Esteban and Stack!
>
> As Esteban said, the problem was solved.
>
> My config is below:
> <code>
>  conf.setInt("hbase.client.retries.number", 1);
> conf.setInt("zookeeper.session.timeout", 5000);
> conf.setInt("zookeeper.recovery.retry", 1);
> conf.setInt("zookeeper.recovery.retry.intervalmill", 50);
> </code>
> But it still cost 46 seconds.
> And the log printing:
> <log>
>
> org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode 
> = ConnectionLoss for /hbase/hbaseid
> org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode 
> = ConnectionLoss for /hbase/master
>
> org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode 
> = ConnectionLoss for /hbase/root-region-server
>
> org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode 
> = ConnectionLoss for /hbase/root-region-server
>
> </log>
> It still tried to build the 4 above connections.
>

The client (via HConnectionManager) needs  to set 3 watchers on each of
those znodes in ZK, each attempt will have a max timeout of 5 seconds (you
have a single zk server) plus 10 seconds of the second attempt: 3 * (5 *
2^0) + 3 * (5 * 2^1) = 45 and the extra second should come from a hardcoded
sleep in the RPC implementation during a retry.
Setting zookeeper.recovery.retry=0 can make it fail faster but in case of a
transient failure then you will have to handle the reconnection in your
code.

>
> Could you please explain why the ZK do this? ( I'm realy new to the HBase
> world.)
> If i set the ZK session timeout with 1s, is't OK?
>

you *could* but you don't want clients to overwhelm ZK by re-establishing
connections over and over.


> And what do you mean about "depending on the number of ZK servers you have
> running the socket level timeout in the client to a ZK server will be
> zookeeper.session.timeout/#ZKs"?
> It means that if i hava 3 zookeepers and zookeeper.session.timeout=5000,
> each connection will 5000/3 timeout?
>

thats correct, the timeout to establish a connection to  ZK will be around
1.6 seconds (5000 milliseconds / 3) with 3 ZKs.


> I'm running ZK and HBase Master at one node as pseudo-distributed mode.
>

> Best Regards!
>
> ------------------------------
>
> jingych
>
> 2013-11-14
>
>  *发件人:* Esteban Gutierrez <[email protected]>
> *发送时间:* 2013-11-14 06:10
> *收件人:* Stack <[email protected]>
> *抄送:* Hbase-User <[email protected]>; jingych <[email protected]>
> *主题:* Re: Re: HBaseAdmin#checkHBaseAvailable COST ABOUT 1 MINUTE TO CHECK
> A DEAD(OR NOT EXISTS) HBASE MASTER
>
> jingych,
>
> That timeout comes from ZooKeeper, are you running ZK on the same node you
> are running the HBase Master? If your environment requires to fail fast
> even for ZK connection timeouts then you need to reduce
> zookeeper.recovery.retry.intervalmill and zookeeper.recovery.retry since
> the retries are done via an exponential backoff (1 second, 2 seconds, 8
> seconds), also depending on the number of ZK servers you have running the
> socket level timeout in the client to a ZK server will be
> zookeeper.session.timeout/#ZKs
>
> cheers,
> esteban.
>
>
>
>
>
>
>  --
> Cloudera, Inc.
>
>
>
> On Wed, Nov 13, 2013 at 7:21 AM, Stack <[email protected]> wrote:
>
>> More of the log and the version of HBase involved please.  Thanks.
>> St.Ack
>>
>>
>>  On Wed, Nov 13, 2013 at 1:07 AM, jingych <[email protected]> wrote:
>>
>>> Thanks, esteban!
>>>
>>> I'v tried. But it did not work.
>>>
>>> I first load the customer hbase-site.xml, and then try to check the
>>> hbase server.
>>> So my code is like this:
>>> <code>
>>> conf.setInt("hbase.client.retries.number", 1);
>>> conf.setInt("hbase.client.pause", 5);
>>> conf.setInt("ipc.socket.timeout", 5000);
>>> conf.setInt("hbase.rpc.timeout", 5000);
>>> </code>
>>>
>>> The log printing: Sleeping 4000ms before retry #2...
>>>
>>> If the zookeeper's quarum is the wrong address, the process will take
>>> very long time.
>>>
>>>
>>>
>>>
>>>  井玉成
>>>
>>> 基础软件事业部
>>> 东软集团股份有限公司
>>> 手机:13889491801
>>> 电话:0411-84835702
>>>
>>> 大连市甘井子区黄浦路901号 D1座217室
>>> Postcode:116085
>>> Email:[email protected]
>>>
>>> From: Esteban Gutierrez
>>> Date: 2013-11-13 11:12
>>> To: [email protected]; jingych
>>> Subject: Re: HBaseAdmin#checkHBaseAvailable COST ABOUT 1 MINUTE TO CHECK
>>> A DEAD(OR NOT EXISTS) HBASE MASTER
>>>  jingych,
>>>
>>> The behavior is driven by the number of retries
>>> (hbase.client.retries.number), the length of the pause between retries
>>> (hbase.client.pause) and the timeout to establish a connection
>>> (ipc.socket.timeout) and the time to get some data back from HBase
>>> (hbase.rpc.timeout). Lowering the rpc timeout and the ipc socket timeout
>>> should help you to fail fast the operation when the HBase Master is not
>>> responsive.
>>>
>>> cheers,
>>> esteban.
>>>
>>>
>>>
>>>
>>> --
>>> Cloudera, Inc.
>>>
>>>
>>>
>>> On Tue, Nov 12, 2013 at 6:49 PM, jingych <[email protected]> wrote:
>>>
>>> > HI,
>>> >
>>> > I wonder is there any way to limit the "HBaseAdmin#checkHBaseAvailable"
>>> > method time cost.
>>> >
>>> > As i use the "HBaseAdmin#checkHBaseAvailable" method to test if the
>>> hbase
>>> > master is connectable.
>>> > But if the target master is dead or not exists at all, this method will
>>> > cost 1 minute to wait the result.
>>> >
>>> >
>>> >
>>> >
>>> > jingych
>>> > 2013-11-13
>>> >
>>> >
>>> ---------------------------------------------------------------------------------------------------
>>> > Confidentiality Notice: The information contained in this e-mail and
>>> any
>>> > accompanying attachment(s)
>>> > is intended only for the use of the intended recipient and may be
>>> > confidential and/or privileged of
>>> > Neusoft Corporation, its subsidiaries and/or its affiliates. If any
>>> reader
>>> > of this communication is
>>> > not the intended recipient, unauthorized use, forwarding, printing,
>>> >  storing, disclosure or copying
>>> > is strictly prohibited, and may be unlawful.If you have received this
>>> > communication in error,please
>>> > immediately notify the sender by return e-mail, and delete the original
>>> > message and all copies from
>>> > your system. Thank you.
>>> >
>>> >
>>> ---------------------------------------------------------------------------------------------------
>>> >
>>>
>>> ---------------------------------------------------------------------------------------------------
>>> Confidentiality Notice: The information contained in this e-mail and any
>>> accompanying attachment(s)
>>> is intended only for the use of the intended recipient and may be
>>> confidential and/or privileged of
>>> Neusoft Corporation, its subsidiaries and/or its affiliates. If any
>>> reader of this communication is
>>> not the intended recipient, unauthorized use, forwarding, printing,
>>>  storing, disclosure or copying
>>> is strictly prohibited, and may be unlawful.If you have received this
>>> communication in error,please
>>> immediately notify the sender by return e-mail, and delete the original
>>> message and all copies from
>>> your system. Thank you.
>>>
>>> ---------------------------------------------------------------------------------------------------
>>>
>>
>>
>
> ---------------------------------------------------------------------------------------------------
> Confidentiality Notice: The information contained in this e-mail and any
> accompanying attachment(s)
> is intended only for the use of the intended recipient and may be
> confidential and/or privileged of
> Neusoft Corporation, its subsidiaries and/or its affiliates. If any reader
> of this communication is
> not the intended recipient, unauthorized use, forwarding, printing,
> storing, disclosure or copying
> is strictly prohibited, and may be unlawful.If you have received this
> communication in error,please
> immediately notify the sender by return e-mail, and delete the original
> message and all copies from
> your system. Thank you.
>
> ---------------------------------------------------------------------------------------------------
>

Reply via email to