RE: A question about Connection timed out and operation timeout

2009-08-20 Thread Benjamin Reed
are you using the single threaded or multithreaded C library? the exceeded 
deadline message means that our thread was supposed to get control after a 
certain period, but we got control that many milliseconds late. what is your 
session timeout?

ben


From: Qian Ye [yeqian@gmail.com]
Sent: Thursday, August 20, 2009 3:17 AM
To: zookeeper-user
Subject: A question about Connection timed out and operation timeout

Hi guys:

I met the problem again: an ephemeral node disappeared, and I found it
because my application got a operation timeout

My application which created an ephemeral node at the zookeeper server,
printed the following log

*WARNING: 08-20 03:09:20:  auto * 182894118176
[logid:][reqip:][auto_exchanger_zk_basic.cpp:605]get children
fail.[/forum/elect_nodes][-7][operation timeout]*

and the Zookeeper client printed the following log (the log level is INFO)

2009-08-19 21:36:18,067:3813(0x9556c520):zoo_i...@log_env@545: Client
environment:zookeeper.version=zookeeper C client 3.2.0
606 2009-08-19 21:36:18,067:3813(0x9556c520):zoo_i...@log_env@549:
Client environment:host.name=jx-ziyuan-test00.jx.baidu.com
607 2009-08-19 21:36:18,068:3813(0x9556c520):zoo_i...@log_env@557:
Client environments.name=Linux
608 2009-08-19 21:36:18,068:3813(0x9556c520):zoo_i...@log_env@558:
Client environments.arch=2.6.9-52bs
609 2009-08-19 21:36:18,068:3813(0x9556c520):zoo_i...@log_env@559:
Client environments.version=#2 SMP Fri Jan 26 13:34:38 CST 2007
610 2009-08-19 21:36:18,068:3813(0x9556c520):zoo_i...@log_env@567:
Client environment:user.name=club
611 2009-08-19 21:36:18,068:3813(0x9556c520):zoo_i...@log_env@577:
Client environment:user.home=/home/club
612 2009-08-19 21:36:18,068:3813(0x9556c520):zoo_i...@log_env@589:
Client environment:user.dir=/home/club/user/luhongbo/auto-exchanger
613 2009-08-19 21:36:18,068:3813(0x9556c520):zoo_i...@zookeeper_init@613:
Initiating client connection,
host=127.0.0.1:2181,127.0.0.1:2182sessionTimeout=2000 wa
tcher=0x408c56 sessionId=0x0
sessionPasswd=null context=(nil) flags=0
614 2009-08-19 21:36:18,069:3813(0x41401960):zoo_i...@check_events@1439:
initiated connection to server [127.0.0.1:2181]
615 2009-08-19 21:36:18,070:3813(0x41401960):zoo_i...@check_events@1484:
connected to server [127.0.0.1:2181] with session id=1232c1688a20093
616 2009-08-20
02:48:01,780:3813(0x41401960):zoo_w...@zookeeper_interest@1335:
Exceeded deadline by 520ms
617 2009-08-20
03:08:52,332:3813(0x41401960):zoo_w...@zookeeper_interest@1335:
Exceeded deadline by 14ms
618 2009-08-20
03:09:04,666:3813(0x41401960):zoo_w...@zookeeper_interest@1335:
Exceeded deadline by 48ms
619 2009-08-20
03:09:09,733:3813(0x41401960):zoo_w...@zookeeper_interest@1335:
Exceeded deadline by 24ms
620 *2009-08-20
03:09:20,289:3813(0x41401960):zoo_w...@zookeeper_interest@1335: Exceeded
deadline by 264ms*
621 *2009-08-20
03:09:20,295:3813(0x41401960):zoo_er...@handle_socket_error_msg@1388: Socket
[127.0.0.1:2181] zk retcode=-7, errno=110(Connection timed out): conn
ection timed out (exceeded timeout by 264ms)*
622 *2009-08-20
03:09:20,309:3813(0x41401960):zoo_w...@zookeeper_interest@1335: Exceeded
deadline by 284ms*
623 *2009-08-20
03:09:20,309:3813(0x41401960):zoo_er...@handle_socket_error_msg@1433: Socket
[127.0.0.1:2182] zk retcode=-4, errno=111(Connection refused):
server refused to accept the client*
624 *2009-08-20 03:09:20,353:3813(0x41401960):zoo_i...@check_events@1439:
initiated connection to server [127.0.0.1:2181]*
625 *2009-08-20 03:09:20,552:3813(0x41401960):zoo_i...@check_events@1484:
connected to server [127.0.0.1:2181] with session id=1232c1688a20093*

I don't know why the connection timed out happened at *2009-08-20
03:09:20,295:3813, *and the server refuse to accept the client. Could some
one give me any hints? And I'm not sure the meaning of Exceeded deadline by
xxms, need some help too.


P.S. I used the Zookeeper 3.2.0 (Server and C Client API) and run a
stand-alone instance

Thx all~

--
With Regards!

Ye, Qian
Made in Zhejiang University


Re: A question about Connection timed out and operation timeout

2009-08-20 Thread Qian Ye
Hi Ben:

I used multi-thread library, and the session timeout is set to 2000 when the
zookeeper handler was initialized.

On Thu, Aug 20, 2009 at 9:52 PM, Benjamin Reed br...@yahoo-inc.com wrote:

 are you using the single threaded or multithreaded C library? the exceeded
 deadline message means that our thread was supposed to get control after a
 certain period, but we got control that many milliseconds late. what is your
 session timeout?

 ben

 
 From: Qian Ye [yeqian@gmail.com]
 Sent: Thursday, August 20, 2009 3:17 AM
 To: zookeeper-user
 Subject: A question about Connection timed out and operation timeout

 Hi guys:

 I met the problem again: an ephemeral node disappeared, and I found it
 because my application got a operation timeout

 My application which created an ephemeral node at the zookeeper server,
 printed the following log

 *WARNING: 08-20 03:09:20:  auto * 182894118176
 [logid:][reqip:][auto_exchanger_zk_basic.cpp:605]get children
 fail.[/forum/elect_nodes][-7][operation timeout]*

 and the Zookeeper client printed the following log (the log level is INFO)

 2009-08-19 21:36:18,067:3813(0x9556c520):zoo_i...@log_env@545: Client
 environment:zookeeper.version=zookeeper C client 3.2.0
606 2009-08-19 21:36:18,067:3813(0x9556c520):zoo_i...@log_env@549:
 Client environment:host.name=jx-ziyuan-test00.jx.baidu.com
607 2009-08-19 21:36:18,068:3813(0x9556c520):zoo_i...@log_env@557:
 Client environments.name=Linux
608 2009-08-19 21:36:18,068:3813(0x9556c520):zoo_i...@log_env@558:
 Client environments.arch=2.6.9-52bs
609 2009-08-19 21:36:18,068:3813(0x9556c520):zoo_i...@log_env@559:
 Client environments.version=#2 SMP Fri Jan 26 13:34:38 CST 2007
610 2009-08-19 21:36:18,068:3813(0x9556c520):zoo_i...@log_env@567:
 Client environment:user.name=club
611 2009-08-19 21:36:18,068:3813(0x9556c520):zoo_i...@log_env@577:
 Client environment:user.home=/home/club
612 2009-08-19 21:36:18,068:3813(0x9556c520):zoo_i...@log_env@589:
 Client environment:user.dir=/home/club/user/luhongbo/auto-exchanger
613 2009-08-19 21:36:18,068:3813(0x9556c520):zoo_i...@zookeeper_init
 @613:
 Initiating client connection,
 host=127.0.0.1:2181,127.0.0.1:2182sessionTimeout=2000 wa
 tcher=0x408c56 sessionId=0x0
 sessionPasswd=null context=(nil) flags=0
614 2009-08-19 21:36:18,069:3813(0x41401960):zoo_i...@check_events
 @1439:
 initiated connection to server [127.0.0.1:2181]
615 2009-08-19 21:36:18,070:3813(0x41401960):zoo_i...@check_events
 @1484:
 connected to server [127.0.0.1:2181] with session id=1232c1688a20093
616 2009-08-20
 02:48:01,780:3813(0x41401960):zoo_w...@zookeeper_interest@1335:
 Exceeded deadline by 520ms
617 2009-08-20
 03:08:52,332:3813(0x41401960):zoo_w...@zookeeper_interest@1335:
 Exceeded deadline by 14ms
618 2009-08-20
 03:09:04,666:3813(0x41401960):zoo_w...@zookeeper_interest@1335:
 Exceeded deadline by 48ms
619 2009-08-20
 03:09:09,733:3813(0x41401960):zoo_w...@zookeeper_interest@1335:
 Exceeded deadline by 24ms
620 *2009-08-20
 03:09:20,289:3813(0x41401960):zoo_w...@zookeeper_interest@1335: Exceeded
 deadline by 264ms*
621 *2009-08-20
 03:09:20,295:3813(0x41401960):zoo_er...@handle_socket_error_msg@1388:
 Socket
 [127.0.0.1:2181] zk retcode=-7, errno=110(Connection timed out): conn
 ection timed out (exceeded timeout by 264ms)*
622 *2009-08-20
 03:09:20,309:3813(0x41401960):zoo_w...@zookeeper_interest@1335: Exceeded
 deadline by 284ms*
623 *2009-08-20
 03:09:20,309:3813(0x41401960):zoo_er...@handle_socket_error_msg@1433:
 Socket
 [127.0.0.1:2182] zk retcode=-4, errno=111(Connection refused):
 server refused to accept the client*
624 *2009-08-20 03:09:20,353:3813(0x41401960):zoo_i...@check_events
 @1439:
 initiated connection to server [127.0.0.1:2181]*
625 *2009-08-20 03:09:20,552:3813(0x41401960):zoo_i...@check_events
 @1484:
 connected to server [127.0.0.1:2181] with session id=1232c1688a20093*

 I don't know why the connection timed out happened at *2009-08-20
 03:09:20,295:3813, *and the server refuse to accept the client. Could some
 one give me any hints? And I'm not sure the meaning of Exceeded deadline
 by
 xxms, need some help too.


 P.S. I used the Zookeeper 3.2.0 (Server and C Client API) and run a
 stand-alone instance

 Thx all~

 --
 With Regards!

 Ye, Qian
 Made in Zhejiang University




-- 
With Regards!

Ye, Qian
Made in Zhejiang University


Re: A question about Connection timed out and operation timeout

2009-08-20 Thread Patrick Hunt

Hi Qian, it would good if you could create a jira for this:
https://issues.apache.org/jira/browse/ZOOKEEPER

include both the client logs and the server logs (for overlapping 
client/server time period where you see the problem). also the server 
config if you're using a quorum vs standalone. If you could also include 
some/all of the client side code you have that would be useful for us to 
review.


are you doing anything in particular during this time period where you 
see a problem? perhaps load testing of the server?


2 sec timeout is pretty low. the client needs to send a heartbeat every 
0.6 seconds (1/3 timeout), and if it doesn't hear back from the server 
in another 0.6 seconds it will close the connection and attempt to 
re-connect to a(nother) server. also if the server doesn't hear from the 
client every 2 seconds it will expire the session. in general this is 
ok, but if you have any network issues, or periodic heavy load on the 
server this could cause a problem. typically we suggest timeouts in the 
20-30 second range, but we do have some ppl that run in 5-10 second 
range regularly (but they have solid lans with low latency connections)


re the ephemeral node - did you try the suggestions I had in my previous 
email? (attached) (please include that on the jira as well)


Patrick

Qian Ye wrote:

Hi Ben:

I used multi-thread library, and the session timeout is set to 2000 when the
zookeeper handler was initialized.

On Thu, Aug 20, 2009 at 9:52 PM, Benjamin Reed br...@yahoo-inc.com wrote:


are you using the single threaded or multithreaded C library? the exceeded
deadline message means that our thread was supposed to get control after a
certain period, but we got control that many milliseconds late. what is your
session timeout?

ben


From: Qian Ye [yeqian@gmail.com]
Sent: Thursday, August 20, 2009 3:17 AM
To: zookeeper-user
Subject: A question about Connection timed out and operation timeout

Hi guys:

I met the problem again: an ephemeral node disappeared, and I found it
because my application got a operation timeout

My application which created an ephemeral node at the zookeeper server,
printed the following log

*WARNING: 08-20 03:09:20:  auto * 182894118176
[logid:][reqip:][auto_exchanger_zk_basic.cpp:605]get children
fail.[/forum/elect_nodes][-7][operation timeout]*

and the Zookeeper client printed the following log (the log level is INFO)

2009-08-19 21:36:18,067:3813(0x9556c520):zoo_i...@log_env@545: Client
environment:zookeeper.version=zookeeper C client 3.2.0
   606 2009-08-19 21:36:18,067:3813(0x9556c520):zoo_i...@log_env@549:
Client environment:host.name=jx-ziyuan-test00.jx.baidu.com
   607 2009-08-19 21:36:18,068:3813(0x9556c520):zoo_i...@log_env@557:
Client environments.name=Linux
   608 2009-08-19 21:36:18,068:3813(0x9556c520):zoo_i...@log_env@558:
Client environments.arch=2.6.9-52bs
   609 2009-08-19 21:36:18,068:3813(0x9556c520):zoo_i...@log_env@559:
Client environments.version=#2 SMP Fri Jan 26 13:34:38 CST 2007
   610 2009-08-19 21:36:18,068:3813(0x9556c520):zoo_i...@log_env@567:
Client environment:user.name=club
   611 2009-08-19 21:36:18,068:3813(0x9556c520):zoo_i...@log_env@577:
Client environment:user.home=/home/club
   612 2009-08-19 21:36:18,068:3813(0x9556c520):zoo_i...@log_env@589:
Client environment:user.dir=/home/club/user/luhongbo/auto-exchanger
   613 2009-08-19 21:36:18,068:3813(0x9556c520):zoo_i...@zookeeper_init
@613:
Initiating client connection,
host=127.0.0.1:2181,127.0.0.1:2182sessionTimeout=2000 wa
tcher=0x408c56 sessionId=0x0
sessionPasswd=null context=(nil) flags=0
   614 2009-08-19 21:36:18,069:3813(0x41401960):zoo_i...@check_events
@1439:
initiated connection to server [127.0.0.1:2181]
   615 2009-08-19 21:36:18,070:3813(0x41401960):zoo_i...@check_events
@1484:
connected to server [127.0.0.1:2181] with session id=1232c1688a20093
   616 2009-08-20
02:48:01,780:3813(0x41401960):zoo_w...@zookeeper_interest@1335:
Exceeded deadline by 520ms
   617 2009-08-20
03:08:52,332:3813(0x41401960):zoo_w...@zookeeper_interest@1335:
Exceeded deadline by 14ms
   618 2009-08-20
03:09:04,666:3813(0x41401960):zoo_w...@zookeeper_interest@1335:
Exceeded deadline by 48ms
   619 2009-08-20
03:09:09,733:3813(0x41401960):zoo_w...@zookeeper_interest@1335:
Exceeded deadline by 24ms
   620 *2009-08-20
03:09:20,289:3813(0x41401960):zoo_w...@zookeeper_interest@1335: Exceeded
deadline by 264ms*
   621 *2009-08-20
03:09:20,295:3813(0x41401960):zoo_er...@handle_socket_error_msg@1388:
Socket
[127.0.0.1:2181] zk retcode=-7, errno=110(Connection timed out): conn
ection timed out (exceeded timeout by 264ms)*
   622 *2009-08-20
03:09:20,309:3813(0x41401960):zoo_w...@zookeeper_interest@1335: Exceeded
deadline by 284ms*
   623 *2009-08-20
03:09:20,309:3813(0x41401960):zoo_er...@handle_socket_error_msg@1433:
Socket
[127.0.0.1:2182] zk retcode=-4, errno=111(Connection refused):
server refused to accept the client*
   624 *2009-08-20