RE: A question about Connection timed out and operation timeout
are you using the single threaded or multithreaded C library? the exceeded deadline message means that our thread was supposed to get control after a certain period, but we got control that many milliseconds late. what is your session timeout? ben From: Qian Ye [yeqian@gmail.com] Sent: Thursday, August 20, 2009 3:17 AM To: zookeeper-user Subject: A question about Connection timed out and operation timeout Hi guys: I met the problem again: an ephemeral node disappeared, and I found it because my application got a operation timeout My application which created an ephemeral node at the zookeeper server, printed the following log *WARNING: 08-20 03:09:20: auto * 182894118176 [logid:][reqip:][auto_exchanger_zk_basic.cpp:605]get children fail.[/forum/elect_nodes][-7][operation timeout]* and the Zookeeper client printed the following log (the log level is INFO) 2009-08-19 21:36:18,067:3813(0x9556c520):zoo_i...@log_env@545: Client environment:zookeeper.version=zookeeper C client 3.2.0 606 2009-08-19 21:36:18,067:3813(0x9556c520):zoo_i...@log_env@549: Client environment:host.name=jx-ziyuan-test00.jx.baidu.com 607 2009-08-19 21:36:18,068:3813(0x9556c520):zoo_i...@log_env@557: Client environments.name=Linux 608 2009-08-19 21:36:18,068:3813(0x9556c520):zoo_i...@log_env@558: Client environments.arch=2.6.9-52bs 609 2009-08-19 21:36:18,068:3813(0x9556c520):zoo_i...@log_env@559: Client environments.version=#2 SMP Fri Jan 26 13:34:38 CST 2007 610 2009-08-19 21:36:18,068:3813(0x9556c520):zoo_i...@log_env@567: Client environment:user.name=club 611 2009-08-19 21:36:18,068:3813(0x9556c520):zoo_i...@log_env@577: Client environment:user.home=/home/club 612 2009-08-19 21:36:18,068:3813(0x9556c520):zoo_i...@log_env@589: Client environment:user.dir=/home/club/user/luhongbo/auto-exchanger 613 2009-08-19 21:36:18,068:3813(0x9556c520):zoo_i...@zookeeper_init@613: Initiating client connection, host=127.0.0.1:2181,127.0.0.1:2182sessionTimeout=2000 wa tcher=0x408c56 sessionId=0x0 sessionPasswd=null context=(nil) flags=0 614 2009-08-19 21:36:18,069:3813(0x41401960):zoo_i...@check_events@1439: initiated connection to server [127.0.0.1:2181] 615 2009-08-19 21:36:18,070:3813(0x41401960):zoo_i...@check_events@1484: connected to server [127.0.0.1:2181] with session id=1232c1688a20093 616 2009-08-20 02:48:01,780:3813(0x41401960):zoo_w...@zookeeper_interest@1335: Exceeded deadline by 520ms 617 2009-08-20 03:08:52,332:3813(0x41401960):zoo_w...@zookeeper_interest@1335: Exceeded deadline by 14ms 618 2009-08-20 03:09:04,666:3813(0x41401960):zoo_w...@zookeeper_interest@1335: Exceeded deadline by 48ms 619 2009-08-20 03:09:09,733:3813(0x41401960):zoo_w...@zookeeper_interest@1335: Exceeded deadline by 24ms 620 *2009-08-20 03:09:20,289:3813(0x41401960):zoo_w...@zookeeper_interest@1335: Exceeded deadline by 264ms* 621 *2009-08-20 03:09:20,295:3813(0x41401960):zoo_er...@handle_socket_error_msg@1388: Socket [127.0.0.1:2181] zk retcode=-7, errno=110(Connection timed out): conn ection timed out (exceeded timeout by 264ms)* 622 *2009-08-20 03:09:20,309:3813(0x41401960):zoo_w...@zookeeper_interest@1335: Exceeded deadline by 284ms* 623 *2009-08-20 03:09:20,309:3813(0x41401960):zoo_er...@handle_socket_error_msg@1433: Socket [127.0.0.1:2182] zk retcode=-4, errno=111(Connection refused): server refused to accept the client* 624 *2009-08-20 03:09:20,353:3813(0x41401960):zoo_i...@check_events@1439: initiated connection to server [127.0.0.1:2181]* 625 *2009-08-20 03:09:20,552:3813(0x41401960):zoo_i...@check_events@1484: connected to server [127.0.0.1:2181] with session id=1232c1688a20093* I don't know why the connection timed out happened at *2009-08-20 03:09:20,295:3813, *and the server refuse to accept the client. Could some one give me any hints? And I'm not sure the meaning of Exceeded deadline by xxms, need some help too. P.S. I used the Zookeeper 3.2.0 (Server and C Client API) and run a stand-alone instance Thx all~ -- With Regards! Ye, Qian Made in Zhejiang University
Re: A question about Connection timed out and operation timeout
Hi Ben: I used multi-thread library, and the session timeout is set to 2000 when the zookeeper handler was initialized. On Thu, Aug 20, 2009 at 9:52 PM, Benjamin Reed br...@yahoo-inc.com wrote: are you using the single threaded or multithreaded C library? the exceeded deadline message means that our thread was supposed to get control after a certain period, but we got control that many milliseconds late. what is your session timeout? ben From: Qian Ye [yeqian@gmail.com] Sent: Thursday, August 20, 2009 3:17 AM To: zookeeper-user Subject: A question about Connection timed out and operation timeout Hi guys: I met the problem again: an ephemeral node disappeared, and I found it because my application got a operation timeout My application which created an ephemeral node at the zookeeper server, printed the following log *WARNING: 08-20 03:09:20: auto * 182894118176 [logid:][reqip:][auto_exchanger_zk_basic.cpp:605]get children fail.[/forum/elect_nodes][-7][operation timeout]* and the Zookeeper client printed the following log (the log level is INFO) 2009-08-19 21:36:18,067:3813(0x9556c520):zoo_i...@log_env@545: Client environment:zookeeper.version=zookeeper C client 3.2.0 606 2009-08-19 21:36:18,067:3813(0x9556c520):zoo_i...@log_env@549: Client environment:host.name=jx-ziyuan-test00.jx.baidu.com 607 2009-08-19 21:36:18,068:3813(0x9556c520):zoo_i...@log_env@557: Client environments.name=Linux 608 2009-08-19 21:36:18,068:3813(0x9556c520):zoo_i...@log_env@558: Client environments.arch=2.6.9-52bs 609 2009-08-19 21:36:18,068:3813(0x9556c520):zoo_i...@log_env@559: Client environments.version=#2 SMP Fri Jan 26 13:34:38 CST 2007 610 2009-08-19 21:36:18,068:3813(0x9556c520):zoo_i...@log_env@567: Client environment:user.name=club 611 2009-08-19 21:36:18,068:3813(0x9556c520):zoo_i...@log_env@577: Client environment:user.home=/home/club 612 2009-08-19 21:36:18,068:3813(0x9556c520):zoo_i...@log_env@589: Client environment:user.dir=/home/club/user/luhongbo/auto-exchanger 613 2009-08-19 21:36:18,068:3813(0x9556c520):zoo_i...@zookeeper_init @613: Initiating client connection, host=127.0.0.1:2181,127.0.0.1:2182sessionTimeout=2000 wa tcher=0x408c56 sessionId=0x0 sessionPasswd=null context=(nil) flags=0 614 2009-08-19 21:36:18,069:3813(0x41401960):zoo_i...@check_events @1439: initiated connection to server [127.0.0.1:2181] 615 2009-08-19 21:36:18,070:3813(0x41401960):zoo_i...@check_events @1484: connected to server [127.0.0.1:2181] with session id=1232c1688a20093 616 2009-08-20 02:48:01,780:3813(0x41401960):zoo_w...@zookeeper_interest@1335: Exceeded deadline by 520ms 617 2009-08-20 03:08:52,332:3813(0x41401960):zoo_w...@zookeeper_interest@1335: Exceeded deadline by 14ms 618 2009-08-20 03:09:04,666:3813(0x41401960):zoo_w...@zookeeper_interest@1335: Exceeded deadline by 48ms 619 2009-08-20 03:09:09,733:3813(0x41401960):zoo_w...@zookeeper_interest@1335: Exceeded deadline by 24ms 620 *2009-08-20 03:09:20,289:3813(0x41401960):zoo_w...@zookeeper_interest@1335: Exceeded deadline by 264ms* 621 *2009-08-20 03:09:20,295:3813(0x41401960):zoo_er...@handle_socket_error_msg@1388: Socket [127.0.0.1:2181] zk retcode=-7, errno=110(Connection timed out): conn ection timed out (exceeded timeout by 264ms)* 622 *2009-08-20 03:09:20,309:3813(0x41401960):zoo_w...@zookeeper_interest@1335: Exceeded deadline by 284ms* 623 *2009-08-20 03:09:20,309:3813(0x41401960):zoo_er...@handle_socket_error_msg@1433: Socket [127.0.0.1:2182] zk retcode=-4, errno=111(Connection refused): server refused to accept the client* 624 *2009-08-20 03:09:20,353:3813(0x41401960):zoo_i...@check_events @1439: initiated connection to server [127.0.0.1:2181]* 625 *2009-08-20 03:09:20,552:3813(0x41401960):zoo_i...@check_events @1484: connected to server [127.0.0.1:2181] with session id=1232c1688a20093* I don't know why the connection timed out happened at *2009-08-20 03:09:20,295:3813, *and the server refuse to accept the client. Could some one give me any hints? And I'm not sure the meaning of Exceeded deadline by xxms, need some help too. P.S. I used the Zookeeper 3.2.0 (Server and C Client API) and run a stand-alone instance Thx all~ -- With Regards! Ye, Qian Made in Zhejiang University -- With Regards! Ye, Qian Made in Zhejiang University
Re: A question about Connection timed out and operation timeout
Hi Qian, it would good if you could create a jira for this: https://issues.apache.org/jira/browse/ZOOKEEPER include both the client logs and the server logs (for overlapping client/server time period where you see the problem). also the server config if you're using a quorum vs standalone. If you could also include some/all of the client side code you have that would be useful for us to review. are you doing anything in particular during this time period where you see a problem? perhaps load testing of the server? 2 sec timeout is pretty low. the client needs to send a heartbeat every 0.6 seconds (1/3 timeout), and if it doesn't hear back from the server in another 0.6 seconds it will close the connection and attempt to re-connect to a(nother) server. also if the server doesn't hear from the client every 2 seconds it will expire the session. in general this is ok, but if you have any network issues, or periodic heavy load on the server this could cause a problem. typically we suggest timeouts in the 20-30 second range, but we do have some ppl that run in 5-10 second range regularly (but they have solid lans with low latency connections) re the ephemeral node - did you try the suggestions I had in my previous email? (attached) (please include that on the jira as well) Patrick Qian Ye wrote: Hi Ben: I used multi-thread library, and the session timeout is set to 2000 when the zookeeper handler was initialized. On Thu, Aug 20, 2009 at 9:52 PM, Benjamin Reed br...@yahoo-inc.com wrote: are you using the single threaded or multithreaded C library? the exceeded deadline message means that our thread was supposed to get control after a certain period, but we got control that many milliseconds late. what is your session timeout? ben From: Qian Ye [yeqian@gmail.com] Sent: Thursday, August 20, 2009 3:17 AM To: zookeeper-user Subject: A question about Connection timed out and operation timeout Hi guys: I met the problem again: an ephemeral node disappeared, and I found it because my application got a operation timeout My application which created an ephemeral node at the zookeeper server, printed the following log *WARNING: 08-20 03:09:20: auto * 182894118176 [logid:][reqip:][auto_exchanger_zk_basic.cpp:605]get children fail.[/forum/elect_nodes][-7][operation timeout]* and the Zookeeper client printed the following log (the log level is INFO) 2009-08-19 21:36:18,067:3813(0x9556c520):zoo_i...@log_env@545: Client environment:zookeeper.version=zookeeper C client 3.2.0 606 2009-08-19 21:36:18,067:3813(0x9556c520):zoo_i...@log_env@549: Client environment:host.name=jx-ziyuan-test00.jx.baidu.com 607 2009-08-19 21:36:18,068:3813(0x9556c520):zoo_i...@log_env@557: Client environments.name=Linux 608 2009-08-19 21:36:18,068:3813(0x9556c520):zoo_i...@log_env@558: Client environments.arch=2.6.9-52bs 609 2009-08-19 21:36:18,068:3813(0x9556c520):zoo_i...@log_env@559: Client environments.version=#2 SMP Fri Jan 26 13:34:38 CST 2007 610 2009-08-19 21:36:18,068:3813(0x9556c520):zoo_i...@log_env@567: Client environment:user.name=club 611 2009-08-19 21:36:18,068:3813(0x9556c520):zoo_i...@log_env@577: Client environment:user.home=/home/club 612 2009-08-19 21:36:18,068:3813(0x9556c520):zoo_i...@log_env@589: Client environment:user.dir=/home/club/user/luhongbo/auto-exchanger 613 2009-08-19 21:36:18,068:3813(0x9556c520):zoo_i...@zookeeper_init @613: Initiating client connection, host=127.0.0.1:2181,127.0.0.1:2182sessionTimeout=2000 wa tcher=0x408c56 sessionId=0x0 sessionPasswd=null context=(nil) flags=0 614 2009-08-19 21:36:18,069:3813(0x41401960):zoo_i...@check_events @1439: initiated connection to server [127.0.0.1:2181] 615 2009-08-19 21:36:18,070:3813(0x41401960):zoo_i...@check_events @1484: connected to server [127.0.0.1:2181] with session id=1232c1688a20093 616 2009-08-20 02:48:01,780:3813(0x41401960):zoo_w...@zookeeper_interest@1335: Exceeded deadline by 520ms 617 2009-08-20 03:08:52,332:3813(0x41401960):zoo_w...@zookeeper_interest@1335: Exceeded deadline by 14ms 618 2009-08-20 03:09:04,666:3813(0x41401960):zoo_w...@zookeeper_interest@1335: Exceeded deadline by 48ms 619 2009-08-20 03:09:09,733:3813(0x41401960):zoo_w...@zookeeper_interest@1335: Exceeded deadline by 24ms 620 *2009-08-20 03:09:20,289:3813(0x41401960):zoo_w...@zookeeper_interest@1335: Exceeded deadline by 264ms* 621 *2009-08-20 03:09:20,295:3813(0x41401960):zoo_er...@handle_socket_error_msg@1388: Socket [127.0.0.1:2181] zk retcode=-7, errno=110(Connection timed out): conn ection timed out (exceeded timeout by 264ms)* 622 *2009-08-20 03:09:20,309:3813(0x41401960):zoo_w...@zookeeper_interest@1335: Exceeded deadline by 284ms* 623 *2009-08-20 03:09:20,309:3813(0x41401960):zoo_er...@handle_socket_error_msg@1433: Socket [127.0.0.1:2182] zk retcode=-4, errno=111(Connection refused): server refused to accept the client* 624 *2009-08-20