[jira] Commented: (ZOOKEEPER-676) 50%-75% connection loss exceptions using zkpython
[ https://issues.apache.org/jira/browse/ZOOKEEPER-676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12836533#action_12836533 ] Josh Fraser commented on ZOOKEEPER-676: --- Henry -- I've attached the script, and just pasted here in its entirety just in case you have issues downloading or displaying. Thank you so much for your attention to this issue! :) Addtionally, this same script, and all others I'm working on, work flawlessly on our RedHat 5.x boxes, just not on our developer's Macs. #!/usr/bin/env python import zookeeper zkserver='localhost:2181' zk = zookeeper.init(zkserver) acl = {"perms":0x1f, "scheme":"world", "id" :"anyone"} node = "/foo" value = "bar" if not zookeeper.exists(zk, node): zookeeper.create(zk, node, value, [acl]) print "created: %s with value: %s" % (node, value) else: zookeeper.set(zk, node, value) print "updated: %s with value: %s" % (node, value) > 50%-75% connection loss exceptions using zkpython > - > > Key: ZOOKEEPER-676 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-676 > Project: Zookeeper > Issue Type: Bug > Components: contrib, contrib-bindings >Affects Versions: 3.2.2 > Environment: Mac OS X 10.5.8, MacBook Air Intel Core 2 Duo @ > 1.86GHz, Python 2.5.1, ZooKeeper 3.2.2 Standalone >Reporter: Josh Fraser > Attachments: connection_test.py > > > I get about 50-75% connection loss exceptions and about 10% Bus Error when > using the contrib/zkpython zookeeper.so. Below is the exception: > 2010-02-21 > 16:57:56,138:18481(0xb0081000):zoo_er...@handle_socket_error_msg@1359: Socket > [fe80::1002:885:7f00:1:2181] zk retcode=-4, errno=47(Address family not > supported by protocol family): connect() call failed > Traceback (most recent call last): > File "./zksh.py", line 63, in > 2010-02-21 16:57:56,138:18481(0xb0081000):zoo_i...@check_events@1439: > initiated connection to server [127.0.0.1:2181] > zkcli.dispatch(cmd,*args) > File "./zksh.py", line 56, in dispatch > returned = run(*args) > File "./zksh.py", line 48, in ls > print "\n".join(self.cmd.listNode(node)) > File "/Users/josh/git/zktools/commands.py", line 22, in listNode > for path in zookeeper.get_children(self.zk, node): > zookeeper.ConnectionLossException: connection loss > I've run this in gdb and have this backtrace: > #0 free_pywatcher (pw=0x0) at src/c/zookeeper.c:199 > #1 0x0025ae09 in pyzoo_exists (self=0x0, args=0x0) at src/c/zookeeper.c:765 > #2 0x0018f51e in PyEval_EvalFrameEx () > #3 0x00191173 in PyEval_EvalCodeEx () > #4 0x0013b488 in PyFunction_SetClosure () > #5 0x00121505 in PyObject_Call () > #6 0x0018fcd0 in PyEval_EvalFrameEx () > #7 0x00191173 in PyEval_EvalCodeEx () > #8 0x0013b488 in PyFunction_SetClosure () > #9 0x00121505 in PyObject_Call () > #10 0x0018fcd0 in PyEval_EvalFrameEx () > #11 0x00191173 in PyEval_EvalCodeEx () > #12 0x0018f79d in PyEval_EvalFrameEx () > #13 0x00191173 in PyEval_EvalCodeEx () > #14 0x00191260 in PyEval_EvalCode () > #15 0x001a883c in PyErr_Display () > #16 0x001aa4ab in PyRun_InteractiveOneFlags () > #17 0x001aa5f9 in PyRun_InteractiveLoopFlags () > #18 0x001aaa2b in PyRun_AnyFileExFlags () > #19 0x001b5a57 in Py_Main () > #20 0x1fca in ?? () > zookeeper.c @ line 199: > void free_pywatcher( pywatcher_t *pw) > { > Py_DECREF(pw->callback); > free(pw); > } > That's as far as I've dug so far -- I ended up just writing a retry decorator > to get around it for now. On the same machine, the zkCli.sh test client > works flawlessly. Also, here's the Mac OS X Bus Error trace: > Process: Python [18556] > Path: > /System/Library/Frameworks/Python.framework/Versions/2.5/Resources/Python.app/Contents/MacOS/Python > Identifier: Python > Version: ??? (???) > Code Type: X86 (Native) > Parent Process: bash [18436] > Interval Since Last Report: 3323078 sec > Crashes Since Last Report: 50 > Per-App Interval Since Last Report: 0 sec > Per-App Crashes Since Last Report: 38 > Date/Time: 2010-02-21 17:07:27.399 -0800 > OS Version: Mac OS X 10.5.8 (9L31a) > Report Version: 6 > Anonymous UUID: FA533BDA-50B2-47A9-931C-6F2614C741F0 > Exception Type: EXC_BAD_ACCESS (SIGBUS) > Exception Codes: KERN_PROTECTION_FAILURE at 0x0004 > Crashed Thread: 0 > Thread 0 Crashed: > 0 zookeeper.so 0x002332bd free_pywatcher + 10 > (zookeeper.c:199) > 1 zookeeper.so 0x00239e09 pyzoo_exists + 984 > (zookeeper.c:765) > 2 org.python.python 0x0018f51e PyEval_EvalFrameEx + 17116 > 3 org.python.python 0x0018f700 PyEval_EvalFrameEx + 17598 > 4 org.python.python 0x00191173 PyEval_EvalCodeEx + 1638 > 5 org.python.python
[jira] Updated: (ZOOKEEPER-676) 50%-75% connection loss exceptions using zkpython
[ https://issues.apache.org/jira/browse/ZOOKEEPER-676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Fraser updated ZOOKEEPER-676: -- Attachment: connection_test.py Test script demonstrating the issue > 50%-75% connection loss exceptions using zkpython > - > > Key: ZOOKEEPER-676 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-676 > Project: Zookeeper > Issue Type: Bug > Components: contrib, contrib-bindings >Affects Versions: 3.2.2 > Environment: Mac OS X 10.5.8, MacBook Air Intel Core 2 Duo @ > 1.86GHz, Python 2.5.1, ZooKeeper 3.2.2 Standalone >Reporter: Josh Fraser > Attachments: connection_test.py > > > I get about 50-75% connection loss exceptions and about 10% Bus Error when > using the contrib/zkpython zookeeper.so. Below is the exception: > 2010-02-21 > 16:57:56,138:18481(0xb0081000):zoo_er...@handle_socket_error_msg@1359: Socket > [fe80::1002:885:7f00:1:2181] zk retcode=-4, errno=47(Address family not > supported by protocol family): connect() call failed > Traceback (most recent call last): > File "./zksh.py", line 63, in > 2010-02-21 16:57:56,138:18481(0xb0081000):zoo_i...@check_events@1439: > initiated connection to server [127.0.0.1:2181] > zkcli.dispatch(cmd,*args) > File "./zksh.py", line 56, in dispatch > returned = run(*args) > File "./zksh.py", line 48, in ls > print "\n".join(self.cmd.listNode(node)) > File "/Users/josh/git/zktools/commands.py", line 22, in listNode > for path in zookeeper.get_children(self.zk, node): > zookeeper.ConnectionLossException: connection loss > I've run this in gdb and have this backtrace: > #0 free_pywatcher (pw=0x0) at src/c/zookeeper.c:199 > #1 0x0025ae09 in pyzoo_exists (self=0x0, args=0x0) at src/c/zookeeper.c:765 > #2 0x0018f51e in PyEval_EvalFrameEx () > #3 0x00191173 in PyEval_EvalCodeEx () > #4 0x0013b488 in PyFunction_SetClosure () > #5 0x00121505 in PyObject_Call () > #6 0x0018fcd0 in PyEval_EvalFrameEx () > #7 0x00191173 in PyEval_EvalCodeEx () > #8 0x0013b488 in PyFunction_SetClosure () > #9 0x00121505 in PyObject_Call () > #10 0x0018fcd0 in PyEval_EvalFrameEx () > #11 0x00191173 in PyEval_EvalCodeEx () > #12 0x0018f79d in PyEval_EvalFrameEx () > #13 0x00191173 in PyEval_EvalCodeEx () > #14 0x00191260 in PyEval_EvalCode () > #15 0x001a883c in PyErr_Display () > #16 0x001aa4ab in PyRun_InteractiveOneFlags () > #17 0x001aa5f9 in PyRun_InteractiveLoopFlags () > #18 0x001aaa2b in PyRun_AnyFileExFlags () > #19 0x001b5a57 in Py_Main () > #20 0x1fca in ?? () > zookeeper.c @ line 199: > void free_pywatcher( pywatcher_t *pw) > { > Py_DECREF(pw->callback); > free(pw); > } > That's as far as I've dug so far -- I ended up just writing a retry decorator > to get around it for now. On the same machine, the zkCli.sh test client > works flawlessly. Also, here's the Mac OS X Bus Error trace: > Process: Python [18556] > Path: > /System/Library/Frameworks/Python.framework/Versions/2.5/Resources/Python.app/Contents/MacOS/Python > Identifier: Python > Version: ??? (???) > Code Type: X86 (Native) > Parent Process: bash [18436] > Interval Since Last Report: 3323078 sec > Crashes Since Last Report: 50 > Per-App Interval Since Last Report: 0 sec > Per-App Crashes Since Last Report: 38 > Date/Time: 2010-02-21 17:07:27.399 -0800 > OS Version: Mac OS X 10.5.8 (9L31a) > Report Version: 6 > Anonymous UUID: FA533BDA-50B2-47A9-931C-6F2614C741F0 > Exception Type: EXC_BAD_ACCESS (SIGBUS) > Exception Codes: KERN_PROTECTION_FAILURE at 0x0004 > Crashed Thread: 0 > Thread 0 Crashed: > 0 zookeeper.so 0x002332bd free_pywatcher + 10 > (zookeeper.c:199) > 1 zookeeper.so 0x00239e09 pyzoo_exists + 984 > (zookeeper.c:765) > 2 org.python.python 0x0018f51e PyEval_EvalFrameEx + 17116 > 3 org.python.python 0x0018f700 PyEval_EvalFrameEx + 17598 > 4 org.python.python 0x00191173 PyEval_EvalCodeEx + 1638 > 5 org.python.python 0x0013b488 PyFunction_SetClosure + 2667 > 6 org.python.python 0x00121505 PyObject_Call + 50 > 7 org.python.python 0x0018fcd0 PyEval_EvalFrameEx + 19086 > 8 org.python.python 0x00191173 PyEval_EvalCodeEx + 1638 > 9 org.python.python 0x0013b488 PyFunction_SetClosure + 2667 > 10 org.python.python 0x00121505 PyObject_Call + 50 > 11 org.python.python 0x0018fcd0 PyEval_EvalFrameEx + 19086 > 12 org.python.python 0x00191173 PyEval_EvalCodeEx + 1638 > 13 org.python.python 0x00191260 PyEval_EvalCode + 87 > 14 org.python.python 0x001a883c PyErr_Display + 1
[jira] Commented: (ZOOKEEPER-676) 50%-75% connection loss exceptions using zkpython
[ https://issues.apache.org/jira/browse/ZOOKEEPER-676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12836516#action_12836516 ] Henry Robinson commented on ZOOKEEPER-676: -- Josh - I think I've identified a case in which this could happen. Could you share a minimal script that reproduces the problem? I'll test it against my fix to make sure I'm solving the problem you're seeing. Thanks, Henry > 50%-75% connection loss exceptions using zkpython > - > > Key: ZOOKEEPER-676 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-676 > Project: Zookeeper > Issue Type: Bug > Components: contrib, contrib-bindings >Affects Versions: 3.2.2 > Environment: Mac OS X 10.5.8, MacBook Air Intel Core 2 Duo @ > 1.86GHz, Python 2.5.1, ZooKeeper 3.2.2 Standalone >Reporter: Josh Fraser > > I get about 50-75% connection loss exceptions and about 10% Bus Error when > using the contrib/zkpython zookeeper.so. Below is the exception: > 2010-02-21 > 16:57:56,138:18481(0xb0081000):zoo_er...@handle_socket_error_msg@1359: Socket > [fe80::1002:885:7f00:1:2181] zk retcode=-4, errno=47(Address family not > supported by protocol family): connect() call failed > Traceback (most recent call last): > File "./zksh.py", line 63, in > 2010-02-21 16:57:56,138:18481(0xb0081000):zoo_i...@check_events@1439: > initiated connection to server [127.0.0.1:2181] > zkcli.dispatch(cmd,*args) > File "./zksh.py", line 56, in dispatch > returned = run(*args) > File "./zksh.py", line 48, in ls > print "\n".join(self.cmd.listNode(node)) > File "/Users/josh/git/zktools/commands.py", line 22, in listNode > for path in zookeeper.get_children(self.zk, node): > zookeeper.ConnectionLossException: connection loss > I've run this in gdb and have this backtrace: > #0 free_pywatcher (pw=0x0) at src/c/zookeeper.c:199 > #1 0x0025ae09 in pyzoo_exists (self=0x0, args=0x0) at src/c/zookeeper.c:765 > #2 0x0018f51e in PyEval_EvalFrameEx () > #3 0x00191173 in PyEval_EvalCodeEx () > #4 0x0013b488 in PyFunction_SetClosure () > #5 0x00121505 in PyObject_Call () > #6 0x0018fcd0 in PyEval_EvalFrameEx () > #7 0x00191173 in PyEval_EvalCodeEx () > #8 0x0013b488 in PyFunction_SetClosure () > #9 0x00121505 in PyObject_Call () > #10 0x0018fcd0 in PyEval_EvalFrameEx () > #11 0x00191173 in PyEval_EvalCodeEx () > #12 0x0018f79d in PyEval_EvalFrameEx () > #13 0x00191173 in PyEval_EvalCodeEx () > #14 0x00191260 in PyEval_EvalCode () > #15 0x001a883c in PyErr_Display () > #16 0x001aa4ab in PyRun_InteractiveOneFlags () > #17 0x001aa5f9 in PyRun_InteractiveLoopFlags () > #18 0x001aaa2b in PyRun_AnyFileExFlags () > #19 0x001b5a57 in Py_Main () > #20 0x1fca in ?? () > zookeeper.c @ line 199: > void free_pywatcher( pywatcher_t *pw) > { > Py_DECREF(pw->callback); > free(pw); > } > That's as far as I've dug so far -- I ended up just writing a retry decorator > to get around it for now. On the same machine, the zkCli.sh test client > works flawlessly. Also, here's the Mac OS X Bus Error trace: > Process: Python [18556] > Path: > /System/Library/Frameworks/Python.framework/Versions/2.5/Resources/Python.app/Contents/MacOS/Python > Identifier: Python > Version: ??? (???) > Code Type: X86 (Native) > Parent Process: bash [18436] > Interval Since Last Report: 3323078 sec > Crashes Since Last Report: 50 > Per-App Interval Since Last Report: 0 sec > Per-App Crashes Since Last Report: 38 > Date/Time: 2010-02-21 17:07:27.399 -0800 > OS Version: Mac OS X 10.5.8 (9L31a) > Report Version: 6 > Anonymous UUID: FA533BDA-50B2-47A9-931C-6F2614C741F0 > Exception Type: EXC_BAD_ACCESS (SIGBUS) > Exception Codes: KERN_PROTECTION_FAILURE at 0x0004 > Crashed Thread: 0 > Thread 0 Crashed: > 0 zookeeper.so 0x002332bd free_pywatcher + 10 > (zookeeper.c:199) > 1 zookeeper.so 0x00239e09 pyzoo_exists + 984 > (zookeeper.c:765) > 2 org.python.python 0x0018f51e PyEval_EvalFrameEx + 17116 > 3 org.python.python 0x0018f700 PyEval_EvalFrameEx + 17598 > 4 org.python.python 0x00191173 PyEval_EvalCodeEx + 1638 > 5 org.python.python 0x0013b488 PyFunction_SetClosure + 2667 > 6 org.python.python 0x00121505 PyObject_Call + 50 > 7 org.python.python 0x0018fcd0 PyEval_EvalFrameEx + 19086 > 8 org.python.python 0x00191173 PyEval_EvalCodeEx + 1638 > 9 org.python.python 0x0013b488 PyFunction_SetClosure + 2667 > 10 org.python.python 0x00121505 PyObject_Call + 50 > 11 org.python.python 0x0018fcd0 PyEval_EvalFrameEx + 19086 > 12 org.python.python 0x
[jira] Commented: (ZOOKEEPER-676) 50%-75% connection loss exceptions using zkpython
[ https://issues.apache.org/jira/browse/ZOOKEEPER-676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12836507#action_12836507 ] Henry Robinson commented on ZOOKEEPER-676: -- Sorry - I just realised you gave that information. I'll dig into the problem and see what's up. > 50%-75% connection loss exceptions using zkpython > - > > Key: ZOOKEEPER-676 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-676 > Project: Zookeeper > Issue Type: Bug > Components: contrib, contrib-bindings >Affects Versions: 3.2.2 > Environment: Mac OS X 10.5.8, MacBook Air Intel Core 2 Duo @ > 1.86GHz, Python 2.5.1, ZooKeeper 3.2.2 Standalone >Reporter: Josh Fraser > > I get about 50-75% connection loss exceptions and about 10% Bus Error when > using the contrib/zkpython zookeeper.so. Below is the exception: > 2010-02-21 > 16:57:56,138:18481(0xb0081000):zoo_er...@handle_socket_error_msg@1359: Socket > [fe80::1002:885:7f00:1:2181] zk retcode=-4, errno=47(Address family not > supported by protocol family): connect() call failed > Traceback (most recent call last): > File "./zksh.py", line 63, in > 2010-02-21 16:57:56,138:18481(0xb0081000):zoo_i...@check_events@1439: > initiated connection to server [127.0.0.1:2181] > zkcli.dispatch(cmd,*args) > File "./zksh.py", line 56, in dispatch > returned = run(*args) > File "./zksh.py", line 48, in ls > print "\n".join(self.cmd.listNode(node)) > File "/Users/josh/git/zktools/commands.py", line 22, in listNode > for path in zookeeper.get_children(self.zk, node): > zookeeper.ConnectionLossException: connection loss > I've run this in gdb and have this backtrace: > #0 free_pywatcher (pw=0x0) at src/c/zookeeper.c:199 > #1 0x0025ae09 in pyzoo_exists (self=0x0, args=0x0) at src/c/zookeeper.c:765 > #2 0x0018f51e in PyEval_EvalFrameEx () > #3 0x00191173 in PyEval_EvalCodeEx () > #4 0x0013b488 in PyFunction_SetClosure () > #5 0x00121505 in PyObject_Call () > #6 0x0018fcd0 in PyEval_EvalFrameEx () > #7 0x00191173 in PyEval_EvalCodeEx () > #8 0x0013b488 in PyFunction_SetClosure () > #9 0x00121505 in PyObject_Call () > #10 0x0018fcd0 in PyEval_EvalFrameEx () > #11 0x00191173 in PyEval_EvalCodeEx () > #12 0x0018f79d in PyEval_EvalFrameEx () > #13 0x00191173 in PyEval_EvalCodeEx () > #14 0x00191260 in PyEval_EvalCode () > #15 0x001a883c in PyErr_Display () > #16 0x001aa4ab in PyRun_InteractiveOneFlags () > #17 0x001aa5f9 in PyRun_InteractiveLoopFlags () > #18 0x001aaa2b in PyRun_AnyFileExFlags () > #19 0x001b5a57 in Py_Main () > #20 0x1fca in ?? () > zookeeper.c @ line 199: > void free_pywatcher( pywatcher_t *pw) > { > Py_DECREF(pw->callback); > free(pw); > } > That's as far as I've dug so far -- I ended up just writing a retry decorator > to get around it for now. On the same machine, the zkCli.sh test client > works flawlessly. Also, here's the Mac OS X Bus Error trace: > Process: Python [18556] > Path: > /System/Library/Frameworks/Python.framework/Versions/2.5/Resources/Python.app/Contents/MacOS/Python > Identifier: Python > Version: ??? (???) > Code Type: X86 (Native) > Parent Process: bash [18436] > Interval Since Last Report: 3323078 sec > Crashes Since Last Report: 50 > Per-App Interval Since Last Report: 0 sec > Per-App Crashes Since Last Report: 38 > Date/Time: 2010-02-21 17:07:27.399 -0800 > OS Version: Mac OS X 10.5.8 (9L31a) > Report Version: 6 > Anonymous UUID: FA533BDA-50B2-47A9-931C-6F2614C741F0 > Exception Type: EXC_BAD_ACCESS (SIGBUS) > Exception Codes: KERN_PROTECTION_FAILURE at 0x0004 > Crashed Thread: 0 > Thread 0 Crashed: > 0 zookeeper.so 0x002332bd free_pywatcher + 10 > (zookeeper.c:199) > 1 zookeeper.so 0x00239e09 pyzoo_exists + 984 > (zookeeper.c:765) > 2 org.python.python 0x0018f51e PyEval_EvalFrameEx + 17116 > 3 org.python.python 0x0018f700 PyEval_EvalFrameEx + 17598 > 4 org.python.python 0x00191173 PyEval_EvalCodeEx + 1638 > 5 org.python.python 0x0013b488 PyFunction_SetClosure + 2667 > 6 org.python.python 0x00121505 PyObject_Call + 50 > 7 org.python.python 0x0018fcd0 PyEval_EvalFrameEx + 19086 > 8 org.python.python 0x00191173 PyEval_EvalCodeEx + 1638 > 9 org.python.python 0x0013b488 PyFunction_SetClosure + 2667 > 10 org.python.python 0x00121505 PyObject_Call + 50 > 11 org.python.python 0x0018fcd0 PyEval_EvalFrameEx + 19086 > 12 org.python.python 0x00191173 PyEval_EvalCodeEx + 1638 > 13 org.python.python 0x00191260 PyEval_EvalCode + 87 > 14 org.python.python
[jira] Commented: (ZOOKEEPER-676) 50%-75% connection loss exceptions using zkpython
[ https://issues.apache.org/jira/browse/ZOOKEEPER-676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12836506#action_12836506 ] Henry Robinson commented on ZOOKEEPER-676: -- Thanks for the report! Can you let me know what version of zkpython you're using - is it from ZK trunk or an official release? There is another JIRA related to segfaults on watchers, I wonder if this is related... Henry > 50%-75% connection loss exceptions using zkpython > - > > Key: ZOOKEEPER-676 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-676 > Project: Zookeeper > Issue Type: Bug > Components: contrib, contrib-bindings >Affects Versions: 3.2.2 > Environment: Mac OS X 10.5.8, MacBook Air Intel Core 2 Duo @ > 1.86GHz, Python 2.5.1, ZooKeeper 3.2.2 Standalone >Reporter: Josh Fraser > > I get about 50-75% connection loss exceptions and about 10% Bus Error when > using the contrib/zkpython zookeeper.so. Below is the exception: > 2010-02-21 > 16:57:56,138:18481(0xb0081000):zoo_er...@handle_socket_error_msg@1359: Socket > [fe80::1002:885:7f00:1:2181] zk retcode=-4, errno=47(Address family not > supported by protocol family): connect() call failed > Traceback (most recent call last): > File "./zksh.py", line 63, in > 2010-02-21 16:57:56,138:18481(0xb0081000):zoo_i...@check_events@1439: > initiated connection to server [127.0.0.1:2181] > zkcli.dispatch(cmd,*args) > File "./zksh.py", line 56, in dispatch > returned = run(*args) > File "./zksh.py", line 48, in ls > print "\n".join(self.cmd.listNode(node)) > File "/Users/josh/git/zktools/commands.py", line 22, in listNode > for path in zookeeper.get_children(self.zk, node): > zookeeper.ConnectionLossException: connection loss > I've run this in gdb and have this backtrace: > #0 free_pywatcher (pw=0x0) at src/c/zookeeper.c:199 > #1 0x0025ae09 in pyzoo_exists (self=0x0, args=0x0) at src/c/zookeeper.c:765 > #2 0x0018f51e in PyEval_EvalFrameEx () > #3 0x00191173 in PyEval_EvalCodeEx () > #4 0x0013b488 in PyFunction_SetClosure () > #5 0x00121505 in PyObject_Call () > #6 0x0018fcd0 in PyEval_EvalFrameEx () > #7 0x00191173 in PyEval_EvalCodeEx () > #8 0x0013b488 in PyFunction_SetClosure () > #9 0x00121505 in PyObject_Call () > #10 0x0018fcd0 in PyEval_EvalFrameEx () > #11 0x00191173 in PyEval_EvalCodeEx () > #12 0x0018f79d in PyEval_EvalFrameEx () > #13 0x00191173 in PyEval_EvalCodeEx () > #14 0x00191260 in PyEval_EvalCode () > #15 0x001a883c in PyErr_Display () > #16 0x001aa4ab in PyRun_InteractiveOneFlags () > #17 0x001aa5f9 in PyRun_InteractiveLoopFlags () > #18 0x001aaa2b in PyRun_AnyFileExFlags () > #19 0x001b5a57 in Py_Main () > #20 0x1fca in ?? () > zookeeper.c @ line 199: > void free_pywatcher( pywatcher_t *pw) > { > Py_DECREF(pw->callback); > free(pw); > } > That's as far as I've dug so far -- I ended up just writing a retry decorator > to get around it for now. On the same machine, the zkCli.sh test client > works flawlessly. Also, here's the Mac OS X Bus Error trace: > Process: Python [18556] > Path: > /System/Library/Frameworks/Python.framework/Versions/2.5/Resources/Python.app/Contents/MacOS/Python > Identifier: Python > Version: ??? (???) > Code Type: X86 (Native) > Parent Process: bash [18436] > Interval Since Last Report: 3323078 sec > Crashes Since Last Report: 50 > Per-App Interval Since Last Report: 0 sec > Per-App Crashes Since Last Report: 38 > Date/Time: 2010-02-21 17:07:27.399 -0800 > OS Version: Mac OS X 10.5.8 (9L31a) > Report Version: 6 > Anonymous UUID: FA533BDA-50B2-47A9-931C-6F2614C741F0 > Exception Type: EXC_BAD_ACCESS (SIGBUS) > Exception Codes: KERN_PROTECTION_FAILURE at 0x0004 > Crashed Thread: 0 > Thread 0 Crashed: > 0 zookeeper.so 0x002332bd free_pywatcher + 10 > (zookeeper.c:199) > 1 zookeeper.so 0x00239e09 pyzoo_exists + 984 > (zookeeper.c:765) > 2 org.python.python 0x0018f51e PyEval_EvalFrameEx + 17116 > 3 org.python.python 0x0018f700 PyEval_EvalFrameEx + 17598 > 4 org.python.python 0x00191173 PyEval_EvalCodeEx + 1638 > 5 org.python.python 0x0013b488 PyFunction_SetClosure + 2667 > 6 org.python.python 0x00121505 PyObject_Call + 50 > 7 org.python.python 0x0018fcd0 PyEval_EvalFrameEx + 19086 > 8 org.python.python 0x00191173 PyEval_EvalCodeEx + 1638 > 9 org.python.python 0x0013b488 PyFunction_SetClosure + 2667 > 10 org.python.python 0x00121505 PyObject_Call + 50 > 11 org.python.python 0x0018fcd0 PyEval_EvalFrameEx + 19086 > 12 org.python.python 0x0019
[jira] Updated: (ZOOKEEPER-676) 50%-75% connection loss exceptions using zkpython
[ https://issues.apache.org/jira/browse/ZOOKEEPER-676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Fraser updated ZOOKEEPER-676: -- Description: I get about 50-75% connection loss exceptions and about 10% Bus Error when using the contrib/zkpython zookeeper.so. Below is the exception: 2010-02-21 16:57:56,138:18481(0xb0081000):zoo_er...@handle_socket_error_msg@1359: Socket [fe80::1002:885:7f00:1:2181] zk retcode=-4, errno=47(Address family not supported by protocol family): connect() call failed Traceback (most recent call last): File "./zksh.py", line 63, in 2010-02-21 16:57:56,138:18481(0xb0081000):zoo_i...@check_events@1439: initiated connection to server [127.0.0.1:2181] zkcli.dispatch(cmd,*args) File "./zksh.py", line 56, in dispatch returned = run(*args) File "./zksh.py", line 48, in ls print "\n".join(self.cmd.listNode(node)) File "/Users/josh/git/zktools/commands.py", line 22, in listNode for path in zookeeper.get_children(self.zk, node): zookeeper.ConnectionLossException: connection loss I've run this in gdb and have this backtrace: #0 free_pywatcher (pw=0x0) at src/c/zookeeper.c:199 #1 0x0025ae09 in pyzoo_exists (self=0x0, args=0x0) at src/c/zookeeper.c:765 #2 0x0018f51e in PyEval_EvalFrameEx () #3 0x00191173 in PyEval_EvalCodeEx () #4 0x0013b488 in PyFunction_SetClosure () #5 0x00121505 in PyObject_Call () #6 0x0018fcd0 in PyEval_EvalFrameEx () #7 0x00191173 in PyEval_EvalCodeEx () #8 0x0013b488 in PyFunction_SetClosure () #9 0x00121505 in PyObject_Call () #10 0x0018fcd0 in PyEval_EvalFrameEx () #11 0x00191173 in PyEval_EvalCodeEx () #12 0x0018f79d in PyEval_EvalFrameEx () #13 0x00191173 in PyEval_EvalCodeEx () #14 0x00191260 in PyEval_EvalCode () #15 0x001a883c in PyErr_Display () #16 0x001aa4ab in PyRun_InteractiveOneFlags () #17 0x001aa5f9 in PyRun_InteractiveLoopFlags () #18 0x001aaa2b in PyRun_AnyFileExFlags () #19 0x001b5a57 in Py_Main () #20 0x1fca in ?? () zookeeper.c @ line 199: void free_pywatcher( pywatcher_t *pw) { Py_DECREF(pw->callback); free(pw); } That's as far as I've dug so far -- I ended up just writing a retry decorator to get around it for now. On the same machine, the zkCli.sh test client works flawlessly. Also, here's the Mac OS X Bus Error trace: Process: Python [18556] Path: /System/Library/Frameworks/Python.framework/Versions/2.5/Resources/Python.app/Contents/MacOS/Python Identifier: Python Version: ??? (???) Code Type: X86 (Native) Parent Process: bash [18436] Interval Since Last Report: 3323078 sec Crashes Since Last Report: 50 Per-App Interval Since Last Report: 0 sec Per-App Crashes Since Last Report: 38 Date/Time: 2010-02-21 17:07:27.399 -0800 OS Version: Mac OS X 10.5.8 (9L31a) Report Version: 6 Anonymous UUID: FA533BDA-50B2-47A9-931C-6F2614C741F0 Exception Type: EXC_BAD_ACCESS (SIGBUS) Exception Codes: KERN_PROTECTION_FAILURE at 0x0004 Crashed Thread: 0 Thread 0 Crashed: 0 zookeeper.so0x002332bd free_pywatcher + 10 (zookeeper.c:199) 1 zookeeper.so0x00239e09 pyzoo_exists + 984 (zookeeper.c:765) 2 org.python.python 0x0018f51e PyEval_EvalFrameEx + 17116 3 org.python.python 0x0018f700 PyEval_EvalFrameEx + 17598 4 org.python.python 0x00191173 PyEval_EvalCodeEx + 1638 5 org.python.python 0x0013b488 PyFunction_SetClosure + 2667 6 org.python.python 0x00121505 PyObject_Call + 50 7 org.python.python 0x0018fcd0 PyEval_EvalFrameEx + 19086 8 org.python.python 0x00191173 PyEval_EvalCodeEx + 1638 9 org.python.python 0x0013b488 PyFunction_SetClosure + 2667 10 org.python.python 0x00121505 PyObject_Call + 50 11 org.python.python 0x0018fcd0 PyEval_EvalFrameEx + 19086 12 org.python.python 0x00191173 PyEval_EvalCodeEx + 1638 13 org.python.python 0x00191260 PyEval_EvalCode + 87 14 org.python.python 0x001a883c PyErr_Display + 1896 15 org.python.python 0x001a8e66 PyRun_FileExFlags + 135 16 org.python.python 0x001aa7d2 PyRun_SimpleFileExFlags + 421 17 org.python.python 0x001b5a57 Py_Main + 3095 18 org.python.pythonapp0x1fca 0x1000 + 4042 Thread 1: 0 libSystem.B.dylib 0x9265fe0e poll$UNIX2003 + 10 1 libSystem.B.dylib 0x9262a155 _pthread_start + 321 2 libSystem.B.dylib 0x9262a012 thread_start + 34 Thread 2: 0 libSystem.B.dylib 0x9260046e __semwait_signal + 10 1 libSystem.B.dylib 0x9262adcd pthread_cond_wait$UNIX2003 + 73 2 libzookeeper_mt.2.dylib 0x00247e9f do_completion +
[jira] Created: (ZOOKEEPER-676) 50%-75% connection loss exceptions using zkpython
50%-75% connection loss exceptions using zkpython - Key: ZOOKEEPER-676 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-676 Project: Zookeeper Issue Type: Bug Components: contrib, contrib-bindings Affects Versions: 3.2.2 Environment: Mac OS X 10.5.8, MacBook Air Intel Core 2 Duo @ 1.86GHz, Python 2.5.1, ZooKeeper 3.2.2 Standalone Reporter: Josh Fraser I get about 50-75% connection loss exceptions and about 10% Bus Error when using the contrib/zkpython zookeeper.so. Below is the exception: 2010-02-21 16:57:56,138:18481(0xb0081000):zoo_er...@handle_socket_error_msg@1359: Socket [fe80::1002:885:7f00:1:2181] zk retcode=-4, errno=47(Address family not supported by protocol family): connect() call failed Traceback (most recent call last): File "./zksh.py", line 63, in 2010-02-21 16:57:56,138:18481(0xb0081000):zoo_i...@check_events@1439: initiated connection to server [127.0.0.1:2181] zkcli.dispatch(cmd,*args) File "./zksh.py", line 56, in dispatch returned = run(*args) File "./zksh.py", line 48, in ls print "\n".join(self.cmd.listNode(node)) File "/Users/josh/git/zktools/commands.py", line 22, in listNode for path in zookeeper.get_children(self.zk, node): zookeeper.ConnectionLossException: connection loss I've run this in gdb and have this backtrace: #0 free_pywatcher (pw=0x0) at src/c/zookeeper.c:199 #1 0x0025ae09 in pyzoo_exists (self=0x0, args=0x0) at src/c/zookeeper.c:765 #2 0x0018f51e in PyEval_EvalFrameEx () #3 0x00191173 in PyEval_EvalCodeEx () #4 0x0013b488 in PyFunction_SetClosure () #5 0x00121505 in PyObject_Call () #6 0x0018fcd0 in PyEval_EvalFrameEx () #7 0x00191173 in PyEval_EvalCodeEx () #8 0x0013b488 in PyFunction_SetClosure () #9 0x00121505 in PyObject_Call () #10 0x0018fcd0 in PyEval_EvalFrameEx () #11 0x00191173 in PyEval_EvalCodeEx () #12 0x0018f79d in PyEval_EvalFrameEx () #13 0x00191173 in PyEval_EvalCodeEx () #14 0x00191260 in PyEval_EvalCode () #15 0x001a883c in PyErr_Display () #16 0x001aa4ab in PyRun_InteractiveOneFlags () #17 0x001aa5f9 in PyRun_InteractiveLoopFlags () #18 0x001aaa2b in PyRun_AnyFileExFlags () #19 0x001b5a57 in Py_Main () #20 0x1fca in ?? () zookeeper.c @ line 199: void free_pywatcher( pywatcher_t *pw) { Py_DECREF(pw->callback); free(pw); } That's as far as I've dug so far -- I ended up just writing a retry decorator to get around it for now. On the same machine, the zkCli.sh test client works flawlessly. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (ZOOKEEPER-569) Failure of elected leader can lead to never-ending leader election
[ https://issues.apache.org/jira/browse/ZOOKEEPER-569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12836356#action_12836356 ] Hudson commented on ZOOKEEPER-569: -- Integrated in ZooKeeper-trunk #703 (See [http://hudson.zones.apache.org/hudson/job/ZooKeeper-trunk/703/]) . Failure of elected leader can lead to never-ending leader election (henry via flavio) > Failure of elected leader can lead to never-ending leader election > -- > > Key: ZOOKEEPER-569 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-569 > Project: Zookeeper > Issue Type: Bug >Reporter: Henry Robinson >Assignee: Henry Robinson > Fix For: 3.3.0 > > Attachments: zookeeper-569.patch, ZOOKEEPER-569.patch, > zookeeper-569.patch, zookeeper-569.patch, zookeeper-569.patch, > zookeeper-569.patch > > > It is possible for basic LeaderElection to enter a situation where it never > terminates. > As an example, consider a three node cluster A, B and C. > 1. In the first round, A votes for A, B votes for B and C votes for C > 2. Since C > B > A, all nodes resolve to vote for C in the second round as > there is no first round winner > 3. A, B vote for C, but C fails. > 4. C is not elected because neither A nor B hear from it, and so votes for it > are discarded > 5. A and B never reset their votes, despite not hearing from C, so continue > to vote for it ad infinitum. > Step 5 is the bug. If A and B reset their votes to themselves in the case > where the heard-from vote set is empty, leader election will continue. > I do not know if this affects running ZK clusters, as it is possible that the > out-of-band failure detection protocols may cause leader election to be > restarted anyhow, but I've certainly seen this in tests. > I have a trivial patch which fixes it, but it needs a test (and tests for > race conditions are hard to write!) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.