[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Botond Hejj updated ZOOKEEPER-1023:
-----------------------------------

    Description: 
If the add_auth method has a callback and we execute another command just after 
it than we can deadlock the python api.
Example:

def deadlock(a, b):
        pass

def watcher(zh, type, state, path):
        if(state == zookeeper.CONNECTED_STATE):
                zookeeper.add_auth(zh, 'test', 'test', deadlock)
                zookeeper.get_children(zh, '/')

zh = zookeeper.init("host:port", watcher)

Looking at the code the problem looks like the following:
get_children sync call is running on the main thread and have the GIL it blocks 
until the get_children finished. Meantime on the other thread the callback of 
add_auth is called and that tries to get the GIL to call the python callback. 
So this thread is waiting for the main thread to release the GIL but the main 
thread is waiting for the other thread to process the reply of get_children.

I am not an expert on python binding but I think it can be solved if the GIL 
would be release before synchronous c api calls.


  was:
If the add_auth method has a callback and we execute another command just after 
it than we can deadlock the python api.
Example:

def deadlock(a, b):
        pass

def watcher(zh, type, state, path):
        if(state == zookeeper.CONNECTED_STATE):
                zookeeper.add_auth(zh, 'test', 'test', deadlock)
                zookeeper.get_children(zh, '/')

zh = zookeeper.init("np801c3n10:9221,np801c3n10:9331,np801c3n10:9441", watcher)

Looking at the code the problem looks like the following:
get_children sync call is running on the main thread and have the GIL it blocks 
until the get_children finished. Meantime on the other thread the callback of 
add_auth is called and that tries to get the GIL to call the python callback. 
So this thread is waiting for the main thread to release the GIL but the main 
thread is waiting for the other thread to process the reply of get_children.

I am not an expert on python binding but I think it can be solved if the GIL 
would be release before synchronous c api calls.



> zkpython: add_auth can deadlock the interpreter
> -----------------------------------------------
>
>                 Key: ZOOKEEPER-1023
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1023
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: contrib-bindings
>    Affects Versions: 3.3.2
>            Reporter: Botond Hejj
>            Priority: Minor
>             Fix For: 3.4.0
>
>
> If the add_auth method has a callback and we execute another command just 
> after it than we can deadlock the python api.
> Example:
> def deadlock(a, b):
>       pass
> def watcher(zh, type, state, path):
>       if(state == zookeeper.CONNECTED_STATE):
>               zookeeper.add_auth(zh, 'test', 'test', deadlock)
>               zookeeper.get_children(zh, '/')
> zh = zookeeper.init("host:port", watcher)
> Looking at the code the problem looks like the following:
> get_children sync call is running on the main thread and have the GIL it 
> blocks until the get_children finished. Meantime on the other thread the 
> callback of add_auth is called and that tries to get the GIL to call the 
> python callback. So this thread is waiting for the main thread to release the 
> GIL but the main thread is waiting for the other thread to process the reply 
> of get_children.
> I am not an expert on python binding but I think it can be solved if the GIL 
> would be release before synchronous c api calls.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to