-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/53263/
-----------------------------------------------------------

Review request for Ambari and Vitalyi Brodetskyi.


Bugs: AMBARI-18728
    https://issues.apache.org/jira/browse/AMBARI-18728


Repository: ambari


Description
-------

This was caused by a very tricky race-condition in the way python 
multiprocessing.thread works resulting in deadlock in ambari_agent.ActionQueue 
thread.
The problem is the below flow:
If this all these three get executed at the same time (a very rear occasion):
1. Process1 executes queue.get(False)
2. Process2 executes queue.put(largeObjectWhichTakesLongTimeToPut)
3. Someone kills Process2.

This results in deadlock in process1 get. Which is caused by queue 
locks/semaphores to being released during put of process2.

I have wrote a script test_race_condition.py to emulate this behaviour and 
indeed could reproduce this and test the fix for it.


Diffs
-----

  ambari-agent/src/main/python/ambari_agent/ActionQueue.py bf840e2 
  ambari-agent/src/main/python/ambari_agent/StatusCommandsExecutor.py 20acee4 

Diff: https://reviews.apache.org/r/53263/diff/


Testing
-------

mvn clean test


Thanks,

Andrew Onischuk

Reply via email to