[ 
https://issues.apache.org/jira/browse/AMBARI-18456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Hurley updated AMBARI-18456:
-------------------------------------
    Fix Version/s: 2.5.0

> Refactor Unnecessary In-Memory Locks Around Business Objects
> ------------------------------------------------------------
>
>                 Key: AMBARI-18456
>                 URL: https://issues.apache.org/jira/browse/AMBARI-18456
>             Project: Ambari
>          Issue Type: Epic
>          Components: ambari-server
>    Affects Versions: 2.5.0
>            Reporter: Jonathan Hurley
>            Assignee: Jonathan Hurley
>              Labels: branch-feature-AMBARI-18456
>             Fix For: 2.5.0
>
>
> The top 4 business objects in Ambari:
> - ClusterImpl
> - ServiceImpl
> - ServiceComponentImpl
> - ServiceComponentHostImpl
> All use {{ReadWriteLock}} implementations to prevent dirty reads and 
> concurrent writes. However, {{ClusterImpl}} exposes a "global" 
> {{ReadWriteLock}} which the other business objects share. This causes 
> tremendous problems with deadlocks, especially on slow databases.
> Consider the case where you have 3 threads:
> # thread-1 acquires {{ClusterReadLock}}
> # thread-2 acquires {{ServiceComponenWriteLock}}
> # thread-3 tries to get {{ClusterWriteLock}} and is blocked by {{thread-1}}
> # thread-2 tries to get {{ClusterReadLock}} and is blocked by {{thread-3}}
> # thread-1 tries to get {{ServiceComponentReadLock}} and is blocked by 
> {{thread-2}}
> Essentially, the exposure of the "cluster global lock" causes problems since 
> multiple threads can acquire other internal locks and be blocked waiting on 
> the global lock.
> In general, I don't believe that the read locks help at all. Ambari usually 
> encounters these locks while try to display web page information. Once 
> displayed, the locks are removed and the information is already stale if 
> there were write threads waiting.
> These locks should be investigated and, for the most part, except in some 
> cases involving concurrent writes, removed.
> Part of the problem revolves around our assumption about how the 
> ReadWriteLock works. The issue in the above scenario is that the 
> clusterWriteLock request is pending. This actually blocks all subsequent 
> readers even though the lock is not fair.
> FYI, this code shows that a reader, in unfair mode, will wait when there is a 
> waiting writer:
> {noformat:title=Output}
> Waiting for a read lock...
> Read lock acquired!
> Waiting for a write lock...
> Trying to acquire a second read lock...
> {noformat}
> {code}
> import java.util.concurrent.locks.ReentrantReadWriteLock;
> public class Test {
>   private static ReentrantReadWriteLock lock = new 
> ReentrantReadWriteLock(false);
>   public static void main(String[] args) throws InterruptedException {
>     // A reader which takes too long to finish
>     new Thread() {
>       @Override
>       public void run() {
>         System.out.println("Waiting for a read lock...");
>         lock.readLock().lock();
>         System.out.println("Read lock acquired!");
>         try {
>           try {
>             Thread.sleep(1000 * 60 * 60);
>           } catch (InterruptedException e) {
>           }
>         } finally {
>           lock.readLock().unlock();
>         }
>       }
>     }.start();
>     Thread.sleep(3000);
>     // A writer which will be waiting
>     new Thread() {
>       @Override
>       public void run() {
>         System.out.println("Waiting for a write lock...");
>         lock.writeLock().lock();
>         System.out.println("Write lock acquired!");
>         lock.writeLock().unlock();
>       }
>     }.start();
>     Thread.sleep(3000);
>     // Another reader
>     new Thread() {
>       @Override
>       public void run() {
>         System.out.println("Trying to acquire a second read lock...");
>         lock.readLock().lock();
>         try {
>           System.out.println("Second read lock acquired successfully!!");
>         } finally {
>           lock.readLock().unlock();
>         }
>       }
>     }.start();
>   }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to