Mahadev konar created AMBARI-16146:
--------------------------------------

             Summary: Hive View Synchronized Around Entire Connection Creation 
Causing Deadlock
                 Key: AMBARI-16146
                 URL: https://issues.apache.org/jira/browse/AMBARI-16146
             Project: Ambari
          Issue Type: Bug
    Affects Versions: ambari-2.4.0
            Reporter: Mahadev konar
            Assignee: Nitiraj Singh Rathore
            Priority: Critical
             Fix For: ambari-2.4.0


Hive View Synchronized Around Entire Connection Creation Causing Deadlock

The Hive view uses two {{synchronized}} methods when creating connections:

[ConnectionFactory|https://github.com/apache/ambari/blob/trunk/contrib/views/hive/src/main/java/org/apache/ambari/view/hive/client/ConnectionFactory.java#L54]
{code}
  public synchronized HdfsApi getHDFSApi() {
    if (hdfsApi == null) {
      try {
        hdfsApi = HdfsUtil.connectToHDFSApi(context);
      } catch (Exception ex) {
        throw new ServiceFormattedException("HdfsApi connection failed. Check 
\"webhdfs.url\" property", ex);
      }
    }
    return hdfsApi;
  }
{code}

[Connection|https://github.com/apache/ambari/blob/trunk/contrib/views/hive/src/main/java/org/apache/ambari/view/hive/client/Connection.java#L104]
{code}
  public synchronized void openConnection() throws HiveClientException, 
HiveAuthRequiredException {
    try {
      transport = isHttpTransportMode() ? createHttpTransport() : 
createBinaryTransport();
      transport.open();
      client = new TCLIService.Client(new TBinaryProtocol(transport));
    } catch (TTransportException e) {
      throw new HiveClientException("H020 Could not establish connection to "
          + host + ":" + port + ": " + e.toString(), e);
    } catch (SQLException e) {
      throw new HiveClientException(e.getMessage(), e);
    }
    LOG.info("Hive connection opened");
  }
{code}

[UserLocationConnection|https://github.com/apache/ambari/blob/trunk/contrib/views/hive/src/main/java/org/apache/ambari/view/hive/client/UserLocalConnection.java#L37]
{code}
  @Override
  protected synchronized Connection initialValue(ViewContext context) {
    ConnectionFactory hiveConnectionFactory = new ConnectionFactory(context, 
authCredentialsLocal.get(context));
    authCredentialsLocal.remove(context);  // we should not store credentials 
in memory,
                                          // password is erased after 
connection established
    return hiveConnectionFactory.create();
  }
{code}

The problem with this approach is that views must share the Jetty thread pool 
with the Ambari Server. When the Hive view is requested, several threads are 
spawned and each waits for a single connection to Hive. One thread enters the 
{{synchronized}} block and attempts to make the connections. All other threads 
are blocked - and that means that Ambari's Jetty threads are not blocked as 
well and not able to answer requests.

Between opening connections to HDFS, Ambari, and Hive, these calls can easily 
take between several seconds to a minute to complete. During that time, no 
other requests can be fulfilled by Ambari on those threads. If there are 
several users using Ambari, then this means that all available Jetty threads 
are going to be waiting for the sole hive thread to complete it's 
{{synchronized}} block.

*This essentially makes Ambari single-threaded*

AMBARI-16131 is a workaround to alleviate this problem by denying access to the 
view if there are already too many threads being held by various views. 

However, this problem also needs to be fixed in the Hive view. Using a new 
workflow of callbacks and/or asynchronous returns/polling while waiting for the 
connection, you can prevent the use of these {{synchronized}} blocks.

Here's an example of a thread dump showing the problem:

This thread is stuck inside of the synchronized trying to make a connection 
back to Ambari:
{code}
"qtp-ambari-client-117" prio=10 tid=0x00007efbbc029800 nid=0x135e runnable 
[0x00007efb929e5000]
   java.lang.Thread.State: RUNNABLE
        at java.net.SocketInputStream.socketRead0(Native Method)
        at java.net.SocketInputStream.read(SocketInputStream.java:152)
        at java.net.SocketInputStream.read(SocketInputStream.java:122)
        at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
        at java.io.BufferedInputStream.read1(BufferedInputStream.java:275)
        at java.io.BufferedInputStream.read(BufferedInputStream.java:334)
        - locked <0x000000077769e870> (a java.io.BufferedInputStream)
        at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:690)
        at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:633)
        at 
sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1325)
        - locked <0x0000000777692ff8> (a 
sun.net.www.protocol.http.HttpURLConnection)
        at 
java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:468)
        at 
org.apache.ambari.server.controller.internal.URLStreamProvider.processURL(URLStreamProvider.java:209)
        at 
org.apache.ambari.server.view.ViewAmbariStreamProvider.getInputStream(ViewAmbariStreamProvider.java:118)
        at 
org.apache.ambari.server.view.ViewAmbariStreamProvider.readFrom(ViewAmbariStreamProvider.java:78)
        at 
org.apache.ambari.view.utils.ambari.URLStreamProviderBasicAuth.readFrom(URLStreamProviderBasicAuth.java:65)
        at 
org.apache.ambari.view.utils.ambari.AmbariApi.requestClusterAPI(AmbariApi.java:173)
        at 
org.apache.ambari.view.utils.ambari.AmbariApi.requestClusterAPI(AmbariApi.java:142)
        at 
org.apache.ambari.view.utils.ambari.AmbariApi.getHostsWithComponent(AmbariApi.java:99)
        at 
org.apache.ambari.view.hive.client.ConnectionFactory.getHiveHost(ConnectionFactory.java:79)
        at 
org.apache.ambari.view.hive.client.ConnectionFactory.create(ConnectionFactory.java:68)
        at 
org.apache.ambari.view.hive.client.UserLocalConnection.initialValue(UserLocalConnection.java:42)
        - locked <0x0000000798772aa8> (a 
org.apache.ambari.view.hive.client.UserLocalConnection)
{code}

However it can't be answered because all of the available Jetty threads are 
currently used waiting for the above thread to finish its {{synchronized}} 
block:
{code}
"qtp-ambari-client-118" prio=10 tid=0x00007efbbc02b000 nid=0x135f waiting for 
monitor entry [0x00007efb928e4000]
   java.lang.Thread.State: BLOCKED (on object monitor)
        at 
org.apache.ambari.view.hive.client.UserLocalConnection.initialValue(UserLocalConnection.java:39)
        - waiting to lock <0x0000000798772aa8> (a 
org.apache.ambari.view.hive.client.UserLocalConnection)
        at 
org.apache.ambari.view.hive.client.UserLocalConnection.initialValue(UserLocalConnection.java:26)
        at org.apache.ambari.view.utils.UserLocal.get(UserLocal.java:66)
        at 
org.apache.ambari.view.hive.resources.browser.HiveBrowserService.databases(HiveBrowserService.java:87)

...

"qtp-ambari-client-25" prio=10 tid=0x00007efc1b235800 nid=0xaab waiting for 
monitor entry [0x00007efbfb7f7000]
   java.lang.Thread.State: BLOCKED (on object monitor)
        at 
org.apache.ambari.view.hive.client.UserLocalConnection.initialValue(UserLocalConnection.java:39)
        - waiting to lock <0x0000000798772aa8> (a 
org.apache.ambari.view.hive.client.UserLocalConnection)
        at 
org.apache.ambari.view.hive.client.UserLocalConnection.initialValue(UserLocalConnection.java:26)
        at org.apache.ambari.view.utils.UserLocal.get(UserLocal.java:66)
        at 
org.apache.ambari.view.hive.resources.browser.HiveBrowserService.databases(HiveBrowserService.java:87)

{code}




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to