-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/44597/
-----------------------------------------------------------

(Updated March 9, 2016, 5:28 p.m.)


Review request for Ambari, Alejandro Fernandez, Robert Levas, and Sid Wagle.


Changes
-------

Reviewboard is being annoying; making me type a comment to get this published.


Bugs: AMBARI-15363
    https://issues.apache.org/jira/browse/AMBARI-15363


Repository: ambari


Description
-------

When executing certain JPA queries, the Ambari Server seems to deadlock in the 
c3p0 library. It's caused by two issues:

- Problems with c3p0 connection management (updated version to use better 
connection pool handling)
- EclipseLink JPA CriteriaBuilder cartesian products (updated how we build 
sorts for criteria to prevent duplicate Root<?>)

```
com.mchange.v2.async.ThreadPoolAsynchronousRunner$DeadlockDetector@7675360a -- 
APPARENT DEADLOCK!!! Creating emergency threads for unassigned pending tasks!
07 Mar 2016 18:11:01,108  WARN [Timer-0] ThreadPoolAsynchronousRunner:624 - 
com.mchange.v2.async.ThreadPoolAsynchronousRunner$DeadlockDetector@7675360a -- 
APPARENT DEADLOCK!!! Complete Status:
        Managed Threads: 3
        Active Threads: 1
        Active Tasks:
```

Looks like the problem is twofold. First, we have a thread performing a very 
heavy operation in the database:

```
"qtp-ambari-client-36" prio=5 tid=0x00007fd6b344f000 nid=0x8203 runnable 
[0x0000700002b3e000]
   java.lang.Thread.State: RUNNABLE
        at java.lang.Thread.currentThread(Native Method)
        at java.lang.ThreadLocal.get(ThreadLocal.java:143)
        at java.lang.StringCoding.deref(StringCoding.java:63)
        at java.lang.StringCoding.decode(StringCoding.java:179)
        at java.lang.String.<init>(String.java:416)
        at org.postgresql.core.Encoding.decode(Encoding.java:191)
        at org.postgresql.core.Encoding.decode(Encoding.java:203)
        at 
org.postgresql.jdbc2.AbstractJdbc2ResultSet.getString(AbstractJdbc2ResultSet.java:1979)
        at 
com.mchange.v2.c3p0.impl.NewProxyResultSet.getString(NewProxyResultSet.java:3316)
    ...
        at 
org.apache.ambari.server.orm.dao.DaoUtils.selectList(DaoUtils.java:62)
```

And then we see this:

```
Internal Exception: org.postgresql.util.PSQLException: Ran out of memory 
retrieving query results.
Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded
Mar 08, 2016 9:57:53 PM com.sun.jersey.spi.container.ContainerResponse 
mapMappableContainerException
SEVERE: The exception contained within MappableContainerException could not be 
mapped to a response, re-throwing to the HTTP container
java.lang.OutOfMemoryError: GC overhead limit exceeded
        at java.lang.StringCoding$StringDecoder.decode(StringCoding.java:149)
        at java.lang.StringCoding.decode(StringCoding.java:193)
        at java.lang.String.<init>(String.java:416)
        at org.postgresql.core.Encoding.decode(Encoding.java:191)
        at org.postgresql.core.Encoding.decode(Encoding.java:203)
        at 
org.postgresql.jdbc2.AbstractJdbc2ResultSet.getString(AbstractJdbc2ResultSet.java:1979)
        at 
com.mchange.v2.c3p0.impl.NewProxyResultSet.getString(NewProxyResultSet.java:3316)
    ...
        at 
org.eclipse.persistence.internal.jpa.QueryImpl.executeReadQuery(QueryImpl.java:258)
```


Diffs
-----

  ambari-agent/conf/unix/ambari-agent.ini 05e898a 
  ambari-agent/conf/windows/ambari-agent.ini e490f7c 
  ambari-agent/src/main/python/ambari_agent/AlertSchedulerHandler.py eb9945b 
  ambari-agent/src/main/python/ambari_agent/Controller.py eb2c363 
  ambari-agent/src/main/python/ambari_agent/alerts/base_alert.py fd6b03c 
  ambari-agent/src/main/python/ambari_agent/alerts/metric_alert.py b2f4e33 
  ambari-agent/src/main/python/ambari_agent/alerts/port_alert.py 92d28ad 
  ambari-agent/src/main/python/ambari_agent/alerts/recovery_alert.py 760a737 
  ambari-agent/src/main/python/ambari_agent/alerts/script_alert.py e8d0125 
  ambari-agent/src/main/python/ambari_agent/alerts/web_alert.py 502526c 
  ambari-agent/src/test/python/ambari_agent/TestAlertSchedulerHandler.py 
9fd426f 
  ambari-agent/src/test/python/ambari_agent/TestAlerts.py 8344238 
  ambari-agent/src/test/python/ambari_agent/TestBaseAlert.py e67c894 
  ambari-agent/src/test/python/ambari_agent/TestMetricAlert.py 23e9f13 
  ambari-agent/src/test/python/ambari_agent/TestPortAlert.py 195cc63 
  ambari-agent/src/test/python/ambari_agent/TestScriptAlert.py 46c7651 
  
ambari-common/src/main/python/resource_management/libraries/functions/curl_krb_request.py
 1ccc45f 
  
ambari-server/src/main/resources/common-services/HDFS/2.1.0.2.0/package/alerts/alert_checkpoint_time.py
 ef389cd 
  
ambari-server/src/main/resources/common-services/HDFS/2.1.0.2.0/package/alerts/alert_ha_namenode_health.py
 a174cb4 
  
ambari-server/src/main/resources/common-services/HDFS/2.1.0.2.0/package/alerts/alert_metrics_deviation.py
 217f3b8 
  
ambari-server/src/main/resources/common-services/HDFS/2.1.0.2.0/package/alerts/alert_upgrade_finalized.py
 6e8945c 
  
ambari-server/src/main/resources/common-services/HIVE/0.12.0.2.0/package/alerts/alert_webhcat_server.py
 b49fd6e 
  
ambari-server/src/main/resources/common-services/YARN/2.1.0.2.0/package/alerts/alert_nodemanager_health.py
 ef5e6b3 
  
ambari-server/src/main/resources/common-services/YARN/2.1.0.2.0/package/alerts/alert_nodemanagers_summary.py
 119a1a1 

Diff: https://reviews.apache.org/r/44597/diff/


Testing (updated)
-------

Executed problematic queries against a massive database. Problem is resolved in 
test environment.

Unit Tests Pending


Thanks,

Jonathan Hurley

Reply via email to