Jonathan Hurley created AMBARI-15363: ----------------------------------------
Summary: Postgres And c3p0 Queries Can Hang Ambari On Large Queries Key: AMBARI-15363 URL: https://issues.apache.org/jira/browse/AMBARI-15363 Project: Ambari Issue Type: Bug Components: ambari-server Affects Versions: 2.1.0 Reporter: Jonathan Hurley Assignee: Jonathan Hurley Priority: Critical Fix For: 2.2.2 When executing certain JPA queries, the Ambari Server seems to deadlock in the c3p0 library. {code} 07 Mar 2016 18:03:01,313 ERROR [qtp-ambari-client-12980] BaseProvider:142 - Timed out waiting for metrics. 07 Mar 2016 18:04:36,314 ERROR [qtp-ambari-client-12988] BaseProvider:142 - Timed out waiting for metrics. 07 Mar 2016 18:07:13,653 ERROR [qtp-ambari-client-12990] BaseProvider:142 - Timed out waiting for metrics. 07 Mar 2016 18:07:27,287 ERROR [qtp-ambari-client-12990] BaseProvider:142 - Timed out waiting for metrics. 07 Mar 2016 18:09:13,789 WARN [Timer-0] ThreadPoolAsynchronousRunner:608 - com.mchange.v2.async.ThreadPoolAsynchronousRunner$DeadlockDetector@7675360a -- APPARENT DEADLOCK!!! Creating emergency threads for unassigned pending tasks! 07 Mar 2016 18:11:01,108 WARN [Timer-0] ThreadPoolAsynchronousRunner:624 - com.mchange.v2.async.ThreadPoolAsynchronousRunner$DeadlockDetector@7675360a -- APPARENT DEADLOCK!!! Complete Status: Managed Threads: 3 Active Threads: 1 Active Tasks: com.mchange.v2.resourcepool.BasicResourcePool$AsyncTestIdleResourceTask@207cca8d (com.mchange.v2.async.ThreadPoolAsynchronousRunner$PoolThread-#1) Pending Tasks: com.mchange.v2.resourcepool.BasicResourcePool$AsyncTestIdleResourceTask@7d3ee004 com.mchange.v2.resourcepool.BasicResourcePool$AsyncTestIdleResourceTask@3ac3468f Pool thread stack traces: Thread[com.mchange.v2.async.ThreadPoolAsynchronousRunner$PoolThread-#2,5,main] java.lang.Object.wait(Native Method) com.mchange.v2.async.ThreadPoolAsynchronousRunner$PoolThread.run(ThreadPoolAsynchronousRunner.java:534) Thread[com.mchange.v2.async.ThreadPoolAsynchronousRunner$PoolThread-#0,5,main] java.lang.Object.wait(Native Method) com.mchange.v2.async.ThreadPoolAsynchronousRunner$PoolThread.run(ThreadPoolAsynchronousRunner.java:534) Thread[com.mchange.v2.async.ThreadPoolAsynchronousRunner$PoolThread-#1,5,main] com.mchange.v2.async.ThreadPoolAsynchronousRunner$PoolThread.run(ThreadPoolAsynchronousRunner.java:560) {code} Looks like the problem is twofold. First, we have a thread performing a very heavy operation in the database: {code} "qtp-ambari-client-36" prio=5 tid=0x00007fd6b344f000 nid=0x8203 runnable [0x0000700002b3e000] java.lang.Thread.State: RUNNABLE at java.lang.Thread.currentThread(Native Method) at java.lang.ThreadLocal.get(ThreadLocal.java:143) at java.lang.StringCoding.deref(StringCoding.java:63) at java.lang.StringCoding.decode(StringCoding.java:179) at java.lang.String.<init>(String.java:416) at org.postgresql.core.Encoding.decode(Encoding.java:191) at org.postgresql.core.Encoding.decode(Encoding.java:203) at org.postgresql.jdbc2.AbstractJdbc2ResultSet.getString(AbstractJdbc2ResultSet.java:1979) at com.mchange.v2.c3p0.impl.NewProxyResultSet.getString(NewProxyResultSet.java:3316) at org.eclipse.persistence.internal.databaseaccess.DatabaseAccessor.getObjectThroughOptimizedDataConversion(DatabaseAccessor.java:1345) at org.eclipse.persistence.internal.databaseaccess.DatabaseAccessor.getObject(DatabaseAccessor.java:1264) at org.eclipse.persistence.internal.databaseaccess.DatabaseAccessor.fetchRow(DatabaseAccessor.java:1075) at org.eclipse.persistence.internal.databaseaccess.DatabaseAccessor.processResultSet(DatabaseAccessor.java:768) at org.eclipse.persistence.internal.databaseaccess.DatabaseAccessor.basicExecuteCall(DatabaseAccessor.java:655) at org.eclipse.persistence.internal.databaseaccess.DatabaseAccessor.executeCall(DatabaseAccessor.java:558) at org.eclipse.persistence.internal.sessions.AbstractSession.basicExecuteCall(AbstractSession.java:2002) at org.eclipse.persistence.sessions.server.ServerSession.executeCall(ServerSession.java:570) at org.eclipse.persistence.internal.queries.DatasourceCallQueryMechanism.executeCall(DatasourceCallQueryMechanism.java:242) at org.eclipse.persistence.internal.queries.DatasourceCallQueryMechanism.executeCall(DatasourceCallQueryMechanism.java:228) at org.eclipse.persistence.internal.queries.DatasourceCallQueryMechanism.executeSelectCall(DatasourceCallQueryMechanism.java:299) at org.eclipse.persistence.internal.queries.DatasourceCallQueryMechanism.selectAllRows(DatasourceCallQueryMechanism.java:694) at org.eclipse.persistence.internal.queries.ExpressionQueryMechanism.selectAllRowsFromTable(ExpressionQueryMechanism.java:2738) at org.eclipse.persistence.internal.queries.ExpressionQueryMechanism.selectAllRows(ExpressionQueryMechanism.java:2691) at org.eclipse.persistence.queries.ReadAllQuery.executeObjectLevelReadQuery(ReadAllQuery.java:495) at org.eclipse.persistence.queries.ObjectLevelReadQuery.executeDatabaseQuery(ObjectLevelReadQuery.java:1168) at org.eclipse.persistence.queries.DatabaseQuery.execute(DatabaseQuery.java:899) at org.eclipse.persistence.queries.ObjectLevelReadQuery.execute(ObjectLevelReadQuery.java:1127) at org.eclipse.persistence.queries.ReadAllQuery.execute(ReadAllQuery.java:403) at org.eclipse.persistence.queries.ObjectLevelReadQuery.executeInUnitOfWork(ObjectLevelReadQuery.java:1215) at org.eclipse.persistence.internal.sessions.UnitOfWorkImpl.internalExecuteQuery(UnitOfWorkImpl.java:2896) at org.eclipse.persistence.internal.sessions.AbstractSession.executeQuery(AbstractSession.java:1804) at org.eclipse.persistence.internal.sessions.AbstractSession.executeQuery(AbstractSession.java:1786) at org.eclipse.persistence.internal.sessions.AbstractSession.executeQuery(AbstractSession.java:1751) at org.eclipse.persistence.internal.jpa.QueryImpl.executeReadQuery(QueryImpl.java:258) at org.eclipse.persistence.internal.jpa.QueryImpl.getResultList(QueryImpl.java:469) at org.apache.ambari.server.orm.dao.DaoUtils.selectList(DaoUtils.java:62) {code} And then we see this: {code} Internal Exception: org.postgresql.util.PSQLException: Ran out of memory retrieving query results. Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded Mar 08, 2016 9:57:53 PM com.sun.jersey.spi.container.ContainerResponse mapMappableContainerException SEVERE: The exception contained within MappableContainerException could not be mapped to a response, re-throwing to the HTTP container java.lang.OutOfMemoryError: GC overhead limit exceeded at java.lang.StringCoding$StringDecoder.decode(StringCoding.java:149) at java.lang.StringCoding.decode(StringCoding.java:193) at java.lang.String.<init>(String.java:416) at org.postgresql.core.Encoding.decode(Encoding.java:191) at org.postgresql.core.Encoding.decode(Encoding.java:203) at org.postgresql.jdbc2.AbstractJdbc2ResultSet.getString(AbstractJdbc2ResultSet.java:1979) at com.mchange.v2.c3p0.impl.NewProxyResultSet.getString(NewProxyResultSet.java:3316) at org.eclipse.persistence.internal.databaseaccess.DatabaseAccessor.getObjectThroughOptimizedDataConversion(DatabaseAccessor.java:1345) at org.eclipse.persistence.internal.databaseaccess.DatabaseAccessor.getObject(DatabaseAccessor.java:1264) at org.eclipse.persistence.internal.databaseaccess.DatabaseAccessor.fetchRow(DatabaseAccessor.java:1075) at org.eclipse.persistence.internal.databaseaccess.DatabaseAccessor.processResultSet(DatabaseAccessor.java:768) at org.eclipse.persistence.internal.databaseaccess.DatabaseAccessor.basicExecuteCall(DatabaseAccessor.java:655) at org.eclipse.persistence.internal.databaseaccess.DatabaseAccessor.executeCall(DatabaseAccessor.java:558) at org.eclipse.persistence.internal.sessions.AbstractSession.basicExecuteCall(AbstractSession.java:2002) at org.eclipse.persistence.sessions.server.ServerSession.executeCall(ServerSession.java:570) at org.eclipse.persistence.internal.queries.DatasourceCallQueryMechanism.executeCall(DatasourceCallQueryMechanism.java:242) at org.eclipse.persistence.internal.queries.DatasourceCallQueryMechanism.executeCall(DatasourceCallQueryMechanism.java:228) at org.eclipse.persistence.internal.queries.DatasourceCallQueryMechanism.executeSelectCall(DatasourceCallQueryMechanism.java:299) at org.eclipse.persistence.internal.queries.DatasourceCallQueryMechanism.selectAllRows(DatasourceCallQueryMechanism.java:694) at org.eclipse.persistence.internal.queries.ExpressionQueryMechanism.selectAllRowsFromTable(ExpressionQueryMechanism.java:2738) at org.eclipse.persistence.internal.queries.ExpressionQueryMechanism.selectAllRows(ExpressionQueryMechanism.java:2691) at org.eclipse.persistence.queries.ReadAllQuery.executeObjectLevelReadQuery(ReadAllQuery.java:495) at org.eclipse.persistence.queries.ObjectLevelReadQuery.executeDatabaseQuery(ObjectLevelReadQuery.java:1168) at org.eclipse.persistence.queries.DatabaseQuery.execute(DatabaseQuery.java:899) at org.eclipse.persistence.queries.ObjectLevelReadQuery.execute(ObjectLevelReadQuery.java:1127) at org.eclipse.persistence.queries.ReadAllQuery.execute(ReadAllQuery.java:403) at org.eclipse.persistence.queries.ObjectLevelReadQuery.executeInUnitOfWork(ObjectLevelReadQuery.java:1215) at org.eclipse.persistence.internal.sessions.UnitOfWorkImpl.internalExecuteQuery(UnitOfWorkImpl.java:2896) at org.eclipse.persistence.internal.sessions.AbstractSession.executeQuery(AbstractSession.java:1804) at org.eclipse.persistence.internal.sessions.AbstractSession.executeQuery(AbstractSession.java:1786) at org.eclipse.persistence.internal.sessions.AbstractSession.executeQuery(AbstractSession.java:1751) at org.eclipse.persistence.internal.jpa.QueryImpl.executeReadQuery(QueryImpl.java:258) {code} The deadlock could be caused by how c3p0 is working with the connection pool. It could also be something specific to how it's generating the native query for Postgres. Investigating some various solutions to this... -- This message was sent by Atlassian JIRA (v6.3.4#6332)