Stamatis Zampetakis created HIVE-28839:
------------------------------------------

             Summary: Connection starvation in HMS if datanucleus value 
generation fails
                 Key: HIVE-28839
                 URL: https://issues.apache.org/jira/browse/HIVE-28839
             Project: Hive
          Issue Type: Bug
    Affects Versions: 4.0.1
            Reporter: Stamatis Zampetakis


If the Datanucleus value generation logic (e.g., TableGenerator) fails when its 
about to release the managed connection (specifically during commit) the 
underlying connection is not closed which can cause various problems in 
subsequent operations like connection starvation, deadlock, and lock timeout 
errors.

The stacktrace below demonstrates the origin of the leak that was triggered by 
a database deadlock. 

{noformat}
org.datanucleus.exceptions.NucleusDataStoreException: Deadlock found when 
trying to get lock; try restarting transaction
        at 
org.datanucleus.store.rdbms.ConnectionFactoryImpl$ManagedConnectionImpl.release(ConnectionFactoryImpl.java:362)
        at 
org.datanucleus.store.rdbms.RDBMSStoreManager$1.releaseConnection(RDBMSStoreManager.java:2022)
        at 
org.datanucleus.store.rdbms.valuegenerator.TableGenerator.obtainGenerationBlock(TableGenerator.java:404)
        at 
org.datanucleus.store.valuegenerator.AbstractGenerator.obtainGenerationBlock(AbstractGenerator.java:166)
        at 
org.datanucleus.store.valuegenerator.AbstractGenerator.next(AbstractGenerator.java:75)
        at 
org.datanucleus.store.rdbms.RDBMSStoreManager.getNextValueForValueGenerator(RDBMSStoreManager.java:2035)
        at 
org.datanucleus.store.AbstractStoreManager.getValueGenerationStrategyValue(AbstractStoreManager.java:859)
        at 
org.datanucleus.ExecutionContextImpl.newObjectId(ExecutionContextImpl.java:3864)
        at 
org.datanucleus.state.StateManagerImpl.setIdentity(StateManagerImpl.java:3529)
        at 
org.datanucleus.state.StateManagerImpl.initialiseForPersistentNew(StateManagerImpl.java:559)
        at 
org.datanucleus.state.StateManagerImpl.initialiseForPersistentNew(StateManagerImpl.java:126)
        at 
org.datanucleus.state.ObjectProviderFactoryImpl.newForPersistentNew(ObjectProviderFactoryImpl.java:205)
        at 
org.datanucleus.ExecutionContextImpl.persistObjectInternal(ExecutionContextImpl.java:2025)
        at 
org.datanucleus.ExecutionContextImpl.persistObjectInternal(ExecutionContextImpl.java:2118)
        at 
org.datanucleus.store.rdbms.mapping.java.PersistableMapping.setObjectAsValue(PersistableMapping.java:635)
        at 
org.datanucleus.store.rdbms.mapping.java.PersistableMapping.setObject(PersistableMapping.java:386)
        at 
org.datanucleus.store.rdbms.fieldmanager.ParameterSetter.storeObjectField(ParameterSetter.java:191)
        at 
org.datanucleus.state.StateManagerImpl.providedObjectField(StateManagerImpl.java:1871)
        at 
org.apache.hadoop.hive.metastore.model.MTable.dnProvideField(MTable.java)
        at 
org.apache.hadoop.hive.metastore.model.MTable.dnProvideFields(MTable.java)
        at 
org.datanucleus.state.StateManagerImpl.provideFields(StateManagerImpl.java:2534)
        at 
org.datanucleus.store.rdbms.request.InsertRequest.execute(InsertRequest.java:358)
        at 
org.datanucleus.store.rdbms.RDBMSPersistenceHandler.insertObjectInTable(RDBMSPersistenceHandler.java:163)
        at 
org.datanucleus.store.rdbms.RDBMSPersistenceHandler.insertObject(RDBMSPersistenceHandler.java:139)
        at 
org.datanucleus.state.StateManagerImpl.internalMakePersistent(StateManagerImpl.java:4601)
        at 
org.datanucleus.state.StateManagerImpl.makePersistent(StateManagerImpl.java:4578)
        at 
org.datanucleus.ExecutionContextImpl.persistObjectInternal(ExecutionContextImpl.java:2026)
        at 
org.datanucleus.ExecutionContextImpl.persistObjectWork(ExecutionContextImpl.java:1869)
        at 
org.datanucleus.ExecutionContextImpl.persistObject(ExecutionContextImpl.java:1724)
        at 
org.datanucleus.ExecutionContextThreadedImpl.persistObject(ExecutionContextThreadedImpl.java:222)
        at 
org.datanucleus.api.jdo.JDOPersistenceManager.jdoMakePersistent(JDOPersistenceManager.java:715)
        at 
org.datanucleus.api.jdo.JDOPersistenceManager.makePersistent(JDOPersistenceManager.java:740)
        at 
org.apache.hadoop.hive.metastore.ObjectStore.createTable(ObjectStore.java:1178)
        at sun.reflect.GeneratedMethodAccessor194.invoke(Unknown Source)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at 
org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:97)
        at com.sun.proxy.$Proxy27.createTable(Unknown Source)
        at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_core(HiveMetaStore.java:2288)
        at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_req(HiveMetaStore.java:2494)
        at sun.reflect.GeneratedMethodAccessor193.invoke(Unknown Source)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147)
        at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108)
        at com.sun.proxy.$Proxy28.create_table_req(Unknown Source)
        at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$create_table_req.getResult(ThriftHiveMetastore.java:18167)
        at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$create_table_req.getResult(ThriftHiveMetastore.java:18146)
        at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:38)
        at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:38)
        at 
org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$1.run(HadoopThriftAuthBridge.java:652)
        at 
org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$1.run(HadoopThriftAuthBridge.java:647)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1910)
        at 
org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:647)
        at 
org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:250)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:750)
Caused by: com.mysql.jdbc.exceptions.jdbc4.MySQLTransactionRollbackException: 
Deadlock found when trying to get lock; try restarting transaction
        at sun.reflect.GeneratedConstructorAccessor155.newInstance(Unknown 
Source)
        at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
        at com.mysql.jdbc.Util.handleNewInstance(Util.java:403)
        at com.mysql.jdbc.Util.getInstance(Util.java:386)
        at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:952)
        at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3933)
        at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3869)
        at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2524)
        at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2675)
        at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2465)
        at com.mysql.jdbc.ConnectionImpl.commit(ConnectionImpl.java:1545)
        at 
com.zaxxer.hikari.pool.ProxyConnection.commit(ProxyConnection.java:353)
        at 
com.zaxxer.hikari.pool.HikariProxyConnection.commit(HikariProxyConnection.java)
        at 
org.datanucleus.store.rdbms.ConnectionFactoryImpl$ManagedConnectionImpl.release(ConnectionFactoryImpl.java:357)
        ... 59 more
Nested Throwables StackTrace:
com.mysql.jdbc.exceptions.jdbc4.MySQLTransactionRollbackException: Deadlock 
found when trying to get lock; try restarting transaction
        at sun.reflect.GeneratedConstructorAccessor155.newInstance(Unknown 
Source)
        at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
        at com.mysql.jdbc.Util.handleNewInstance(Util.java:403)
        at com.mysql.jdbc.Util.getInstance(Util.java:386)
        at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:952)
        at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3933)
        at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3869)
        at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2524)
        at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2675)
        at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2465)
        at com.mysql.jdbc.ConnectionImpl.commit(ConnectionImpl.java:1545)
        at 
com.zaxxer.hikari.pool.ProxyConnection.commit(ProxyConnection.java:353)
        at 
com.zaxxer.hikari.pool.HikariProxyConnection.commit(HikariProxyConnection.java)
        at 
org.datanucleus.store.rdbms.ConnectionFactoryImpl$ManagedConnectionImpl.release(ConnectionFactoryImpl.java:357)
        at 
org.datanucleus.store.rdbms.RDBMSStoreManager$1.releaseConnection(RDBMSStoreManager.java:2022)
        at 
org.datanucleus.store.rdbms.valuegenerator.TableGenerator.obtainGenerationBlock(TableGenerator.java:404)
        at 
org.datanucleus.store.valuegenerator.AbstractGenerator.obtainGenerationBlock(AbstractGenerator.java:166)
        at 
org.datanucleus.store.valuegenerator.AbstractGenerator.next(AbstractGenerator.java:75)
        at 
org.datanucleus.store.rdbms.RDBMSStoreManager.getNextValueForValueGenerator(RDBMSStoreManager.java:2035)
        at 
org.datanucleus.store.AbstractStoreManager.getValueGenerationStrategyValue(AbstractStoreManager.java:859)
        at 
org.datanucleus.ExecutionContextImpl.newObjectId(ExecutionContextImpl.java:3864)
        at 
org.datanucleus.state.StateManagerImpl.setIdentity(StateManagerImpl.java:3529)
        at 
org.datanucleus.state.StateManagerImpl.initialiseForPersistentNew(StateManagerImpl.java:559)
        at 
org.datanucleus.state.StateManagerImpl.initialiseForPersistentNew(StateManagerImpl.java:126)
        at 
org.datanucleus.state.ObjectProviderFactoryImpl.newForPersistentNew(ObjectProviderFactoryImpl.java:205)
        at 
org.datanucleus.ExecutionContextImpl.persistObjectInternal(ExecutionContextImpl.java:2025)
        at 
org.datanucleus.ExecutionContextImpl.persistObjectInternal(ExecutionContextImpl.java:2118)
        at 
org.datanucleus.store.rdbms.mapping.java.PersistableMapping.setObjectAsValue(PersistableMapping.java:635)
        at 
org.datanucleus.store.rdbms.mapping.java.PersistableMapping.setObject(PersistableMapping.java:386)
        at 
org.datanucleus.store.rdbms.fieldmanager.ParameterSetter.storeObjectField(ParameterSetter.java:191)
        at 
org.datanucleus.state.StateManagerImpl.providedObjectField(StateManagerImpl.java:1871)
        at 
org.apache.hadoop.hive.metastore.model.MTable.dnProvideField(MTable.java)
        at 
org.apache.hadoop.hive.metastore.model.MTable.dnProvideFields(MTable.java)
        at 
org.datanucleus.state.StateManagerImpl.provideFields(StateManagerImpl.java:2534)
        at 
org.datanucleus.store.rdbms.request.InsertRequest.execute(InsertRequest.java:358)
        at 
org.datanucleus.store.rdbms.RDBMSPersistenceHandler.insertObjectInTable(RDBMSPersistenceHandler.java:163)
        at 
org.datanucleus.store.rdbms.RDBMSPersistenceHandler.insertObject(RDBMSPersistenceHandler.java:139)
        at 
org.datanucleus.state.StateManagerImpl.internalMakePersistent(StateManagerImpl.java:4601)
        at 
org.datanucleus.state.StateManagerImpl.makePersistent(StateManagerImpl.java:4578)
        at 
org.datanucleus.ExecutionContextImpl.persistObjectInternal(ExecutionContextImpl.java:2026)
        at 
org.datanucleus.ExecutionContextImpl.persistObjectWork(ExecutionContextImpl.java:1869)
        at 
org.datanucleus.ExecutionContextImpl.persistObject(ExecutionContextImpl.java:1724)
        at 
org.datanucleus.ExecutionContextThreadedImpl.persistObject(ExecutionContextThreadedImpl.java:222)
        at 
org.datanucleus.api.jdo.JDOPersistenceManager.jdoMakePersistent(JDOPersistenceManager.java:715)
        at 
org.datanucleus.api.jdo.JDOPersistenceManager.makePersistent(JDOPersistenceManager.java:740)
        at 
org.apache.hadoop.hive.metastore.ObjectStore.createTable(ObjectStore.java:1178)
        at sun.reflect.GeneratedMethodAccessor194.invoke(Unknown Source)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at 
org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:97)
        at com.sun.proxy.$Proxy27.createTable(Unknown Source)
        at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_core(HiveMetaStore.java:2288)
        at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_req(HiveMetaStore.java:2494)
        at sun.reflect.GeneratedMethodAccessor193.invoke(Unknown Source)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147)
        at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108)
        at com.sun.proxy.$Proxy28.create_table_req(Unknown Source)
        at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$create_table_req.getResult(ThriftHiveMetastore.java:18167)
        at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$create_table_req.getResult(ThriftHiveMetastore.java:18146)
        at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:38)
        at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:38)
        at 
org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$1.run(HadoopThriftAuthBridge.java:652)
        at 
org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$1.run(HadoopThriftAuthBridge.java:647)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1910)
        at 
org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:647)
        at 
org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:250)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:750)

{noformat}

The leak caused connection starvation and eventually production down for the 
Hivemetastore since the connections in the Hikaricp objectstore-secondary 
thread pool were exhausted. Since the size of the objectstore-secondary pool is 
hardcoded to 2 it suffices to have two errors like the one above to bring down 
HMS.

After a while all threads are stuck waiting to obtain a connection with a 
stacktrace similar to the one below:

{noformat}
java.sql.SQLTransientConnectionException: objectstore-secondary - Connection is 
not available, request timed out after 500ms.
        at 
com.zaxxer.hikari.pool.HikariPool.createTimeoutException(HikariPool.java:696)
        at com.zaxxer.hikari.pool.HikariPool.getConnection(HikariPool.java:197)
        at com.zaxxer.hikari.pool.HikariPool.getConnection(HikariPool.java:162)
        at 
com.zaxxer.hikari.HikariDataSource.getConnection(HikariDataSource.java:100)
        ...
{noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to