Stamatis Zampetakis created HIVE-28839:
------------------------------------------
Summary: Connection starvation in HMS if datanucleus value
generation fails
Key: HIVE-28839
URL: https://issues.apache.org/jira/browse/HIVE-28839
Project: Hive
Issue Type: Bug
Affects Versions: 4.0.1
Reporter: Stamatis Zampetakis
If the Datanucleus value generation logic (e.g., TableGenerator) fails when its
about to release the managed connection (specifically during commit) the
underlying connection is not closed which can cause various problems in
subsequent operations like connection starvation, deadlock, and lock timeout
errors.
The stacktrace below demonstrates the origin of the leak that was triggered by
a database deadlock.
{noformat}
org.datanucleus.exceptions.NucleusDataStoreException: Deadlock found when
trying to get lock; try restarting transaction
at
org.datanucleus.store.rdbms.ConnectionFactoryImpl$ManagedConnectionImpl.release(ConnectionFactoryImpl.java:362)
at
org.datanucleus.store.rdbms.RDBMSStoreManager$1.releaseConnection(RDBMSStoreManager.java:2022)
at
org.datanucleus.store.rdbms.valuegenerator.TableGenerator.obtainGenerationBlock(TableGenerator.java:404)
at
org.datanucleus.store.valuegenerator.AbstractGenerator.obtainGenerationBlock(AbstractGenerator.java:166)
at
org.datanucleus.store.valuegenerator.AbstractGenerator.next(AbstractGenerator.java:75)
at
org.datanucleus.store.rdbms.RDBMSStoreManager.getNextValueForValueGenerator(RDBMSStoreManager.java:2035)
at
org.datanucleus.store.AbstractStoreManager.getValueGenerationStrategyValue(AbstractStoreManager.java:859)
at
org.datanucleus.ExecutionContextImpl.newObjectId(ExecutionContextImpl.java:3864)
at
org.datanucleus.state.StateManagerImpl.setIdentity(StateManagerImpl.java:3529)
at
org.datanucleus.state.StateManagerImpl.initialiseForPersistentNew(StateManagerImpl.java:559)
at
org.datanucleus.state.StateManagerImpl.initialiseForPersistentNew(StateManagerImpl.java:126)
at
org.datanucleus.state.ObjectProviderFactoryImpl.newForPersistentNew(ObjectProviderFactoryImpl.java:205)
at
org.datanucleus.ExecutionContextImpl.persistObjectInternal(ExecutionContextImpl.java:2025)
at
org.datanucleus.ExecutionContextImpl.persistObjectInternal(ExecutionContextImpl.java:2118)
at
org.datanucleus.store.rdbms.mapping.java.PersistableMapping.setObjectAsValue(PersistableMapping.java:635)
at
org.datanucleus.store.rdbms.mapping.java.PersistableMapping.setObject(PersistableMapping.java:386)
at
org.datanucleus.store.rdbms.fieldmanager.ParameterSetter.storeObjectField(ParameterSetter.java:191)
at
org.datanucleus.state.StateManagerImpl.providedObjectField(StateManagerImpl.java:1871)
at
org.apache.hadoop.hive.metastore.model.MTable.dnProvideField(MTable.java)
at
org.apache.hadoop.hive.metastore.model.MTable.dnProvideFields(MTable.java)
at
org.datanucleus.state.StateManagerImpl.provideFields(StateManagerImpl.java:2534)
at
org.datanucleus.store.rdbms.request.InsertRequest.execute(InsertRequest.java:358)
at
org.datanucleus.store.rdbms.RDBMSPersistenceHandler.insertObjectInTable(RDBMSPersistenceHandler.java:163)
at
org.datanucleus.store.rdbms.RDBMSPersistenceHandler.insertObject(RDBMSPersistenceHandler.java:139)
at
org.datanucleus.state.StateManagerImpl.internalMakePersistent(StateManagerImpl.java:4601)
at
org.datanucleus.state.StateManagerImpl.makePersistent(StateManagerImpl.java:4578)
at
org.datanucleus.ExecutionContextImpl.persistObjectInternal(ExecutionContextImpl.java:2026)
at
org.datanucleus.ExecutionContextImpl.persistObjectWork(ExecutionContextImpl.java:1869)
at
org.datanucleus.ExecutionContextImpl.persistObject(ExecutionContextImpl.java:1724)
at
org.datanucleus.ExecutionContextThreadedImpl.persistObject(ExecutionContextThreadedImpl.java:222)
at
org.datanucleus.api.jdo.JDOPersistenceManager.jdoMakePersistent(JDOPersistenceManager.java:715)
at
org.datanucleus.api.jdo.JDOPersistenceManager.makePersistent(JDOPersistenceManager.java:740)
at
org.apache.hadoop.hive.metastore.ObjectStore.createTable(ObjectStore.java:1178)
at sun.reflect.GeneratedMethodAccessor194.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at
org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:97)
at com.sun.proxy.$Proxy27.createTable(Unknown Source)
at
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_core(HiveMetaStore.java:2288)
at
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_req(HiveMetaStore.java:2494)
at sun.reflect.GeneratedMethodAccessor193.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147)
at
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108)
at com.sun.proxy.$Proxy28.create_table_req(Unknown Source)
at
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$create_table_req.getResult(ThriftHiveMetastore.java:18167)
at
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$create_table_req.getResult(ThriftHiveMetastore.java:18146)
at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:38)
at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:38)
at
org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$1.run(HadoopThriftAuthBridge.java:652)
at
org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$1.run(HadoopThriftAuthBridge.java:647)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1910)
at
org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:647)
at
org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:250)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
Caused by: com.mysql.jdbc.exceptions.jdbc4.MySQLTransactionRollbackException:
Deadlock found when trying to get lock; try restarting transaction
at sun.reflect.GeneratedConstructorAccessor155.newInstance(Unknown
Source)
at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at com.mysql.jdbc.Util.handleNewInstance(Util.java:403)
at com.mysql.jdbc.Util.getInstance(Util.java:386)
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:952)
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3933)
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3869)
at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2524)
at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2675)
at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2465)
at com.mysql.jdbc.ConnectionImpl.commit(ConnectionImpl.java:1545)
at
com.zaxxer.hikari.pool.ProxyConnection.commit(ProxyConnection.java:353)
at
com.zaxxer.hikari.pool.HikariProxyConnection.commit(HikariProxyConnection.java)
at
org.datanucleus.store.rdbms.ConnectionFactoryImpl$ManagedConnectionImpl.release(ConnectionFactoryImpl.java:357)
... 59 more
Nested Throwables StackTrace:
com.mysql.jdbc.exceptions.jdbc4.MySQLTransactionRollbackException: Deadlock
found when trying to get lock; try restarting transaction
at sun.reflect.GeneratedConstructorAccessor155.newInstance(Unknown
Source)
at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at com.mysql.jdbc.Util.handleNewInstance(Util.java:403)
at com.mysql.jdbc.Util.getInstance(Util.java:386)
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:952)
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3933)
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3869)
at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2524)
at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2675)
at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2465)
at com.mysql.jdbc.ConnectionImpl.commit(ConnectionImpl.java:1545)
at
com.zaxxer.hikari.pool.ProxyConnection.commit(ProxyConnection.java:353)
at
com.zaxxer.hikari.pool.HikariProxyConnection.commit(HikariProxyConnection.java)
at
org.datanucleus.store.rdbms.ConnectionFactoryImpl$ManagedConnectionImpl.release(ConnectionFactoryImpl.java:357)
at
org.datanucleus.store.rdbms.RDBMSStoreManager$1.releaseConnection(RDBMSStoreManager.java:2022)
at
org.datanucleus.store.rdbms.valuegenerator.TableGenerator.obtainGenerationBlock(TableGenerator.java:404)
at
org.datanucleus.store.valuegenerator.AbstractGenerator.obtainGenerationBlock(AbstractGenerator.java:166)
at
org.datanucleus.store.valuegenerator.AbstractGenerator.next(AbstractGenerator.java:75)
at
org.datanucleus.store.rdbms.RDBMSStoreManager.getNextValueForValueGenerator(RDBMSStoreManager.java:2035)
at
org.datanucleus.store.AbstractStoreManager.getValueGenerationStrategyValue(AbstractStoreManager.java:859)
at
org.datanucleus.ExecutionContextImpl.newObjectId(ExecutionContextImpl.java:3864)
at
org.datanucleus.state.StateManagerImpl.setIdentity(StateManagerImpl.java:3529)
at
org.datanucleus.state.StateManagerImpl.initialiseForPersistentNew(StateManagerImpl.java:559)
at
org.datanucleus.state.StateManagerImpl.initialiseForPersistentNew(StateManagerImpl.java:126)
at
org.datanucleus.state.ObjectProviderFactoryImpl.newForPersistentNew(ObjectProviderFactoryImpl.java:205)
at
org.datanucleus.ExecutionContextImpl.persistObjectInternal(ExecutionContextImpl.java:2025)
at
org.datanucleus.ExecutionContextImpl.persistObjectInternal(ExecutionContextImpl.java:2118)
at
org.datanucleus.store.rdbms.mapping.java.PersistableMapping.setObjectAsValue(PersistableMapping.java:635)
at
org.datanucleus.store.rdbms.mapping.java.PersistableMapping.setObject(PersistableMapping.java:386)
at
org.datanucleus.store.rdbms.fieldmanager.ParameterSetter.storeObjectField(ParameterSetter.java:191)
at
org.datanucleus.state.StateManagerImpl.providedObjectField(StateManagerImpl.java:1871)
at
org.apache.hadoop.hive.metastore.model.MTable.dnProvideField(MTable.java)
at
org.apache.hadoop.hive.metastore.model.MTable.dnProvideFields(MTable.java)
at
org.datanucleus.state.StateManagerImpl.provideFields(StateManagerImpl.java:2534)
at
org.datanucleus.store.rdbms.request.InsertRequest.execute(InsertRequest.java:358)
at
org.datanucleus.store.rdbms.RDBMSPersistenceHandler.insertObjectInTable(RDBMSPersistenceHandler.java:163)
at
org.datanucleus.store.rdbms.RDBMSPersistenceHandler.insertObject(RDBMSPersistenceHandler.java:139)
at
org.datanucleus.state.StateManagerImpl.internalMakePersistent(StateManagerImpl.java:4601)
at
org.datanucleus.state.StateManagerImpl.makePersistent(StateManagerImpl.java:4578)
at
org.datanucleus.ExecutionContextImpl.persistObjectInternal(ExecutionContextImpl.java:2026)
at
org.datanucleus.ExecutionContextImpl.persistObjectWork(ExecutionContextImpl.java:1869)
at
org.datanucleus.ExecutionContextImpl.persistObject(ExecutionContextImpl.java:1724)
at
org.datanucleus.ExecutionContextThreadedImpl.persistObject(ExecutionContextThreadedImpl.java:222)
at
org.datanucleus.api.jdo.JDOPersistenceManager.jdoMakePersistent(JDOPersistenceManager.java:715)
at
org.datanucleus.api.jdo.JDOPersistenceManager.makePersistent(JDOPersistenceManager.java:740)
at
org.apache.hadoop.hive.metastore.ObjectStore.createTable(ObjectStore.java:1178)
at sun.reflect.GeneratedMethodAccessor194.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at
org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:97)
at com.sun.proxy.$Proxy27.createTable(Unknown Source)
at
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_core(HiveMetaStore.java:2288)
at
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_req(HiveMetaStore.java:2494)
at sun.reflect.GeneratedMethodAccessor193.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147)
at
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108)
at com.sun.proxy.$Proxy28.create_table_req(Unknown Source)
at
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$create_table_req.getResult(ThriftHiveMetastore.java:18167)
at
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$create_table_req.getResult(ThriftHiveMetastore.java:18146)
at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:38)
at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:38)
at
org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$1.run(HadoopThriftAuthBridge.java:652)
at
org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$1.run(HadoopThriftAuthBridge.java:647)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1910)
at
org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:647)
at
org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:250)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
{noformat}
The leak caused connection starvation and eventually production down for the
Hivemetastore since the connections in the Hikaricp objectstore-secondary
thread pool were exhausted. Since the size of the objectstore-secondary pool is
hardcoded to 2 it suffices to have two errors like the one above to bring down
HMS.
After a while all threads are stuck waiting to obtain a connection with a
stacktrace similar to the one below:
{noformat}
java.sql.SQLTransientConnectionException: objectstore-secondary - Connection is
not available, request timed out after 500ms.
at
com.zaxxer.hikari.pool.HikariPool.createTimeoutException(HikariPool.java:696)
at com.zaxxer.hikari.pool.HikariPool.getConnection(HikariPool.java:197)
at com.zaxxer.hikari.pool.HikariPool.getConnection(HikariPool.java:162)
at
com.zaxxer.hikari.HikariDataSource.getConnection(HikariDataSource.java:100)
...
{noformat}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)