Hi Soam, Can you please provide the value specified for "datanucleus.connectionPool.maxIdle"?
~Rajesh.B On Wed, Jun 11, 2014 at 2:26 AM, Soam Acharya <s...@altiscale.com> wrote: > Hi Vaibhav, > > good question. We're using 0.8.0 RELEASE. Would 0.7.1 be preferable > instead? > > Thanks! > > Soam > > > On Tue, Jun 10, 2014 at 8:00 PM, Vaibhav Gumashta < > vgumas...@hortonworks.com> wrote: > >> Soam, >> >> What version of BoneCP are you using? >> >> Thanks, >> --Vaibhav >> >> >> On Tue, Jun 10, 2014 at 12:41 PM, Soam Acharya <s...@altiscale.com> >> wrote: >> >>> Hi folks, >>> >>> we're seeing an intermittent issue between our Hive 0.13 metastore and >>> mysql instance. After being idle for about 5 minutes or so, any >>> transactions involving the metastore and mysql causes the following error >>> to appear in the metastore log: >>> >>> 2014-06-09 05:13:52,066 ERROR bonecp.ConnectionHandle >>> (ConnectionHandle.java:markPossiblyBroken(388)) - Database access problem. >>> Killing off this connection and all remaining connectio >>> ns in the connection pool. SQL State = 08S01 >>> 2014-06-09 05:13:52,068 ERROR metastore.RetryingHMSHandler >>> (RetryingHMSHandler.java:invoke(157)) - Retrying HMSHandler after 1000 ms >>> (attempt 1 of 1) with error: javax.jdo.JDODataStore >>> Exception: Communications link failure >>> >>> The last packet successfully received from the server was 502,751 >>> milliseconds ago. The last packet sent successfully to the server was 1 >>> milliseconds ago. >>> at >>> org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:451) >>> at org.datanucleus.api.jdo.JDOQuery.execute(JDOQuery.java:275) >>> at >>> org.apache.hadoop.hive.metastore.ObjectStore.getMTable(ObjectStore.java:900) >>> at >>> org.apache.hadoop.hive.metastore.ObjectStore.getTable(ObjectStore.java:832) >>> at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source) >>> >>> [full stack trace is below] >>> >>> Subsequent transactions go through fine, however. This only occurs if >>> the metastore has has been idle for a while. >>> >>> We've been looking at the code and the problem seems to lie in the >>> channel between the driver and metastore connection pool. We looked at the >>> wait time configuration settings both mysql and BoneCP. Both of them are >>> set for max value. >>> >>> Our hive 0.12 install uses DBCP for datanucleus.connectionPoolingType, >>> not BoneCP and we don't see these issues there. We are not running Tez. >>> >>> This seems like such a basic issue that we'd thought to check and see if >>> anyone else has encountered it. Any insights would be greatly appreciated. >>> >>> Thanks! >>> >>> Soam >>> >>> ================= >>> >>> Full stack trace: >>> >>> 2014-06-09 05:13:52,066 ERROR bonecp.ConnectionHandle >>> (ConnectionHandle.java:markPossiblyBroken(388)) - Database access problem. >>> Killing off this connection and all remaining connectio >>> ns in the connection pool. SQL State = 08S01 >>> 2014-06-09 05:13:52,068 ERROR metastore.RetryingHMSHandler >>> (RetryingHMSHandler.java:invoke(157)) - Retrying HMSHandler after 1000 ms >>> (attempt 1 of 1) with error: javax.jdo.JDODataStore >>> Exception: Communications link failure >>> >>> The last packet successfully received from the server was 502,751 >>> milliseconds ago. The last packet sent successfully to the server was 1 >>> milliseconds ago. >>> at >>> org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:451) >>> at org.datanucleus.api.jdo.JDOQuery.execute(JDOQuery.java:275) >>> at >>> org.apache.hadoop.hive.metastore.ObjectStore.getMTable(ObjectStore.java:900) >>> at >>> org.apache.hadoop.hive.metastore.ObjectStore.getTable(ObjectStore.java:832) >>> at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source) >>> at >>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >>> at java.lang.reflect.Method.invoke(Method.java:606) >>> at >>> org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:108) >>> at com.sun.proxy.$Proxy0.getTable(Unknown Source) >>> at >>> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_table(HiveMetaStore.java:1559) >>> at sun.reflect.GeneratedMethodAccessor15.invoke(Unknown Source) >>> at >>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >>> at java.lang.reflect.Method.invoke(Method.java:606) >>> at >>> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:105) >>> at com.sun.proxy.$Proxy11.get_table(Unknown Source) >>> at >>> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$get_table.getResult(ThriftHiveMetastore.java:8146) >>> at >>> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$get_table.getResult(ThriftHiveMetastore.java:8130) >>> at >>> org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) >>> at >>> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:110) >>> at >>> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:107) >>> at java.security.AccessController.doPrivileged(Native Method) >>> at javax.security.auth.Subject.doAs(Subject.java:415) >>> at >>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491) >>> at >>> org.apache.hadoop.hive.shims.HadoopShimsSecure.doAs(HadoopShimsSecure.java:500) >>> at >>> org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:118) >>> at >>> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:206) >>> at >>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >>> at >>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >>> at java.lang.Thread.run(Thread.java:744) >>> NestedThrowablesStackTrace: >>> com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications >>> link failure >>> >>> The last packet successfully received from the server was 502,751 >>> milliseconds ago. The last packet sent successfully to the server was 1 >>> milliseconds ago. >>> at >>> sun.reflect.GeneratedConstructorAccessor41.newInstance(Unknown Source) >>> at >>> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) >>> at >>> java.lang.reflect.Constructor.newInstance(Constructor.java:526) >>> at com.mysql.jdbc.Util.handleNewInstance(Util.java:411) >>> at >>> com.mysql.jdbc.SQLError.createCommunicationsException(SQLError.java:1117) >>> at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:3603) >>> at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:3492) >>> at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:4043) >>> at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2503) >>> at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2664) >>> at >>> com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2788) >>> at >>> com.mysql.jdbc.ConnectionImpl.setAutoCommit(ConnectionImpl.java:5282) >>> at >>> com.jolbox.bonecp.ConnectionHandle.setAutoCommit(ConnectionHandle.java:1292) >>> at >>> org.datanucleus.store.rdbms.ConnectionFactoryImpl$ManagedConnectionImpl.getConnection(ConnectionFactoryImpl.java:440) >>> at >>> org.datanucleus.store.rdbms.ConnectionFactoryImpl$ManagedConnectionImpl.getXAResource(ConnectionFactoryImpl.java:378) >>> at >>> org.datanucleus.store.connection.ConnectionManagerImpl.allocateConnection(ConnectionManagerImpl.java:328) >>> at >>> org.datanucleus.store.connection.AbstractConnectionFactory.getConnection(AbstractConnectionFactory.java:94) >>> at >>> org.datanucleus.store.AbstractStoreManager.getConnection(AbstractStoreManager.java:430) >>> at >>> org.datanucleus.store.AbstractStoreManager.getConnection(AbstractStoreManager.java:396) >>> at >>> org.datanucleus.store.rdbms.query.JDOQLQuery.performExecute(JDOQLQuery.java:621) >>> at >>> org.datanucleus.store.query.Query.executeQuery(Query.java:1786) >>> at >>> org.datanucleus.store.query.Query.executeWithArray(Query.java:1672) >>> at org.datanucleus.api.jdo.JDOQuery.execute(JDOQuery.java:266) >>> at >>> org.apache.hadoop.hive.metastore.ObjectStore.getMTable(ObjectStore.java:900) >>> at >>> org.apache.hadoop.hive.metastore.ObjectStore.getTable(ObjectStore.java:832) >>> at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source) >>> at >>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >>> at java.lang.reflect.Method.invoke(Method.java:606) >>> at >>> org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:108) >>> at com.sun.proxy.$Proxy0.getTable(Unknown Source) >>> at >>> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_table(HiveMetaStore.java:1559) >>> at sun.reflect.GeneratedMethodAccessor15.invoke(Unknown Source) >>> at >>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >>> at java.lang.reflect.Method.invoke(Method.java:606) >>> at >>> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:105) >>> at com.sun.proxy.$Proxy11.get_table(Unknown Source) >>> at >>> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$get_table.getResult(ThriftHiveMetastore.java:8146) >>> at >>> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$get_table.getResult(ThriftHiveMetastore.java:8130) >>> at >>> org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) >>> at >>> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:110) >>> at >>> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:107) >>> at java.security.AccessController.doPrivileged(Native Method) >>> at javax.security.auth.Subject.doAs(Subject.java:415) >>> at >>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491) >>> at >>> org.apache.hadoop.hive.shims.HadoopShimsSecure.doAs(HadoopShimsSecure.java:500) >>> at >>> org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:118) >>> at >>> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:206) >>> at >>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >>> at >>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >>> at java.lang.Thread.run(Thread.java:744) >>> Caused by: java.io.EOFException: Can not read response from server. >>> Expected to read 4 bytes, read 0 bytes before connection was unexpectedly >>> lost. >>> at com.mysql.jdbc.MysqlIO.readFully(MysqlIO.java:3052) >>> at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:3503) >>> ... 44 more >>> >>> >>> >> >> CONFIDENTIALITY NOTICE >> NOTICE: This message is intended for the use of the individual or entity >> to which it is addressed and may contain information that is confidential, >> privileged and exempt from disclosure under applicable law. If the reader >> of this message is not the intended recipient, you are hereby notified that >> any printing, copying, dissemination, distribution, disclosure or >> forwarding of this communication is strictly prohibited. If you have >> received this communication in error, please contact the sender immediately >> and delete it from your system. Thank You. > > > -- ~Rajesh.B