Hi Carl,
Thanks for the reply. I am currently monitoring file handles via:
/usr/sbin/lsof | grep hive | wc -l
The number is fairly stable, but spikes intermittently with the
following files that get created:
/tmp/aphadke/hive_job_log_aphadke_201008021315_656095069.txt
and so on, mainly the pattern below:
/tmp/aphadke/hive_job_log_*
Sample content:
SessionStart SESSION_ID="aphadke_201008021315" TIME="1280780118654"
QueryStart QUERY_STRING="LOAD DATA INPATH
'hdfs://cm-hadoop01.mozilla.org/flume/cm-hadoop01.mozilla.org/2010-08-02/1315/hadooplog.00000018.20100802-131516958-0700.7823351645004687.seq.gz'
INTO TABLE logs_temp PARTITION (ds='2010-08-02', ts='13-15',
hn='cm-hadoop01.mozilla.org')" QUERY_ID="aphadke_20100802131515"
TIME="1280780147562"
TaskStart TASK_NAME="org.apache.hadoop.hive.ql.exec.MoveTask"
TASK_ID="Stage-0" QUERY_ID="aphadke_20100802131515" TIME="1280780147564"
TaskEnd TASK_RET_CODE="0"
TASK_NAME="org.apache.hadoop.hive.ql.exec.MoveTask" TASK_ID="Stage-0"
QUERY_ID="aphadke_20100802131515" TIME="1280780147902"
QueryEnd QUERY_STRING="LOAD DATA INPATH
'hdfs://cm-hadoop01.mozilla.org/flume/cm-hadoop01.mozilla.org/2010-08-02/1315/hadooplog.00000018.20100802-131516958-0700.7823351645004687.seq.gz'
INTO TABLE logs_temp PARTITION (ds='2010-08-02', ts='13-15',
hn='cm-hadoop01.mozilla.org')" QUERY_ID="aphadke_20100802131515"
QUERY_RET_CODE="0" QUERY_NUM_TASKS="0" TIME="1280780147902"
These files and their content look innocuous as they seem to contain
session information, any idea on why we need them and if its possible to
reduce the number of open files?
-Anurag
On 8/2/10 1:37 PM, Carl Steinbach wrote:
Hi Anurag,
It looks like the MetaStore server is not properly closing JDBC
connections, or connection pooling is not working properly. As a first
step I would try to verify that process has lots of open sockets using
the output of lsof: 'lsof -i -p <jvm process id>'
Thanks.
Carl
On Mon, Aug 2, 2010 at 1:14 PM, Anurag Phadke <[email protected]
<mailto:[email protected]>> wrote:
We are running hive server with mysql as metastore. The server
receives LOAD DATA hql every 30 sec (from flume), everything runs
fine for about 90 minutes and then the hive server throws following
error. (Note: The user running hive server has "unlimited" file
handle limit.)
any ideas?
10/08/02 12:34:34 ERROR exec.MoveTask: Failed with exception Unable
to fetch table logs_temp
org.apache.hadoop.hive.ql.metadata.HiveException: Unable to fetch
table logs_temp
at
org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:339)
at
org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:310)
at
org.apache.hadoop.hive.ql.metadata.Hive.loadPartition(Hive.java:454)
at
org.apache.hadoop.hive.ql.exec.MoveTask.execute(MoveTask.java:135)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:379)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:285)
at
org.apache.hadoop.hive.service.HiveServer$HiveServerHandler.execute(HiveServer.java:115)
at
org.apache.hadoop.hive.service.ThriftHive$Processor$execute.process(ThriftHive.java:340)
at
org.apache.hadoop.hive.service.ThriftHive$Processor.process(ThriftHive.java:328)
at
org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:252)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:619)
Caused by: javax.jdo.JDODataStoreException: Communications link
failure due to underlying exception:
** BEGIN NESTED EXCEPTION **
java.net.SocketException
MESSAGE: Too many open files
STACKTRACE:
java.net.SocketException: Too many open files
at java.net.Socket.createImpl(Socket.java:397)
at java.net.Socket.<init>(Socket.java:371)
at java.net.Socket.<init>(Socket.java:218)
at
com.mysql.jdbc.StandardSocketFactory.connect(StandardSocketFactory.java:256)
at com.mysql.jdbc.MysqlIO.<init>(MysqlIO.java:271)
at com.mysql.jdbc.Connection.createNewIO(Connection.java:2771)
at com.mysql.jdbc.Connection.<init>(Connection.java:1555)
at
com.mysql.jdbc.NonRegisteringDriver.connect(NonRegisteringDriver.java:285)
at
org.datanucleus.store.rdbms.datasource.DriverManagerDataSource.getConnection(DriverManagerDataSource.java:142)
at
org.datanucleus.store.rdbms.datasource.DriverManagerDataSource.getConnection(DriverManagerDataSource.java:118)
at
org.datanucleus.store.rdbms.ConnectionProviderPriorityList.getConnection(ConnectionProviderPriorityList.java:59)
at
org.datanucleus.store.rdbms.ConnectionFactoryImpl$ManagedConnectionImpl.getConnection(ConnectionFactoryImpl.java:458)
at
org.datanucleus.store.rdbms.ConnectionFactoryImpl$ManagedConnectionImpl.getXAResource(ConnectionFactoryImpl.java:418)
at
org.datanucleus.ConnectionManagerImpl.enlistResource(ConnectionManagerImpl.java:329)
at
org.datanucleus.ConnectionManagerImpl.allocateConnection(ConnectionManagerImpl.java:210)
at
org.datanucleus.store.rdbms.ConnectionFactoryImpl.getConnection(ConnectionFactoryImpl.java:345)
at
org.datanucleus.store.AbstractStoreManager.getConnection(AbstractStoreManager.java:267)
at
org.datanucleus.store.rdbms.query.SQLEvaluator.evaluate(SQLEvaluator.java:91)
at
org.datanucleus.store.rdbms.query.JDOQLQuery.performExecute(JDOQLQuery.java:287)
at
org.datanucleus.store.query.Query.executeQuery(Query.java:1475)
at
org.datanucleus.store.rdbms.query.JDOQLQuery.executeQuery(JDOQLQuery.java:244)
at
org.datanucleus.store.query.Query.executeWithArray(Query.java:1357)
at org.datanucleus.jdo.JDOQuery.execute(JDOQuery.java:265)
at
org.apache.hadoop.hive.metastore.ObjectStore.getMTable(ObjectStore.java:554)
at
org.apache.hadoop.hive.metastore.ObjectStore.getTable(ObjectStore.java:497)
at
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_table(HiveMetaStore.java:388)
at
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getTable(HiveMetaStoreClient.java:431)
at
org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:331)
at
org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:310)
at
org.apache.hadoop.hive.ql.metadata.Hive.loadPartition(Hive.java:454)
at
org.apache.hadoop.hive.ql.exec.MoveTask.execute(MoveTask.java:135)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:379)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:285)
at
org.apache.hadoop.hive.service.HiveServer$HiveServerHandler.execute(HiveServer.java:115)
at
org.apache.hadoop.hive.service.ThriftHive$Processor$execute.process(ThriftHive.java:340)
at
org.apache.hadoop.hive.service.ThriftHive$Processor.process(ThriftHive.java:328)
at
org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:252)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:619)
** END NESTED EXCEPTION **
Last packet sent to the server was 1 ms ago.
NestedThrowables:
com.mysql.jdbc.CommunicationsException: Communications link failure
due to underlying exception:
** BEGIN NESTED EXCEPTION **
java.net.SocketException
MESSAGE: Too many open files
STACKTRACE:
java.net.SocketException: Too many open files
at java.net.Socket.createImpl(Socket.java:397)
at java.net.Socket.<init>(Socket.java:371)
at java.net.Socket.<init>(Socket.java:218)
at
com.mysql.jdbc.StandardSocketFactory.connect(StandardSocketFactory.java:256)
at com.mysql.jdbc.MysqlIO.<init>(MysqlIO.java:271)
at com.mysql.jdbc.Connection.createNewIO(Connection.java:2771)
at com.mysql.jdbc.Connection.<init>(Connection.java:1555)
at
com.mysql.jdbc.NonRegisteringDriver.connect(NonRegisteringDriver.java:285)
at
org.datanucleus.store.rdbms.datasource.DriverManagerDataSource.getConnection(DriverManagerDataSource.java:142)
at
org.datanucleus.store.rdbms.datasource.DriverManagerDataSource.getConnection(DriverManagerDataSource.java:118)
at
org.datanucleus.store.rdbms.ConnectionProviderPriorityList.getConnection(ConnectionProviderPriorityList.java:59)
at
org.datanucleus.store.rdbms.ConnectionFactoryImpl$ManagedConnectionImpl.getConnection(ConnectionFactoryImpl.java:458)
at
org.datanucleus.store.rdbms.ConnectionFactoryImpl$ManagedConnectionImpl.getXAResource(ConnectionFactoryImpl.java:418)
at
org.datanucleus.ConnectionManagerImpl.enlistResource(ConnectionManagerImpl.java:329)
at
org.datanucleus.ConnectionManagerImpl.allocateConnection(ConnectionManagerImpl.java:210)
at
org.datanucleus.store.rdbms.ConnectionFactoryImpl.getConnection(ConnectionFactoryImpl.java:345)
at
org.datanucleus.store.AbstractStoreManager.getConnection(AbstractStoreManager.java:267)
at
org.datanucleus.store.rdbms.query.SQLEvaluator.evaluate(SQLEvaluator.java:91)
at
org.datanucleus.store.rdbms.query.JDOQLQuery.performExecute(JDOQLQuery.java:287)
at org.datanucleus.store.query.Query.executeQuery