Hi Anurag,

the hive_job_log_* files are created by the HiveHistory class. This class
creates a PrintWriter for writing to the file, but never closes the writer.
It looks like we need to add a cleanup method to HiveHistory that closes the
PrintWriter and does any other necessary cleanup. Can you please file a bug
for this issue?

Thanks.

Carl

On Mon, Aug 2, 2010 at 1:51 PM, Anurag Phadke <[email protected]> wrote:

> Hi Carl,
> Thanks for the reply. I am currently monitoring file handles via:
> /usr/sbin/lsof | grep hive | wc -l
> The number is fairly stable, but spikes intermittently with the following
> files that get created:
> /tmp/aphadke/hive_job_log_aphadke_201008021315_656095069.txt
> and so on, mainly the pattern below:
> /tmp/aphadke/hive_job_log_*
>
> Sample content:
> SessionStart SESSION_ID="aphadke_201008021315" TIME="1280780118654"
> QueryStart QUERY_STRING="LOAD DATA INPATH 'hdfs://
> cm-hadoop01.mozilla.org/flume/cm-hadoop01.mozilla.org/2010-08-02/1315/hadooplog.00000018.20100802-131516958-0700.7823351645004687.seq.gz'
> INTO TABLE logs_temp PARTITION (ds='2010-08-02', ts='13-15', hn='
> cm-hadoop01.mozilla.org')" QUERY_ID="aphadke_20100802131515"
> TIME="1280780147562"
> TaskStart TASK_NAME="org.apache.hadoop.hive.ql.exec.MoveTask"
> TASK_ID="Stage-0" QUERY_ID="aphadke_20100802131515" TIME="1280780147564"
> TaskEnd TASK_RET_CODE="0"
> TASK_NAME="org.apache.hadoop.hive.ql.exec.MoveTask" TASK_ID="Stage-0"
> QUERY_ID="aphadke_20100802131515" TIME="1280780147902"
> QueryEnd QUERY_STRING="LOAD DATA INPATH 'hdfs://
> cm-hadoop01.mozilla.org/flume/cm-hadoop01.mozilla.org/2010-08-02/1315/hadooplog.00000018.20100802-131516958-0700.7823351645004687.seq.gz'
> INTO TABLE logs_temp PARTITION (ds='2010-08-02', ts='13-15', hn='
> cm-hadoop01.mozilla.org')" QUERY_ID="aphadke_20100802131515"
> QUERY_RET_CODE="0" QUERY_NUM_TASKS="0" TIME="1280780147902"
>
>
> These files and their content look innocuous as they seem to contain
> session information, any idea on why we need them and if its possible to
> reduce the number of open files?
>
> -Anurag
>
>
> On 8/2/10 1:37 PM, Carl Steinbach wrote:
>
>> Hi Anurag,
>>
>> It looks like the MetaStore server is not properly closing JDBC
>> connections, or connection pooling is not working properly. As a first
>> step I would try to verify that process has lots of open sockets using
>> the output of lsof: 'lsof -i -p <jvm process id>'
>>
>> Thanks.
>>
>> Carl
>>
>> On Mon, Aug 2, 2010 at 1:14 PM, Anurag Phadke <[email protected]
>> <mailto:[email protected]>> wrote:
>>
>>    We are running hive server with mysql as metastore. The server
>>    receives LOAD DATA hql every 30 sec (from flume), everything runs
>>    fine for about 90 minutes and then the hive server throws following
>>    error. (Note: The user running hive server has "unlimited" file
>>    handle limit.)
>>
>>    any ideas?
>>
>>    10/08/02 12:34:34 ERROR exec.MoveTask: Failed with exception Unable
>>    to fetch table logs_temp
>>    org.apache.hadoop.hive.ql.metadata.HiveException: Unable to fetch
>>    table logs_temp
>>            at
>>    org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:339)
>>            at
>>    org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:310)
>>            at
>>    org.apache.hadoop.hive.ql.metadata.Hive.loadPartition(Hive.java:454)
>>            at
>>    org.apache.hadoop.hive.ql.exec.MoveTask.execute(MoveTask.java:135)
>>            at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:379)
>>            at org.apache.hadoop.hive.ql.Driver.run(Driver.java:285)
>>            at
>>
>>  
>> org.apache.hadoop.hive.service.HiveServer$HiveServerHandler.execute(HiveServer.java:115)
>>            at
>>
>>  
>> org.apache.hadoop.hive.service.ThriftHive$Processor$execute.process(ThriftHive.java:340)
>>            at
>>
>>  
>> org.apache.hadoop.hive.service.ThriftHive$Processor.process(ThriftHive.java:328)
>>            at
>>
>>  
>> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:252)
>>            at
>>
>>  
>> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>>            at
>>
>>  
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>>            at java.lang.Thread.run(Thread.java:619)
>>    Caused by: javax.jdo.JDODataStoreException: Communications link
>>    failure due to underlying exception:
>>
>>    ** BEGIN NESTED EXCEPTION **
>>
>>    java.net.SocketException
>>    MESSAGE: Too many open files
>>
>>    STACKTRACE:
>>
>>    java.net.SocketException: Too many open files
>>            at java.net.Socket.createImpl(Socket.java:397)
>>            at java.net.Socket.<init>(Socket.java:371)
>>            at java.net.Socket.<init>(Socket.java:218)
>>            at
>>
>>  com.mysql.jdbc.StandardSocketFactory.connect(StandardSocketFactory.java:256)
>>            at com.mysql.jdbc.MysqlIO.<init>(MysqlIO.java:271)
>>            at com.mysql.jdbc.Connection.createNewIO(Connection.java:2771)
>>            at com.mysql.jdbc.Connection.<init>(Connection.java:1555)
>>            at
>>
>>  com.mysql.jdbc.NonRegisteringDriver.connect(NonRegisteringDriver.java:285)
>>            at
>>
>>  
>> org.datanucleus.store.rdbms.datasource.DriverManagerDataSource.getConnection(DriverManagerDataSource.java:142)
>>            at
>>
>>  
>> org.datanucleus.store.rdbms.datasource.DriverManagerDataSource.getConnection(DriverManagerDataSource.java:118)
>>            at
>>
>>  
>> org.datanucleus.store.rdbms.ConnectionProviderPriorityList.getConnection(ConnectionProviderPriorityList.java:59)
>>            at
>>
>>  
>> org.datanucleus.store.rdbms.ConnectionFactoryImpl$ManagedConnectionImpl.getConnection(ConnectionFactoryImpl.java:458)
>>            at
>>
>>  
>> org.datanucleus.store.rdbms.ConnectionFactoryImpl$ManagedConnectionImpl.getXAResource(ConnectionFactoryImpl.java:418)
>>            at
>>
>>  
>> org.datanucleus.ConnectionManagerImpl.enlistResource(ConnectionManagerImpl.java:329)
>>            at
>>
>>  
>> org.datanucleus.ConnectionManagerImpl.allocateConnection(ConnectionManagerImpl.java:210)
>>            at
>>
>>  
>> org.datanucleus.store.rdbms.ConnectionFactoryImpl.getConnection(ConnectionFactoryImpl.java:345)
>>            at
>>
>>  
>> org.datanucleus.store.AbstractStoreManager.getConnection(AbstractStoreManager.java:267)
>>            at
>>
>>  
>> org.datanucleus.store.rdbms.query.SQLEvaluator.evaluate(SQLEvaluator.java:91)
>>            at
>>
>>  
>> org.datanucleus.store.rdbms.query.JDOQLQuery.performExecute(JDOQLQuery.java:287)
>>            at
>>    org.datanucleus.store.query.Query.executeQuery(Query.java:1475)
>>            at
>>
>>  
>> org.datanucleus.store.rdbms.query.JDOQLQuery.executeQuery(JDOQLQuery.java:244)
>>            at
>>    org.datanucleus.store.query.Query.executeWithArray(Query.java:1357)
>>            at org.datanucleus.jdo.JDOQuery.execute(JDOQuery.java:265)
>>            at
>>
>>  org.apache.hadoop.hive.metastore.ObjectStore.getMTable(ObjectStore.java:554)
>>            at
>>
>>  org.apache.hadoop.hive.metastore.ObjectStore.getTable(ObjectStore.java:497)
>>            at
>>
>>  
>> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_table(HiveMetaStore.java:388)
>>            at
>>
>>  
>> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getTable(HiveMetaStoreClient.java:431)
>>            at
>>    org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:331)
>>            at
>>    org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:310)
>>            at
>>    org.apache.hadoop.hive.ql.metadata.Hive.loadPartition(Hive.java:454)
>>            at
>>    org.apache.hadoop.hive.ql.exec.MoveTask.execute(MoveTask.java:135)
>>            at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:379)
>>            at org.apache.hadoop.hive.ql.Driver.run(Driver.java:285)
>>            at
>>
>>  
>> org.apache.hadoop.hive.service.HiveServer$HiveServerHandler.execute(HiveServer.java:115)
>>            at
>>
>>  
>> org.apache.hadoop.hive.service.ThriftHive$Processor$execute.process(ThriftHive.java:340)
>>            at
>>
>>  
>> org.apache.hadoop.hive.service.ThriftHive$Processor.process(ThriftHive.java:328)
>>            at
>>
>>  
>> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:252)
>>            at
>>
>>  
>> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>>            at
>>
>>  
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>>            at java.lang.Thread.run(Thread.java:619)
>>
>>
>>    ** END NESTED EXCEPTION **
>>
>>
>>
>>    Last packet sent to the server was 1 ms ago.
>>    NestedThrowables:
>>    com.mysql.jdbc.CommunicationsException: Communications link failure
>>    due to underlying exception:
>>
>>    ** BEGIN NESTED EXCEPTION **
>>
>>    java.net.SocketException
>>    MESSAGE: Too many open files
>>
>>    STACKTRACE:
>>
>>    java.net.SocketException: Too many open files
>>            at java.net.Socket.createImpl(Socket.java:397)
>>            at java.net.Socket.<init>(Socket.java:371)
>>            at java.net.Socket.<init>(Socket.java:218)
>>            at
>>
>>  com.mysql.jdbc.StandardSocketFactory.connect(StandardSocketFactory.java:256)
>>            at com.mysql.jdbc.MysqlIO.<init>(MysqlIO.java:271)
>>            at com.mysql.jdbc.Connection.createNewIO(Connection.java:2771)
>>            at com.mysql.jdbc.Connection.<init>(Connection.java:1555)
>>            at
>>
>>  com.mysql.jdbc.NonRegisteringDriver.connect(NonRegisteringDriver.java:285)
>>            at
>>
>>  
>> org.datanucleus.store.rdbms.datasource.DriverManagerDataSource.getConnection(DriverManagerDataSource.java:142)
>>            at
>>
>>  
>> org.datanucleus.store.rdbms.datasource.DriverManagerDataSource.getConnection(DriverManagerDataSource.java:118)
>>            at
>>
>>  
>> org.datanucleus.store.rdbms.ConnectionProviderPriorityList.getConnection(ConnectionProviderPriorityList.java:59)
>>            at
>>
>>  
>> org.datanucleus.store.rdbms.ConnectionFactoryImpl$ManagedConnectionImpl.getConnection(ConnectionFactoryImpl.java:458)
>>            at
>>
>>  
>> org.datanucleus.store.rdbms.ConnectionFactoryImpl$ManagedConnectionImpl.getXAResource(ConnectionFactoryImpl.java:418)
>>            at
>>
>>  
>> org.datanucleus.ConnectionManagerImpl.enlistResource(ConnectionManagerImpl.java:329)
>>            at
>>
>>  
>> org.datanucleus.ConnectionManagerImpl.allocateConnection(ConnectionManagerImpl.java:210)
>>            at
>>
>>  
>> org.datanucleus.store.rdbms.ConnectionFactoryImpl.getConnection(ConnectionFactoryImpl.java:345)
>>            at
>>
>>  
>> org.datanucleus.store.AbstractStoreManager.getConnection(AbstractStoreManager.java:267)
>>            at
>>
>>  
>> org.datanucleus.store.rdbms.query.SQLEvaluator.evaluate(SQLEvaluator.java:91)
>>            at
>>
>>  
>> org.datanucleus.store.rdbms.query.JDOQLQuery.performExecute(JDOQLQuery.java:287)
>>            at org.datanucleus.store.query.Query.executeQuery
>>
>>
>>

Reply via email to