This is a known issue in Hive Server. This is because the same metastore client is being used to issue both queries and JDBC does not like that. We should use thread specific or session specific metastore clients but I don't think Hive Server is doing that right now. HIVE-584 is supposed to fix this issue.
________________________________ From: Matt Pestritto <[email protected]> Reply-To: <[email protected]> Date: Tue, 28 Jul 2009 10:48:24 -0700 To: <[email protected]> Subject: Problem with Thrift Server Concurrency Hi all Does the Thrift server support concurrency ? I'm having a problem that only happens if I fire off multiple ( 2+ ) DML queries at the same time. Randomly, one of the queries will succeed but the other will fail with the following error I pulled from the hiveserver output: java.io.IOException: cannot find dir = hdfs://mustique:9000/user/hadoop/mantis-output/mantis-job/20090601 in partToPartitionInfo! at org.apache.hadoop.hive.ql.io.HiveInputFormat.getTableDescFromPath(HiveInputFormat.java:311) at org.apache.hadoop.hive.ql.io.HiveInputFormat.validateInput(HiveInputFormat.java:288) at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:735) at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:388) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:357) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:263) at org.apache.hadoop.hive.service.HiveServer$HiveServerHandler.execute(HiveServer.java:108) at org.apache.hadoop.hive.service.ThriftHive$Processor$execute.process(ThriftHive.java:302) at org.apache.hadoop.hive.service.ThriftHive$Processor.process(ThriftHive.java:290) at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:252) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619) If I execute the queries via thrift a few seconds apart from each other, it succeeds. It only seems to fail if the queries start at about the same time. When I run the same two queries using *hive -e "query 1" & hive -e "query 2" * is also works fine. Any ideas ? Thanks -Matt
