Thanks, Edward and Ashutosh
Ashutosh,
yes, I do not understand why the service "hiveserver" still uses a Derby
instance even through it should be talking to the service "metastore".
Btw, if I run the hiveserver without having started the metastore
service, the hiveserver complains when I try to let it execute a HiveQL
command through JDBC:
...
org.apache.hadoop.hive.ql.metadata.HiveException:
MetaException(message:Could not connect to meta store using any of the
URIs provided)
at
org.apache.hadoop.hive.ql.metadata.Hive.getTablesByPattern(Hive.java:919)
...
(full stacktrace at the end of this post)
which is exactly what I expect and which makes me somewhat confident
that I have configured things correctly.
The entire issue came up, because the hiveserver service did not work,
when started from the same directory, from which the metastore service
had been started. It turned out that this was because both services were
trying to setup a Derby instance in the current dir and therefore ran
into a file locking situation. I have worked around this by starting the
two services from different directories, but I am worried that I'd be
missing an important point in my setup.
When I run "pfiles <pid of hiveserver>" it lists these files for the
hiveserver service (which should not need a Derby instance, as far as I
understood):
...tons of jars...
/home/hadoop/hive_admin/derby.log
/home/hadoop/hive_admin/metastore_db/log/log1.dat
/home/hadoop/hive_admin/metastore_db/dbex.lck
/home/hadoop/hive_admin/metastore_db/seg0/c191.dat
/home/hadoop/hive_admin/metastore_db/seg0/c1a1.dat
...
/home/hadoop/hive_admin/metastore_db/seg0/c431.dat
/home/hadoop/hive_admin/metastore_db/seg0/c451.dat
Any pointers appreciated. If anybody things this is a bug, I can file one.
Thanks,
Christian
full stacktrace:
Hive history file=/tmp/hadoop/hive_job_log_hadoop_201108242305_155100916.txt
FAILED: Error in semantic analysis: Table not found weblog
org.apache.hadoop.hive.ql.metadata.HiveException:
MetaException(message:Could not connect to meta store using any of the
URIs provided)
at
org.apache.hadoop.hive.ql.metadata.Hive.getTablesByPattern(Hive.java:919)
at
org.apache.hadoop.hive.ql.metadata.Hive.getTablesByPattern(Hive.java:904)
at
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeCreateTable(SemanticAnalyzer.java:7074)
at
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:6573)
at
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:238)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:340)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:736)
at
org.apache.hadoop.hive.service.HiveServer$HiveServerHandler.execute(HiveServer.java:116)
at
org.apache.hadoop.hive.service.ThriftHive$Processor$execute.process(ThriftHive.java:699)
at
org.apache.hadoop.hive.service.ThriftHive$Processor.process(ThriftHive.java:677)
at
org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:253)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:619)
Caused by: MetaException(message:Could not connect to meta store using
any of the URIs provided)
at
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:183)
at
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:151)
at
org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:1855)
at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:1865)
at
org.apache.hadoop.hive.ql.metadata.Hive.getTablesByPattern(Hive.java:917)
... 13 more
FAILED: Error in metadata: MetaException(message:Could not connect to
meta store using any of the URIs provided)
FAILED: Execution Error, return code 1 from
org.apache.hadoop.hive.ql.exec.DDLTask
On 25.08.2011 01:29, Ashutosh Chauhan wrote:
Edward,
Apart from recommended best practices what Christian is asking for is
why HiveServer is still trying to interact with local db instance even
after setting the config variables. AFAIK it should not. Christian,
you found that out by looking at files opened by HiveServer jvm. Can
you provide more info there like how did you find that out and which
these files are?
Ashutosh
On Wed, Aug 24, 2011 at 14:20, Edward Capriolo <edlinuxg...@gmail.com
<mailto:edlinuxg...@gmail.com>> wrote:
On Wed, Aug 24, 2011 at 3:02 PM, Christian Kurz <crk...@gmx.de
<mailto:crk...@gmx.de>> wrote:
Thanks for the quick reply, Edward
I am not sure I got you: My HiveService has been started with
hive.metastore.local=false. So shouldn't it use thrift instead
of its own local Derby instance?
Thanks,
Christian
Am 24.08.2011 um 19:33 schrieb Edward Capriolo
<edlinuxg...@gmail.com <mailto:edlinuxg...@gmail.com>>:
On Wed, Aug 24, 2011 at 10:53 AM, Christian Kurz
<crk...@gmx.de <mailto:crk...@gmx.de>> wrote:
Greetings,
could somebody confirm/correct my understanding of a
fully distributed Hive setup, please?
My setup is as follows
* *Java application using Hive JDBC driver *connects to
* *hive --service hiveserver*, which connects to
* *hive --service metastore*, which uses an embedded
Derby database for metadata storage
Please find more details in the image attached.
The thing I find confusing is that JVM2 (Hive Server)
starts up a Derby database instance. I can see that from
the files the JVM has opened.
Does anybody know, why the Hive Server needs a Derby
instance even though hive-site.xml says:
hive.metastore.local=false ?
Any hints are much appreciated.
Thanks,
Christian
btw,
I have not been able to access the picture on the wiki
<https://cwiki.apache.org/Hive/adminmanual-metastoreadmin.html#AdminManualMetastoreAdmin-MetastoreDeploymentOptionsinPictures>.
("Not permitted"; even though I have registered on the wiki)
hive.metastore.local is really misnamed.
local=true means communicate using datanucleus/JPOX and
talking directly to the metastore.
local=false means use thrift which is essentially a level of
indirection.
Talking about HiveService can confuse things because HiveService
is a different thrift interface.
You could be setup like this:
HiveServiceClient->HiveService->metastore.local=true->derby
or
HiveServiceClient->HiveService->metastore.local=false>thrift->hive_metastore
most people are setup like this:
HiveServiceClient->HiveService->metastore.local=true->mysql
cli->metastore.local=true->mysql