Hello,

[I posted the question below to Cloudera's getsatisfaction site but am 
cross-posting here in case hive-users folks have debugging suggestions. I'm 
really stuck on this one.]

I recently upgraded to CDH3 Beta. I had some Hive code working well in an 
earlier version of Hadoop 20 that created a table, then loaded data into it 
using LOAD DATA LOCAL INPATH. In CDH3, I now get a semantic error when I run 
the same LOAD command.

The table is created by

CREATE TABLE TOMCAT(identifier STRING, datestamp STRING, time_stamp STRING, seq 
STRING, server STRING, logline STRING) PARTITIONED BY(filedate STRING, app 
STRING, filename STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\011' 
STORED AS TEXTFILE;

and the load command used is:

LOAD DATA LOCAL INPATH '/var/www/petrify/mw.log.trustejb1' INTO TABLE TOMCAT 
PARTITION (filedate='2010-06-25', app='trustdomain', 
filename='mw.log.trustejb1');

The file is simple tab-delimited log data.
If I exclude the partition when I create the table, the data loads fine. But 
when I set up the partitions I get the stack trace below during the load.

I tried copying the data into HDFS and using LOAD DATA INPATH instead, but got 
the same error:

FAILED: Error in semantic analysis: line 1:110 Partition not found 
'mw.log.trustejb1'

where 110 is the character position just after the word PARTITION in the query.
It seems like it doesn't think the table is partitioned, though I can see the 
partition keys listed when I do DESCRIBE EXTENDED on my table. (Output from 
that is below the error.) There were no errors in the logs or at the Thrift 
server console when I created the table.

Strangely, when I run SHOW PARTITIONS TOMCAT, it doesn't list anything.

Any help with this would be most welcome.

Thanks
Ken

10/08/12 15:11:40 INFO service.HiveServer: Running the query: LOAD DATA LOCAL 
INPATH '/var/www/petrify/trustdomain-rewritten/mw.log.trustejb1' INTO TABLE 
TOMCAT PARTITION (filedate='2010-06-25', app='trustdomain', 
filename='mw.log.trustejb1')
10/08/12 15:11:40 INFO parse.ParseDriver: Parsing command: LOAD DATA LOCAL 
INPATH '/var/www/petrify/trustdomain-rewritten/mw.log.trustejb1' INTO TABLE 
TOMCAT PARTITION (filedate='2010-06-25', app='trustdomain', 
filename='mw.log.trustejb1')
10/08/12 15:11:40 INFO parse.ParseDriver: Parse Completed
10/08/12 15:11:40 INFO hive.log: DDL: struct tomcat { string identifier, string 
datestamp, string time_stamp, string seq, string server, string logline}
10/08/12 15:11:40 ERROR metadata.Hive: org.apache.thrift.TApplicationException: 
get_partition failed: unknown result
at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_partition(ThriftHiveMetastore.java:831)
at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_partition(ThriftHiveMetastore.java:799)
at 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getPartition(HiveMetaStoreClient.java:418)
at org.apache.hadoop.hive.ql.metadata.Hive.getPartition(Hive.java:620)
at 
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer$tableSpec.(BaseSemanticAnalyzer.java:397)
at 
org.apache.hadoop.hive.ql.parse.LoadSemanticAnalyzer.analyzeInternal(LoadSemanticAnalyzer.java:178)
at 
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:105)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:275)
at org.apache.hadoop.hive.ql.Driver.runCommand(Driver.java:320)
at 
org.apache.hadoop.hive.service.HiveServer$HiveServerHandler.execute(HiveServer.java:120)
at 
org.apache.hadoop.hive.service.ThriftHive$Processor$execute.process(ThriftHive.java:378)
at 
org.apache.hadoop.hive.service.ThriftHive$Processor.process(ThriftHive.java:366)
at 
org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:253)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:619)

FAILED: Error in semantic analysis: line 1:110 Partition not found 
'mw.log.trustejb1'
10/08/12 15:11:40 ERROR ql.Driver: FAILED: Error in semantic analysis: line 
1:110 Partition not found 'mw.log.trustejb1'
org.apache.hadoop.hive.ql.parse.SemanticException: line 1:110 Partition not 
found 'mw.log.trustejb1'
at 
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer$tableSpec.(BaseSemanticAnalyzer.java:403)
at 
org.apache.hadoop.hive.ql.parse.LoadSemanticAnalyzer.analyzeInternal(LoadSemanticAnalyzer.java:178)
at 
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:105)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:275)
at org.apache.hadoop.hive.ql.Driver.runCommand(Driver.java:320)
at 
org.apache.hadoop.hive.service.HiveServer$HiveServerHandler.execute(HiveServer.java:120)
at 
org.apache.hadoop.hive.service.ThriftHive$Processor$execute.process(ThriftHive.java:378)
at 
org.apache.hadoop.hive.service.ThriftHive$Processor.process(ThriftHive.java:366)
at 
org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:253)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:619)

DESCRIBE TABLE EXTENDED TOMCAT;
identifier string
datestamp string
time_stamp string
seq string
server string
logline string
filedate string
app string
filename string

Detailed Table Information Table(tableName:tomcat, dbName:default, owner:root, 
createTime:1281661047, lastAccessTime:0, retention:0, 
sd:StorageDescriptor(cols:[FieldSchema(name:identifier, type:string, 
comment:null), FieldSchema(name:datestamp, type:string, comment:null), 
FieldSchema(name:time_stamp, type:string, comment:null), FieldSchema(name:seq, 
type:string, comment:null), FieldSchema(name:server, type:string, 
comment:null), FieldSchema(name:logline, type:string, comment:null)], 
location:hdfs://hadoop-vm1/user/hive/warehouse/tomcat, 
inputFormat:org.apache.hadoop.mapred.TextInputFormat, 
outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat, 
compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, 
serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, 
parameters:{field.delim= , serialization.format=9}), bucketCols:[], 
sortCols:[], parameters:{}), partitionKeys:[FieldSchema(name:filedate, 
type:string, comment:null), FieldSchema(name:app, type:string, comment:null), 
FieldSchema(name:filename, type:string, comment:null)], 
parameters:{transient_lastDdlTime=1281661047})
Time taken: 0.086 seconds
10/08/12 18:53:08 INFO CliDriver: Time taken: 0.086 seconds



Ken Barclay

Integration Engineer

Wells Fargo Bank - ISD | 45 Fremont Street, 10th Floor | San Francisco, CA 94105
MAC A0194-100
Tel 415-222-6491

[email protected]<mailto:[email protected]>

This message may contain confidential and/or privileged information, and is 
intended for the use of the addressee only. If you are not the addressee or 
authorized to receive this for the addressee, you must not use, copy, disclose, 
or take any action based on this message or any information herein. If you have 
received this message in error, please advise the sender immediately by reply 
e-mail and delete this message. Thank you for your cooperation.



Reply via email to