I can confirm the HBase table is populated via "SELECT *" or the hbase
shell.
But, when I read or copy the table via a mapreduce job, there are no rows
returned.

I'm hoping someone would recognize this as some sort of confiuration
problem.
The stack is: Hadoop 0.20.2, HBase 0.20.3, and Hive from the trunk ~8/20.

Here are the statements that show the problem...



hive> select * from hbase_table_1 limit 5;
OK
500184511 033ee0111f22bbf5786f80df3d163834
500184512 030c23751e42fa5e01d05daf5a028e8b
500184516 01945892c252a55da843c692f4b1bd77
500184542 0078d187207d1f1777524b027f826b19
500184662 036e9bd88dba12bfc6943f417d29302f
Time taken: 0.087 seconds


hive> select key, value from hbase_table_1 limit 5;
Total MapReduce jobs = 1
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_201009041301_0030, Tracking URL =
http://pos01n:50030/jobdetails.jsp?jobid=job_201009041301_0030
Kill Command = /hadoop/bin/../bin/hadoop job
 -Dmapred.job.tracker=pos01n:9001 -kill job_201009041301_0030
2010-09-04 19:04:34,673 Stage-1 map = 0%,  reduce = 0%
2010-09-04 19:04:37,685 Stage-1 map = 100%,  reduce = 100%
Ended Job = job_201009041301_0030
OK
Time taken: 8.386 seconds


hive> describe extended hbase_table_1;
OK
key int from deserializer
value string from deserializer

Detailed Table Information Table(tableName:hbase_table_1, dbName:default,
owner:root, createTime:1283637617, lastAccessTime:0, retention:0,
sd:StorageDescriptor(cols:[FieldSchema(name:key, type:int, comment:null),
FieldSchema(name:value, type:string, comment:null)],
location:hdfs://pos01n:54310/user/hive/warehouse/hbase_table_1,
inputFormat:org.apache.hadoop.hive.hbase.HiveHBaseTableInputFormat,
outputFormat:org.apache.hadoop.hive.hbase.HiveHBaseTableOutputFormat,
compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null,
serializationLib:org.apache.hadoop.hive.hbase.HBaseSerDe,
parameters:{serialization.format=1, hbase.columns.mapping=:key,cf1:val}),
bucketCols:[], sortCols:[], parameters:{}), partitionKeys:[], parameters:{
hbase.table.name=xyz, transient_lastDdlTime=1283637617,
storage_handler=org.apache.hadoop.hive.hbase.HBaseStorageHandler},
viewOriginalText:null, viewExpandedText:null, tableType:MANAGED_TABLE)


Of course, I appreciate the help. Hopefully I'll find HBase can solve my
problem, become a user, and be able to return the favor some day ;)

Reply via email to