I can confirm the HBase table is populated via "SELECT *" or the hbase shell. But, when I read or copy the table via a mapreduce job, there are no rows returned.
I'm hoping someone would recognize this as some sort of confiuration problem. The stack is: Hadoop 0.20.2, HBase 0.20.3, and Hive from the trunk ~8/20. Here are the statements that show the problem... hive> select * from hbase_table_1 limit 5; OK 500184511 033ee0111f22bbf5786f80df3d163834 500184512 030c23751e42fa5e01d05daf5a028e8b 500184516 01945892c252a55da843c692f4b1bd77 500184542 0078d187207d1f1777524b027f826b19 500184662 036e9bd88dba12bfc6943f417d29302f Time taken: 0.087 seconds hive> select key, value from hbase_table_1 limit 5; Total MapReduce jobs = 1 Launching Job 1 out of 1 Number of reduce tasks is set to 0 since there's no reduce operator Starting Job = job_201009041301_0030, Tracking URL = http://pos01n:50030/jobdetails.jsp?jobid=job_201009041301_0030 Kill Command = /hadoop/bin/../bin/hadoop job -Dmapred.job.tracker=pos01n:9001 -kill job_201009041301_0030 2010-09-04 19:04:34,673 Stage-1 map = 0%, reduce = 0% 2010-09-04 19:04:37,685 Stage-1 map = 100%, reduce = 100% Ended Job = job_201009041301_0030 OK Time taken: 8.386 seconds hive> describe extended hbase_table_1; OK key int from deserializer value string from deserializer Detailed Table Information Table(tableName:hbase_table_1, dbName:default, owner:root, createTime:1283637617, lastAccessTime:0, retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:key, type:int, comment:null), FieldSchema(name:value, type:string, comment:null)], location:hdfs://pos01n:54310/user/hive/warehouse/hbase_table_1, inputFormat:org.apache.hadoop.hive.hbase.HiveHBaseTableInputFormat, outputFormat:org.apache.hadoop.hive.hbase.HiveHBaseTableOutputFormat, compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.hbase.HBaseSerDe, parameters:{serialization.format=1, hbase.columns.mapping=:key,cf1:val}), bucketCols:[], sortCols:[], parameters:{}), partitionKeys:[], parameters:{ hbase.table.name=xyz, transient_lastDdlTime=1283637617, storage_handler=org.apache.hadoop.hive.hbase.HBaseStorageHandler}, viewOriginalText:null, viewExpandedText:null, tableType:MANAGED_TABLE) Of course, I appreciate the help. Hopefully I'll find HBase can solve my problem, become a user, and be able to return the favor some day ;)