Hi Tariq
From the stack trace, I believe the issue could be due to the fact that you
are just providing Column Families but no Qualifiers in
thehbase.columns.mapping. If you don't specify the qualifier for a column
family then the hive column would be mapped to all the Qualifiers corresponding
to that hbase Column Family. So here what happens is that ,all the qualifiers
for each column family is made to map and this map is supposed to be stored in
hive tables, but in your query you are mapping these maps to primitives and it
results in the exception. In hive wiki such an operation is mentioned illegal,
please refer
https://cwiki.apache.org/Hive/hbaseintegration.html#HBaseIntegration-ColumnMapping
https://cwiki.apache.org/Hive/hbaseintegration.html#HBaseIntegration-Illegal%253AHivePrimitivetoHBaseColumnFamily
You can get your query working by just changing the data type of Hbase columns
also better to add key in your mapping,
CREATE EXTERNAL TABLE employee(key string,no map<string,string>,name
map<string,string>,address map<string,string>)
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,no:,name:,address:")
TBLPROPERTIES("hbase.table.name"= "employee");
For your second question, In Hbase every row is uniquely identified by the
ROW_KEY, here with :key in our mapping we are mapping this row key to one of
our hive table column. From your output the two values of row key in hbase
employee table are emp1 and emp2. I believe your confusion is from the hbase
CLI output. In RDBMS/hive query we see a record in a line on querying, but in
hbase shell one line represents a column family not an entire record unlike
hive. If there are 10 column families in your hbase table, then on scan command
you get 10 lines for one record (by record in Hbase i refer to all the
attributes corresponding to a row key). Here since you have 3 Column Families,
you see 3 lines represent a record(attributes of emp*) .
Hope it helps!...
Regards
Bejoy.K.S
________________________________
From: Mohammad Tariq <[email protected]>
To: user <[email protected]>
Sent: Saturday, December 17, 2011 11:32 PM
Subject: Hive-Hbase integration
Hello list,
I have a small demo table in Hbase and I want to operate it
through Hive.Here is my table in Hbase -
hbase(main):021:0> scan 'employee'
ROW COLUMN+CELL
emp1 column=address:,
timestamp=1324119715536, value=#12-bangalore
emp1 column=name:,
timestamp=1324119698581, value=tariq
emp1 column=no:,
timestamp=1324119688511, value=001
emp2 column=address:,
timestamp=1324120893996, value=#13-bangalore
emp2 column=name:,
timestamp=1324120883612, value=vishal
emp2 column=no:,
timestamp=1324120866981, value=002
2 row(s) in 0.0260 seconds
I have 2 rows in the employee table, each corresponding to a
particular user. And I have 3 column families (each having only 1
column) - no, name and address.
For this table I have created an external table in Hive using the
following command -
hive> CREATE EXTERNAL TABLE employee(key string,no string,name
string,address string) > STORED BY
'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
> WITH SERDEPROPERTIES ("hbase.columns.mapping" =
"no:,name:,address:") > TBLPROPERTIES("hbase.table.name"
= "employee");
But i am getting the following error -
FAILED: Error in metadata: java.lang.RuntimeException:
MetaException(message:org.apache.hadoop.hive.serde2.SerDeException
org.apache.hadoop.hive.hbase.HBaseSerDe: hbase column family 'no'
should be mapped to Map<String,?> but is mapped to string)FAILED:
Execution Error, return code 1 from
org.apache.hadoop.hive.ql.exec.DDLTask
Could someone point out my mistake??Also, I would like to know whether
the field "key" corresponds to each row in the Hbase table i.e emp1
and emp2 or am I getting the concept wrong??I was going through the
wiki, but could not find the proper explanation there.Sorry if my
question seems childish.
Many thanks.
Regards,
Mohammad Tariq