On Thu, Apr 5, 2012 at 2:45 PM, shashwat shriparv
<[email protected]> wrote:
> I am able to create external tables in hive of HBase, now i have a
> requirement to create an external table which is having variable columns,
> which means the columns in HBase are not fixed for the particular table,
> the no of columns and can be created dynamically at the time of data
> insertion, what should be the approach for handling such kind of situation.
>
> Summary : How to create external tables in hive when the no of columns are
> not fixed in HBase table.

Maybe this is more a question for the Hive user mailing list:

  http://hive.apache.org/mailing_lists.html#Users

If your "variable" columns (qualifiers in that case) are all in a column
family that is known upfront, you could use the

  map<string, string>

structure in the definition of the EXTERNAL table.

E.g. in a table with a column family 'demo' you could do in Hive:

hive> CREATE EXTERNAL TABLE lrug(key int, value map<string, string>)
    > STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
    > WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,demo:")
    > TBLPROPERTIES("hbase.table.name" = "LRUG_Mobile");

(Note the specific SERDESPROPERTIES
:key   for the default key
demo:    to take all qualifiers in the 'demo' column family

With this test data:

hive> SELECT * from lrug;
OK
1234    {"country_code":"UK","name":"Peter"}
2345    {"country_code":"US","name":"Shawn","state":"CA"}

If initially you would only have thought of a qualifier
"country_code" you could write:

hive> SELECT value['name'] FROM lrug
    > WHERE value['country_code'] = 'UK';
Total MapReduce jobs = 1
...
Ended Job = job_201204051132_0017
OK
1234    Peter
Time taken: 7.644 seconds

If then later, you also have a column 'state' when,
you could change the query to

hive> SELECT key, value['name'] FROM lrug
    > WHERE value['country_code'] = 'US' AND value['state'] = 'CA';
Total MapReduce jobs = 1

...

Ended Job = job_201204051132_0018
OK
2345    Shawn
Time taken: 6.864 seconds

If the qualifiers of the columns are really the data itself, you
can use a lateral view and explode(map_keys()) to convert the
qualifiers into new data rows.

HTH,

Peter

Reply via email to