Thanks alot peter, you saved my life. what i understood that i need to explore hive and hbase and hadoop in more detail :) thanx a ton...
On Fri, Apr 6, 2012 at 3:43 AM, Peter Vandenabeele <[email protected]>wrote: > On Thu, Apr 5, 2012 at 2:45 PM, shashwat shriparv > <[email protected]> wrote: > > I am able to create external tables in hive of HBase, now i have a > > requirement to create an external table which is having variable columns, > > which means the columns in HBase are not fixed for the particular table, > > the no of columns and can be created dynamically at the time of data > > insertion, what should be the approach for handling such kind of > situation. > > > > Summary : How to create external tables in hive when the no of columns > are > > not fixed in HBase table. > > Maybe this is more a question for the Hive user mailing list: > > http://hive.apache.org/mailing_lists.html#Users > > If your "variable" columns (qualifiers in that case) are all in a column > family that is known upfront, you could use the > > map<string, string> > > structure in the definition of the EXTERNAL table. > > E.g. in a table with a column family 'demo' you could do in Hive: > > hive> CREATE EXTERNAL TABLE lrug(key int, value map<string, string>) > > STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' > > WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,demo:") > > TBLPROPERTIES("hbase.table.name" = "LRUG_Mobile"); > > (Note the specific SERDESPROPERTIES > :key for the default key > demo: to take all qualifiers in the 'demo' column family > > With this test data: > > hive> SELECT * from lrug; > OK > 1234 {"country_code":"UK","name":"Peter"} > 2345 {"country_code":"US","name":"Shawn","state":"CA"} > > If initially you would only have thought of a qualifier > "country_code" you could write: > > hive> SELECT value['name'] FROM lrug > > WHERE value['country_code'] = 'UK'; > Total MapReduce jobs = 1 > ... > Ended Job = job_201204051132_0017 > OK > 1234 Peter > Time taken: 7.644 seconds > > If then later, you also have a column 'state' when, > you could change the query to > > hive> SELECT key, value['name'] FROM lrug > > WHERE value['country_code'] = 'US' AND value['state'] = 'CA'; > Total MapReduce jobs = 1 > > ... > > Ended Job = job_201204051132_0018 > OK > 2345 Shawn > Time taken: 6.864 seconds > > If the qualifiers of the columns are really the data itself, you > can use a lateral view and explode(map_keys()) to convert the > qualifiers into new data rows. > > HTH, > > Peter > -- ∞ Shashwat Shriparv
