That looks cool. On a different note, it looks like the HiveStorageHandler is based on the old Hadoop "mapred" interface. Any idea when you plan to migrate to the "mapreduce" interface?
-Sanjit On Tue, Jun 1, 2010 at 12:26 PM, John Sichi <[email protected]> wrote: > On May 28, 2010, at 3:49 PM, Sanjit Jhala wrote: > > > John, theres some logic in the helper serialize method to serialize lists > and structs. Is this used currently? I was under the impression that maps > and primitives are the only types currently supported by the connector. > > > Yes, this logic is working. I just now tested it interactively (see below) > and will add a corresponding unit test when I work on HIVE-1245. > > I'm not sure what is going on with the JSON-vs-delimited stuff; in my test > it looks like it is coming out as delimited based on what I see from the > HBase side. There is a setUseJSONSerialize method but currently nothing > invokes it; it would make sense to include this in the HIVE-1245 work as > part of controlling how values are stored within HBase. > > JVS > > ---- > > hive> CREATE TABLE complex( > > key string, > > a array<string>, > > s struct<col1 : int, col2 : int>) > > STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' > > WITH SERDEPROPERTIES ( > > "hbase.columns.mapping" = "cf:a, cf:s" > > ); > OK > hive> > > INSERT OVERWRITE TABLE complex > > SELECT bar, array('x', 'y', 'z'), struct(100, 200) > > FROM pokes > > WHERE foo=497; > ... > OK > hive> > > SELECT * FROM complex; > OK > val_497 ["x","y","z"] {"col1":100,"col2":200} > > hbase(main):003:0> scan 'complex' > ROW COLUMN+CELL > val_497 column= cf:s, timestamp=1275419258650, > value=100\x02200 > val_497 column=cf:a, timestamp=1275419258650, > value=x\x02y\x02z > 1 row(s) in 1.0250 seconds > >
