That's correct. In order to map the data into the relational world, the storage handler will need to put together references for all of the cells for a given row and return a reference to that. Take a look at the HBase handler to see how to do that in a lazy fashion if that makes sense for you.
JVS On May 19, 2010, at 6:00 PM, Sanjit Jhala wrote: Thinking about this a bit more I realize the Input and Output formats have to have some notion of rows for any kind of filtering, grouping etc to work. -Sanjit On Wed, May 19, 2010 at 4:37 PM, Sanjit Jhala <[email protected]<mailto:[email protected]>> wrote: Hi, I'm trying to write a StorageHandler for Hypertable, to facilitate Hive-Hypertable integration. Looking at the documentation, it looks like the SerDe interface deals with reading and writing abstract objects which are the external data store's equivalent of (Hive) rows. Is this correct, or can the interface be used to deal with sub-row objects (ie a rowkey + column)? The reason I ask is that currently the Hypertable API only exposes Cells (a row is essentially a collection of Cells with the same rowkey) and has no explicit notion of a row. -Sanjit
