Hi Jean-Daniel,

Thanks for the info on the HBase transaction model.

I'm not sure I quite understand your subsequent question though, and I'm interested in learning whether you are suggesting a preferred alternative to what we are doing.

To set the background, I should begin by noting that by the nature of our application, the table in this case will always be dense. We probably are not using HBase optimally in that regard, since such data could be stored in an RDBMs. However, we mainly want to use HBase because of the scale, and to run map-reduce across the data set for different computations where RDBMs data-mining tools would be both overkill in some cases and inefficient in others.

We have used the transaction model described for several reasons. By creating a column family ("key"), which has numerous column members ("key:c1", "key:c2", etc), we have a complex key for a column family ("value") that we can filter on in the map phase of map-reduce computations. Each row in the HBase table corresponds to a data set for an entity in our system, which it seems would be stored relatively efficiently in the HFS. Timestamps and versions become relevant as indices for data whose temporal course is of interest. (Figure 1 in the seminal Google paper "Bigtable, A Distributed Storage System for Structured Data" is in fact a good picture of what we are doing --- except our rows are dense in the sense that every column has an entry for every timestamp because of the nature of our data.)

The ability to insure that all columns written in a single transaction are returned together also allows us to retrieve data sequences of interest to us in non-map-reduce environments. In this case, we have used the transaction model described because of the issue of apparently not being able to easily retrieve all versions of two different columns in a row, along with timestamps, with the get(), getRow(), and obtainScanner(), methods the HTable v0.1.x native client. By including a "key:timestamp" column in the "key" column- family, we can get an explicit timestamp value that can be combined with the row key value as a key for fast retrieval of column values in the "value" column-family. In that sense we are using a column key (the "timestamp" column), which just happens to be a timestamp, to access other other column values in the row.

Which actually brings me to another question. I think our model highlights something about "timestamps" and "versions" which is not quite clear. Namely, the Google figure and the model I've described sort of implicitly assumes that row key + timestamp forms a unique key for an entry in each column. In everything else I can find, "versions" generally refer to a set of entries in a column (row-column cell) where each value has a unique timestamp. But if timestamp is not stored with sufficient resolution, e.g. 1 sec resolution, one can postulate that a situation in which two put()s close enough in time that two entries in a cells could in theory have the same row key + timestamp. This suggests that set of versions of a cell are not 1-1 with the set of unique timestamps on the cell entries. Put another way, could one retrieve X versions of a row-column cell, but only find Y < X unique timestamps on those X versions of the cell? Is there any additional explanation about versions viz a viz timestamps you can point me to that helps sort this out better?

So finally, given all this, are you thinking about other ways to use column keys rather than timestamps we should be considering?

Thanks,
Rick

On Jul 21, 2008, at 8:23 AM, Jean-Daniel Cryans wrote:

Rick,

Yes and yes, but why not using the column keys instead of the timestamps?

J-D

On Sat, Jul 19, 2008 at 1:47 AM, Rick Hangartner <[EMAIL PROTECTED] >
wrote:

Hi,

This is a question about the HBase transaction model.

Suppose I have a table with two columns "c1" and "c2". Now assume for each timestamp in each row, I have a entry in each column. That is, assume the table is ALWAYS written such that it is "dense" (e.g. a complete relation)
rather than sparse (e.g. a partial relation) using the transaction
semantics:

  HTable table = new HTable(conf, new Text("test"));
  static final Text rowId = new Text("row_num");
  static final Text col1Id = new Text("c1");
  static final Text col2Id = new Text("c2");

  long lockid = table.startUpdate(rowId);

  // always write all columns in a row
  table.put( lockid, col1Id, val );
  table.put( lockid, col2Id, val );

  table.commit( lockid, timestamp );

Also assume that  the two columns are read as something like:

  byte[][] c1Vals = table.get( rowId, col1Id, versions );
  byte[][] c2Vals = table.get( rowId, col2Id, versions );


Is it guaranteed that for each index value i, c1Vals[i][] and c2Vals[i][] are the two column entries originally written with the same timestamp?

Also, is something like:

  byte[][] c1Vals = table.get( rowId, col1Id, MAX_VALUE );

sufficient to guarantee all versions are returned in the "get" operations?

Thanks,
Rick




Reply via email to