Re: what is considered as best / worst practice?

Dru Jensen Mon, 22 Dec 2008 10:10:02 -0800

JSON+

Question: Is it an acceptable design to use the timestamp as a dataelement?

I am currently adding the date to the column name and setting thenumber of versions in the table to 1.


Current:  htable.put('table','family:date', 'JSON');

What I would like to do is use the timestamp as a data element tostore the date of the entry and set the number of versions to infinite.


Proposed: htable.put ('table', 'family:', 'JSON', 'date');

Is this a good approach? Are there any gotcha's? Is there a way toget all of the versions for a row/column in a single call? I need tograph the results over time.


On Dec 21, 2008, at 8:11 AM, Andrew Purtell wrote:

I use JSON for exactly this. A simple row/column/timestamp
key leads to a compound structure encoding all of the object
attributes, or maybe arrays of objects, etc. At the scale
where HBase is an effective solution you need to
denormalize ("insert time join") for query efficiency anyhow,
and I can serve the results out as is. Most of the work then
is done in the mapreduce tasks that produce and store the
JSON encodings in batch. I also build several views of the
data into multiple tables -- materialized views basically.
At Hadoop/HBase scale, disk space is cheap, seek time is not.

Because of this query processing time is low enough that I
can serve them right out of HBase without needing an
intermediate caching layer such as memcached or Tokyo
Cabinet (jgray's favorite).

Re: what is considered as best / worst practice?

Reply via email to