Thanks for the explanation.

So if I don't care whether the newest row is on the top when doing a Scan,
then I don't need to bother using Reverse Timestamp of the Row Key?

For example, I am collecting news articles on a daily basis.
And each article is stored in Hbase, "using YearMonthDate + Title Hash" as
the Row Key.
I don't care how the articles are sorted as long as they are grouped by
YearMonthDate.
In this case, I don't need Reverse Timestamp.
Am I right on this one?

Ed

2011/7/22 Doug Meil <[email protected]>

>
> It's so that you can get the most recent entry with a Scan.  Assuming that
> the key-structure (as explained in the book) is something like
> <thing><rev-timestamp>.  And you are trying to quickly find the most
> recent entry for <thing>.
>
>
>
>
>
>
> On 7/22/11 3:18 AM, "edward choi" <[email protected]> wrote:
>
> >Hi,
> >I was studying Hbase with "Hadoop: The Definitive Guide".
> >There was a schema example that had as the row key, "Group Id + Reverse
> >Timestamp."
> >This way the same groups will be located near one another in the table.
> >Plus, within the same group, rows will be sorted so that the most recently
> >inserted row will be located at the first.
> >
> >The part I don't understand is, what is the advantage of using "Reverse
> >Timestamp" instead of just "Timestamp"?
> >Why place the newest row on the top?
> >I thought in Hbase, keys are searched by binary search. And in binary
> >search, the chronological order has no effect (at least that's how I
> >understand it).
> >So why put an extra step to reverse the timestamp?
> >
> >Any explanation will be much appreciated.
> >
> >Ed.
>
>

Reply via email to