Not sure why you have timestamp in the key... assuming that message id would be 
incremented therefore rows would be in time order anyways. 

But to answer your question... 
You will want to use a separate table.

In both instances you will end up doing a full table scan, however the number 
of rows in a distinct user table would be much less than your user's table. 


HTH

-Mike

On Dec 20, 2012, at 8:55 AM, Shengjie Min <[email protected]> wrote:

> I have a hbase table called "users", rowkey consists of three parts:
> 
>   1. userid
>   2. messageid
>   3. timestamp
> 
> rowkey looks like: ${userid}_${messageid}_${timestamp}
> 
> Given I can hash the userid and make the length of the field fixed, is
> there anyway I can do a query like SQL query:
> 
> select distinct(userid) from users
> 
> If rowkey doesn't allow me to query like this, does that mean I need to
> create a separated table just contains all the user ids? I guess if I do
> something like that, it won't be atomic anymore when I insert a record in,
> becoz I am dealing with two tables without transaction.
> -- 
> All the best,
> Shengjie Min

Reply via email to