Re: Help in designing row key

Ted Yu Tue, 02 Jul 2013 09:26:54 -0700

bq. Using timestamp in row-keys is discouraged

The above is true.
Prefixing row key with timestamp would create hot region.


bq. should I filter by a simpler row-key plus a filter on timestamp?

You can do the above.

On Tue, Jul 2, 2013 at 9:13 AM, Flavio Pompermaier <[email protected]>wrote:

> Hi to everybody,
>
> in my use case I have to perform batch analysis skipping old data.
> For example, I want to process all rows created after a certain timestamp,
> passed as parameter.
>
> What is the most effective way to do this?
> Should I design my row-key to embed timestamp?
> Or just filtering by timestamp of the row is fast as well? Or what else?
>
> Initially I was thinking to compose my key as:
> timestamp|source|title|type
>
> but:
>
> 1) Using timestamp in row-keys is discouraged
> 2) If this design is ok, using this approach I still have problems
> filtering by timestamp because I cannot found a way to numerically filer
> (instead of alphanumerically/by string). Example:
> 1372776400441|something has timestamp lesser
> than 1372778470913|somethingelse but I cannot filter all row whose key is
> "numerically" greater than 1372776400441. Is it possible to overcome this
> issue?
> 3) If this design is not ok, should I filter by a simpler row-key plus a
> filter on timestamp? Or what else?
>
> Best,
> Flavio
>

Re: Help in designing row key

Reply via email to