Help in designing row key

Flavio Pompermaier Tue, 02 Jul 2013 09:15:31 -0700

Hi to everybody,

in my use case I have to perform batch analysis skipping old data.
For example, I want to process all rows created after a certain timestamp,
passed as parameter.


What is the most effective way to do this?
Should I design my row-key to embed timestamp?
Or just filtering by timestamp of the row is fast as well? Or what else?

Initially I was thinking to compose my key as:
timestamp|source|title|type

but:

1) Using timestamp in row-keys is discouraged
2) If this design is ok, using this approach I still have problems
filtering by timestamp because I cannot found a way to numerically filer
(instead of alphanumerically/by string). Example:
1372776400441|something has timestamp lesser
than 1372778470913|somethingelse but I cannot filter all row whose key is
"numerically" greater than 1372776400441. Is it possible to overcome this
issue?
3) If this design is not ok, should I filter by a simpler row-key plus a
filter on timestamp? Or what else?

Best,
Flavio

Help in designing row key

Reply via email to