Hi to everybody, in my use case I have to perform batch analysis skipping old data. For example, I want to process all rows created after a certain timestamp, passed as parameter.
What is the most effective way to do this? Should I design my row-key to embed timestamp? Or just filtering by timestamp of the row is fast as well? Or what else? Initially I was thinking to compose my key as: timestamp|source|title|type but: 1) Using timestamp in row-keys is discouraged 2) If this design is ok, using this approach I still have problems filtering by timestamp because I cannot found a way to numerically filer (instead of alphanumerically/by string). Example: 1372776400441|something has timestamp lesser than 1372778470913|somethingelse but I cannot filter all row whose key is "numerically" greater than 1372776400441. Is it possible to overcome this issue? 3) If this design is not ok, should I filter by a simpler row-key plus a filter on timestamp? Or what else? Best, Flavio
