We have an existing product sitting on the hbase/hadoop ecosystem. We have laid our object model on HBase: we have one table with a row per object, and a separate table with composite index rows. Works great. We can efficiently find our objects based on their type, relationships, etc. by scanning the index table. We *never* scan the main table (except when rebuilding the index).
A new requirement just came in: get a list of all objects that have been modified since <timestamp>. This has to happen "quickly" (user time). If we scan the main table with a timestamp restriction, will that be efficient? Or do we have to introduce a new composite index that has the last modified timestamp as part of it and scan that? Thanks, Mark
