This request has come up a few times now. We should dev a soln (I've made an issue to track it -- HBASE-2839).
You could try scanning whole table but my guess is this will prove too slow -- if not now, then later when your table grows. If results returned will be few, a scan that ran in parallel rather than in series tripping over each table region might make sense. https://issues.apache.org/jira/browse/HBASE-1935 discusses this and even has a patch though I'm sure it well stale at this point. St.Ack On Fri, Jul 16, 2010 at 8:16 AM, Mark Laffoon <[email protected]> wrote: > We have an existing product sitting on the hbase/hadoop ecosystem. We have > laid our object model on HBase: we have one table with a row per object, > and a separate table with composite index rows. Works great. We can > efficiently find our objects based on their type, relationships, etc. by > scanning the index table. We *never* scan the main table (except when > rebuilding the index). > > > > A new requirement just came in: get a list of all objects that have been > modified since <timestamp>. This has to happen "quickly" (user time). > > > > If we scan the main table with a timestamp restriction, will that be > efficient? Or do we have to introduce a new composite index that has the > last modified timestamp as part of it and scan that? > > > > Thanks, > > Mark > > > >
