Hello Michael, Thank you for your response.
By the way, is it possible to set setMaxVersions per column family on a scan? /David On Thu, Nov 22, 2012 at 5:11 PM, Michael Segel <[email protected]>wrote: > IMHO, the best practice is not to do this. > > Its an abuse of versioning and if you really want to store temporal data, > make it part of the column name. > > > On Nov 22, 2012, at 7:55 AM, David Koch <[email protected]> wrote: > > > Hello, > > > > I was thinking of using versions with custom timestamps to store the > > evolution of a column value - as opposed to creating several (time_t, > > value_at_time_t) qualifier-value pairs. The value to be stored is a > single > > integer. Fast ad-hoc retrieval of multiple versions based on a row key + > > filter [1] (i.e through a web service) is important, the number of row > keys > > will be between 10^6 and 10^9. > > > > a) If the number of versions (timestamps) is moderate, can I expect > > read/filtering performance to be better than when using multiple > > qualifier/value pairs? > > b) For a larger number of versions, say 365, what if any precautions > should > > I take with respect to the HBase/table setup. > > > > I looked around a bit and found the following: > > > > The documentation [2] mentions that the maximum number of versions should > > not be too high ("in the hundreds"). The HBase o'Reilly book [3] on the > > other hand mentions that Facebook use(d) versions to store inbox messages > > in order. Clearly, the number of messages may grow quite large (>> 100). > Is > > [1] still valid with more recent versions of HBase? > > > > Thank you, > > > > /David > > > > [1] > > > http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/TimestampsFilter.html > > [2] http://hbase.apache.org/book/schema.versions.html > > [3] 1st edition, page 384 > >
