seems such theme(encode timestamp in rowKey) works only for newly put rows, but not for updated rows, since updated timestamp can't reflect in existing timestamp part of the rowkey. right?
no direct efficient way to achieve William's request for finding the latest updated (put new row, or update only certain columns in some rows, etc) rows. maybe a reverved/special(can't be used by user/application) row can help here: this row contains a single cell, the timestamp is the latest put/updated timestamp and the value is the latest put/updated rowKey, each time a put/update occurs, this reserved/special row is updated concurrently to record the put/updated row as the value and together the latest put/update timestamp. but this need to resolve concurrent writes from various different clients. it's ok if there is always a single client, otherwise the write to the reserved row should be a special checkAndPut which compares the timestamp to determine if to overwrite, such special checkAndPut introduces read for each write, hence hurts performance and serializes writes from various clients by the reserved/special row... ________________________________________ 发件人: Joshi, Rekha [[email protected]] 发送时间: 2014年1月21日 15:55 收件人: [email protected]; hbase-user 主题: Re: Finding the latest updated rows Hi Wiliam, The timestamp part of rowkey schema design caters to this., usually efficient but your SLA may differ. http://hbase.apache.org/book.html#reverse.timestamp http://hbase.apache.org/book.html#schema.casestudies http://hbase.apache.org/book.html#timeseries Thanks Rekha On 21/01/14 9:36 AM, "William Kang" <[email protected]> wrote: >Hi, >In HBase, the time stamp is set for each column, not for the entire row. >If >somehow I want to find the latest updated (put new row, or update only >certain columns in some rows, etc) rows, is there an efficient way to do >it? > >Many thanks. > > >William
