Hi HBase users, We have created a index table (say T2) of another table (say t1). The clients who write to T1 also write a index record to T2 with the same timestamp. There may be accumulated inconsistency as time goes by. So we run a MR job periodically, which fully scans T1, builds a index, and bulk-loads the result to T2.
Because the MR job may be running for a while, during the period of which, all new data into T2 must be kept and not be overridden. So the MR creates puts using the timestamp the job starts. Then we want all data in T2 before a given timestamp to invisible for read after the index builds successfully and get deleted eventually (e.g. during major compaction). We prefer setting it explicitly than using the TTL feature for safety, as we want only old data are deleted only when the new data is written. Does HBase support this kind of operation for now? Thanks, Chao
