Unification of mvcc and seqId would pave way for solving such inconsistencies. See HBASE-8763
Cheers On Tue, May 27, 2014 at 3:16 AM, 冯宏华 <[email protected]> wrote: > Not sure whether there already had similar discussion on it, sorry for > re-raising if yes. > > 1. data inconsistency between master and peer clusters: > a). write a keyvalue KV1 at a specific coordinate (row, cf, col, ts) > with value V1 to master cluster A, and since there is active scanner while > flushing, KV1's memstoreTS not set to 0 in the resultant hfile F1 > b). write KV1 once again with the same coordinate (row, cf, col, ts) but > with different value V2, and no active scanner while flushing this time, > KV1's memstoreTS is set to 0 in the resultant hfile F2 > c). two KV1 are replicated to peer cluster serially, no active scanner > when flushing and they are flushed to two different hfiles both with > memstoreTS=0 > > now, a client reads KV1 from the master cluster will find the value is > V1 (since its memstoreTS is larger), and when it reads KV1 from peer > cluster will find the value is V2 (since memstoreTS are equal but the > latter's seqID is larger) > > 2. data inconsistency in different time phases: > a). write a keyvalue KV1 at a specific coordinate (row, cf, col, ts) > with value V1 to master cluster A, and since there is active scanner while > flushing, KV1's memstoreTS is not set to 0 in the resultant hfile F1 > b). write KV1 once again with the same coordinate (row, cf, col, ts) but > with different value V2, and no active scanner while flushing this time, > KV1's memstoreTS is set to 0 in the resultant hfile F2 > > reading KV1 now will find the value is V1 (since its memstoreTS is > larger) > > c). after a while a compact including F1(but not F2) occurs and KV1's > memstoreTS is set to 0 since no active scanner > > reading KV1 now will find the value is V2 (since memstoreTS are equal > but the latter's seqID is larger) > > Keeping mvcc untouched during a keyvalue's whole lifecycle (during > flush/compact, or failover/HLog-replay) can avoid above two kinds of data > inconsistency, any opinion? >
