Hi,

I have the problem that we have key in our cluster which exist double.
The keys have different timestamps.

I got notice of the keys, because we are replicating the data to another cluster
and in the target cluster we see only the keys with the newer timestamp.

we run major_compaction on regular basis in both cluster.

The table has VERSIONS => '1'

get 't1', "\x98\x04......", {COLUMN => 'd', VERSIONS => 5 }
timestamp=1442848394860, value=@\x83

get 't1', "\x98\x04......", {COLUMN => 'd', VERSIONS => 5, TIMESTAMP => 
1442569821452 }
timestamp=1442569821452, value=@\x83

I thought that after a 

flush 't1'
major_compact 't1'

the key with the old timestamp would be deleted because we have versions => 1

http://blog.cloudera.com/blog/2012/06/hbase-io-hfile-input-output/
"In a major compaction, deleted key/values are removed, this new file 
doesn’t contain the tombstone markers and all the duplicate key/values
(replace value operations) are removed."

But this does not happen.

After the flush and major_compaction of the table the keys are still there.

We use hbase: 0.94.2-cdh4.2.0, rUnknown

Why the are still there? Do i have to delete them manual?

Regards

Reply via email to