Hey Shawn, how exactly did you delete the column? There are three types of delete markers: family, column, version. Your observation would be consistent with having used a version delete marker, which just marks are a specific version (the latest by default) for delete.
Check out the HBase Reference Guide: http://hbase.apache.org/book.html#version.delete Also, if you don't mind the plug see a more detailed discussion here: http://hadoop-hbase.blogspot.com/2011/12/deletion-in-hbase.html -- Lars ----- Original Message ----- From: Shawn Quinn <[email protected]> To: [email protected] Cc: Sent: Tuesday, March 27, 2012 10:01 AM Subject: Still Seeing Old Data After a Delete Hello, In a couple of situations we were noticing some odd problems with old data appearing in the application, and I finally found a reproducible scenario. Here's what we're seeing in one basic case: 1. Using a scan in hbase shell one of our column cells (both the column name and value are simple long's) looks like so: column=thing:\x00\x00\x00\x00\x00\x00\x00\x02, timestamp=1332795701976, value=\x00\x00\x00\x00\x00\x00\x00s 2. If we then use a "Put" to update that cell to a new value it looks as we'd expect like so: column=thing:\x00\x00\x00\x00\x00\x00\x00\x02, timestamp=1332866682295, value=\x00\x00\x00\x00\x00\x00\x00u 3. If we then use a "Delete" to remove that column, instead of the column no longer being included in the scan we instead see the following again: column=thing:\x00\x00\x00\x00\x00\x00\x00\x02, timestamp=1332795701976, value=\x00\x00\x00\x00\x00\x00\x00s So, for some reason, at least in this case, the tombstone/delete marker doesn't appear to be preventing new scans from no longer seeing the old data. Note that this is a small development cluster of HBase (version: hbase-0.90.4-cdh3u2) which contains one master and three region servers, and I have confirmed that the clocks are synchronized properly between the four machines. Also note that we're using the Java client API to run the Put/Delete commands noted above. Any ideas on how old data could still appear in a Get/Scan like this, and if there are any workarounds we could try? I saw HBASE-4536, but after reading that thread it didn't seem pertinent to this more basic scenario. Thanks in advance for any pointers! -Shawn
