Jean-Daniel Cryans created HBASE-7774:
-----------------------------------------

             Summary: RegionObserver.prePut() cannot rely on the Put's 
timestamps, can even cause data loss
                 Key: HBASE-7774
                 URL: https://issues.apache.org/jira/browse/HBASE-7774
             Project: HBase
          Issue Type: Bug
    Affects Versions: 0.94.4, 0.92.2, 0.96.0
            Reporter: Jean-Daniel Cryans
            Priority: Critical


We had a user that had code that looked like this in a coprocessor's prePut():

{code}
if (put.has(expectedKv))
  put.add(kvSayingIFoundIt);
else
  put.add(kvSayingNotFound);
{code}

If you have MSLAB turned *off*, and you have the {{expectedKv}} in your 
{{Put}}, doing a {{Get}} following your insert will only return 
{{kvSayingIFoundIt}} and not the KV you were actually inserting.

More so, if you only do {{put.has(expectedKv)}}, you will not get anything 
back. Your data seems to be gone.

The reason is that in {{prePut()}} the timestamp hasn't been set yet, so 
calling {{kv.getTimestamp()}} during the comparisons in {{put.has()}} will 
populate {{kv.timestampCache}} with {{Long.MAX_VALUE}}. Then it will stay in 
the {{MemStore}} with that big timestamp and be filtered out because 
{{TimeRange}} will compare {{Long.MAX_VALUE}} >= {{Long.MAX_VALUE}} and return 
{{SKIP}}.

And the reason it works correctly with MSLAB *on* is that the KV is cloned in 
{{maybeCloneWithAllocator()}} and the cache is reset.

Now, I think this has bigger implications. Basically, you can't rely on the 
timestamp at all in {{prePut()}}. I'm sure this can screw someone else in a 
creative way later.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to