John Vines created ACCUMULO-2232:
------------------------------------
Summary: Combiners can cause deleted data to come back
Key: ACCUMULO-2232
URL: https://issues.apache.org/jira/browse/ACCUMULO-2232
Project: Accumulo
Issue Type: Bug
Components: client, tserver
Reporter: John Vines
The case-
3 files with-
* 1 with a key, k, with timestamp 0, value 3
* 1 with a delete of k with timestamp 1
* 1 with k with timestamp 2, value 2
The column of k has a summing combiner set on it. The issue here is that
depending on how the major compactions play out, differing values with result.
If all 3 files compact, the correct value of 2 will result. However, if 1 & 3
compact first, they will aggregate to 5. And then the delete will fall after
the combined value, resulting in the result 5 to persist.
First and foremost, this should be documented. I think to remedy this,
combiners should only be used on full MajC, not not full ones. This may
necessitate a special flag or a new combiner that implemented the proper
semantics.
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)