[ 
https://issues.apache.org/jira/browse/ACCUMULO-2232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13879052#comment-13879052
 ] 

Keith Turner commented on ACCUMULO-2232:
----------------------------------------

bq.  I think to remedy this, combiners should only be used on full MajC, not 
not full ones.

+1

bq.  This may necessitate a special flag or a new combiner that implemented the 
proper semantics.

Why not just modify current combiner to only run on full majc?

If there is a special flag that allows running the combiner for partial 
compactions, it could fail if it detects a delete.  However there are other 
ways to delete and this sanity check would not cover the case of row deletion, 
row filtering, etc.   

Seems like this should be fixed in 1.4, 1.5, and 1.6.

> Combiners can cause deleted data to come back
> ---------------------------------------------
>
>                 Key: ACCUMULO-2232
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-2232
>             Project: Accumulo
>          Issue Type: Bug
>          Components: client, tserver
>            Reporter: John Vines
>
> The case-
> 3 files with-
> * 1 with a key, k, with timestamp 0, value 3
> * 1 with a delete of k with timestamp 1
> * 1 with k with timestamp 2, value 2
> The column of k has a summing combiner set on it. The issue here is that 
> depending on how the major compactions play out, differing values with 
> result. If all 3 files compact, the correct value of 2 will result. However, 
> if 1 & 3 compact first, they will aggregate to 5. And then the delete will 
> fall after the combined value, resulting in the result 5 to persist.
> First and foremost, this should be documented. I think to remedy this, 
> combiners should only be used on full MajC, not not full ones. This may 
> necessitate a special flag or a new combiner that implemented the proper 
> semantics.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to