[ 
https://issues.apache.org/jira/browse/ACCUMULO-2232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13879098#comment-13879098
 ] 

Sean Busbey commented on ACCUMULO-2232:
---------------------------------------

bq. If there is a special flag that allows running the combiner for partial 
compactions, it could fail if it detects a delete. However there are other ways 
to delete and this sanity check would not cover the case of row deletion, row 
filtering, etc.

I say a flag to allow running on partial compactions is reasonable, provided it 
comes with a strong warning about not seeing all rows (calling out this 
specific issue).

In that case, I think it should be up to the combiner to handle logic around 
deletes. I can think of combiner applications where it would make sense not to 
fail when there's a delete.

bq. Seems like this should be fixed in 1.4, 1.5, and 1.6.

+1, seems like an oversight that combiners didn't default to full majc

> Combiners can cause deleted data to come back
> ---------------------------------------------
>
>                 Key: ACCUMULO-2232
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-2232
>             Project: Accumulo
>          Issue Type: Bug
>          Components: client, tserver
>            Reporter: John Vines
>
> The case-
> 3 files with-
> * 1 with a key, k, with timestamp 0, value 3
> * 1 with a delete of k with timestamp 1
> * 1 with k with timestamp 2, value 2
> The column of k has a summing combiner set on it. The issue here is that 
> depending on how the major compactions play out, differing values with 
> result. If all 3 files compact, the correct value of 2 will result. However, 
> if 1 & 3 compact first, they will aggregate to 5. And then the delete will 
> fall after the combined value, resulting in the result 5 to persist.
> First and foremost, this should be documented. I think to remedy this, 
> combiners should only be used on full MajC, not not full ones. This may 
> necessitate a special flag or a new combiner that implemented the proper 
> semantics.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to