[ 
https://issues.apache.org/jira/browse/HBASE-8521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13783517#comment-13783517
 ] 

Alexandre Normand commented on HBASE-8521:
------------------------------------------

I'll jump in late to the discussion to add a personal story. 

We're very much relying on this patch since we bulk load everything and our 
use-case depends on this. We're running with this patch at the moment and we're 
hoping not to lose it when upgrading. We put automated testing of our feature 
that relies on this and we spent a lot of time looking at the internals (fun 
times with the Hfile tool) to check that all was as expected. While the test 
doesn't provide a full guarantee that the behavior is as expected as opposed to 
still being a combination of non-deterministic behavior and luck, this runs 
every day and each day that the test doesn't fail increases our confidence.  

I'm very much +1. If there's one patch we ever needed from hbase, this is it. 

> Cells cannot be overwritten with bulk loaded HFiles
> ---------------------------------------------------
>
>                 Key: HBASE-8521
>                 URL: https://issues.apache.org/jira/browse/HBASE-8521
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.92.1
>            Reporter: Jonathan Natkins
>            Assignee: Jean-Marc Spaggiari
>         Attachments: HBASE-8521.diff, HBASE-8521-v0-0.94.patch, 
> HBASE-8521-v1-0.94.patch, HBASE-8521-v2-0.94.patch, HBASE-8521-v3-0.94.patch, 
> HBASE-8521-v4-0.94.patch, hfileDirs.tar.gz
>
>
> Let's say you have a pre-built HFile that contains a cell:
> ('rowkey1', 'family1', 'qual1', 1234L, 'value1')
> We bulk load this first HFile. Now, let's create a second HFile that contains 
> a cell that overwrites the first:
> ('rowkey1', 'family1', 'qual1', 1234L, 'value2')
> That gets bulk loaded into the table, but the value that HBase bubbles up is 
> still 'value1'.
> It seems that there's no way to overwrite a cell for a particular timestamp 
> without an explicit put operation. This seems to be the case even after minor 
> and major compactions happen.
> My guess is that this is pretty closely related to the sequence number work 
> being done on the compaction algorithm via HBASE-7842, but I'm not sure if 
> one of would fix the other.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to