[ 
https://issues.apache.org/jira/browse/HBASE-8521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13653527#comment-13653527
 ] 

Jonathan Natkins commented on HBASE-8521:
-----------------------------------------

Basically, my process for reproducing is this:

{noformat}
hadoop fs -put familyDir1 familyDir1
hadoop fs -put familyDir2 familyDir2

create 'test', {NAME => 'myfam', VERSIONS => 100000, TTL => 1000000000}

hbase org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles 
hdfs://localhost:8020/user/natty/familyDir1 test
hbase org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles 
hdfs://localhost:8020/user/natty/familyDir2 test
{noformat}

The result after this set of operations is:

{noformat}
1.9.2p320 :001 > scan 'test'
ROW                                           COLUMN+CELL                       
                                                                                
                   
 aaaa                                         column=myfam:myqual, 
timestamp=1368157470713, value=oldVal                                           
                                
1 row(s) in 0.6260 seconds
{noformat}

If I only load familyDir2, the output is this:

{noformat}
1.9.2p320 :001 > scan 'test'
ROW                                           COLUMN+CELL                       
                                                                                
                   
 aaaa                                         column=myfam:myqual, 
timestamp=1368157470713, value=newVal                                           
                                
1 row(s) in 0.5930 seconds
{noformat}
                
> Cells cannot be overwritten with bulk loaded HFiles
> ---------------------------------------------------
>
>                 Key: HBASE-8521
>                 URL: https://issues.apache.org/jira/browse/HBASE-8521
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.92.1
>            Reporter: Jonathan Natkins
>         Attachments: hfileDirs.tar.gz
>
>
> Let's say you have a pre-built HFile that contains a cell:
> ('rowkey1', 'family1', 'qual1', 1234L, 'value1')
> We bulk load this first HFile. Now, let's create a second HFile that contains 
> a cell that overwrites the first:
> ('rowkey1', 'family1', 'qual1', 1234L, 'value2')
> That gets bulk loaded into the table, but the value that HBase bubbles up is 
> still 'value1'.
> It seems that there's no way to overwrite a cell for a particular timestamp 
> without an explicit put operation. This seems to be the case even after minor 
> and major compactions happen.
> My guess is that this is pretty closely related to the sequence number work 
> being done on the compaction algorithm via HBASE-7842, but I'm not sure if 
> one of would fix the other.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to