[
https://issues.apache.org/jira/browse/HBASE-8521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13653527#comment-13653527
]
Jonathan Natkins commented on HBASE-8521:
-----------------------------------------
Basically, my process for reproducing is this:
{noformat}
hadoop fs -put familyDir1 familyDir1
hadoop fs -put familyDir2 familyDir2
create 'test', {NAME => 'myfam', VERSIONS => 100000, TTL => 1000000000}
hbase org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles
hdfs://localhost:8020/user/natty/familyDir1 test
hbase org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles
hdfs://localhost:8020/user/natty/familyDir2 test
{noformat}
The result after this set of operations is:
{noformat}
1.9.2p320 :001 > scan 'test'
ROW COLUMN+CELL
aaaa column=myfam:myqual,
timestamp=1368157470713, value=oldVal
1 row(s) in 0.6260 seconds
{noformat}
If I only load familyDir2, the output is this:
{noformat}
1.9.2p320 :001 > scan 'test'
ROW COLUMN+CELL
aaaa column=myfam:myqual,
timestamp=1368157470713, value=newVal
1 row(s) in 0.5930 seconds
{noformat}
> Cells cannot be overwritten with bulk loaded HFiles
> ---------------------------------------------------
>
> Key: HBASE-8521
> URL: https://issues.apache.org/jira/browse/HBASE-8521
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.92.1
> Reporter: Jonathan Natkins
> Attachments: hfileDirs.tar.gz
>
>
> Let's say you have a pre-built HFile that contains a cell:
> ('rowkey1', 'family1', 'qual1', 1234L, 'value1')
> We bulk load this first HFile. Now, let's create a second HFile that contains
> a cell that overwrites the first:
> ('rowkey1', 'family1', 'qual1', 1234L, 'value2')
> That gets bulk loaded into the table, but the value that HBase bubbles up is
> still 'value1'.
> It seems that there's no way to overwrite a cell for a particular timestamp
> without an explicit put operation. This seems to be the case even after minor
> and major compactions happen.
> My guess is that this is pretty closely related to the sequence number work
> being done on the compaction algorithm via HBASE-7842, but I'm not sure if
> one of would fix the other.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira