[
https://issues.apache.org/jira/browse/HBASE-8521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jean-Marc Spaggiari updated HBASE-8521:
---------------------------------------
Attachment: HBASE-8521-v1-0.94.patch
Small update.
I have defaulted the assignSeqIds to false in LoadIncrementalHFiles so if no
additionnal parameter is used, it will behave the same way as before.
Also, if assignSeqIds is false, I'm using the previous existing method
signatures in the serverCallable to make sure new client -> old server is still
compatible.
If someone specificaly turn assignSeqIds to true, then they are most probably
aware of this new feature and will most probably also know that they need to
upgrade the server side too.
What I can add is a message in the console when this is turned to true to
informe that this is working only with a server >= 0.94.8.
Again, here are the tests results of this change:
{code}
Tests in error:
testBasic(org.apache.hadoop.hbase.regionserver.TestRSStatusServlet):
Unresolved compilation problems: (..)
testWithRegions(org.apache.hadoop.hbase.regionserver.TestRSStatusServlet):
Unresolved compilation problems: (..)
testGetEmpty(org.apache.hadoop.hbase.avro.TestAvroUtil): Unresolved
compilation problems: (..)
Tests run: 682, Failures: 0, Errors: 3, Skipped: 0
{code}
I have compilation errors in my repository for thos 2 classes, not related to
the changes I made.
{code}
bin/hbase -Dhbase.mapreduce.bulkload.assign.sequenceNumbers=true
org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles familyDir1 test
hbase(main):001:0> scan 'test'
ROW COLUMN+CELL
aaaa column=myfam:myqual,
timestamp=1368157470713, value=oldVal
1 row(s) in 0.3860 seconds
bin/hbase -Dhbase.mapreduce.bulkload.assign.sequenceNumbers=true
org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles familyDir2 test
hbase(main):001:0> scan 'test'
ROW COLUMN+CELL
aaaa column=myfam:myqual,
timestamp=1368157470713, value=newVal
1 row(s) in 0.3800 seconds
{code}
And without the parameter...
{code}
echo "create 'test', {NAME => 'myfam', VERSIONS => 100000, TTL => 1000000000}"
| bin/hbase shell
bin/hbase org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles familyDir1
test
bin/hbase org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles familyDir2
test
echo "scan 'test'" | bin/hbase shell
HBase Shell; enter 'help<RETURN>' for list of supported commands.
Type "exit<RETURN>" to leave the HBase Shell
Version 0.94.8-SNAPSHOT, r, Thu May 16 08:43:22 EDT 2013
scan 'test'
ROW COLUMN+CELL
aaaa column=myfam:myqual,
timestamp=1368157470713, value=oldVal
{code}
> Cells cannot be overwritten with bulk loaded HFiles
> ---------------------------------------------------
>
> Key: HBASE-8521
> URL: https://issues.apache.org/jira/browse/HBASE-8521
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.92.1
> Reporter: Jonathan Natkins
> Attachments: HBASE-8521.diff, HBASE-8521-v0-0.94.patch,
> HBASE-8521-v1-0.94.patch, hfileDirs.tar.gz
>
>
> Let's say you have a pre-built HFile that contains a cell:
> ('rowkey1', 'family1', 'qual1', 1234L, 'value1')
> We bulk load this first HFile. Now, let's create a second HFile that contains
> a cell that overwrites the first:
> ('rowkey1', 'family1', 'qual1', 1234L, 'value2')
> That gets bulk loaded into the table, but the value that HBase bubbles up is
> still 'value1'.
> It seems that there's no way to overwrite a cell for a particular timestamp
> without an explicit put operation. This seems to be the case even after minor
> and major compactions happen.
> My guess is that this is pretty closely related to the sequence number work
> being done on the compaction algorithm via HBASE-7842, but I'm not sure if
> one of would fix the other.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira