[ 
https://issues.apache.org/jira/browse/HBASE-8521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jean-Marc Spaggiari updated HBASE-8521:
---------------------------------------

    Attachment: HBASE-8521-v1-0.94.patch

Small update.

I have defaulted the assignSeqIds to false in LoadIncrementalHFiles so if no 
additionnal parameter is used, it will behave the same way as before.

Also, if assignSeqIds is false, I'm using the previous existing method 
signatures in the serverCallable to make sure new client -> old server is still 
compatible.

If someone specificaly turn assignSeqIds to true, then they are most probably 
aware of this new feature and will most probably also know that they need to 
upgrade the server side too.

What I can add is a message in the console when this is turned to true to 
informe that this is working only with a server >= 0.94.8.

Again, here are the tests results of this change:
{code}
Tests in error: 
  testBasic(org.apache.hadoop.hbase.regionserver.TestRSStatusServlet): 
Unresolved compilation problems: (..)
  testWithRegions(org.apache.hadoop.hbase.regionserver.TestRSStatusServlet): 
Unresolved compilation problems: (..)
  testGetEmpty(org.apache.hadoop.hbase.avro.TestAvroUtil): Unresolved 
compilation problems: (..)

Tests run: 682, Failures: 0, Errors: 3, Skipped: 0
{code}
I have compilation errors in my repository for thos 2 classes, not related to 
the changes I made.

{code}
bin/hbase -Dhbase.mapreduce.bulkload.assign.sequenceNumbers=true 
org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles familyDir1 test

hbase(main):001:0> scan 'test'
ROW                                                   COLUMN+CELL               
                                                                                
                                                   
 aaaa                                                 column=myfam:myqual, 
timestamp=1368157470713, value=oldVal                                           
                                                        
1 row(s) in 0.3860 seconds


bin/hbase -Dhbase.mapreduce.bulkload.assign.sequenceNumbers=true 
org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles familyDir2 test


hbase(main):001:0> scan 'test'
ROW                                                   COLUMN+CELL               
                                                                                
                                                   
 aaaa                                                 column=myfam:myqual, 
timestamp=1368157470713, value=newVal                                           
                                                        
1 row(s) in 0.3800 seconds
{code}


And without the parameter...

{code}
echo "create 'test', {NAME => 'myfam', VERSIONS => 100000, TTL => 1000000000}" 
| bin/hbase shell

bin/hbase org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles familyDir1 
test
bin/hbase org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles familyDir2 
test

echo "scan 'test'" | bin/hbase shell
HBase Shell; enter 'help<RETURN>' for list of supported commands.
Type "exit<RETURN>" to leave the HBase Shell
Version 0.94.8-SNAPSHOT, r, Thu May 16 08:43:22 EDT 2013

scan 'test'
ROW                                                   COLUMN+CELL               
                                                                                
                                                   
 aaaa                                                 column=myfam:myqual, 
timestamp=1368157470713, value=oldVal  
{code}

                
> Cells cannot be overwritten with bulk loaded HFiles
> ---------------------------------------------------
>
>                 Key: HBASE-8521
>                 URL: https://issues.apache.org/jira/browse/HBASE-8521
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.92.1
>            Reporter: Jonathan Natkins
>         Attachments: HBASE-8521.diff, HBASE-8521-v0-0.94.patch, 
> HBASE-8521-v1-0.94.patch, hfileDirs.tar.gz
>
>
> Let's say you have a pre-built HFile that contains a cell:
> ('rowkey1', 'family1', 'qual1', 1234L, 'value1')
> We bulk load this first HFile. Now, let's create a second HFile that contains 
> a cell that overwrites the first:
> ('rowkey1', 'family1', 'qual1', 1234L, 'value2')
> That gets bulk loaded into the table, but the value that HBase bubbles up is 
> still 'value1'.
> It seems that there's no way to overwrite a cell for a particular timestamp 
> without an explicit put operation. This seems to be the case even after minor 
> and major compactions happen.
> My guess is that this is pretty closely related to the sequence number work 
> being done on the compaction algorithm via HBASE-7842, but I'm not sure if 
> one of would fix the other.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to