[ 
https://issues.apache.org/jira/browse/HBASE-26451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17446094#comment-17446094
 ] 

Peter Somogyi commented on HBASE-26451:
---------------------------------------

Hi [~xdosis],

Import will add all the HFiles to HBase with the timestamps it was originally 
inserted with. HBase will _"merge"_ all the HFiles when you run a request and 
gets you the _latest_ data. Since you had the same HFiles twice the only 
difference was the delete thumbstone for a single rowkey for which the 
timestamp was the latest so HBase did not give you back that row.

It could look something like this:

1. Original state in table:
||rowkey||column family||timestamp||value||
|r1|f1|1|a|
|r2|f1|2|b|
|r3|f1|3|c|

2. Run export
3. Delete r2
||rowkey||column family||timestamp||value||
|r1|f1|1|a|
|r2|f1|2|b|
|r3|f1|3|c|
|r2|f1|4|<delete thumbstone>|

4. Run import
||rowkey||column family||timestamp||value||
|r1|f1|1|a|
|r2|f1|2|b|
|r3|f1|3|c|
|r2|f1|4|<delete thumbstone>|
|r1|f1|1|a|
|r2|f1|2|b|
|r3|f1|3|c|

>From this if you run a scan in the table HBase will give you back only the 
>following:
||rowkey||column family||timestamp||value||
|r1|f1|1|a|
|r3|f1|3|c|

 

> Hbase Export/Import via MapReduce job
> -------------------------------------
>
>                 Key: HBASE-26451
>                 URL: https://issues.apache.org/jira/browse/HBASE-26451
>             Project: HBase
>          Issue Type: Bug
>          Components: backup&amp;restore
>    Affects Versions: 2.0.1
>            Reporter: Christos Dosis
>            Priority: Major
>
> Hi Hbase support team,
>  
> While using the MapReduce job with export/import commands we have the below 
> behaviour.
>  
> {+}Step1{+}:  I have a hbase table(tsdb) with 3 rows. and export the table 
> like below:
> /opt/hbase/bin/hbase org.apache.hadoop.hbase.mapreduce.Export tsdb 
> file:///opt/hbase/backup/tsdb
> {+}Step2{+}: Then I delete 1 row.
> {+}Step3{+}: Then I import tsdb table from the exported data from Step 1.
> /opt/hbase/bin/hbase org.apache.hadoop.hbase.mapreduce.Import tsdb 
> file:///opt/hbase/backup/tsdb
>  
> There are still only 2 rows in the table. Is this a valid behaviour?
>  
> Br,
> Chris



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to