[ 
https://issues.apache.org/jira/browse/HBASE-18693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16144865#comment-16144865
 ] 

Jingcheng Du commented on HBASE-18693:
--------------------------------------

HDFS move doesn't copy the data, right, it doesn't, it is supposed to be a
rename operation.
My concern is if we restore a snapshot twice which is possible, how to
handle such operations?

In HBase, we compact the hfile links in compaction, I think compacting
hfile links in MOB compaction is reasonable too.
Or we can skip the hfile links in most of MOB compaction, and compact the
links in a longer interval (like a month)?
I prefer the 1st option. What's your idea? Thanks.




> adding an option to restore_snapshot to move mob files from archive dir to 
> working dir
> --------------------------------------------------------------------------------------
>
>                 Key: HBASE-18693
>                 URL: https://issues.apache.org/jira/browse/HBASE-18693
>             Project: HBase
>          Issue Type: Improvement
>          Components: mob
>    Affects Versions: 2.0.0-alpha-2
>            Reporter: huaxiang sun
>            Assignee: huaxiang sun
>
> Today, there is a single mob region where mob files for all user regions are 
> saved. There could be many files (one million) in a single mob directory. 
> When one mob table is restored or cloned from snapshot, links are created for 
> these mob files. This creates a scaling issue for mob compaction. In mob 
> compaction's select() logic, for each hFileLink, it needs to call NN's 
> getFileStatus() to get the size of the linked hfile. Assume that one such 
> call takes 20ms, 20ms * 1000000 = 6 hours. 
> To avoid this overhead, we want to add an option so that restore_snapshot can 
> move mob files from archive dir to working dir. clone_snapshot is more 
> complicated as it can clone a snapshot to a different table so moving that 
> can destroy the snapshot. No option will be added for clone_snapshot.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to