[
https://issues.apache.org/jira/browse/HBASE-18693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16147677#comment-16147677
]
huaxiang sun commented on HBASE-18693:
--------------------------------------
Hi Jingcheng,
{quote}
Restoring a snapshot to the same table is okay. What if we try to restore the
snapshot in another table? The same MOB file can be in different locations? No,
right?
{quote}
I got what was your concern. restore_snapshot always restores to the same
table, that is why I add an option here. clone_snapshot is a different story,
it can be cloned to different tables. If the option is added to clone_snapshot,
it will corrupt the snapshot.
{quote}
You are right, this is a problem. How about select files with multiple threads,
each thread handle part of the files selection? Thanks.
{quote}
HBASE-17043 has been created for this effort. I think this is not enough and
overhead (pressure to NN). We need to give user an option in this case.
If this option looks good to you, I am going to post a patch.
Thanks
> adding an option to restore_snapshot to move mob files from archive dir to
> working dir
> --------------------------------------------------------------------------------------
>
> Key: HBASE-18693
> URL: https://issues.apache.org/jira/browse/HBASE-18693
> Project: HBase
> Issue Type: Improvement
> Components: mob
> Affects Versions: 2.0.0-alpha-2
> Reporter: huaxiang sun
> Assignee: huaxiang sun
>
> Today, there is a single mob region where mob files for all user regions are
> saved. There could be many files (one million) in a single mob directory.
> When one mob table is restored or cloned from snapshot, links are created for
> these mob files. This creates a scaling issue for mob compaction. In mob
> compaction's select() logic, for each hFileLink, it needs to call NN's
> getFileStatus() to get the size of the linked hfile. Assume that one such
> call takes 20ms, 20ms * 1000000 = 6 hours.
> To avoid this overhead, we want to add an option so that restore_snapshot can
> move mob files from archive dir to working dir. clone_snapshot is more
> complicated as it can clone a snapshot to a different table so moving that
> can destroy the snapshot. No option will be added for clone_snapshot.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)