[ 
https://issues.apache.org/jira/browse/HBASE-5509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13223598#comment-13223598
 ] 

Karthik Ranganathan commented on HBASE-5509:
--------------------------------------------

@Zhihong Yu:
We use this code as the primary means to backup HFiles inside FB. We have done 
a lot of improvements to the DFS copy underneath, and they have caused some 
bugs, but thats unrelated to this code. Not too many issues, besides tuning the 
number of mappers to use so that we dont overwhelm a running system.

@Lars:
You are correct about getStoreFileList() - it is passed from commandline and it 
is overloaded for a subset/all CF's. Zhihong - the list versus a 
comma-separated string is a trivial point since the list construction has to 
happen either in the RS or in the caller, so should not make much of a 
difference practically.
                
> MR based copier for copying HFiles (trunk version)
> --------------------------------------------------
>
>                 Key: HBASE-5509
>                 URL: https://issues.apache.org/jira/browse/HBASE-5509
>             Project: HBase
>          Issue Type: Sub-task
>          Components: documentation, regionserver
>            Reporter: Karthik Ranganathan
>            Assignee: Lars Hofhansl
>             Fix For: 0.94.0, 0.96.0
>
>         Attachments: 5509-v2.txt, 5509.txt
>
>
> This copier is a modification of the distcp tool in HDFS. It does the 
> following:
> 1. List out all the regions in the HBase cluster for the required table
> 2. Write the above out to a file
> 3. Each mapper 
>    3.1 lists all the HFiles for a given region by querying the regionserver
>    3.2 copies all the HFiles
>    3.3 outputs success if the copy succeeded, failure otherwise. Failed 
> regions are retried in another loop
> 4. Mappers are placed on nodes which have maximum locality for a given region 
> to speed up copying

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to