[
https://issues.apache.org/jira/browse/HBASE-5509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13396510#comment-13396510
]
Karthik Ranganathan commented on HBASE-5509:
--------------------------------------------
@Lars - I ripped out some code which used the hardlinking - we have implemented
it internally. I believe we are planning on opensourcing this, otherwise you'd
have to wait for native hardlinks. The current copy approach still works though
for a few tens of TB's.
> MR based copier for copying HFiles (trunk version)
> --------------------------------------------------
>
> Key: HBASE-5509
> URL: https://issues.apache.org/jira/browse/HBASE-5509
> Project: HBase
> Issue Type: Sub-task
> Components: documentation, regionserver
> Reporter: Karthik Ranganathan
> Assignee: Lars Hofhansl
> Attachments: 5509-v2.txt, 5509.txt
>
>
> This copier is a modification of the distcp tool in HDFS. It does the
> following:
> 1. List out all the regions in the HBase cluster for the required table
> 2. Write the above out to a file
> 3. Each mapper
> 3.1 lists all the HFiles for a given region by querying the regionserver
> 3.2 copies all the HFiles
> 3.3 outputs success if the copy succeeded, failure otherwise. Failed
> regions are retried in another loop
> 4. Mappers are placed on nodes which have maximum locality for a given region
> to speed up copying
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira