[
https://issues.apache.org/jira/browse/HDFS-5442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13848745#comment-13848745
]
Jerry Chen commented on HDFS-5442:
----------------------------------
We are working on the this and committing to get the initial patch out soon.
> Zero loss HDFS data replication for multiple datacenters
> --------------------------------------------------------
>
> Key: HDFS-5442
> URL: https://issues.apache.org/jira/browse/HDFS-5442
> Project: Hadoop HDFS
> Issue Type: Improvement
> Reporter: Avik Dey
> Attachments: Disaster Recovery Solution for Hadoop.pdf
>
>
> Hadoop is architected to operate efficiently at scale for normal hardware
> failures within a datacenter. Hadoop is not designed today to handle
> datacenter failures. Although HDFS is not designed for nor deployed in
> configurations spanning multiple datacenters, replicating data from one
> location to another is common practice for disaster recovery and global
> service availability. There are current solutions available for batch
> replication using data copy/export tools. However, while providing some
> backup capability for HDFS data, they do not provide the capability to
> recover all your HDFS data from a datacenter failure and be up and running
> again with a fully operational Hadoop cluster in another datacenter in a
> matter of minutes. For disaster recovery from a datacenter failure, we should
> provide a fully distributed, zero data loss, low latency, high throughput and
> secure HDFS data replication solution for multiple datacenter setup.
> Design and code for Phase-1 to follow soon.
--
This message was sent by Atlassian JIRA
(v6.1.4#6159)