Avik Dey created HDFS-5442:
------------------------------

             Summary: Zero loss HDFS data replication for multiple datacenter
                 Key: HDFS-5442
                 URL: https://issues.apache.org/jira/browse/HDFS-5442
             Project: Hadoop HDFS
          Issue Type: Improvement
            Reporter: Avik Dey


Hadoop is architected to operate efficiently at scale for normal hardware 
failures within a datacenter. Hadoop is not designed today to handle datacenter 
failures. Although HDFS is not designed for nor deployed in configurations 
spanning multiple datacenters, replicating data from one location to another is 
common practice for disaster recovery and global service availability. There 
are current solutions available for batch replication using data copy/export 
tools. However, while providing some backup capability for HDFS data, they do 
not provide the capability to recover all your HDFS data from a datacenter 
failure and be up and running again with a fully operational Hadoop cluster in 
another datacenter in a matter of minutes. For disaster recovery from a 
datacenter failure, we should provide a fully distributed, zero data loss, low 
latency, high throughput and secure HDFS data replication solution for multiple 
datacenter setup.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to