[
https://issues.apache.org/jira/browse/HDFS-5442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13855361#comment-13855361
]
Jerry Chen commented on HDFS-5442:
----------------------------------
Hi LiuLei,
Thanks for your comemnts.
We considered also the "one cluster" apprach purely on HA concept when we were
designing the approach. While we see a clear drive to evolve to "multiple
clusters" approach based on same facilities such as HA. The core facility that
HA provided for namespace is jouraling and tailing. And jouraling and tailing
can also be used across clusters namespace replication other than limited it in
a single cluster. Just as we mentioned, we think it is important to have clear
communication and collaboration boundary between the datacenters. In addition,
we also want to support sync namespace journaling which prevent namespace data
loss when primary cluster is down.
By the way, we improved the design in journaling workflow recently and the
doucment is updated. Please feel free to give your comments if you feel
interested.
> Zero loss HDFS data replication for multiple datacenters
> --------------------------------------------------------
>
> Key: HDFS-5442
> URL: https://issues.apache.org/jira/browse/HDFS-5442
> Project: Hadoop HDFS
> Issue Type: Improvement
> Reporter: Avik Dey
> Attachments: Disaster Recovery Solution for Hadoop.pdf, Disaster
> Recovery Solution for Hadoop.pdf
>
>
> Hadoop is architected to operate efficiently at scale for normal hardware
> failures within a datacenter. Hadoop is not designed today to handle
> datacenter failures. Although HDFS is not designed for nor deployed in
> configurations spanning multiple datacenters, replicating data from one
> location to another is common practice for disaster recovery and global
> service availability. There are current solutions available for batch
> replication using data copy/export tools. However, while providing some
> backup capability for HDFS data, they do not provide the capability to
> recover all your HDFS data from a datacenter failure and be up and running
> again with a fully operational Hadoop cluster in another datacenter in a
> matter of minutes. For disaster recovery from a datacenter failure, we should
> provide a fully distributed, zero data loss, low latency, high throughput and
> secure HDFS data replication solution for multiple datacenter setup.
> Design and code for Phase-1 to follow soon.
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)