Under replicated blocks are also consistent from a consumers point. Care of explain the relation to weak consistency to hadoop.
Thanks, Rahul On Wed, Sep 4, 2013 at 9:56 AM, Rahul Bhattacharjee <[email protected] > wrote: > Adam's response makes more sense to me to offline replicate generated data > from one cluster to another across data centers. > > Not sure if configurable block placement block placement policy is > supported in Hadoop.If yes , then alone side with rack awareness , you > should be able to achieve the same. > > I could not follow your question related to weak consistency. > > Thanks, > Rahul > > > > On Wed, Sep 4, 2013 at 2:20 AM, Baskar Duraikannu < > [email protected]> wrote: > >> Rahul >> Are you talking about rack-awareness script? >> >> I did go through rack awareness. Here are the problems with rack >> awareness w.r.to my (given) "business requirment" >> >> 1. Hadoop , default places two copies on the same rack and 1 copy on >> some other rack. This would work as long as we have two data centers. if >> business wants to have three data centers, then data would not be spread >> across. Separately there is a question around whether it is the right thing >> to do or not. I have been promised by business that they would buy enough >> bandwidth such that each data center will be few milliseconds apart (in >> latency). >> >> 2. I believe Hadoop automatically re-replicates data if one or more node >> is down. Assume when one out of 2 data center goes down. There will be a >> massive data flow to create additional copies. When I say data center >> support, I should be able to configure hadoop to say >> a) Maintain 1 copy per data center >> b) If any data center goes down, dont create additional copies. >> >> Above requirements that I am pointing will essentially move hadoop from >> strongly consistent to a week/eventual consistent model. Since this changes >> fundamental architecture, it will probably break all sort of things... >> Might not be possible ever in Hadoop. >> >> Thoughts? >> >> Sadak >> Is there a way to implement above requirement via Federation? >> >> Thanks >> Baskar >> >> >> ------------------------------ >> Date: Sun, 1 Sep 2013 00:20:04 +0530 >> >> Subject: Re: Multidata center support >> From: [email protected] >> To: [email protected] >> >> >> What do you think friends I think hadoop clusters can run on multiple >> data centers using FEDERATION >> >> >> On Sat, Aug 31, 2013 at 8:39 PM, Visioner Sadak <[email protected] >> > wrote: >> >> The only problem i guess hadoop wont be able to duplicate data from one >> data center to another but i guess i can identify data nodes or namenodes >> from another data center correct me if i am wrong >> >> >> On Sat, Aug 31, 2013 at 7:00 PM, Visioner Sadak <[email protected] >> > wrote: >> >> lets say that >> >> you have some machines in europe and some in US I think you just need >> the ips and configure them in your cluster set up >> it will work... >> >> >> On Sat, Aug 31, 2013 at 7:52 AM, Jun Ping Du <[email protected]> wrote: >> >> Hi, >> Although you can set datacenter layer on your network topology, it is >> never enabled in hadoop as lacking of replica placement and task scheduling >> support. There are some work to add layers other than rack and node under >> HADOOP-8848 but may not suit for your case. Agree with Adam that a cluster >> spanning multiple data centers seems not make sense even for DR case. Do >> you have other cases to do such a deployment? >> >> Thanks, >> >> Junping >> >> ------------------------------ >> *From: *"Adam Muise" <[email protected]> >> *To: *[email protected] >> *Sent: *Friday, August 30, 2013 6:26:54 PM >> *Subject: *Re: Multidata center support >> >> >> Nothing has changed. DR best practice is still one (or more) clusters per >> site and replication is handled via distributed copy or some variation of >> it. A cluster spanning multiple data centers is a poor idea right now. >> >> >> >> >> On Fri, Aug 30, 2013 at 12:35 AM, Rahul Bhattacharjee < >> [email protected]> wrote: >> >> My take on this. >> >> Why hadoop has to know about data center thing. I think it can be >> installed across multiple data centers , however topology configuration >> would be required to tell which node belongs to which data center and >> switch for block placement. >> >> Thanks, >> Rahul >> >> >> On Fri, Aug 30, 2013 at 12:42 AM, Baskar Duraikannu < >> [email protected]> wrote: >> >> We have a need to setup hadoop across data centers. Does hadoop support >> multi data center configuration? I searched through archives and have found >> that hadoop did not support multi data center configuration some time back. >> Just wanted to see whether situation has changed. >> >> Please help. >> >> >> >> >> >> -- >> * >> * >> * >> * >> *Adam Muise* >> Solution Engineer >> *Hortonworks* >> [email protected] >> 416-417-4037 >> >> Hortonworks - Develops, Distributes and Supports Enterprise Apache >> Hadoop.<http://hortonworks.com/> >> >> Hortonworks Virtual Sandbox <http://hortonworks.com/sandbox> >> >> Hadoop: Disruptive Possibilities by Jeff >> Needham<http://hortonworks.com/resources/?did=72&cat=1> >> >> CONFIDENTIALITY NOTICE >> NOTICE: This message is intended for the use of the individual or entity >> to which it is addressed and may contain information that is confidential, >> privileged and exempt from disclosure under applicable law. If the reader >> of this message is not the intended recipient, you are hereby notified that >> any printing, copying, dissemination, distribution, disclosure or >> forwarding of this communication is strictly prohibited. If you have >> received this communication in error, please contact the sender immediately >> and delete it from your system. Thank You. >> >> >> >> >> >
