On 11 September 2012 00:29, Sujee Maniyam <su...@sujee.net> wrote: > HI devs > now that hfds HA is is a reality, how about HDFS spanning multiple > data centers? Are there any discussions / work going on in this area? > > It could be a single cluster spanning multiple data centers or having > a 'standby cluster' in another data center. > > curious, and thanks for your time! > > regards > Sujee Maniyam > http://sujee.net
what are your goals here? - store 1 of the 3 replicas off-site for (possible) recovery on a site failure - store 2+ replicas on each site for better recovery of site+block failure - be able to back up all of the data to a different site - be able to back up some the data to a different site - stream the metadata/NN log to a remote site (you could get away with that today Or do you plan to have different data across sites and then run MR jobs across them? This would be an interesting problem, but its way above the FS. There's still a lot of work that could be done for single-site failure tolerance, in particular -better failure topology awareness, if you run the site on two external power supplies -as telcos do- then you want at least one copy on each power source -better partition failure awareness -differentiate "loss of rack" differently from "all the machines on rack have stopped reporting in", which is how it is treated today, -steve