Hi Anu, That was exactly what I was looking for! Do you have any idea when this will be released?
Thank you, Stephan On Fri, 17 Jun 2016 at 02:12 Anu Engineer <[email protected]> wrote: > Hi Stephan, > > > > AFAIK, this is not a solved problem in the version of Hadoop that you are > using. There is a JIRA that you might be interested in. > > Please look at the attached design documents of > https://issues.apache.org/jira/browse/HDFS-7541 and see > > if it addresses your problem. > > > > Disclaimer: I have never used the feature proposed/implemented in this > JIRA, > > however most of the developers are from twitter and hopefully this is > being used internally. > > > > Thanks > > Anu > > > > > > > > *From: *Stephan Hoermann <[email protected]> > *Date: *Wednesday, June 15, 2016 at 7:29 PM > *To: *"[email protected]" <[email protected]> > *Subject: *Multi node maintenance for HDFS? > > > > Hi, > > > > How do people do multi node maintenance for HDFS without data loss? > > > > We want to apply the ideas of immutable infrastructure to how we manage > our machines. We prebuild an OS image with the configuration and roll it > out to our nodes. When we have a patch we build a new image and roll that > out again. It takes us about 10 to 15 minutes to do that. > > > > For our data nodes we want to keep the data on a separate partition/disks > so that when we rebuild we rejoin HDFS with the data don't start a > replication storm. > > > > Now in order to scale this and quickly roll out upgrades we can't really > do a one node at a time upgrade so we need to be able to take out a > percentage of the nodes at a time. Ideally we would like to do this while > keeping the replication count of each block at 2 (so we can still handle > failure while we are doing an upgrade) and without starting a replication > strategy. > > > > Right now it doesn't look like that is really supported. Is anyone else > doing multi node upgrades and how do you solve these problems? > > > > We are considering changing the replication strategy so that we divide all > our nodes into 3 evenly sized buckets and at maintenance remove a subset > from one bucket at a time. Does anyone have experience with doing something > similar? > > > > Regards, > > > > Stephan >
