Re: Multi node maintenance for HDFS?

Olivier Renault Thu, 16 Jun 2016 23:51:06 -0700

Sorry, it's not suitable, as I take the cluster down.

Thanks,
Olivier




On Fri, Jun 17, 2016 at 3:00 AM +0100, "Stephan Hoermann" 
<[email protected]<mailto:[email protected]>> wrote:

Hi Olivier,

Do your instructions apply for an online cluster? I couldn't see where it dealt 
with keeping the data online/available while the upgrade is happening.

I'm sorry I wasn't very precise. We don't just want to avoid data loss but also 
do the upgrade while the cluster is running and avoid unavailability of data. 
We only want 1 copy of each block offline while we are upgrading.

The ticket Anu linked to address seems to address our use case.

Thank you,

Stephan

On Fri, 17 Jun 2016 at 03:21 Olivier Renault 
<[email protected]<mailto:[email protected]>> wrote:
Hi Stephan,

It happens that I've been working on this during the last two days. It was much 
easier that I was expecting.

https://.hortonworks.com/articles/40126/hdp-upgrade-using-reinstallation.html<https://community.hortonworks.com/articles/40126/hdp-upgrade-using-reinstallation.html>

Let me know if you've got any questions

Kind regards,
Olivier

From: Stephan Hoermann <[email protected]<mailto:[email protected]>>
Date: Thursday, 16 June 2016 at 04:29

To: "[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Subject: Multi node maintenance for HDFS?

Hi,

How do people do multi node maintenance for HDFS without data loss?

We want to apply the ideas of immutable infrastructure to how we manage our 
machines. We prebuild an OS image with the configuration and roll it out to our 
nodes. When we have a patch we build a new image and roll that out again. It 
takes us about 10 to 15 minutes to do that.

For our data nodes we want to keep the data on a separate partition/disks so 
that when we rebuild we rejoin HDFS with the data don't start a replication 
storm.

Now in order to scale this and quickly roll out upgrades we can't really do a 
one node at a time upgrade so we need to be able to take out a percentage of 
the nodes at a time. Ideally we would like to do this while keeping the 
replication count of each block at 2 (so we can still handle failure while we 
are doing an upgrade) and without starting a replication strategy.

Right now it doesn't look like that is really supported. Is anyone else doing 
multi node upgrades and how do you solve these problems?

We are considering changing the replication strategy so that we divide all our 
nodes into 3 evenly sized buckets and at maintenance remove a subset from one 
bucket at a time. Does anyone have experience with doing something similar?

Regards,

Stephan

Re: Multi node maintenance for HDFS?

Reply via email to