Keep the meta data backup before upgrade. Preferably on local machine. Do not 
finalize upgrade until you are OK with data availability 
Musty 

Sent from Yahoo Mail on Android 
 
  On Wed, Mar 23, 2016 at 7:09 PM, Ravi Prakash<[email protected]> wrote:   
Hi Chathuri!

Technically there is a rollback option during upgrade. I don't know how well it 
has been tested, but the idea is that old metadata is not deleted until the 
cluster administrator says $ hdfs dfsadmin -finalizeUpgrade . I'm fairly 
confident that the HDFS upgrade will work smoothly. We have upgraded quite a 
few Hadoop-2.4.1 clusters to Hadoop-2.7.1 successfully (never having to roll 
back). Its your applications that work on top of HDFS and YARN that I'd be 
concerned about.

HTH
Ravi

On Wed, Mar 23, 2016 at 2:22 PM, Chathuri Wimalasena <[email protected]> 
wrote:

Thanks for information Ravi. Is there a way that I can back up data before the  
update ? I was thinking about this approach..
Copy the current hadoop directories to a new set of directories.Point hadoop to 
this new setStart the migration with the backup set
Please let me know if people have done this upgrade successfully. I believe 
many things can go wrong in a lengthy upgrade like this. The data in the 
cluster is very important. Thanks,Chathuri
On Wed, Mar 23, 2016 at 4:37 PM, Ravi Prakash <[email protected]> wrote:

Hi Chathuri!
   
   - When we upgrade, does it change the namenode data structures and data 
nodes? I assume it only changes the name node...

It changes the NN as well as DN layout. As a matter of fact, this upgrade will 
take a long time on Datanodes as well because of 
https://issues.apache.org/jira/browse/HDFS-6482

   
   - What are the risks with this upgrade ?    


What Hadoop applications do you run on top of your cluster? The hope is that 
everything continues working smoothly for the most part, but inevitably some 
backward incompatible changes creep in. 

   
   - Is there a place where I can review the changes made to file system from 
2.5.1 to 2.7.2?

The release notes. http://hadoop.apache.org/releases.html .You'd have to 
accumulate all the changes in the versions. 


Practically, I'd try to run my application on your upgraded test cluster.

HTH


Ravi


On Wed, Mar 23, 2016 at 12:17 PM, Chathuri Wimalasena <[email protected]> 
wrote:

Hi, 
We have a hadoop production deployment with 1 name node and 10 data nodes which 
has more than 20TB of data in HDFS. We are currently using Hadoop 2.5.1 and we 
want to update it to latest Hadoop version, 2.7.2. 
I followed the following link 
(https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HdfsRollingUpgrade.html)
 and updated a single node system running in pseudo distributed mode and it 
went without any issues. But this system did not have that much data as the 
production system. 
Since this is a production system, I'm reluctant to do this update. I would 
like to see what other people have done in these cases and their experiences... 
Here are few questions I have..   
   - When we upgrade, does it change the namenode data structures and data 
nodes? I assume it only changes the name node...
   - What are the risks with this upgrade ? 
   - Is there a place where I can review the changes made to file system from 
2.5.1 to 2.7.2?
I would really appreciate if you can share your experiences.
Thanks in advance,Chathuri





  

Reply via email to