Hi Chathuri! - When we upgrade, does it change the namenode data structures and data nodes? I assume it only changes the name node...
It changes the NN as well as DN layout. As a matter of fact, this upgrade will take a long time on Datanodes as well because of https://issues.apache.org/jira/browse/HDFS-6482 - What are the risks with this upgrade ? What Hadoop applications do you run on top of your cluster? The hope is that everything continues working smoothly for the most part, but inevitably some backward incompatible changes creep in. - Is there a place where I can review the changes made to file system from 2.5.1 to 2.7.2? The release notes. http://hadoop.apache.org/releases.html .You'd have to accumulate all the changes in the versions. Practically, I'd try to run my application on your upgraded test cluster. HTH Ravi On Wed, Mar 23, 2016 at 12:17 PM, Chathuri Wimalasena <[email protected]> wrote: > Hi, > > We have a hadoop production deployment with 1 name node and 10 data nodes > which has more than 20TB of data in HDFS. We are currently using Hadoop > 2.5.1 and we want to update it to latest Hadoop version, 2.7.2. > > I followed the following link ( > https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HdfsRollingUpgrade.html) > and updated a single node system running in pseudo distributed mode and it > went without any issues. But this system did not have that much data as the > production system. > > Since this is a production system, I'm reluctant to do this update. I > would like to see what other people have done in these cases and their > experiences... Here are few questions I have.. > > - When we upgrade, does it change the namenode data structures and > data nodes? I assume it only changes the name node... > - What are the risks with this upgrade ? > - Is there a place where I can review the changes made to file system > from 2.5.1 to 2.7.2? > > I would really appreciate if you can share your experiences. > > Thanks in advance, > Chathuri >
