RE: Any reference for upgrade hadoop from 1.x to 2.2

Nirmal Kumar Mon, 02 Dec 2013 23:42:19 -0800

Hi All,

I am doing a test migration from Apache Hadoop-1.2.0 to Apache 
Hadoop-2.0.6-alpha on a single node environment.


I did the following:

*         Installed Apache Hadoop-1.2.0

*         Ran word count sample MR jobs. The jobs executed successfully.

*         I stop all the services in Apache Hadoop-1.2.0 and then was able to 
start all services again.

*         The previous submitted jobs are visible after the stop/start in the 
job tracker url.

Next I installed Apache Hadoop-2.0.6-alpha alongside.
I used the SAME data directory locations that were in Apache Hadoop-1.2.0 in 
the configuration files namely:
core-site.xml
----------------
$hadoop.tmp.dir                                              
/home/cloud/hadoop_migration/hadoop-data/tempdir

hdfs-site.xml
-----------------
$dfs.data.dir                                                      
/home/cloud/hadoop_migration/hadoop-data/data
$dfs.name.dir                                                    
/home/cloud/hadoop_migration/hadoop-data/name

I am UNABLE to start the NameNode from Apache Hadoop-2.0.6-alpha installation I 
am getting the error:

2013-12-03 18:28:23,941 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: 
loaded properties from hadoop-metrics2.properties
2013-12-03 18:28:24,080 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: 
Scheduled snapshot period at 10 second(s).
2013-12-03 18:28:24,081 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: 
NameNode metrics system started
2013-12-03 18:28:24,576 WARN 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Only one image storage 
directory (dfs.namenode.name.dir) configured. Beware of dataloss due to lack of 
redundant storage directories!
2013-12-03 18:28:24,576 WARN 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Only one namespace edits 
storage directory (dfs.namenode.edits.dir) configured. Beware of dataloss due 
to lack of redundant storage directories!
2013-12-03 18:28:24,744 INFO org.apache.hadoop.util.HostsFileReader: Refreshing 
hosts (include/exclude) list
2013-12-03 18:28:24,749 INFO 
org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager: 
dfs.block.invalidate.limit=1000
2013-12-03 18:28:24,762 INFO 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: 
dfs.block.access.token.enable=false
2013-12-03 18:28:24,762 INFO 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: defaultReplication  
       = 1
2013-12-03 18:28:24,762 INFO 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: maxReplication      
       = 512
2013-12-03 18:28:24,762 INFO 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: minReplication      
       = 1
2013-12-03 18:28:24,763 INFO 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: 
maxReplicationStreams      = 2
2013-12-03 18:28:24,763 INFO 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: 
shouldCheckForEnoughRacks  = false
2013-12-03 18:28:24,763 INFO 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: 
replicationRecheckInterval = 3000
2013-12-03 18:28:24,763 INFO 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: encryptDataTransfer 
       = false
2013-12-03 18:28:24,771 INFO 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: fsOwner             = 
cloud (auth:SIMPLE)
2013-12-03 18:28:24,771 INFO 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: supergroup          = 
supergroup
2013-12-03 18:28:24,771 INFO 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: isPermissionEnabled = true
2013-12-03 18:28:24,771 INFO 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: HA Enabled: false
2013-12-03 18:28:24,776 INFO 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Append Enabled: true
2013-12-03 18:28:25,230 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: 
Caching file names occuring more than 10 times
2013-12-03 18:28:25,243 INFO 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: 
dfs.namenode.safemode.threshold-pct = 0.9990000128746033
2013-12-03 18:28:25,244 INFO 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: 
dfs.namenode.safemode.min.datanodes = 0
2013-12-03 18:28:25,244 INFO 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: 
dfs.namenode.safemode.extension     = 30000
2013-12-03 18:28:25,288 INFO org.apache.hadoop.hdfs.server.common.Storage: Lock 
on /home/cloud/hadoop_migration/hadoop-data/name/in_use.lock acquired by 
nodename [email protected]
2013-12-03 18:28:25,462 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: 
Stopping NameNode metrics system...
2013-12-03 18:28:25,462 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: 
NameNode metrics system stopped.
2013-12-03 18:28:25,473 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: 
NameNode metrics system shutdown complete.
2013-12-03 18:28:25,474 FATAL org.apache.hadoop.hdfs.server.namenode.NameNode: 
Exception in namenode join
org.apache.hadoop.hdfs.server.common.IncorrectVersionException: Unexpected 
version of storage directory /home/cloud/hadoop_migration/hadoop-data/name. 
Reported: -41. Expecting = -40.
        at 
org.apache.hadoop.hdfs.server.common.Storage.setLayoutVersion(Storage.java:1079)
        at 
org.apache.hadoop.hdfs.server.common.Storage.setFieldsFromProperties(Storage.java:887)
        at 
org.apache.hadoop.hdfs.server.namenode.NNStorage.setFieldsFromProperties(NNStorage.java:583)
        at 
org.apache.hadoop.hdfs.server.common.Storage.readProperties(Storage.java:918)
        at 
org.apache.hadoop.hdfs.server.namenode.FSImage.recoverStorageDirs(FSImage.java:304)
        at 
org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:200)
        at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:627)
        at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:469)
        at 
org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:403)
        at 
org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:437)
        at 
org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:609)
        at 
org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:594)
        at 
org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1169)
        at 
org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1235)
2013-12-03 18:28:25,479 INFO org.apache.hadoop.util.ExitUtil: Exiting with 
status 1
2013-12-03 18:28:25,481 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: 
SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at Impetus-942.impetus.co.in/192.168.41.106
************************************************************/

Independently both the installations(Apache Hadoop-1.2.0 and Apache 
Hadoop-2.0.6-alpha) are working for me. I am able to run the MR jobs on both 
the installations independently though.
But I aim to migrate the data and jobs submitted from Apache Hadoop-1.2.0 to 
Apache Hadoop-2.0.6-alpha.

Is there any HDFS compatibility issues from Apache Hadoop-1.2.0 to Apache 
Hadoop-2.0.6-alpha?

Thanks,
-Nirmal

From: Nirmal Kumar
Sent: Wednesday, November 27, 2013 2:56 PM
To: [email protected]; [email protected]
Subject: RE: Any reference for upgrade hadoop from 1.x to 2.2

Hello Sandy,

The post was useful and gave an insight of the migration.

I am doing a test migration from Apache Hadoop-1.2.0 to Apache 
Hadoop-2.0.6-alpha on a single node environment.
I am having the Apache Hadoop-1.2.0 up and running.

Can you please let me know the steps that one should follow for the migration?
I am thinking of doing something like:

*         Install Apache Hadoop-2.0.6-alpha alongside the existing Apache 
Hadoop-1.2.0

*         Use the same HDFS locations

*         Change the various required configuration files

*         Stop Apache Hadoop-1.2.0  and start Apache Hadoop-2.0.6-alpha

*         Verify all the services are running

*         Test via mapreduce (test MRv1 and MRv2 examples)

*         Check Web UI Console and verify the MRv1 and MRv2 jobs

These above steps needs to be performed on all the nodes in a cluster 
environment.

The translation table mapping old configuration to new would be definitely 
*very* useful.

Also the existing Hadoop ecosystem components needs to be considered:

*         Hive Scripts

*         Pig Scripts

*         Oozie Workflows
Their compatibility and version support would need to be checked.

Also thinking of any risks like Data Loss, others that one should keep in mind.

Also I found: http://strataconf.com/strata2014/public/schedule/detail/32247

Thanks,
-Nirmal

From: Robert Dyer [mailto:[email protected]]
Sent: Friday, November 22, 2013 9:08 PM
To: [email protected]<mailto:[email protected]>
Subject: Re: Any reference for upgrade hadoop from 1.x to 2.2

Thanks Sandy! These seem helpful!

"MapReduce cluster configuration options have been split into YARN 
configuration options, which go in yarn-site.xml; and MapReduce configuration 
options, which go in mapred-site.xml. Many have been given new names to reflect 
the shift. ... We'll follow up with a full translation table in a future post."

This type of translation table mapping old configuration to new would be *very* 
useful!

- Robert
On Fri, Nov 22, 2013 at 2:15 AM, Sandy Ryza 
<[email protected]<mailto:[email protected]>> wrote:
For MapReduce and YARN, we recently published a couple blog posts on migrating:
http://blog.cloudera.com/blog/2013/11/migrating-to-mapreduce-2-on-yarn-for-users/
http://blog.cloudera.com/blog/2013/11/migrating-to-mapreduce-2-on-yarn-for-operators/

hope that helps,
Sandy

On Fri, Nov 22, 2013 at 3:03 AM, Nirmal Kumar 
<[email protected]<mailto:[email protected]>> wrote:
Hi All,

I am also looking into migrating\upgrading from Apache Hadoop 1.x to Apache 
Hadoop 2.x.
I didn't find any doc\guide\blogs for the same.
Although there are guides\docs for the CDH and HDP migration\upgradation from 
Hadoop 1.x to Hadoop 2.x
Would referring those be of some use?

I am looking for similar guides\docs for Apache Hadoop 1.x to Apache Hadoop 2.x.

I found something on slideshare though. Not sure how much useful that is going 
to be. I still need to verify that.
http://www.slideshare.net/mikejf12/an-example-apache-hadoop-yarn-upgrade

Any suggestions\comments will be of great help.

Thanks,
-Nirmal

From: Jilal Oussama 
[mailto:[email protected]<mailto:[email protected]>]
Sent: Friday, November 08, 2013 9:13 PM
To: [email protected]<mailto:[email protected]>
Subject: Re: Any reference for upgrade hadoop from 1.x to 2.2

I am looking for the same thing if anyone can point us to a good direction 
please.
Thank you.

(Currently running Hadoop 1.2.1)

2013/11/1 YouPeng Yang 
<[email protected]<mailto:[email protected]>>
Hi users

   Are there any reference docs to introduce how to upgrade hadoop from 1.x to 
2.2.

Regards


________________________________






NOTE: This message may contain information that is confidential, proprietary, 
privileged or otherwise protected by law. The message is intended solely for 
the named addressee. If received in error, please destroy and notify the 
sender. Any use of this email is prohibited when received in error. Impetus 
does not represent, warrant and/or guarantee, that the integrity of this 
communication has been maintained nor that the communication is free of errors, 
virus, interception or interference.

________________________________






NOTE: This message may contain information that is confidential, proprietary, 
privileged or otherwise protected by law. The message is intended solely for 
the named addressee. If received in error, please destroy and notify the 
sender. Any use of this email is prohibited when received in error. Impetus 
does not represent, warrant and/or guarantee, that the integrity of this 
communication has been maintained nor that the communication is free of errors, 
virus, interception or interference.

________________________________






NOTE: This message may contain information that is confidential, proprietary, 
privileged or otherwise protected by law. The message is intended solely for 
the named addressee. If received in error, please destroy and notify the 
sender. Any use of this email is prohibited when received in error. Impetus 
does not represent, warrant and/or guarantee, that the integrity of this 
communication has been maintained nor that the communication is free of errors, 
virus, interception or interference.

RE: Any reference for upgrade hadoop from 1.x to 2.2

Reply via email to