Hi All,
I am doing a test migration from Apache Hadoop-1.2.0 to Apache
Hadoop-2.0.6-alpha on a single node environment.
I did the following:
* Installed Apache Hadoop-1.2.0
* Ran word count sample MR jobs. The jobs executed successfully.
* I stop all the services in Apache Hadoop-1.2.0 and then was able to
start all services again.
* The previous submitted jobs are visible after the stop/start in the
job tracker url.
Next I installed Apache Hadoop-2.0.6-alpha alongside.
I used the SAME data directory locations that were in Apache Hadoop-1.2.0 in
the configuration files namely:
core-site.xml
----------------
$hadoop.tmp.dir
/home/cloud/hadoop_migration/hadoop-data/tempdir
hdfs-site.xml
-----------------
$dfs.data.dir
/home/cloud/hadoop_migration/hadoop-data/data
$dfs.name.dir
/home/cloud/hadoop_migration/hadoop-data/name
I am UNABLE to start the NameNode from Apache Hadoop-2.0.6-alpha installation I
am getting the error:
2013-12-03 18:28:23,941 INFO org.apache.hadoop.metrics2.impl.MetricsConfig:
loaded properties from hadoop-metrics2.properties
2013-12-03 18:28:24,080 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl:
Scheduled snapshot period at 10 second(s).
2013-12-03 18:28:24,081 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl:
NameNode metrics system started
2013-12-03 18:28:24,576 WARN
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Only one image storage
directory (dfs.namenode.name.dir) configured. Beware of dataloss due to lack of
redundant storage directories!
2013-12-03 18:28:24,576 WARN
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Only one namespace edits
storage directory (dfs.namenode.edits.dir) configured. Beware of dataloss due
to lack of redundant storage directories!
2013-12-03 18:28:24,744 INFO org.apache.hadoop.util.HostsFileReader: Refreshing
hosts (include/exclude) list
2013-12-03 18:28:24,749 INFO
org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager:
dfs.block.invalidate.limit=1000
2013-12-03 18:28:24,762 INFO
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager:
dfs.block.access.token.enable=false
2013-12-03 18:28:24,762 INFO
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: defaultReplication
= 1
2013-12-03 18:28:24,762 INFO
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: maxReplication
= 512
2013-12-03 18:28:24,762 INFO
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: minReplication
= 1
2013-12-03 18:28:24,763 INFO
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager:
maxReplicationStreams = 2
2013-12-03 18:28:24,763 INFO
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager:
shouldCheckForEnoughRacks = false
2013-12-03 18:28:24,763 INFO
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager:
replicationRecheckInterval = 3000
2013-12-03 18:28:24,763 INFO
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: encryptDataTransfer
= false
2013-12-03 18:28:24,771 INFO
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: fsOwner =
cloud (auth:SIMPLE)
2013-12-03 18:28:24,771 INFO
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: supergroup =
supergroup
2013-12-03 18:28:24,771 INFO
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: isPermissionEnabled = true
2013-12-03 18:28:24,771 INFO
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: HA Enabled: false
2013-12-03 18:28:24,776 INFO
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Append Enabled: true
2013-12-03 18:28:25,230 INFO org.apache.hadoop.hdfs.server.namenode.NameNode:
Caching file names occuring more than 10 times
2013-12-03 18:28:25,243 INFO
org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
dfs.namenode.safemode.threshold-pct = 0.9990000128746033
2013-12-03 18:28:25,244 INFO
org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
dfs.namenode.safemode.min.datanodes = 0
2013-12-03 18:28:25,244 INFO
org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
dfs.namenode.safemode.extension = 30000
2013-12-03 18:28:25,288 INFO org.apache.hadoop.hdfs.server.common.Storage: Lock
on /home/cloud/hadoop_migration/hadoop-data/name/in_use.lock acquired by
nodename [email protected]
2013-12-03 18:28:25,462 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl:
Stopping NameNode metrics system...
2013-12-03 18:28:25,462 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl:
NameNode metrics system stopped.
2013-12-03 18:28:25,473 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl:
NameNode metrics system shutdown complete.
2013-12-03 18:28:25,474 FATAL org.apache.hadoop.hdfs.server.namenode.NameNode:
Exception in namenode join
org.apache.hadoop.hdfs.server.common.IncorrectVersionException: Unexpected
version of storage directory /home/cloud/hadoop_migration/hadoop-data/name.
Reported: -41. Expecting = -40.
at
org.apache.hadoop.hdfs.server.common.Storage.setLayoutVersion(Storage.java:1079)
at
org.apache.hadoop.hdfs.server.common.Storage.setFieldsFromProperties(Storage.java:887)
at
org.apache.hadoop.hdfs.server.namenode.NNStorage.setFieldsFromProperties(NNStorage.java:583)
at
org.apache.hadoop.hdfs.server.common.Storage.readProperties(Storage.java:918)
at
org.apache.hadoop.hdfs.server.namenode.FSImage.recoverStorageDirs(FSImage.java:304)
at
org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:200)
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:627)
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:469)
at
org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:403)
at
org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:437)
at
org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:609)
at
org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:594)
at
org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1169)
at
org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1235)
2013-12-03 18:28:25,479 INFO org.apache.hadoop.util.ExitUtil: Exiting with
status 1
2013-12-03 18:28:25,481 INFO org.apache.hadoop.hdfs.server.namenode.NameNode:
SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at Impetus-942.impetus.co.in/192.168.41.106
************************************************************/
Independently both the installations(Apache Hadoop-1.2.0 and Apache
Hadoop-2.0.6-alpha) are working for me. I am able to run the MR jobs on both
the installations independently though.
But I aim to migrate the data and jobs submitted from Apache Hadoop-1.2.0 to
Apache Hadoop-2.0.6-alpha.
Is there any HDFS compatibility issues from Apache Hadoop-1.2.0 to Apache
Hadoop-2.0.6-alpha?
Thanks,
-Nirmal
From: Nirmal Kumar
Sent: Wednesday, November 27, 2013 2:56 PM
To: [email protected]; [email protected]
Subject: RE: Any reference for upgrade hadoop from 1.x to 2.2
Hello Sandy,
The post was useful and gave an insight of the migration.
I am doing a test migration from Apache Hadoop-1.2.0 to Apache
Hadoop-2.0.6-alpha on a single node environment.
I am having the Apache Hadoop-1.2.0 up and running.
Can you please let me know the steps that one should follow for the migration?
I am thinking of doing something like:
* Install Apache Hadoop-2.0.6-alpha alongside the existing Apache
Hadoop-1.2.0
* Use the same HDFS locations
* Change the various required configuration files
* Stop Apache Hadoop-1.2.0 and start Apache Hadoop-2.0.6-alpha
* Verify all the services are running
* Test via mapreduce (test MRv1 and MRv2 examples)
* Check Web UI Console and verify the MRv1 and MRv2 jobs
These above steps needs to be performed on all the nodes in a cluster
environment.
The translation table mapping old configuration to new would be definitely
*very* useful.
Also the existing Hadoop ecosystem components needs to be considered:
* Hive Scripts
* Pig Scripts
* Oozie Workflows
Their compatibility and version support would need to be checked.
Also thinking of any risks like Data Loss, others that one should keep in mind.
Also I found: http://strataconf.com/strata2014/public/schedule/detail/32247
Thanks,
-Nirmal
From: Robert Dyer [mailto:[email protected]]
Sent: Friday, November 22, 2013 9:08 PM
To: [email protected]<mailto:[email protected]>
Subject: Re: Any reference for upgrade hadoop from 1.x to 2.2
Thanks Sandy! These seem helpful!
"MapReduce cluster configuration options have been split into YARN
configuration options, which go in yarn-site.xml; and MapReduce configuration
options, which go in mapred-site.xml. Many have been given new names to reflect
the shift. ... We'll follow up with a full translation table in a future post."
This type of translation table mapping old configuration to new would be *very*
useful!
- Robert
On Fri, Nov 22, 2013 at 2:15 AM, Sandy Ryza
<[email protected]<mailto:[email protected]>> wrote:
For MapReduce and YARN, we recently published a couple blog posts on migrating:
http://blog.cloudera.com/blog/2013/11/migrating-to-mapreduce-2-on-yarn-for-users/
http://blog.cloudera.com/blog/2013/11/migrating-to-mapreduce-2-on-yarn-for-operators/
hope that helps,
Sandy
On Fri, Nov 22, 2013 at 3:03 AM, Nirmal Kumar
<[email protected]<mailto:[email protected]>> wrote:
Hi All,
I am also looking into migrating\upgrading from Apache Hadoop 1.x to Apache
Hadoop 2.x.
I didn't find any doc\guide\blogs for the same.
Although there are guides\docs for the CDH and HDP migration\upgradation from
Hadoop 1.x to Hadoop 2.x
Would referring those be of some use?
I am looking for similar guides\docs for Apache Hadoop 1.x to Apache Hadoop 2.x.
I found something on slideshare though. Not sure how much useful that is going
to be. I still need to verify that.
http://www.slideshare.net/mikejf12/an-example-apache-hadoop-yarn-upgrade
Any suggestions\comments will be of great help.
Thanks,
-Nirmal
From: Jilal Oussama
[mailto:[email protected]<mailto:[email protected]>]
Sent: Friday, November 08, 2013 9:13 PM
To: [email protected]<mailto:[email protected]>
Subject: Re: Any reference for upgrade hadoop from 1.x to 2.2
I am looking for the same thing if anyone can point us to a good direction
please.
Thank you.
(Currently running Hadoop 1.2.1)
2013/11/1 YouPeng Yang
<[email protected]<mailto:[email protected]>>
Hi users
Are there any reference docs to introduce how to upgrade hadoop from 1.x to
2.2.
Regards
________________________________
NOTE: This message may contain information that is confidential, proprietary,
privileged or otherwise protected by law. The message is intended solely for
the named addressee. If received in error, please destroy and notify the
sender. Any use of this email is prohibited when received in error. Impetus
does not represent, warrant and/or guarantee, that the integrity of this
communication has been maintained nor that the communication is free of errors,
virus, interception or interference.
________________________________
NOTE: This message may contain information that is confidential, proprietary,
privileged or otherwise protected by law. The message is intended solely for
the named addressee. If received in error, please destroy and notify the
sender. Any use of this email is prohibited when received in error. Impetus
does not represent, warrant and/or guarantee, that the integrity of this
communication has been maintained nor that the communication is free of errors,
virus, interception or interference.
________________________________
NOTE: This message may contain information that is confidential, proprietary,
privileged or otherwise protected by law. The message is intended solely for
the named addressee. If received in error, please destroy and notify the
sender. Any use of this email is prohibited when received in error. Impetus
does not represent, warrant and/or guarantee, that the integrity of this
communication has been maintained nor that the communication is free of errors,
virus, interception or interference.