I figured out why this was hung in the single node case: The node
manager was failing to start because the resource manager was already
listening on the same port. I had followed Cloudera's example YARN
setup, which incorrectly (or at least unwisely) uses port 8040 for
yarn.resourcemanager.address:

https://ccp.cloudera.com/display/CDH4B2/Deploying+MapReduce+v2+%28YARN%29+on+a+Cluster

Switching to the default port (8032), both single and clustered
configurations now fail due to this:

Exception in thread "main" java.lang.NoClassDefFoundError:
org/apache/hadoop/mapreduce/v2/app/MRAppMaster
Caused by: java.lang.ClassNotFoundException:
org.apache.hadoop.mapreduce.v2.app.MRAppMaster
        at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
        at java.security.AccessController.doPrivileged(Native Method)
        at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
Could not find the main class:
org.apache.hadoop.mapreduce.v2.app.MRAppMaster.  Program will exit.

How can I fix the classpath to include the appropriate JAR
(hadoop-mapreduce-client-app-0.23.1-cdh4.0.0b2.jar)? It does appear to
be on the nodemanager classpath.

Thanks,
Trevor

On Mon, May 14, 2012 at 3:46 PM, Trevor Robinson <tre...@scurrilous.com> wrote:
> Would someone please give me some troubleshooting tips for TestDFSIO
> hanging on a new 0.23.1-cdh4b2 cluster? I've tried both a 5-machine
> cluster and just running everything on a single node. It's my first
> time configuring YARN, so maybe I've misconfigured something. I don't
> see anything suspicious in the logs for namenode, datanode,
> resourcemanager, or nodemanager.
>
> $ sudo su hdfs -c 'bin/hadoop --config etc/hadoop jar
> ./share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-0.23.1-cdh4.0.0b2-tests.jar
> TestDFSIO -write -nrFiles 10 -fileSize 1GB'
> 12/05/14 15:02:11 INFO fs.TestDFSIO: TestDFSIO.0.0.6
> 12/05/14 15:02:11 INFO fs.TestDFSIO: nrFiles = 10
> 12/05/14 15:02:11 INFO fs.TestDFSIO: fileSize (MB) = 1024.0
> 12/05/14 15:02:11 INFO fs.TestDFSIO: bufferSize = 1000000
> 12/05/14 15:02:11 INFO fs.TestDFSIO: baseDir = /benchmarks/TestDFSIO
> 12/05/14 15:02:11 INFO fs.TestDFSIO: creating control file: 1073741824
> bytes, 10 files
> 12/05/14 15:02:12 INFO fs.TestDFSIO: created control files for: 10 files
> 12/05/14 15:02:13 INFO mapred.FileInputFormat: Total input paths to process : 
> 10
> 12/05/14 15:02:13 INFO mapreduce.JobSubmitter: number of splits:10
> 12/05/14 15:02:13 WARN conf.Configuration: mapred.jar is deprecated.
> Instead, use mapreduce.job.jar
> 12/05/14 15:02:13 WARN conf.Configuration: mapred.reduce.tasks is
> deprecated. Instead, use mapreduce.job.reduces
> 12/05/14 15:02:13 WARN conf.Configuration: mapred.output.value.class
> is deprecated. Instead, use mapreduce.job.output.value.class
> 12/05/14 15:02:13 WARN conf.Configuration:
> mapred.used.genericoptionsparser is deprecated. Instead, use
> mapreduce.client.genericoptionsparser.used
> 12/05/14 15:02:13 WARN conf.Configuration: mapred.job.name is
> deprecated. Instead, use mapreduce.job.name
> 12/05/14 15:02:13 WARN conf.Configuration: mapred.input.dir is
> deprecated. Instead, use mapreduce.input.fileinputformat.inputdir
> 12/05/14 15:02:13 WARN conf.Configuration: mapred.output.dir is
> deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir
> 12/05/14 15:02:13 WARN conf.Configuration: mapred.map.tasks is
> deprecated. Instead, use mapreduce.job.maps
> 12/05/14 15:02:13 WARN conf.Configuration: mapred.output.key.class is
> deprecated. Instead, use mapreduce.job.output.key.class
> 12/05/14 15:02:13 WARN conf.Configuration: io.bytes.per.checksum is
> deprecated. Instead, use dfs.bytes-per-checksum
> 12/05/14 15:02:13 WARN conf.Configuration: mapred.working.dir is
> deprecated. Instead, use mapreduce.job.working.dir
> 12/05/14 15:02:13 INFO mapred.ResourceMgrDelegate: Submitted
> application application_1337025701572_0001 to ResourceManager at
> server1/10.10.130.30:8040
> 12/05/14 15:02:13 INFO mapreduce.Job: The url to track the job:
> http://server1:8088/proxy/application_1337025701572_0001/
> 12/05/14 15:02:13 INFO mapreduce.Job: Running job: job_1337025701572_0001
> <30 minutes pass - no significant CPU or disk activity>
>
> Thanks,
> Trevor

Reply via email to