RE: Failed to run wordcount on YARN

Devaraj k Fri, 12 Jul 2013 01:25:52 -0700

Hi Raymond, 

        In Hadoop 2.0.5 version, FileInputFormat new API doesn't support 
reading the files recursively in input dir. In supports only having the input 
dir with files. If the input dir has any child dir's then it throws below error.


This has been added in trunk with this JIRA 
https://issues.apache.org/jira/browse/MAPREDUCE-3193.

You can give input dir to the Job which doesn't have nested dir's or you can 
make use of the old FileInputFormat API to read files recursively in the sub 
dir's.

Thanks
Devaraj k

-----Original Message-----
From: Liu, Raymond [mailto:[email protected]] 
Sent: 12 July 2013 12:57
To: [email protected]
Subject: Failed to run wordcount on YARN

Hi 

I just start to try out hadoop2.0, I use the 2.0.5-alpha package

And follow 

http://hadoop.apache.org/docs/r2.0.5-alpha/hadoop-project-dist/hadoop-common/ClusterSetup.html

to setup a cluster in non-security mode. HDFS works fine with client tools.

While when I run wordcount example, there are errors :

./bin/hadoop jar 
./share/hadoop/mapreduce/hadoop-mapreduce-examples-2.0.5-alpha.jar wordcount 
/tmp /out


13/07/12 15:05:53 INFO mapreduce.Job: Task Id : 
attempt_1373609123233_0004_m_000004_0, Status : FAILED
Error: java.io.FileNotFoundException: Path is not a file: /tmp/hadoop-yarn
        at 
org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:42)
        at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsUpdateTimes(FSNamesystem.java:1317)
        at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:1276)
        at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1252)
        at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1225)
        at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:403)
        at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:239)
        at 
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:40728)
        at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:454)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1014)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1741)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1737)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1478)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1735)

        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
        at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:525)
        at 
org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:90)
        at 
org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:57)
        at 
org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:986)
        at org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:974)
        at 
org.apache.hadoop.hdfs.DFSInputStream.fetchLocatedBlocksAndGetLastBlockLength(DFSInputStream.java:157)
        at 
org.apache.hadoop.hdfs.DFSInputStream.openInfo(DFSInputStream.java:124)
        at org.apache.hadoop.hdfs.DFSInputStream.<init>(DFSInputStream.java:117)
        at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:1131)
        at 
org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:244)
        at 
org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:77)
        at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:713)
        at 
org.apache.hadoop.mapreduce.lib.input.LineRecordReader.initialize(LineRecordReader.java:89)
        at 
org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.initialize(MapTask.java:519)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:756)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:339)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:158)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1478)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:153)

I check the HDFS and found /tmp/hadoop-yarn is there , this dir's owner is the 
same as the job user.

And to ensure it works, I also create /tmp/hadoop-yarn on local fs. None of it 
works.

Any idea what might be the problem? Thx!


Best Regards,
Raymond Liu

RE: Failed to run wordcount on YARN

Reply via email to