It works now if the number of reducers is limited (4 in my case).
However I am not sure why sometimes it doesn't work if the number of
reducers is increased? I tried to increase the number of open files
for the user as suggested by some blog, but still for large number of
reducers it doesn't work. Any insights?

Thanks,
Sherif

On Wed, Sep 5, 2012 at 3:33 PM, Sherif Akoush <sherif.ako...@gmail.com> wrote:
> Hi,
>
> I am getting frequent jobs failures when I am running the wordcount
> example on a large dataset (40GB) using the trunk version (v2). Has
> anyone experienced this before? Maybe I need to set something related
> to timers?
>
> here is the error I get:
>
> 2012-09-05 14:02:50,914 ERROR [Thread-2]
> org.apache.hadoop.hdfs.DFSClient: Failed to close file
> /output_wiki_prov/_temporary/1/_temporary/attempt_1346843520083_0002_r_000
> 001_1/part-r-00001
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException):
> No lease on /output_wiki_prov/_temporary/1/_temporary/attempt_1346843
> 520083_0002_r_000001_1/part-r-00001 File does not exist. Holder
> DFSClient_attempt_1346843520083_0002_r_000001_1_-1916714176_1 does not
> have any open files.
>         at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2330)
>         at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2321)
>         at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.completeFileInternal(FSNamesystem.java:2388)
>         at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.completeFile(FSNamesystem.java:2365)
>         at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.complete(NameNodeRpcServer.java:526)
>         at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.complete(ClientNamenodeProtocolServerSideTranslatorPB.java:330)
>         at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:42450)
>         at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:474)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1002)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1732)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1728)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>         at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1367)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1726)
>
>         at org.apache.hadoop.ipc.Client.call(Client.java:1164)
>         at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
>         at $Proxy10.complete(Unknown Source)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>         at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:601)
>         at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:163)
>         at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:82)
>         at $Proxy10.complete(Unknown Source)
>         at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.complete(ClientNamenodeProtocolTranslatorPB.java:322)
>         at 
> org.apache.hadoop.hdfs.DFSOutputStream.completeFile(DFSOutputStream.java:1761)
>         at 
> org.apache.hadoop.hdfs.DFSOutputStream.close(DFSOutputStream.java:1748)
>         at 
> org.apache.hadoop.hdfs.DFSClient.closeAllFilesBeingWritten(DFSClient.java:672)
>         at org.apache.hadoop.hdfs.DFSClient.close(DFSClient.java:689)
>         at 
> org.apache.hadoop.hdfs.DistributedFileSystem.close(DistributedFileSystem.java:539)
>         at 
> org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:2386)
>         at 
> org.apache.hadoop.fs.FileSystem$Cache$ClientFinalizer.run(FileSystem.java:2403)
>         at 
> org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54)
>
> Thanks,
> Sherif

Reply via email to