Hello,

I'm running nutch 1.4 on an 3 Node Hadoop (1.0.1) Cluster and from time to
time i got an "alert" that 1 TaskTracker have been blacklisted.

And the log of the reducer contains 3-6 Exceptions like this:

org.apache.hadoop.ipc.RemoteException: 
org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to create 
file /user/test/crawl/segments/20120316065507/parse_text/part-00001/data for 
DFSClient_attempt_201203151054_0028_r_000001_1 on client xx.x.xx.xx.10, because 
this file is already being created by 
DFSClient_attempt_201203151054_0028_r_000001_0 on xx.xx.xx.9
        at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:1404)
        at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1244)
        at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1186)
        at 
org.apache.hadoop.hdfs.server.namenode.NameNode.create(NameNode.java:628)
        at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:563)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1388)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1384)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1093)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1382)

        at org.apache.hadoop.ipc.Client.call(Client.java:1066)
        at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
        at $Proxy2.create(Unknown Source)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
        at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
        at $Proxy2.create(Unknown Source)
        at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.<init>(DFSClient.java:3245)
        at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:713)
        at 
org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:182)
        at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:555)
        at 
org.apache.hadoop.io.SequenceFile$RecordCompressWriter.<init>(SequenceFile.java:1132)
        at org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:397)
        at org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:354)
        at org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:476)
        at org.apache.hadoop.io.MapFile$Writer.<init>(MapFile.java:157)
        at org.apache.hadoop.io.MapFile$Writer.<init>(MapFile.java:134)
        at org.apache.hadoop.io.MapFile$Writer.<init>(MapFile.java:92)
        at 
org.apache.nutch.parse.ParseOutputFormat.getRecordWriter(ParseOutputFormat.java:110)
        at 
org.apache.hadoop.mapred.ReduceTask$OldTrackingRecordWriter.<init>(ReduceTask.java:448)
        at 
org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:490)
        at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:420)
        at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1093)
        at org.apache.hadoop.mapred.Child.main(Child.java:249)

I have no special Nutch-Plugins, it's a "default" system and the hadoop config 
is "default" as well. 
Any ideas?

Thanks in advance,
Rafael.

Reply via email to