Hi Using the nutch tools I see fairly frequent crashes in SequenceFileOutputFormat.getReaders() with stack traces like the one below. What appears to be happending is that there's a temporary file inside the generate-temp-1220879127849 which exists when getReaders() lists the contents of the directory, but has been deleted by the time it goes to examine the contents.
Since I'm using nutch, this is in Hadoop version 0.15, but the code for getReaders() doesn't seem to have changed in 0.18. Is this a known problem? regards Barry 2008-09-08 14:07:04,429 FATAL crawl.Generator - Generator: org.apache.hadoop.ipc.RemoteException: java.io.IOException: Cannot open filename /tmp/hadoop-bhaddow/mapred/temp/generate-temp-1220879127849/_task_200809081337_0019_r_000004_1 at org.apache.hadoop.dfs.NameNode.open(NameNode.java:238) at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:379) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:596) at org.apache.hadoop.ipc.Client.call(Client.java:482) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:184) at org.apache.hadoop.dfs.$Proxy0.open(Unknown Source) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59) at org.apache.hadoop.dfs.$Proxy0.open(Unknown Source) at org.apache.hadoop.dfs.DFSClient$DFSInputStream.openInfo(DFSClient.java:848) at org.apache.hadoop.dfs.DFSClient$DFSInputStream.<init>(DFSClient.java:840) at org.apache.hadoop.dfs.DFSClient.open(DFSClient.java:285) at org.apache.hadoop.dfs.DistributedFileSystem.open(DistributedFileSystem.java:114) at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1356) at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1349) at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1344) at org.apache.hadoop.mapred.SequenceFileOutputFormat.getReaders(SequenceFileOutputFormat.java:87) at org.apache.nutch.crawl.Generator.generate(Generator.java:443) at org.apache.nutch.crawl.Generator.run(Generator.java:580) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at org.apache.hadoop.util.ToolBase.doMain(ToolBase.java:54) at org.apache.nutch.crawl.Generator.main(Generator.java:543) -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336.