Madhu, Are you multi-threading in your Reducer code by any chance? MTOF may not be thread-safe in the release you're using. Using MultipleOutputs is recommended right now, if this is the cause/case.
On Wed, Jun 8, 2011 at 7:58 PM, Madhu Ramanna <[email protected]> wrote: > Hello, > > We're using CDH3b3 0.20.2 hadoop. In our map reduce jobs we've extended > MultipleTextOutputFormat to override checkOutputSpecs() and > generateFileNameForKeyValue() returning > relative path based on key. I don't have multiple jobs running with the same > output directory. When I rerun it succeeds; a couple of runs later it fails. > Here is a log from failed attempts: > > > attempt 0 > > java.io.IOException: The temporary job-output directory > hdfs://nn:8000/griffin/prod/_temporary doesn't exist! at > org.apache.hadoop.mapred.FileOutputCommitter.getWorkPath(FileOutputCommitte > r.java: > 250) at > org.apache.hadoop.mapred.FileOutputFormat.getTaskOutputPath(FileOutputForma > t.java: > 240) at > org.apache.hadoop.mapred.TextOutputFormat.getRecordWriter(TextOutputFormat. > java: > 129) at > org.apache.hadoop.mapred.lib.MultipleTextOutputFormat.getBaseRecordWriter(M > ultipleTextOutputFormat.java: > 47) at org.apache.hadoop.mapred.lib.MultipleOutputFormat > $1.write(MultipleOutputFormat.java:102) at > org.apache.hadoop.mapred.ReduceTask$3.collect(ReduceTask.java:444) at > com.buysight.griffin.UserHistory$MyReducer.reduce(UserHistory.java: > 672) at com.buysight.griffin.UserHistory > $MyReducer.reduce(UserHistory.java:148) at > org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:467) > at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:415) at > org.apache.hadoop.mapred.Child$4.run(Child.java:217) at > java.security.AccessController.doPrivileged(Native Method) at > javax.security.auth.Subject.doAs(Subject.java:396) at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.j > ava: > 1063) at org.apache.hadoop.mapred.Child.main(Child.java:211) > > attempt 1 > > org.apache.hadoop.ipc.RemoteException: > org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: No lease > on /griffin/prod/_temporary/_attempt_201106061713_0045_r_000000_1/ > s2_conv/20110530/part-00000.bz2 File does not exist. Holder > DFSClient_attempt_201106061713_0045_r_000000_1 does not have any open > files. at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem > .java: > 1488) at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem > .java: > 1479) at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNa > mesystem.java: > 1395) at > org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java: > 588) at sun.reflect.GeneratedMethodAccessor249.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImp > l.java: > 25) at java.lang.reflect.Method.invoke(Method.java:597) at > org.apache.hadoop.ipc.RPC$Server.call(RPC.java:528) at > org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1319) at > org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1315) at > java.security.AccessController.doPrivileged(Native Method) at > javax.security.auth.Subject.doAs(Subject.java:396) at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.j > ava: > 1063) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1313) at > org.apache.hadoop.ipc.Client.call(Client.java:1054) at > org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:226) at > $Proxy1.addBlock(Unknown Source) at > sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java: > 39) at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImp > l.java: > 25) at java.lang.reflect.Method.invoke(Method.java:597) at > org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocat > ionHandler.java: > 82) at > org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHan > dler.java: > 59) at $Proxy1.addBlock(Unknown Source) at > org.apache.hadoop.hdfs.DFSClient > $DFSOutputStream.locateFollowingBlock(DFSClient.java:3166) at > org.apache.hadoop.hdfs.DFSClient > $DFSOutputStream.nextBlockOutputStream(DFSClient.java:3036) at > org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access > $1900(DFSClient.java:2288) at org.apache.hadoop.hdfs.DFSClient > $DFSOutputStream$DataStreamer.run(DFSClient.java:2483) > > I've tried turning off speculative execution. Still no luck. What gives ? > > Thanks, > Madhu > -- Harsh J
