hi,harsh, i wrote the namenode's log here. The problem occurs occasionally Thanks a lot!
2011-08-22 14:41:05,939 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit: ugi=tpic,users,dialout ip=/172.28.1. 29 cmd=delete src=/walter/send_albums/110822_143455/_temporary dst=null perm=null 2011-08-22 14:41:06,025 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit: ugi=tpic,users,dialout ip=/172.24.31 .148 cmd=create src=/walter/send_albums/110822_143455/_temporary/_attempt_201106021431_1838_r_000509_1/part-r-00509 dst=null perm=tpic:supergroup:rw-r--r-- 2011-08-22 14:41:06,227 INFO org.apache.hadoop.ipc.Server: IPC Server handler 6 on 9000, call addBlock(/walter/send_albums/11 0822_143455/_temporary/_attempt_201106021431_1838_r_000506_1/part-r-00506, DFSClient_attempt_201106021431_1838_r_000506_1) fr om 10.163.138.122:42866: error: org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: No lease on /walter/send_albums /110822_143455/_temporary/_attempt_201106021431_1838_r_000506_1/part-r-00506 File does not exist. Holder DFSClient_attempt_20 1106021431_1838_r_000506_1 does not have any open files. org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: No lease on /walter/send_albums/110822_143455/_temporary/_attem pt_201106021431_1838_r_000506_1/part-r-00506 File does not exist. Holder DFSClient_attempt_201106021431_1838_r_000506_1 does not have any open files. at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:1332) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:1323) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1251) at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:422) at sun.reflect.GeneratedMethodAccessor17.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953) 2011-08-22 14:41:06,297 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* NameSystem.addStoredBlock: addStoredBlock request rec eived for blk_521784094891641529_825020 on 172.24.32.27:50010 size 627045 But it does not belong to any file. 2011-08-22 14:41:06,299 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* NameSystem.addStoredBlock: addStoredBlock request rec eived for blk_521784094891641529_825020 on 10.163.128.50:50010 size 627045 But it does not belong to any file. 2011-08-22 14:41:06,299 WARN org.apache.hadoop.hdfs.StateChange: DIR* NameSystem.completeFile: failed to complete /walter/sen d_albums/110822_143455/_temporary/_attempt_201106021431_1838_r_000501_1/part-r-00501 because dir.getFileBlocks() is null and pendingFile is null 2011-08-22 14:41:06,299 INFO org.apache.hadoop.ipc.Server: IPC Server handler 8 on 9000, call complete(/walter/send_albums/11 0822_143455/_temporary/_attempt_201106021431_1838_r_000501_1/part-r-00501, DFSClient_attempt_201106021431_1838_r_000501_1) fr om 10.163.128.50:38026: error: java.io.IOException: Could not complete write to file /walter/send_albums/110822_143455/_tempo rary/_attempt_201106021431_1838_r_000501_1/part-r-00501 by DFSClient_attempt_201106021431_1838_r_000501_1 java.io.IOException: Could not complete write to file /walter/send_albums/110822_143455/_temporary/_attempt_201106021431_1838 _r_000501_1/part-r-00501 by DFSClient_attempt_201106021431_1838_r_000501_1 at org.apache.hadoop.hdfs.server.namenode.NameNode.complete(NameNode.java:449) at sun.reflect.GeneratedMethodAccessor12.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) 2011/8/17, Jianxin Wang <[email protected]>: > Thanks Harsh :) > the hadoop system started 3 months ago. so i think it is not in safe mode. > i found some old tasks started 10 days ago, the tasks seem blocked by > some unknown reason. I killed these tasks now. but i don't know why a > task can be blocked and exists so long. > > I found another type of exception like this in namenode's node. > 2011-08-16 11:25:44,352 INFO org.apache.hadoop.mapred.TaskInProgress: > Error from attempt_201106021431_1704_r_000506_1: org.ap > ache.hadoop.ipc.RemoteException: > org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: No lease > on /walter/send_album > s/110816_111850/_temporary/_attempt_201106021431_1704_r_000506_1/part-r-00506 > File does not exist. Holder DFSClient_attempt_2 > 01106021431_1704_r_000506_1 does not have any open files. > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:1332) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:1323) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1251) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:422) > at sun.reflect.GeneratedMethodAccessor17.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:396) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953) > > at org.apache.hadoop.ipc.Client.call(Client.java:740) > at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220) > at $Proxy1.addBlock(Unknown Source) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > > 2011/8/17, Harsh J <[email protected]>: >> Do you notice anything related in the NameNode logs? One reason for >> this is that the NameNode may be in safe mode for some reason, but >> there are many other reasons so the NameNode's log would be the best >> place to look for exactly why the complete()-op fails. >> >> On Wed, Aug 17, 2011 at 8:20 AM, Jianxin Wang <[email protected]> >> wrote: >>> hi, >>> my job runs once ervey day. but it failed sometimes. >>> i checked the log in job tracker. It seems a hdfs error? >>> thanks a lot! >>> >>> 2011-08-16 21:07:13,247 INFO org.apache.hadoop.mapred.TaskInProgress: >>> Error from attempt_201106021431_1719_r_000498_1: org.ap >>> ache.hadoop.ipc.RemoteException: java.io.IOException: Could not >>> complete write to file /xxx/yyy/110816_210016/_tem >>> porary/_attempt_201106021431_1719_r_000498_1/part-r-00498 by >>> DFSClient_attempt_201106021431_1719_r_000498_1 >>> at >>> org.apache.hadoop.hdfs.server.namenode.NameNode.complete(NameNode.java:449) >>> at sun.reflect.GeneratedMethodAccessor12.invoke(Unknown Source) >>> at >>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) >>> at java.lang.reflect.Method.invoke(Method.java:597) >>> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508) >>> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959) >>> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955) >>> at java.security.AccessController.doPrivileged(Native Method) >>> at javax.security.auth.Subject.doAs(Subject.java:396) >>> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953) >>> >>> at org.apache.hadoop.ipc.Client.call(Client.java:740) >>> at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220) >>> at $Proxy1.complete(Unknown Source) >>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >>> at >>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) >>> at >>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) >>> at java.lang.reflect.Method.invoke(Method.java:597) >>> at >>> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82) >>> at >>> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59) >>> at $Proxy1.complete(Unknown Source) >>> at >>> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.closeInternal(DFSClient.java:3264) >>> at >>> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.close(DFSClient.java:3188) >>> at >>> org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:61) >>> at >>> org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:86) >>> at >>> org.apache.hadoop.io.SequenceFile$Writer.close(SequenceFile.java:959) >>> at >>> org.apache.hadoop.io.SequenceFile$BlockCompressWriter.close(SequenceFile.java:1290) >>> at >>> org.apache.hadoop.mapreduce.lib.output.SequenceFileOutputFormat$1.close(SequenceFileOutputFormat.java:78) >>> at >>> org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:567) >>> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:408) >>> at org.apache.hadoop.mapred.Child.main(Child.java:170) >>> >> >> >> >> -- >> Harsh J >> >
