Hi All,

I am seeing the following. A file is attempting to close but has to replicate, before the replication gets finished the lease times out and is closed which then causes the write to fail. Here is the log in sequence.

2006-06-12 01:58:07,999 INFO org.apache.hadoop.dfs.StateChange: BLOCK* NameSystem.pendingTransfer: ask andromeda01.visvo.com:50010 to replicate blk_-80500146
83592461050 to datanode(s) andromeda07.visvo.com:50010
2006-06-12 02:02:16,915 INFO org.apache.hadoop.fs.FSNamesystem: Removing lease [Lease. Holder: DFSClient_1073590514, heldlocks: 0, pendingcreates: 0], leases remaining: 11 2006-06-12 02:02:20,175 INFO org.apache.hadoop.ipc.Server: Server connection on port 9000 from 192.168.1.240: exiting 2006-06-12 02:02:27,293 WARN org.apache.hadoop.dfs.StateChange: DIR* NameSystem.completeFile: failed to complete /user/phoenix/crawl/newsegs/20060612002155/parse_data/part-00008/data because dir.getFile()==null and null 2006-06-12 02:02:27,324 INFO org.apache.hadoop.ipc.Server: Server handler 0 on 9000 call error: java.io.IOException: Could not complete write to file /user/phoenix/crawl/newsegs/20060612002155/parse_data/part-00008/data by DFSClient_1073590514 java.io.IOException: Could not complete write to file /user/phoenix/crawl/newsegs/20060612002155/parse_data/part-00008/data by DFSClient_1073590514
  at org.apache.hadoop.dfs.NameNode.complete(NameNode.java:240)
  at sun.reflect.GeneratedMethodAccessor70.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
  at java.lang.reflect.Method.invoke(Method.java:585)
  at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:243)
  at org.apache.hadoop.ipc.Server$Handler.run(Server.java:231)
2006-06-12 02:02:27,962 INFO org.apache.hadoop.ipc.Server: Server connection on port 9000 from 192.168.1.237: exiting 2006-06-12 02:02:28,791 INFO org.apache.hadoop.ipc.Server: Server connection on port 9000 from 192.168.1.243: exiting

I think that in DFSOutputStream in the DFSClient file on line 1073, the following should be added.

namenode.renewLease( clientName.toString());

This will renew the lease while it is waiting on file completion (most likely replication). The problem is that I don't know the core of hadoop well enough yet to understand if this will cause other problems so I wanted to get some feedback on this before I submit a patch. Please let me know if this is a valid change or if it causes other problems.

Dennis

Reply via email to