Hi,
You do not need to renew leases manually. There is a LeaseChecker, which
renews them every 30 secs
in a separate thread. I don't know what causes your problem, but
renewing lease in the DFSClient line 1073
does not look right.
--Konstantin
Dennis Kubes wrote:
Hi All,
I am seeing the following. A file is attempting to close but has to
replicate, before the replication gets finished the lease times out
and is closed which then causes the write to fail. Here is the log in
sequence.
2006-06-12 01:58:07,999 INFO org.apache.hadoop.dfs.StateChange: BLOCK*
NameSystem.pendingTransfer: ask andromeda01.visvo.com:50010 to
replicate blk_-80500146
83592461050 to datanode(s) andromeda07.visvo.com:50010
2006-06-12 02:02:16,915 INFO org.apache.hadoop.fs.FSNamesystem:
Removing lease [Lease. Holder: DFSClient_1073590514, heldlocks: 0,
pendingcreates: 0], leases remaining: 11
2006-06-12 02:02:20,175 INFO org.apache.hadoop.ipc.Server: Server
connection on port 9000 from 192.168.1.240: exiting
2006-06-12 02:02:27,293 WARN org.apache.hadoop.dfs.StateChange: DIR*
NameSystem.completeFile: failed to complete
/user/phoenix/crawl/newsegs/20060612002155/parse_data/part-00008/data
because dir.getFile()==null and null
2006-06-12 02:02:27,324 INFO org.apache.hadoop.ipc.Server: Server
handler 0 on 9000 call error: java.io.IOException: Could not complete
write to file
/user/phoenix/crawl/newsegs/20060612002155/parse_data/part-00008/data
by DFSClient_1073590514
java.io.IOException: Could not complete write to file
/user/phoenix/crawl/newsegs/20060612002155/parse_data/part-00008/data
by DFSClient_1073590514
at org.apache.hadoop.dfs.NameNode.complete(NameNode.java:240)
at sun.reflect.GeneratedMethodAccessor70.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:585)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:243)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:231)
2006-06-12 02:02:27,962 INFO org.apache.hadoop.ipc.Server: Server
connection on port 9000 from 192.168.1.237: exiting
2006-06-12 02:02:28,791 INFO org.apache.hadoop.ipc.Server: Server
connection on port 9000 from 192.168.1.243: exiting
I think that in DFSOutputStream in the DFSClient file on line 1073,
the following should be added.
namenode.renewLease( clientName.toString());
This will renew the lease while it is waiting on file completion (most
likely replication). The problem is that I don't know the core of
hadoop well enough yet to understand if this will cause other problems
so I wanted to get some feedback on this before I submit a patch.
Please let me know if this is a valid change or if it causes other
problems.
Dennis