Why not significantly extend the lease period as well, to say 5
minutes and have well behaved clients release the lease explicitly as
soon as they can?
Clients could then try to renew starting at say 2.5 minutes and try
every 30 seconds til 4.5 minutes have expired...
Seems like this would reduce overhead and have zero cost, since in
general there is no conflict for these leases, right?
On Jul 18, 2006, at 6:23 PM, Konstantin Shvachko (JIRA) wrote:
[ http://issues.apache.org/jira/browse/HADOOP-286?
page=comments#action_12422012 ]
Konstantin Shvachko commented on HADOOP-286:
--------------------------------------------
It looks like the following scenario leads to this exception.
LEASE_PERIOD = 60 sec is a global constants defining for how long a
lease is issued.
DFSClient.LeaseChecker renews this client leases every 30 sec =
LEASE_PERIOD/2.
If the renewLease() fails then the client retries to renew every
second.
One of the most popular reasons the renewLease() fails is because
it timeouts SocketTimeoutException.
This happens when the namenode is busy, which is not unusual since
we lock it for each operation.
The socket timeout is defined by the config parameter
"ipc.client.timeout", which is set to 60 sec in
hadoop-default.xml That means that the renewLease() can last up to
60 seconds and the lease will
expire the next time the client tries to renew it, which could be
up to 90 seconds after the lease was
created or renewed last time.
So there are 2 simple solutions to the problem:
1) to increase LEASE_PERIOD
2) to decrease ipc.client.timeout
A related problem is that DFSClient sends lease renew requests no
matter what every 30 seconds
or less. It looks like the DFSClient has enough information to send
renew messages only if it really
holds a lease. A simple solution would be avoid calling renewLease
() when
DFSClient.pendingCreates is empty.
This could substantially decrease overall net traffic for map/reduce.
copyFromLocal throws LeaseExpiredException
------------------------------------------
Key: HADOOP-286
URL: http://issues.apache.org/jira/browse/HADOOP-286
Project: Hadoop
Issue Type: Bug
Components: dfs
Affects Versions: 0.3.0
Environment: redhar linux
Reporter: Runping Qi
Loading local files to dfs through hadoop dfs -copyFromLocal
failed due to the following exception:
copyFromLocal: org.apache.hadoop.dfs.LeaseExpiredException: No
lease on output_crawled.1.txt
at org.apache.hadoop.dfs.FSNamesystem.getAdditionalBlock
(FSNamesystem.java:414)
at org.apache.hadoop.dfs.NameNode.addBlock(NameNode.java:190)
at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown
Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke
(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:585)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:243)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:231)
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the
administrators: http://issues.apache.org/jira/secure/
Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/
software/jira