Each FileSystem instance if in DFS mode uses a DistributedFileSystem
implementation class which has one non-singleton instance of the
DFSClient which has a LeaseChecker thread that monitors only its own
lease which only communicates with the single NameNode. Because of this
I don't see how the DFSClient would be renewing leases from multiple
servers.
I think the renew thread just runs into some type of temporary network
problem like a dropped socket or too much network traffic and then has
to wait to long and it expires on the NameNode. I also don't think it
is the NameNode getting overloaded because we run our NameNode on its
own server and the load factor on the box . 0.03 to 0.10.
I think the HADOOP-563 patches will fix this though. I will try it and see.
Dennis
Dhruba Borthakur wrote:
Hi Dennis,
I agree with your scenario because of which the client may not get a chance
to renew a lease within a 1 minute interval. However, this will occur only
if an RPC goes "bad". You said that: "Now let's suppose that for some reason
the RPC call goes bad and blocks. It is going to wait 60 seconds to go
again". So, the main reason of these timeouts is because an RPC is stuck at
the server for more than a minute. Any ideas why this is so?
One cause could be that there is only one thread in thr DFSClient that
renews leases. Suppose the client has to renew leases from many servers. If
one of the servers is down, then the RPC to that server might get blocked,
thus causing all other leases to other servers timeout too. Maybe the
DFSClient can have a lease-renewal thread per server.
Thanks,
dhruba
-----Original Message-----
From: Dennis Kubes [mailto:[EMAIL PROTECTED]
Sent: Wednesday, November 01, 2006 12:02 PM
To: [email protected]
Subject: Race Condition with Renewing Leases and RPC Calls
This is going to be a long email. I am seeing a race condition between
the renewing of leases on the DFS and RPC client timeout values and I
wanted to get some feedback on this.
We keep seeing leases failing to be renewed where a task has to start
over. Usually the task will complete fine but it still bugs me that the
leases were failing to be renewed. After doing some research here is
what I found. When the task is not renewed we see something like this
in the log:
06/10/31 13:23:57 INFO fs.FSNamesystem: Removing lease [Lease. Holder:
DFSClient_task_0007_r_000001_0, heldlocks: 0, pendingcreates: 0], leases
remaining: 10
06/10/31 13:24:43 INFO ipc.Server: Server handler 5 on 9000 call error:
org.apache.hadoop.dfs.LeaseExpiredException: No lease on
/user/phoenix/crawl/engine/unit2cat/part-00001
06/10/31 13:25:13 INFO ipc.Server: Server handler 2 on 9000 call error:
org.apache.hadoop.dfs.LeaseExpiredException: No lease on
/user/phoenix/crawl/engine/unit2cat/part-00001
06/10/31 13:25:13 INFO ipc.Server: Server handler 7 on 9000 call error:
org.apache.hadoop.dfs.LeaseExpiredException: No lease on
/user/phoenix/crawl/engine/unit2cat/part-00001
06/10/31 13:25:13 INFO ipc.Server: Server handler 5 on 9000 call error:
org.apache.hadoop.dfs.LeaseExpiredException: No lease on
/user/phoenix/crawl/engine/unit2cat/part-00001
06/10/31 13:25:13 INFO ipc.Server: Server handler 6 on 9000 call error:
org.apache.hadoop.dfs.LeaseExpiredException: No lease on
/user/phoenix/crawl/engine/unit2cat/.part-00001.crc
06/10/31 13:25:13 INFO fs.FSNamesystem: Attempt to release a lock from
an unknown lease holder DFSClient_task_0007_r_000001_0 for
/user/phoenix/crawl/engine/unit2cat/.part-00001.crc
Tracking this back through the code, the only way it can get to this
position is if the lease is expired in the FSNamesystem.LeaseMonitor
thread. As far as I can tell the only ways for the lease to become
expired are one the system shuts down and the shutdown hook fires the
DFSClient.ClientFinalizer thread to release locks or two the
DFSClient.LeaseChecker thread doesn't renew the lease in time. The
system is not shutting down so it has to be something in the LeaseChecker.
The LeaseChecker continuously runs, once a second, and makes RPC calls
to renew the leases. The leases have to be renewed every 60 seconds
LEASE_PERIOD or they because expired and are removed by the
LeaseMonitor. The LeaseChecker renews leases when their renew time is
half over, so 30 seconds and beyond. So how is it that the leases
aren't getting renewed. Well the LeaseChecker actually is making an RPC
call to the NameNode to renew the lease. This RPC call uses an ipc
Client object which has a default socket timeout value of 10 seconds but
a configured value of 60 seconds in hadoop-default.xml.
The Client class uses a Connection object that has a thread which waits
for input from the SocketInputStream. So on the RPC method invocation
we write the output through the Connection sendParam and we wait for the
return Writable value through the input stream in the Connection
thread. This input stream will continually wait for input and the RPC
call will block while it is waiting. Now let's suppose that for some
reason the RPC call goes bad and blocks. It is going to wait 60 seconds
to go again. If the lease hasn't just been renewed then a race
condition can occur where the lease will expire and potentially be
removed by the LeaseMonitor before the RPC class times out and can be
refired.
I think a fix to this would be to either change the default timeout
value in the hadoop-default.xml which could have other unintended
consequences or to increase the LEASE_PERIOD to be something around 2
minutes. I am looking for some feedback on what everyone thinks.
Dennis Kubes