Hi Dennis,

I agree with your scenario because of which the client may not get a chance
to renew a lease within a 1 minute interval. However, this will occur only
if an RPC goes "bad". You said that: "Now let's suppose that for some reason
the RPC call goes bad and blocks.  It is going to wait 60 seconds to go
again". So, the main reason of these timeouts is because an RPC is stuck at
the server for more than a minute. Any ideas why this is so?

One cause could be that there is only one thread in thr DFSClient that
renews leases. Suppose the client has to renew leases from many servers. If
one of the servers is down, then the RPC to that server might get blocked,
thus causing all other leases to other servers timeout too. Maybe the
DFSClient can have a lease-renewal thread per server.

Thanks,
dhruba

-----Original Message-----
From: Dennis Kubes [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, November 01, 2006 12:02 PM
To: [email protected]
Subject: Race Condition with Renewing Leases and RPC Calls

This is going to be a long email.  I am seeing a race condition between 
the renewing of leases on the DFS and RPC client timeout values and I 
wanted to get some feedback on this.

We keep seeing leases failing to be renewed where a task has to start 
over.  Usually the task will complete fine but it still bugs me that the 
leases were failing to be renewed.  After doing some research here is 
what I found.  When the task is not renewed we see something like this 
in the log:

06/10/31 13:23:57 INFO fs.FSNamesystem: Removing lease [Lease.  Holder: 
DFSClient_task_0007_r_000001_0, heldlocks: 0, pendingcreates: 0], leases 
remaining: 10
06/10/31 13:24:43 INFO ipc.Server: Server handler 5 on 9000 call error: 
org.apache.hadoop.dfs.LeaseExpiredException: No lease on 
/user/phoenix/crawl/engine/unit2cat/part-00001
06/10/31 13:25:13 INFO ipc.Server: Server handler 2 on 9000 call error: 
org.apache.hadoop.dfs.LeaseExpiredException: No lease on 
/user/phoenix/crawl/engine/unit2cat/part-00001
06/10/31 13:25:13 INFO ipc.Server: Server handler 7 on 9000 call error: 
org.apache.hadoop.dfs.LeaseExpiredException: No lease on 
/user/phoenix/crawl/engine/unit2cat/part-00001
06/10/31 13:25:13 INFO ipc.Server: Server handler 5 on 9000 call error: 
org.apache.hadoop.dfs.LeaseExpiredException: No lease on 
/user/phoenix/crawl/engine/unit2cat/part-00001
06/10/31 13:25:13 INFO ipc.Server: Server handler 6 on 9000 call error: 
org.apache.hadoop.dfs.LeaseExpiredException: No lease on 
/user/phoenix/crawl/engine/unit2cat/.part-00001.crc
06/10/31 13:25:13 INFO fs.FSNamesystem: Attempt to release a lock from 
an unknown lease holder DFSClient_task_0007_r_000001_0 for 
/user/phoenix/crawl/engine/unit2cat/.part-00001.crc

Tracking this back through the code, the only way it can get to this 
position is if the lease is expired in the FSNamesystem.LeaseMonitor 
thread.  As far as I can tell the only ways for the lease to become 
expired are one the system shuts down and the shutdown hook fires the 
DFSClient.ClientFinalizer thread to release locks or two the 
DFSClient.LeaseChecker thread doesn't renew the lease in time.  The 
system is not shutting down so it has to be something in the LeaseChecker.

The LeaseChecker continuously runs, once a second, and makes RPC calls 
to renew the leases.  The leases have to be renewed every 60 seconds 
LEASE_PERIOD or they because expired and are removed by the 
LeaseMonitor.  The LeaseChecker renews leases when their renew time is 
half over, so 30 seconds and beyond.  So how is it that the leases 
aren't getting renewed.  Well the LeaseChecker actually is making an RPC 
call to the NameNode to renew the lease.  This RPC call uses an ipc 
Client object which has a default socket timeout value of 10 seconds but 
a configured value of 60 seconds in hadoop-default.xml. 

The Client class uses a Connection object that has a thread which waits 
for input from the SocketInputStream.  So on the RPC method invocation 
we write the output through the Connection sendParam and we wait for the 
return Writable value through the input stream in the Connection 
thread.  This input stream will continually wait for input and the RPC 
call will block while it is waiting.  Now let's suppose that for some 
reason the RPC call goes bad and blocks.  It is going to wait 60 seconds 
to go again.  If the lease hasn't just been renewed then a race 
condition can occur where the lease will expire and potentially be 
removed by the LeaseMonitor before the RPC class times out and can be 
refired.

I think a fix to this would be to either change the default timeout 
value in the hadoop-default.xml which could have other unintended 
consequences or to increase the LEASE_PERIOD to be something around 2 
minutes.  I am looking for some feedback on what everyone thinks.

Dennis Kubes


Reply via email to