Rainer: Thank you for the patch. The second block of your patch is indeed a deadlock that must be fixed. The first block of your patch affects code that has been changed on the 1.4.x branch. Please examine the attached patch.
Jeffrey Altman Rainer Toebbicke wrote: > There is a possible deadlock in host.c in the fileserver code: > > around line 1882, where the lock for client is dropped in order to lock > oldClient, it is not impossible (e.g. through prfail == 2, or just if > nothing exciting happened through pr_GetCPS) that client == oldClient. > > In itself "switching" at that point is already fishy. > > Worse however, the H_LOCK is held. When the lock for client is dropped, > another thread can grab it and immediately after that queue for the > H_LOCK. Since we hold H_LOCK we'll deadlock if we now wait for the > oldClient's lock since the other thread's got it. > > The attached patch ... > > 1. correctly handles client == oldClient > > 2. drops the H_LOCK when switching to oldClient, with the reference > count updated before, > > 3. does all this in a re-check loop since both locks are dropped at one > point. > > (Bcc'ed to openafs-bugs)
? AFS_component_version_number.h
? rs_state.ini
? viced-rt34073-diff-2.txt
Index: host.c
===================================================================
RCS file: /cvs/openafs/src/viced/host.c,v
retrieving revision 1.57.2.36
diff -u -r1.57.2.36 host.c
--- host.c 7 Jun 2006 04:55:25 -0000 1.57.2.36
+++ host.c 20 Jun 2006 12:30:48 -0000
@@ -1759,7 +1759,7 @@
* the RPC from the other client structure's rock.
*/
oldClient = (struct client *)rx_GetSpecific(tcon, rxcon_client_key);
- if (oldClient && oldClient->sid == rxr_CidOf(tcon)
+ if (oldClient && oldClient != client && oldClient->sid == rxr_CidOf(tcon)
&& oldClient->VenusEpoch == rxr_GetEpoch(tcon)) {
char hoststr[16];
if (!oldClient->deleted) {
@@ -1782,8 +1782,10 @@
FreeCE(client);
created = 0;
}
- ObtainWriteLock(&oldClient->lock);
oldClient->refCount++;
+ H_UNLOCK;
+ ObtainWriteLock(&oldClient->lock);
+ H_LOCK;
client = oldClient;
} else {
ViceLog(0, ("FindClient: deleted client %x(%x) already had conn %x
(host %s:%d), stolen by client %x(%x)\n",
smime.p7s
Description: S/MIME Cryptographic Signature
