There is a possible deadlock in host.c in the fileserver code:

around line 1882, where the lock for client is dropped in order to lock oldClient, it is not impossible (e.g. through prfail == 2, or just if nothing exciting happened through pr_GetCPS) that client == oldClient.

In itself "switching" at that point is already fishy.

Worse however, the H_LOCK is held. When the lock for client is dropped, another thread can grab it and immediately after that queue for the H_LOCK. Since we hold H_LOCK we'll deadlock if we now wait for the oldClient's lock since the other thread's got it.

The attached patch  ...

1. correctly handles client == oldClient

2. drops the H_LOCK when switching to oldClient, with the reference
count updated before,

3. does all this in a re-check loop since both locks are dropped at one point.

(Bcc'ed to openafs-bugs)

--
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Rainer Toebbicke
European Laboratory for Particle Physics(CERN) - Geneva, Switzerland
Phone: +41 22 767 8985       Fax: +41 22 767 7155
--- openafs/src/viced/host.c.o141       2006-04-26 14:55:10.000000000 +0200
+++ openafs/src/viced/host.c    2006-06-19 10:42:20.000000000 +0200
@@ -1853,8 +1853,11 @@
      * required).  So, before setting the RPC's rock, we should disconnect
      * the RPC from the other client structure's rock.
      */
-    oldClient = (struct client *)rx_GetSpecific(tcon, rxcon_client_key);
-    if (oldClient && oldClient->tcon == tcon) {
+    while (
+       (oldClient = (struct client *)rx_GetSpecific(tcon, rxcon_client_key))
+        && oldClient != client
+       && oldClient->tcon == tcon
+    ) {
        char hoststr[16];
        if (!oldClient->deleted) {
            /* if we didn't create it, it's not ours to put back */
@@ -1879,9 +1882,12 @@
                FreeCE(client);
                created = 0;
            } 
-           ObtainWriteLock(&oldClient->lock);
            oldClient->refCount++;
+           H_UNLOCK;
+           ObtainWriteLock(&oldClient->lock);
+           H_LOCK;
            client = oldClient;
+           continue;                           /* all locks dropped, re-check 
*/
        } else {
            rx_PutConnection(oldClient->tcon);
            oldClient->tcon = (struct rx_connection *)0;

Reply via email to