Ian Kent wrote:

My thoughts:

The cause of this issue is user space programs using autofs4 need to call services that must be able to take the inode semaphore. Notably sys_mkdir and sys_symlink in order to complete their task.

I believe that, in this case, releasing the semaphore is ok since the entry is part of the autofs filesystem and so autofs is responsible for taking care of it, provided that it is done carefully. The semaphore is meant to serialize changes being to the directory and these changes are done in autofs by asking the user space process to do it. Which are themselves serialized by the same semaphore.

The only tricky thing I can think of here is that care must be taken to ensure that the semaphore is not released before the DCACHE_AUTOFS_PENDING flag is set to make sure that other incoming requests are sent to the wait queue.

The attached patch does this and opts for a conservative approach by broadening the critical region instead of narrowing it.

It may also be necessary to review the return codes from revaliate but I'm only part way through that.

Please review and test this patch and offer further comment.
Sorry guys but I haven't been able to test this at all save verifying that it compiles.

Hopefully I haven't missed anything completely obvious ... DOH!

Ian

--- linux-2.6.15-rc1/fs/autofs4/root.c.lookup-deadlock  2005-11-17 
18:58:38.000000000 +0800
+++ linux-2.6.15-rc1/fs/autofs4/root.c  2005-11-27 17:00:40.000000000 +0800
@@ -487,11 +487,8 @@ static struct dentry *autofs4_lookup(str
        dentry->d_fsdata = NULL;
        d_add(dentry, NULL);
- if (dentry->d_op && dentry->d_op->d_revalidate) {
-               up(&dir->i_sem);
+       if (dentry->d_op && dentry->d_op->d_revalidate)
                (dentry->d_op->d_revalidate)(dentry, nd);
-               down(&dir->i_sem);
-       }
/*
         * If we are still pending, check if we had to handle
--- linux-2.6.15-rc1/fs/autofs4/waitq.c.lookup-deadlock 2005-11-27 
17:09:42.000000000 +0800
+++ linux-2.6.15-rc1/fs/autofs4/waitq.c 2005-11-27 17:17:34.000000000 +0800
@@ -161,6 +161,8 @@ int autofs4_wait(struct autofs_sb_info *
                enum autofs_notify notify)
 {
        struct autofs_wait_queue *wq;
+       struct inode *dir = dentry->d_parent->d_inode;
+       int i_sem_held;
        char *name;
        int len, status;
@@ -227,6 +229,14 @@ int autofs4_wait(struct autofs_sb_info *
                        (unsigned long) wq->wait_queue_token, wq->len, 
wq->name, notify);
        }
+ /*
+        * If we are called from lookup or lookup_hash the
+        * the inode semaphore needs to be released for
+        * userspace to do its thing.
+        */
+       i_sem_held = down_trylock(&dir->i_sem);
+       up(&dir->i_sem);
+
        if (notify != NFY_NONE && atomic_dec_and_test(&wq->notified)) {
                int type = (notify == NFY_MOUNT ?
                        autofs_ptype_missing : autofs_ptype_expire_multi);
@@ -268,6 +278,10 @@ int autofs4_wait(struct autofs_sb_info *
                DPRINTK("skipped sleeping");
        }
+ /* Re-take the inode semaphore if it was held */
+       if (i_sem_held)
+               down(&dir->i_sem);
+
        status = wq->status;
/* Are we the last process to need status? */
-
Ian,
I have not tested this patch but it seems to have a serious flaw. Given that do_lookup does not get the parent i_sem lock before calling revalidate, you have the possibility that you are being called without having gotten the lock but the lock may be held by another process. In that case you do not want to be releasing their lock while they are relying on it.

Will

_______________________________________________
autofs mailing list
[email protected]
http://linux.kernel.org/mailman/listinfo/autofs

Reply via email to