On Fri, 22 Apr 2005, Jeff Moyer wrote:
==> Regarding Re: [PATCH 1/3] autofs4 - expiring filesystem from under process; Ian Kent <[EMAIL PROTECTED]> adds:
raven> On Wed, 20 Apr 2005, Jeff Moyer wrote:raven> On Mon, 11 Apr 2005, Jeff Moyer wrote:==> Regarding Re: [PATCH 1/3] autofs4 - expiring filesystem from under process; [EMAIL PROTECTED] adds:
process; [EMAIL PROTECTED] adds:==> Regarding [PATCH 1/3] autofs4 - expiring filesystem from under >>AUTOFS_INF_EXPIRING in flags. While the expire is in progress, another
Could also please explain how the following is handled:
expire process runs and issues AUTOFS_EXPIRE_MULTI, which sets >>process access the directory in question, causing a call to >>try_to_fill_dentry. try_to_fill_dentry sees the AUTOFS_INF_EXPIRING >> flag is set, and so calls autofs4_wait with notify set to NFY_NONE. >> However, when we take the wq sem, we find that the expire has finished,tell >> the daemon.and thus create a new wq entry. Because NFY_NONE is set, we don't
So how will this process ever get woken up?
raven> I've thought about this for quite a while and I think all that's raven> needed is to recognise that we're about to expire a dentry that's raven> not mounted anymore.raven> Can you think of a case for which this patch fails?
I don't see how the patch you provided addresses the race between an expire event and a lookup. The real issue here is that the checking of AUTOFS_INF_EXPIRING and the list operations associated therewith need to happen atomically. Until this happens, we will have races.
The attached patch is more in line with how I think the problem should be fixed. I have not yet tested or even compiled this. Could you please look this over and comment?
raven> Sorry about this but the test should be for a notify event of raven> NFY_NONE not NFY_EXPIRE.
Oh, that makes a bit more sense. So the patch I am commenting on is this:
That's right.
@@ -191,6 +191,13 @@ int autofs4_wait(struct autofs_sb_info * }
if ( !wq ) { + /* Can't expire if we are not mounted */ + if (notify == NFY_NONE && !d_mountpoint(dentry)) { + kfree(name); + up(&sbi->wq_sem); + return -ENOENT; + } + /* Create a new wait queue */ wq = kmalloc(sizeof(struct autofs_wait_queue),GFP_KERNEL); if ( !wq ) {
The original case you mentioned was where the expire is racing with an access at the beginning of the expiry. So, they both enter autofs4_wait, one with NFY_EXPIRE and the other with NFY_NONE. The caller with NFY_NONE gets the semaphore first, and so doesn't notify the daemon. The NFY_EXPIRE caller then gets added to the wait list, and no one wakes them up. I don't belive your patch addresses that race, since d_mountpoint(dentry) will still return true.
Now my little brain is spinning. I think that's not quite what we are discussing here.
The original patch from this thread is meant to address the race at expire initiation and I still believe it does. On lead in to the expire d_mountpoint should always be true so this test will always be false.
Following that you gave an example of a race at the tail end of the expire (good catch) which I believe the above patch resolves.
I thought about the possibilities for each of these cases for several days and I'm fairly confident that the two patches together address both issues.
Please, if you can think of a counter example for either case I'll be happy to reconsider, as always.
raven> I started something similar to what you propose myself but I don't raven> think it's needed to resolve the issue.
raven> I'll reconsider, but ...
raven> One possible issue with your proposal is that the dentry info may raven> not yet exist for mount requests in directories that are not raven> browsable (ie. via lookup) as at this point we have a newly created raven> negative dentry that we are attempting to fill in. I'll have to raven> check further to see if this is actually the case.
Okay, I'll look into this further, as well.
raven> The point of the notify none is to "wait" until the expire operation raven> is done then return a fail status so that the following lookup can raven> perform the mount.
Right.
raven> To explain.
raven> In this case, during the path walk the cached lookup is called as raven> the dentry concerned is present (it's dgot) until the very end of raven> the expire. The cached lookup fails as the revalidate (NFY_NONE) raven> returns fail and the dput allows d_invalidate to succeed. The path raven> walk then calls real lookup which triggers a mount following the raven> expire.
raven> There may still be an oppertunity for a race between the time the raven> autofs4_release function is called and before the dput is raven> executed. I'll think more about that.
raven> But, given the correct test, as far as the original issue you raven> describe is concerned I think my recommended patch covers what's raven> needed.
raven> A single counter example is all that's needed.
raven> Thanks for the help Jeff.
No problem. In case you are actually considering the patch I sent earlier, I missed a hunk which changes fs/autofs4/root.c. Let me know if you want me to send that to you. It basically changed try_to_fill_dentry to call autofs4_wait unconditionally with NFY_NONE.
I'll give that some thought but I suspect it will produce incorrect behaviour. I haven't checked but when the lookup is called it may prevent mounts from happening when they should.
We are focusing on interactions with expire here but I agree we probably need to do a similar analysis for the mount case. OTOH that hasn't been problematic as far a I can remember.
Ian
- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html