Peter Tribble wrote:
> I know that locally mounting using nfs is problematic. I'm trying
> to understand why (and what symptoms you're likely to see).
>
> I think the relevant bug is 4498652, but the content is somewhat
> non-existent.
>
> Could someone explain the problem and likely result, or actually
> open up the comments that presumably contains the useful details
> in this case?
>   
Hi Peter,

Below's my explanation of the problem.

On systems under extreme stress you'll run into a deadlock
between NFS/UFS/VM threads which depending on a scenario you hit,
can either make the particular client mount/server share inaccessible or
deadlock the entire system.

The root cause for this really boils down to interaction between UFS and
NFS modules through the segmap driver and NFS client doing
synchronous commits.

There can be several different scenarios which can trigger this.
One scenario could go something like this - all the nfs server threads
are stuck in rfs3_write()  trying to obtain the rwlock for the vnode. 
The one
thread which owns this lock is actually stuck trying to free the pages 
which
happen to be nfs pages. To free the pages, the NFS client needs a commit,
which means you need to talk back to the server where you deadlock
trying to get the rwlock.

ie.  UFS write tries to acquire a segmap slot previously used by the NFS 
client mount:

ufs_write() -> segmap_getmapflt() -> segmap_pagefree() -> page_release() 
-> VN_DISPOSE()
 
and if the page was written but not commited, then the client needs to 
send a commit to the server.

VN_DISPOSE() -> nfs3_dispose() -> nfs3_commit() -> synchronous commit to 
the nfs server

Now if the server happens to be the same system and data that needs to
be flushed happens to be associated with the same file, then all
commits/writes will stall.

nfs-server -> rfs3_commit -> VOP_PUTPAGE() -> ufs_putpage() -> stall.

There are several different variations to this and different locks 
involved in
either nfs/ufs/vm modules (so it is not necessary that you deadlock on 
the same
client).

This deadlock issue appears to be specific to segmap based filesystems
like UFS and should not be an issue with ZFS.

In future we may change our supported configuration to enable
support for loopback mounting ZFS based filesystems. However until
that stage loopback mounting on NFS filesystems is not a supported
configuration.

Thanks,
Mahesh
> Thanks,
>
>   

Reply via email to