On 28/03/2016 19:23, Konstantin Belousov wrote: > On Mon, Mar 28, 2016 at 08:52:03AM -0700, Maxim Sobolev wrote: >> Done some head scratching, it looks like it's got page fault in the >> copyin() (cp(1) AFAIK mmaps source file). There might be some interlock >> issue between competing write to the same ZFS, the md0 device is locked >> forever waiting for the write operation to complete at the very same time. >> I am curious as to whether we are allowed to sleep in the >> dmu_write_uio_dbuf(), >> AFAIK dmu is ZFS's transaction layer, so maybe copyin() should be done >> earlier to avoid possible page fault in there?
Maxim, is this copy from UFS to ZFS? It looks like that because the copyin() fault goes to vnode_pager_generic_getpages() -> bwait()... > No idea about ZFS, but if the issue is due to copyin(9) recursing into > VM and then VFS while owning file system locks, it is well-known and > long-standing issue. I sometimes call it 'ups deadlock', for some > reasons, see tools/test/upsdl/ for the distilled test case. > > It is handled for UFS and NFS, read the long comment starting with 'The > vn_io_fault() is a wrapper' in sys/kern/vfs_vnops.c, which describes the > deadlock in details and explains the mechanism which is used to prevent > it. Filesystems must opt-in into it by specifiying MNTK_NO_IOPF flag, > and then being ready to get an array of pages for io instead of the buffer > KVA. I don't have any idea why the thread would be stuck in bwait() and what locks and threads are involved here. But, as Kostik said, there is a general problem and I have a patch for ZFS: https://reviews.freebsd.org/D2790 -- Andriy Gapon _______________________________________________ [email protected] mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[email protected]"
