I'm having similar issues after upgrading to 9.1-RC2 and RC3. I'm not using either NFS or a ZIL.
On Tue, Dec 4, 2012 at 7:26 AM, Rick Macklem <[email protected]> wrote: > Olivier wrote: >> Hi all >> After upgrading from 9.0-RELEASE to 9.1-PRERELEASE #0 r243679 I'm >> having >> severe problems with NFS sharing of a ZFS volume. nfsd appears to hang >> at >> random times (between once every couple hours to once every two days) >> while >> accessing a ZFS volume, and the only way I have found of resolving the >> problem is to reboot. The server console is sometimes still responsive >> during the nfsd hang, and I can read and write files to the same ZFS >> volume >> while nfsd is hung. I am pasting below the output of procstat -kk on >> nfsd, >> and details of my pool (nfsstat on the server gets hung when the >> problem >> has started occurring, and does not produce any output). The pool is >> v28 >> and was created from a bunch of volumes attached over Fibre Channel >> using >> the mpt driver. My system has a Supermicro board and 4 AMD Opteron >> 6274 >> CPUs. >> >> I did not experience any nfsd hangs with 9.0-RELEASE (same machine, >> essentially same configuration, same usage pattern). >> >> I would greatly appreciate any help to resolve this problem! >> Thank you >> Olivier >> >> PID TID COMM TDNAME KSTACK >> 1511 102751 nfsd nfsd: master >> mi_switch+0x186 >> sleepq_wait+0x42 >> __lockmgr_args+0x5ae >> vop_stdlock+0x39 >> VOP_LOCK1_APV+0x46 >> _vn_lock+0x47 >> zfs_fhtovp+0x338 >> nfsvno_fhtovp+0x87 >> nfsd_fhtovp+0x7a >> nfsrvd_dorpc+0x9cf >> nfssvc_program+0x447 >> svc_run_internal+0x687 >> svc_run+0x8f >> nfsrvd_nfsd+0x193 >> nfssvc_nfsd+0x9b >> sys_nfssvc+0x90 >> amd64_syscall+0x540 >> Xfast_syscall+0xf7 >> 1511 102752 nfsd nfsd: service >> mi_switch+0x186 >> sleepq_wait+0x42 >> __lockmgr_args+0x5ae >> vop_stdlock+0x39 >> VOP_LOCK1_APV+0x46 >> _vn_lock+0x47 >> zfs_fhtovp+0x338 >> nfsvno_fhtovp+0x87 >> nfsd_fhtovp+0x7a >> nfsrvd_dorpc+0x9cf >> nfssvc_program+0x447 >> svc_run_internal+0x687 >> svc_thread_start+0xb >> fork_exit+0x11f >> fork_trampoline+0xe >> 1511 102753 nfsd nfsd: service >> mi_switch+0x186 >> sleepq_wait+0x42 >> _cv_wait+0x112 >> zio_wait+0x61 >> zil_commit+0x764 >> zfs_freebsd_write+0xba0 >> VOP_WRITE_APV+0xb2 >> nfsvno_write+0x14d >> nfsrvd_write+0x362 >> nfsrvd_dorpc+0x3c0 >> nfssvc_program+0x447 >> svc_run_internal+0x687 >> svc_thread_start+0xb >> fork_exit+0x11f >> fork_trampoline+0xe >> 1511 102754 nfsd nfsd: service >> mi_switch+0x186 >> sleepq_wait+0x42 >> _cv_wait+0x112 >> zio_wait+0x61 >> zil_commit+0x3cf >> zfs_freebsd_fsync+0xdc >> nfsvno_fsync+0x2f2 >> nfsrvd_commit+0xe7 >> nfsrvd_dorpc+0x3c0 >> nfssvc_program+0x447 >> svc_run_internal+0x687 >> svc_thread_start+0xb >> fork_exit+0x11f >> fork_trampoline+0xe >> 1511 102755 nfsd nfsd: service >> mi_switch+0x186 >> sleepq_wait+0x42 >> __lockmgr_args+0x5ae >> vop_stdlock+0x39 >> VOP_LOCK1_APV+0x46 >> _vn_lock+0x47 >> zfs_fhtovp+0x338 >> nfsvno_fhtovp+0x87 >> nfsd_fhtovp+0x7a >> nfsrvd_dorpc+0x9cf >> nfssvc_program+0x447 >> svc_run_internal+0x687 >> svc_thread_start+0xb >> fork_exit+0x11f >> fork_trampoline+0xe >> 1511 102756 nfsd nfsd: service >> mi_switch+0x186 >> sleepq_wait+0x42 >> _cv_wait+0x112 >> zil_commit+0x6d >> zfs_freebsd_write+0xba0 >> VOP_WRITE_APV+0xb2 >> nfsvno_write+0x14d >> nfsrvd_write+0x362 >> nfsrvd_dorpc+0x3c0 >> nfssvc_program+0x447 >> svc_run_internal+0x687 >> svc_thread_start+0xb >> fork_exit+0x11f >> fork_trampoline+0xe >> > These threads are either waiting for a vnode lock or waiting inside > zil_commit() { at 3 different locations in zil_commit() }. A guess > would be that the ZIL hasn`t completed a write for some reason, so > 3 threads are waiting for it when one of them is holding a lock on > the vnode being written and the remaining threads are waiting for > that vnode lock. > > I am not a ZFS guy, so I cannot help further, except to suggest > that you try and determine what might cause a write to the ZIL to > stall. (Different device, different device driver...) > > Good luck with it, rick > >> >> PID TID COMM TDNAME KSTACK >> 1507 102750 nfsd - >> mi_switch+0x186 >> sleepq_catch_signals+0x2e1 >> sleepq_wait_sig+0x16 >> _cv_wait_sig+0x12a >> seltdwait+0xf6 >> kern_select+0x6ef >> sys_select+0x5d >> amd64_syscall+0x540 >> Xfast_syscall+0xf7 >> >> >> pool: tank >> state: ONLINE >> status: The pool is formatted using a legacy on-disk format. The pool >> can >> still be used, but some features are unavailable. >> action: Upgrade the pool using 'zpool upgrade'. Once this is done, the >> pool will no longer be accessible on software that does not support >> feature >> flags. >> scan: scrub repaired 0 in 45h37m with 0 errors on Mon Dec 3 03:07:11 >> 2012 >> config: >> >> NAME STATE READ WRITE CKSUM >> tank ONLINE 0 0 0 >> raidz1-0 ONLINE 0 0 0 >> da19 ONLINE 0 0 0 >> da31 ONLINE 0 0 0 >> da32 ONLINE 0 0 0 >> da33 ONLINE 0 0 0 >> da34 ONLINE 0 0 0 >> raidz1-1 ONLINE 0 0 0 >> da20 ONLINE 0 0 0 >> da36 ONLINE 0 0 0 >> da37 ONLINE 0 0 0 >> da38 ONLINE 0 0 0 >> da39 ONLINE 0 0 0 >> _______________________________________________ >> [email protected] mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-stable >> To unsubscribe, send any mail to >> "[email protected]" > _______________________________________________ > [email protected] mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to "[email protected]" -- Reed A. Cartwright, PhD Assistant Professor of Genomics, Evolution, and Bioinformatics School of Life Sciences Center for Evolutionary Medicine and Informatics The Biodesign Institute Arizona State University _______________________________________________ [email protected] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[email protected]"
