Sorry if this isn't the best forum, but symptoms of a problem I'm seeing with
snv_b91 seem similar, at least with respect to unkillable processes to one
descibed here.
Context:
Periodically (typically every two hours), we use zfs send -i ... | zfs
receive ... to do incremental replication of 100 zfs.
Problems:
Occasionally, depending on load, a new round of replications begin
before a previous round finishes. Although more rigorus checking for this
condition might allow us to avoid this circumstance, the result is perhaps of
interest. In earlier
Nevada builds, having two processes "receiving" the same zfs caused a panic.
This was addressed in snv_b89?, however it appears that the solution may
introduced conditions that give rise to "unkillable" processes.
Example:
Below, we see two processes [10448,11428] that will both attempt to
update zfs 'ACS/acsnfs5/acsnfs5/mail/56'. This appears to have resulted in
pid 260 having become unkillable.
I'd appreciate any insight others can offer on the problem.
Data:
% ptree | grep [a]csnfs5
9260 zfs receive -F -d ACS/acsnfs5/acsnfs5/mail/56
10448 ssh -2xac Blowfish -n acsnfs5 zfs send -i acsnfs5/mail/[EMAIL PROTECTED]
11428 ssh -2xac Blowfish -n acsnfs5 zfs send -i acsnfs5/mail/[EMAIL PROTECTED]
# pargs 9260
9260: zfs receive -F -d ACS/acsnfs5/acsnfs5/mail/56
argv[0]: zfs
argv[1]: receive
argv[2]: -F
argv[3]: -d
argv[4]: ACS/acsnfs5/acsnfs5/mail/56
# pargs 10448
10448: ssh -2xac Blowfish -n acsnfs5 zfs send -i acsnfs5/mail/[EMAIL
PROTECTED]
argv[0]: ssh
argv[1]: -2xac
argv[2]: Blowfish
argv[3]: -n
argv[4]: acsnfs5
argv[5]: zfs
argv[6]: send
argv[7]: -i
argv[8]: acsnfs5/mail/[EMAIL PROTECTED]
argv[9]: acsnfs5/mail/[EMAIL PROTECTED]
# pargs 11428
11428: ssh -2xac Blowfish -n acsnfs5 zfs send -i acsnfs5/mail/[EMAIL
PROTECTED]
argv[0]: ssh
argv[1]: -2xac
argv[2]: Blowfish
argv[3]: -n
argv[4]: acsnfs5
argv[5]: zfs
argv[6]: send
argv[7]: -i
argv[8]: acsnfs5/mail/[EMAIL PROTECTED]
argv[9]: acsnfs5/mail/[EMAIL PROTECTED]
# kill 10448 11428
# ptree|grep [a]csnfs5
9260 zfs receive -F -d ACS/acsnfs5/acsnfs5/mail/56
# kill -9 9260
# ptree 9260
9260 zfs receive -F -d ACS/acsnfs5/acsnfs5/mail/56
# zfs destroy -f 'ACS/acsnfs5/acsnfs5/mail/56/%08.06.12.2000.hourly'
cannot destroy 'ACS/acsnfs5/acsnfs5/mail/56/%08.06.12.2000.hourly': dataset is
busy
# truss -fa -p 9260
truss: unanticipated system error: 9260
# mdb -p 9260
mdb: cannot debug 9260: unanticipated system error
mdb: failed to initialize target: No such file or directory
# dtrace -n profile-1234hz'/pid == 9260/[EMAIL PROTECTED]()] = count()}'
dtrace: description 'profile-1234hz' matched 1 probe
^C
unix`disp_getwork+0xba
unix`disp+0x1bb
unix`swtch+0xb5
genunix`cv_wait+0x61
genunix`delay+0xba
zfs`dnode_special_close+0x28
zfs`dmu_objset_evict+0xab
zfs`dsl_dataset_destroy+0x161
zfs`dmu_recv_abort_cleanup+0x4a
zfs`dmu_recv_stream+0x8f2
zfs`zfs_ioc_recv+0x28b
zfs`zfsdev_ioctl+0x10d
genunix`cdev_ioctl+0x48
specfs`spec_ioctl+0x86
genunix`fop_ioctl+0x7b
genunix`ioctl+0x174
unix`sys_syscall32+0x101
1
unix`do_splx+0x82
genunix`disp_lock_exit+0x56
unix`disp+0x1b3
unix`swtch+0xb5
genunix`cv_wait+0x61
genunix`delay+0xba
zfs`dnode_special_close+0x28
zfs`dmu_objset_evict+0xab
zfs`dsl_dataset_destroy+0x161
zfs`dmu_recv_abort_cleanup+0x4a
zfs`dmu_recv_stream+0x8f2
zfs`zfs_ioc_recv+0x28b
zfs`zfsdev_ioctl+0x10d
genunix`cdev_ioctl+0x48
specfs`spec_ioctl+0x86
genunix`fop_ioctl+0x7b
genunix`ioctl+0x174
unix`sys_syscall32+0x101
1
genunix`cpu_update_pct+0x97
genunix`new_mstate+0x5a
genunix`cv_block+0x8d
genunix`cv_wait+0x3f
genunix`delay+0xba
zfs`dnode_special_close+0x28
zfs`dmu_objset_evict+0xab
zfs`dsl_dataset_destroy+0x161
zfs`dmu_recv_abort_cleanup+0x4a
zfs`dmu_recv_stream+0x8f2
zfs`zfs_ioc_recv+0x28b
zfs`zfsdev_ioctl+0x10d
genunix`cdev_ioctl+0x48
specfs`spec_ioctl+0x86
genunix`fop_ioctl+0x7b
genunix`ioctl+0x174
unix`sys_syscall32+0x101
1
unix`atomic_cas_64+0x8
genunix`cv_block+0x8d
genunix`cv_wait+0x3f
genunix`delay+0xba
zfs`dnode_special_close+0x28
zfs`dmu_objset_evict+0xab
zfs`dsl_dataset_destroy+0x161
zfs`dmu_recv_abort_cleanup+0x4a
zfs`dmu_recv_stream+0x8f2
zfs`zfs_ioc_recv+0x28b
zfs`zfsdev_ioctl+0x10d
genunix`cdev_ioctl+0x48
specfs`spec_ioctl+0x86
genunix`fop_ioctl+0x7b
genunix`ioctl+0x174
unix`sys_syscall32+0x101
1
unix`disp_getwork+0x190
unix`disp+0x1bb
unix`swtch+0xb5
genunix`cv_wait+0x61
genunix`delay+0xba
zfs`dnode_special_close+0x28
zfs`dmu_objset_evict+0xab
zfs`dsl_dataset_destroy+0x161
zfs`dmu_recv_abort_cleanup+0x4a
zfs`dmu_recv_stream+0x8f2
zfs`zfs_ioc_recv+0x28b
zfs`zfsdev_ioctl+0x10d
genunix`cdev_ioctl+0x48
specfs`spec_ioctl+0x86
genunix`fop_ioctl+0x7b
genunix`ioctl+0x174
unix`sys_syscall32+0x101
1
unix`disp_getwork+0xae
unix`disp+0x1bb
unix`swtch+0xb5
genunix`cv_wait+0x61
genunix`delay+0xba
zfs`dnode_special_close+0x28
zfs`dmu_objset_evict+0xab
zfs`dsl_dataset_destroy+0x161
zfs`dmu_recv_abort_cleanup+0x4a
zfs`dmu_recv_stream+0x8f2
zfs`zfs_ioc_recv+0x28b
zfs`zfsdev_ioctl+0x10d
genunix`cdev_ioctl+0x48
specfs`spec_ioctl+0x86
genunix`fop_ioctl+0x7b
genunix`ioctl+0x174
unix`sys_syscall32+0x101
1
# dtrace -n profile-1234hz'/pid == 9260/[EMAIL PROTECTED]()] = count()}'
dtrace: description 'profile-1234hz' matched 1 probe
^C
unix`do_splx+0x82
unix`hr_clock_unlock+0x1e
unix`gethrestime_lasttick+0x48
genunix`timeout_common+0x37
genunix`timeout+0x4e
genunix`delay+0xac
zfs`dnode_special_close+0x28
zfs`dmu_objset_evict+0xab
zfs`dsl_dataset_destroy+0x161
zfs`dmu_recv_abort_cleanup+0x4a
zfs`dmu_recv_stream+0x8f2
zfs`zfs_ioc_recv+0x28b
zfs`zfsdev_ioctl+0x10d
genunix`cdev_ioctl+0x48
specfs`spec_ioctl+0x86
genunix`fop_ioctl+0x7b
genunix`ioctl+0x174
unix`sys_syscall32+0x101
1
unix`do_splx+0x82
genunix`disp_lock_exit+0x56
unix`disp+0x1b3
unix`swtch+0xb5
genunix`cv_wait+0x61
genunix`delay+0xba
zfs`dnode_special_close+0x28
zfs`dmu_objset_evict+0xab
zfs`dsl_dataset_destroy+0x161
zfs`dmu_recv_abort_cleanup+0x4a
zfs`dmu_recv_stream+0x8f2
zfs`zfs_ioc_recv+0x28b
zfs`zfsdev_ioctl+0x10d
genunix`cdev_ioctl+0x48
specfs`spec_ioctl+0x86
genunix`fop_ioctl+0x7b
genunix`ioctl+0x174
unix`sys_syscall32+0x101
1
genunix`savectx+0x18
unix`resume+0x5b
unix`swtch+0x17f
genunix`cv_wait+0x61
genunix`delay+0xba
zfs`dnode_special_close+0x28
zfs`dmu_objset_evict+0xab
zfs`dsl_dataset_destroy+0x161
zfs`dmu_recv_abort_cleanup+0x4a
zfs`dmu_recv_stream+0x8f2
zfs`zfs_ioc_recv+0x28b
zfs`zfsdev_ioctl+0x10d
genunix`cdev_ioctl+0x48
specfs`spec_ioctl+0x86
genunix`fop_ioctl+0x7b
genunix`ioctl+0x174
unix`sys_syscall32+0x101
1
unix`tsc_scalehrtime+0xa
genunix`scalehrtime+0x15
genunix`cpu_update_pct+0xcb
genunix`new_mstate+0x5a
genunix`cv_block+0x8d
genunix`cv_wait+0x3f
genunix`delay+0xba
zfs`dnode_special_close+0x28
zfs`dmu_objset_evict+0xab
zfs`dsl_dataset_destroy+0x161
zfs`dmu_recv_abort_cleanup+0x4a
zfs`dmu_recv_stream+0x8f2
zfs`zfs_ioc_recv+0x28b
zfs`zfsdev_ioctl+0x10d
genunix`cdev_ioctl+0x48
specfs`spec_ioctl+0x86
genunix`fop_ioctl+0x7b
genunix`ioctl+0x174
unix`sys_syscall32+0x101
1
unix`lock_set+0x4
genunix`cv_block+0xbf
genunix`cv_wait+0x3f
genunix`delay+0xba
zfs`dnode_special_close+0x28
zfs`dmu_objset_evict+0xab
zfs`dsl_dataset_destroy+0x161
zfs`dmu_recv_abort_cleanup+0x4a
zfs`dmu_recv_stream+0x8f2
zfs`zfs_ioc_recv+0x28b
zfs`zfsdev_ioctl+0x10d
genunix`cdev_ioctl+0x48
specfs`spec_ioctl+0x86
genunix`fop_ioctl+0x7b
genunix`ioctl+0x174
unix`sys_syscall32+0x101
1
unix`disp_getwork+0xae
unix`disp+0x1bb
unix`swtch+0xb5
genunix`cv_wait+0x61
genunix`delay+0xba
zfs`dnode_special_close+0x28
zfs`dmu_objset_evict+0xab
zfs`dsl_dataset_destroy+0x161
zfs`dmu_recv_abort_cleanup+0x4a
zfs`dmu_recv_stream+0x8f2
zfs`zfs_ioc_recv+0x28b
zfs`zfsdev_ioctl+0x10d
genunix`cdev_ioctl+0x48
specfs`spec_ioctl+0x86
genunix`fop_ioctl+0x7b
genunix`ioctl+0x174
unix`sys_syscall32+0x101
1
This message posted from opensolaris.org
_______________________________________________
opensolaris-discuss mailing list
[email protected]