A ddi_periodic would probably be simpler and cleaner than the taskq/cyclic combo. And I think you can define the interrupt level (or not) that you want to run it at.
Sent from my iPhone > On May 21, 2015, at 2:59 PM, Bryan Cantrill <[email protected]> wrote: > > > Youzhong, > > Well, today I leaned that there's an implicit dependency that I imagine would > be news for most: ZFS doesn't allow for vn_rele()'s from low-level interrupt > context. :| There are a couple of ways to fix this, but the simplest is to > have inotify_clean() be in a task queue that's kicked off by a cyclic, which > I can prototype relatively easily. Meanwhile, might it be possible for you > to upload the dump to Manta and/or otherwise make it available so I can > verify that this is indeed what's happening? And thank you (as always!) for > your diligence in reproducing and reporting the issue! > > - Bryan > > >> On Thu, May 21, 2015 at 11:34 AM, Youzhong Yang <[email protected]> wrote: >> Hi all, >> >> We're trying to use inotify for change notification in Samba 4, most of the >> time, it works very well, but under certain conditions, the server can >> become unresponsive (to client NFS requests), due to some kind of deadlock >> possibly caused by inotify_clean() running as a cyclic. >> >> Here are the steps to reproduce: >> >> 1. suppose you have a zfs file system with plenty of space, say /zpool/temp >> which is created in the zpool named 'zpool'; download the attached 2 C >> programs and compile them. >> >> 2. in terminal windows #1, run the following dtrace command to monitor the >> 'dirty_total': >> >> dtrace -n 'txg-syncing /((dsl_pool_t *)arg0)->dp_spa->spa_name == "zpool"/ >> {printf("%4dMB (%2d) of %4dMB used", ((dsl_pool_t >> *)arg0)->dp_dirty_total/1024/1024, ((dsl_pool_t >> *)arg0)->dp_dirty_total*100/`zfs_dirty_data_max, `zfs_dirty_data_max / 1024 >> / 1024);}' >> >> Replace the string "zpool" with your actual zpool name. >> >> 3. in terminal window #2, run the following command >> >> inotify_stress_new -f /zpool/temp -n 4 >> >> It forks 4 processes, each of which uses inotify_* functions to monitor the >> folder /zpool/temp, create and remove some temporary files in that folder. >> >> 4. in terminal windows #3, run the following command: >> >> write-files -f /zpool/temp -n 50 >> >> It forks 50 processes, each one creates a subfolder under /zpool/temp, keeps >> creating files of 1GB size. >> >> Depending on your machine's horse power, you may want to adjust the number >> of processes for write-files program. Its purpose is basically to make zpool >> dirty_total go beyond 60% of zfs_dirty_data_max. >> >> Normally, within a few minutes, terminal windows #1 will stop printing >> anything, which means the issue is reproduced. >> >> Now once the issue is reproduced, I run 'echo ::threadlist -v 10 | mdb -k' >> to dump the kernel threads. So far there are two patterns I observed: >> >> Case No.1: inotify_clean() stuck at cv_timedwait_hires() >> --------------------------- >> swtch+0x82() >> cv_timedwait_hires+0xec(ffffff01e852ae2e, ffffff01e852ae30, >> 1047be67f9694, >> 186a0, 3) >> dmu_tx_delay+0x147(ffffff8114229c00, d80448bb) >> dmu_tx_wait+0x8e(ffffff8114229c00) >> dmu_tx_assign+0x55(ffffff8114229c00, 1) >> dmu_free_long_range_impl+0x9f(ffffff443a1943c0, ffffff4be51507e8, 0, >> ffffffffffffffff) >> dmu_free_long_range+0x62(ffffff443a1943c0, 179294d, 0, ffffffffffffffff) >> zfs_rmnode+0x52(ffffff44d20891f0) >> zfs_zinactive+0xe8(ffffff44d20891f0) >> zfs_inactive+0x75(ffffff450b8e8500, 0, 0) >> fop_inactive+0x76(ffffff450b8e8500, 0, 0) >> vn_rele+0x82(ffffff450b8e8500) >> inotify_clean+0xfc(ffffff43abf54810) >> cyclic_softint+0xf3(ffffff42eb589a80, 0) >> cbe_low_level+0x14() >> av_dispatch_softvect+0x78(2) >> apix_dispatch_softint+0x35(0, 0) >> switch_sp_and_call+0x13() >> 0xffffff42eb589a80() >> apix_do_interrupt+0xfe(ffffff01e84faad0, 0) >> _interrupt+0xba() >> i86_mwait+0xd() >> cpu_idle_mwait+0x109() >> idle+0xa7() >> thread_start+8() >> --------------------------- >> Case No. 2: inotify_clean() waits for a mutex which is held by another >> thread stuck at cv_timedwait_hires() >> --------------------------- >> > ffffff003d29cc40::findstack -v >> stack pointer for thread ffffff003d29cc40: ffffff003d29c960 >> [ ffffff003d29c960 resume_from_intr+0xb7() ] >> ffffff003d29c990 swtch+0x82() >> ffffff003d29ca30 turnstile_block+0x21a(ffffff090bf5bd58, 0, >> ffffff096b1d9808, fffffffffbc07aa0, 0, 0) >> ffffff003d29caa0 mutex_vector_enter+0x3a3(ffffff096b1d9808) >> ffffff003d29caf0 inotify_clean+0x2f(ffffff096b1d9808) >> ffffff003d29cb90 cyclic_softint+0xfd(ffffff090b55bb00, 0) >> ffffff003d29cba0 cbe_low_level+0x14() >> ffffff003d29cbf0 av_dispatch_softvect+0x78(2) >> ffffff003d29cc20 dispatch_softint+0x39(0, 0) >> ffffff003d5a8e60 switch_sp_and_call+0x13() >> ffffff003d5a8e90 0xffffff0900000001() >> 4136b0 [no mapping for address] >> > ffffff096b1d9808::mutex >> ADDR TYPE HELD MINSPL OLDSPL WAITERS >> ffffff096b1d9808 adapt ffffff0912213000 - - yes >> > ffffff0912213000::findstack -v >> stack pointer for thread ffffff0912213000: ffffff003d8372f0 >> [ ffffff003d8372f0 _resume_from_idle+0xf4() ] >> ffffff003d837320 swtch+0x141() >> ffffff003d8373b0 cv_timedwait_hires+0xec(ffffff09122131ee, >> ffffff09122131f0, 7d07b0bbc4, 186a0, 3) >> ffffff003d837410 dmu_tx_delay+0x147(ffffff096f22a900, 87f76900) >> ffffff003d837460 dmu_tx_wait+0x8e(ffffff096f22a900) >> ffffff003d8374a0 dmu_tx_assign+0x55(ffffff096f22a900, 1) >> ffffff003d837520 dmu_free_long_range_impl+0xa7(ffffff0914365780, >> ffffff096cb94d28, 0, ffffffffffffffff) >> ffffff003d837590 dmu_free_long_range+0x62(ffffff0914365780, 89e, 0, >> ffffffffffffffff) >> ffffff003d8375e0 zfs_rmnode+0x52(ffffff0978376220) >> ffffff003d837620 zfs_zinactive+0xe8(ffffff0978376220) >> ffffff003d837680 zfs_inactive+0x75(ffffff09783d0e00, ffffff096ed6abb8, 0) >> ffffff003d8376e0 fop_inactive+0x76(ffffff09783d0e00, ffffff096ed6abb8, 0) >> ffffff003d837710 vn_rele+0x82(ffffff09783d0e00) >> ffffff003d837750 inotify_watch_remove+0x93(ffffff096b1d9808, >> ffffff097a002540) >> ffffff003d837830 inotify_rm_watch+0x5b(ffffff096b1d9808, 1) >> ffffff003d837cc0 inotify_ioctl+0x113(12000000004, 696e7902, 1, 202003, >> ffffff096ed6abb8, ffffff003d837ea8) >> ffffff003d837d00 cdev_ioctl+0x39(12000000004, 696e7902, 1, 202003, >> ffffff096ed6abb8, ffffff003d837ea8) >> ffffff003d837d50 spec_ioctl+0x60(ffffff0968d52d40, 696e7902, 1, 202003, >> ffffff096ed6abb8, ffffff003d837ea8, 0) >> ffffff003d837de0 fop_ioctl+0x55(ffffff0968d52d40, 696e7902, 1, 202003, >> ffffff096ed6abb8, ffffff003d837ea8, 0) >> ffffff003d837f00 ioctl+0x9b(3, 696e7902, 1) >> ffffff003d837f10 sys_syscall+0x196() >> --------------------------- >> >> I've reproduced this issue using the latest image joyent_20150514T133314Z by >> the way. >> >> Thanks, >> >> -Youzhong > > smartos-discuss | Archives | Modify Your Subscription ------------------------------------------- smartos-discuss Archives: https://www.listbox.com/member/archive/184463/=now RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00 Modify Your Subscription: https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb Powered by Listbox: http://www.listbox.com
