It seems that are a couple of code paths where spa_config_enter is called while spa_namespace_lock is held [*]. I am curious if this is considered normal or it it's something that should be and, more importantly, can be avoided. Perhaps, it's possible to find a spa, reference it, drop the namespace lock and then enter configuration locks.
The reason for my inquiry is that from time to time we see cases where ZFS gets into a very bad shape because of a problem with a disk, especially if it's a network disk. The scenario unfolds like this: - there is an outstanding zio that does not complete for a long time, that means that SCL_ZIO is held in the read mode - a vdev event (SPA_ASYNC_REMOVE) occurs and spa_async_thread_vd gets kicked, it acquires a number of config locks in the write mode before getting blocked on SCL_ZIO - this is already quite bad by itself as SCL_STATE is held by spa_async_thread_vd and, thus, spa_sync gets blocked - if one of the paths involving spa_namespace_lock and spa_config_enter gets executed, then spa_namespace_lock ends up locked - that means that practically all ioctl-s will get blocked, because they need to look up a spa first In the end, it is not possible to examine the state of ZFS or to attempt any corrective action. Also, the problem affects all pools, not only the one with the disk problems. I understand that there is not much, if anything at all, that ZFS can do about SCL_ZIO (it's really an issue with an underlying storage stack), but I think that ZFS should try to avoid having a dependency between SCL_ZIO in one pool and global spa_namespace_lock. I will appreciate any thoughts and suggestion on this problem. Thanks! P.S. I also recall a discussion about calling dsl_sync_task() while holding spa_namespace_lock. That can lead to a similar issue. [*] 1. spa_config_enter spa_config_generate spa_open_common spa_get_stats zfs_ioc_pool_stats zfsdev_ioctl 2. spa_config_enter l2arc_dev_get_next l2arc_feed_thread -- Andriy Gapon ------------------------------------------ openzfs-developer Archives: https://openzfs.topicbox.com/groups/developer/discussions/T33dc6c344651ecdc-Ma22fb5d54d38e7ceeda5fb29 Powered by Topicbox: https://topicbox.com