It seems that are a couple of code paths where spa_config_enter is called while
spa_namespace_lock is held [*].  I am curious if this is considered normal or it
it's something that should be and, more importantly, can be avoided.  Perhaps,
it's possible to find a spa, reference it, drop the namespace lock and then
enter configuration locks.

The reason for my inquiry is that from time to time we see cases where ZFS gets
into a very bad shape because of a problem with a disk, especially if it's a
network disk.  The scenario unfolds like this:
- there is an outstanding zio that does not complete for a long time,
  that means that SCL_ZIO is held in the read mode
- a vdev event (SPA_ASYNC_REMOVE) occurs and spa_async_thread_vd gets kicked,
  it acquires a number of config locks in the write mode before
  getting blocked on SCL_ZIO
- this is already quite bad by itself as SCL_STATE is held by
  spa_async_thread_vd and, thus, spa_sync gets blocked
- if one of the paths involving spa_namespace_lock and spa_config_enter gets
  executed, then spa_namespace_lock ends up locked
- that means that practically all ioctl-s will get blocked, because they
  need to look up a spa first

In the end, it is not possible to examine the state of ZFS or to attempt any
corrective action.  Also, the problem affects all pools, not only the one with
the disk problems.
I understand that there is not much, if anything at all, that ZFS can do about
SCL_ZIO (it's really an issue with an underlying storage stack), but I think
that ZFS should try to avoid having a dependency between SCL_ZIO in one pool and
global spa_namespace_lock.

I will appreciate any thoughts and suggestion on this problem.
Thanks!

P.S.
I also recall a discussion about calling dsl_sync_task() while holding
spa_namespace_lock.  That can lead to a similar issue.

[*]
1. spa_config_enter spa_config_generate spa_open_common spa_get_stats
zfs_ioc_pool_stats zfsdev_ioctl
2. spa_config_enter l2arc_dev_get_next l2arc_feed_thread

-- 
Andriy Gapon

------------------------------------------
openzfs-developer
Archives: 
https://openzfs.topicbox.com/groups/developer/discussions/T33dc6c344651ecdc-Ma22fb5d54d38e7ceeda5fb29
Powered by Topicbox: https://topicbox.com

Reply via email to