On 8/16/19 6:55 AM, Ming Lei wrote:
The kernfs built-in lock of 'kn->count' is held in sysfs .show/.store
path. Meantime, inside block's .show/.store callback, q->sysfs_lock is
required.

However, when mq & iosched kobjects are removed via
blk_mq_unregister_dev() & elv_unregister_queue(), q->sysfs_lock is held
too. This way causes AB-BA lock because the kernfs built-in lock of
'kn-count' is required inside kobject_del() too, see the lockdep warning[1].

On the other hand, it isn't necessary to acquire q->sysfs_lock for
both blk_mq_unregister_dev() & elv_unregister_queue() because
clearing REGISTERED flag prevents storing to 'queue/scheduler'
from being happened. Also sysfs write(store) is exclusive, so no
necessary to hold the lock for elv_unregister_queue() when it is
called in switching elevator path.

Fixes the issue by not holding the q->sysfs_lock for blk_mq_unregister_dev() &
elv_unregister_queue().

Have you considered to split sysfs_lock into multiple mutexes? Today it is very hard to verify the correctness of block layer code that uses sysfs_lock because it has not been documented anywhere what that mutex protects. I think that mutex should be split into at least two mutexes: one that protects switching I/O schedulers and another one that protects hctx->tags and hctx->sched_tags.

Thanks,

Bart.

Reply via email to