On Wed, 2018-01-17 at 21:04 +0800, Ming Lei wrote:
> On Wed, Jan 17, 2018 at 10:17 AM, Bart Van Assche
> <[email protected]> wrote:
> > Sorry but I think what you wrote is wrong. kobject_del(&q->kobj) waits
> > until all
> > ongoing sysfs callback methods, including elv_iosched_store(), have
> > finished and
> > prevents that any new elv_iosched_store() calls start. That is why I think
> > the
> > above changes do not reintroduce the race fixed by commit e9a823fb34a8
> > ("block:
> > fix warning when I/O elevator is changed as request_queue is being
> > removed").
>
> But your patch basically reverts commit e9a823fb34a8, and I just saw the
> warning
> again after applying your patch in my stress test of switching elelvato:
>
> [ 225.999505] kobject_add_internal failed for iosched (error: -2 parent:
> queue)
> [ 225.999521] WARNING: CPU: 0 PID: 12707 at lib/kobject.c:244
> kobject_add_internal+0x236/0x24c
> [ 225.999566] Call Trace:
> [ 225.999570] kobject_add+0x9e/0xc5
> [ 225.999573] elv_register_queue+0x35/0xa2
> [ 225.999575] elevator_switch+0x7a/0x1a4
> [ 225.999577] elv_iosched_store+0xd2/0x103
> [ 225.999579] queue_attr_store+0x66/0x82
> [ 225.999581] kernfs_fop_write+0xf3/0x135
> [ 225.999583] __vfs_write+0x31/0x142
> [ 225.999591] vfs_write+0xcb/0x16e
> [ 225.999593] SyS_write+0x5d/0xab
> [ 225.999595] entry_SYSCALL_64_fastpath+0x1a/0x7d
The above call stack means that sysfs_create_dir_ns() returned -ENOENT because
kobj->parent->sd was NULL (where kobj is the elevator kernel object). That's
probably becase sysfs_remove_dir() clears the sd pointer before performing the
actual removal. From fs/sysfs/dir.c:
spin_lock(&sysfs_symlink_target_lock);
kobj->sd = NULL;
spin_unlock(&sysfs_symlink_target_lock);
if (kn) {
WARN_ON_ONCE(kernfs_type(kn) != KERNFS_DIR);
kernfs_remove(kn);
}
In the past these two actions were performed in the opposite order. See also
commit aecdcedaab49 ("sysfs: implement kobj_sysfs_assoc_lock"). Anyway, I will
rework this patch.
Bart.