Hello Sergey, On Wed, Apr 29, 2015 at 09:16:24AM +0900, Sergey Senozhatsky wrote: > Hello, > > Minchan, a quick question. just to avoid resends and to make sure that I'm > not missing any better solution. > > > lockdep is unhappy here: > > > -static void zram_remove(struct zram *zram) > > +static int zram_remove(struct zram *zram) > > { > > - pr_info("Removed device: %s\n", zram->disk->disk_name); > > + struct block_device *bdev; > > + > > + bdev = bdget_disk(zram->disk, 0); > > + if (!bdev) > > + return -ENOMEM; > > + > > + mutex_lock(&bdev->bd_mutex); > > + if (bdev->bd_openers) { > > + mutex_unlock(&bdev->bd_mutex); > > + return -EBUSY; > > + } > > + > > /* > > * Remove sysfs first, so no one will perform a disksize > > - * store while we destroy the devices > > + * store while we destroy the devices. This also helps during > > + * zram_remove() -- device_reset() is the last holder of > > + * ->init_lock. > > */ > > sysfs_remove_group(&disk_to_dev(zram->disk)->kobj, > > &zram_disk_attr_group); > > > > + /* Make sure all pending I/O is finished */ > > + fsync_bdev(bdev); > > zram_reset_device(zram); > > + mutex_unlock(&bdev->bd_mutex); > > + > > + pr_info("Removed device: %s\n", zram->disk->disk_name); > > + > > idr_remove(&zram_index_idr, zram->disk->first_minor); > > blk_cleanup_queue(zram->disk->queue); > > del_gendisk(zram->disk); > > put_disk(zram->disk); > > kfree(zram); > > + > > + return 0; > > } > > > lock ->bd_mutex > if ->bd_openers { > unlock ->bd_mutex > return -EBUSY; > } > > sysfs_remove_group() > ^^^^^^^^^ lockdep splat > zram_reset_device() > lock ->init_lock > reset device > unlock ->init_lock > unlock ->bd_mutex > ... > kfree zram > > > why did I do this: > > sysfs_remove_group() turns zram_reset_device() into the last possible > ->init_lock > owner: there are no sysfs nodes before final zram_reset_device(), so no > concurrent nor later store()/show() sysfs handler can be called. it closes a > number > of race conditions, like: > > CPU0 CPU1 > umount > zram_remove() > zram_reset_device() disksize_store() > mount > kfree zram > > or > > CPU0 CPU1 > umount > zram_remove() > zram_reset_device() > cat /sys/block/zram0/_any_sysfs_node_ > sysfs_remove_group() > kfree zram _any_sysfs_node_read() > > > and so on. so removing sysfs group before zram_reset_device() makes sense. > > at the same time we need to prevent `umount-zram_remove vs. mount' race and > forbid > zram_remove() on active device. so we check ->bd_openers and perform device > reset > under ->bd_mutex. >
Could you explain in detail about unmount-zram_remove vs. mount race? I guess it should be done in upper layer(e,g. VFS). Anyway, I want to be more clear about that. > > a quick solution I can think of is to do something like this: > > sysfs_remove_group(); > lock ->bd_mutex > if ->bd_openers { > unlock ->bd_mutex > create_sysfs_group() > ^^^^^^^^ return attrs back > return -EBUSY > } > > zram_reset_device() > lock ->init_lock > reset device > unlock ->init_lock > unlock ->bd_mutex > ... > kfree zram > > > iow, move sysfs_remove_group() out of ->bd_mutex lock, but keep it > before ->bd_openers check & zram_reset_device(). > > I don't think that we can handle it with only ->init_lock. we need to unlock > it at some point and kfree zram that contains that lock. and we have no idea > are there any lock owners or waiters when we kfree zram. removing sysfs group > and acquiring ->init_lock for write in zram_reset_device() guarantee that > there will no further ->init_lock owners and we can safely kfree after > `unlock ->init_lock`. hence, sysfs_remove_group() must happen before > zram_reset_device(). > > > create_sysfs_group() potentially can fail. which is a bit ugly. but user > will be able to umount device and remove it anyway. > > > what do you think? > > -ss -- Kind regards, Minchan Kim -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/