[PATCH] sd: Fix crash due to race when removing scsi disk

Howard Cochran Fri, 01 Jul 2016 09:17:37 -0700

This crash occurred while writing 1 to /sys/block/sda/device/delete at
the same instant that another process was closing the block device:


 BUG: unable to handle kernel NULL pointer dereference at 00000230
 IP: [<c138fa9c>] blk_get_backing_dev_info+0xc/0x20
 Oops: 0000 [#1] PREEMPT SMP
 Call Trace:
  [<c112da2a>] ? __filemap_fdatawrite_range+0x15a/0x180
  [<c112d9b5>] ? __filemap_fdatawrite_range+0xe5/0x180
  [<c112dae8>] filemap_write_and_wait+0x38/0x70
  [<c11b79b1>] fsync_bdev+0x41/0x50
  [<c13a4f7c>] invalidate_partition+0x1c/0x40
  [<c13a5d0f>] del_gendisk+0xcf/0x1c0
  [<c15c7143>] sd_remove+0x53/0xb0
  [<c157eaf0>] __device_release_driver+0x80/0x120
  [<c157ebad>] device_release_driver+0x1d/0x30
  [<c157e392>] bus_remove_device+0xb2/0xf0
  [<c157b45c>] device_del+0xec/0x1e0
  [<c13b6d88>] ? kobject_put+0x58/0xc0
  [<c15c12af>] __scsi_remove_device+0xaf/0xc0
  [<c15c12df>] scsi_remove_device+0x1f/0x30
  [<c15c131b>] sdev_store_delete+0x2b/0x40
  [<c15c12f0>] ? scsi_remove_device+0x30/0x30
  [<c157a87f>] dev_attr_store+0x1f/0x40
               ...
  [<c11829bc>] SyS_write+0x4c/0xb0
 EIP: [<c138fa9c>] blk_get_backing_dev_info+0xc/0x20 SS:ESP 0068:f5eb9d18

It is caused by this race: Between the time Thread B's instance of
filemap_write_and_wait() has asked whether there are any pages to flush and
when it it dereferences bdev->disk, Thread A can clear that pointer in
__blkdev_put().

Thread A:                             Thread B:
blkdev_close()                        sdev_store_delete()
  blkdev_put()                          sd_remove()
    __blkdev_put()                        del_gendisk()
      mutex_lock(bd_mutex);                 invalidate_partition()
        sync_blkdev()                         fsync_bdev()
          filemap_write_and_wait()              filemap_write_and_wait()
            if (mapping has pages)                if (mapping has pages)
              deref bdev->disk (OK)
        Set bdev->bd_disk = NULL;
      mutex_unlock(bd_mutex);                       deref. bdev->bd_disk (BOOM!)

The "dereference bdev->disk" occurs on this sub-chain:
filemap_write_and_wait()
  __filemap_fdatawrite_range()
    mapping_cap_writeback_dirty()
      inode_to_bdi()
        bdev_get_queue()
          return bdev->disk->queue;

The problem was introduced by de1414a654e6 ("fs: export inode_to_bdi and use
it in favor of mapping->backing_dev_info"). Before that change,
mapping_cap_writeback_dirty() directly retrieved the backing_dev_info from
the mapping rather than looking it up through
mapping->host->inode_dev->bdev->bd_disk->queue.

This was found while running a stress test on an ARM-based embedded system
which involved repeatedly shutting down many services simultaneously via
systemd isolate (thereby making it likely that "Thread B" was preempted for
awhile just before it dereferenced bdev->bd_disk). I subsequently reproduced
this on vanilla Linux 4.6 in QEMU/x86.

This patch fixes the race by making sd_remove() hold bd_mutex during the
call to del_gendisk().

Fixes: de1414a654e6 ("fs: export inode_to_bdi and use it in favor of
mapping->backing_dev_info")
Signed-off-by: Howard Cochran <[email protected]>
Cc: Howard Cochran <[email protected]>
Cc: [email protected]
Cc: Christoph Hellwig <[email protected]>
Cc: James Bottomley <[email protected]>
Cc: Martin K. Petersen <[email protected]>
---
 drivers/scsi/sd.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index f52b74c..0f53925 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -3126,6 +3126,7 @@ static int sd_remove(struct device *dev)
 {
        struct scsi_disk *sdkp;
        dev_t devt;
+       struct block_device *bdev;
 
        sdkp = dev_get_drvdata(dev);
        devt = disk_devt(sdkp->disk);
@@ -3134,7 +3135,13 @@ static int sd_remove(struct device *dev)
        async_synchronize_full_domain(&scsi_sd_pm_domain);
        async_synchronize_full_domain(&scsi_sd_probe_domain);
        device_del(&sdkp->dev);
+
+       bdev = bdget_disk(sdkp->disk, 0);
+       mutex_lock(&bdev->bd_mutex);
        del_gendisk(sdkp->disk);
+       mutex_unlock(&bdev->bd_mutex);
+       bdput(bdev);
+
        sd_shutdown(dev);
 
        blk_register_region(devt, SD_MINORS, NULL,
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] sd: Fix crash due to race when removing scsi disk

Reply via email to