[PATCH v2] scsi: ufshcd: fix possible unclocked register access
Vendor specific setup_clocks callback may require the clocks managed by ufshcd driver to be ON. So if the vendor specific setup_clocks callback is called while the required clocks are turned off, it could result into unclocked register access. To prevent possible unclock register access, this change makes sure that required clocks remain enabled before calling into vendor specific setup_clocks callback. Signed-off-by: Subhash Jadavani--- Changes from v2: * Don't call ufshcd_vops_setup_clocks() again for clock off --- drivers/scsi/ufs/ufshcd.c | 22 +- 1 file changed, 21 insertions(+), 1 deletion(-) diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c index 05c7456..c1a77d3 100644 --- a/drivers/scsi/ufs/ufshcd.c +++ b/drivers/scsi/ufs/ufshcd.c @@ -5389,6 +5389,17 @@ static int __ufshcd_setup_clocks(struct ufs_hba *hba, bool on, if (!head || list_empty(head)) goto out; + /* +* vendor specific setup_clocks ops may depend on clocks managed by +* this standard driver hence call the vendor specific setup_clocks +* before disabling the clocks managed here. +*/ + if (!on) { + ret = ufshcd_vops_setup_clocks(hba, on); + if (ret) + return ret; + } + list_for_each_entry(clki, head, list) { if (!IS_ERR_OR_NULL(clki->clk)) { if (skip_ref_clk && !strcmp(clki->name, "ref_clk")) @@ -5410,7 +5421,16 @@ static int __ufshcd_setup_clocks(struct ufs_hba *hba, bool on, } } - ret = ufshcd_vops_setup_clocks(hba, on); + /* +* vendor specific setup_clocks ops may depend on clocks managed by +* this standard driver hence call the vendor specific setup_clocks +* after enabling the clocks managed here. +*/ + if (on) { + ret = ufshcd_vops_setup_clocks(hba, on); + if (ret) + return ret; + } out: if (ret) { list_for_each_entry(clki, head, list) { -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v7 1/7] block: Add 'zoned' queue limit
Add the zoned queue limit to indicate the zoning model of a block device. Defined values are 0 (BLK_ZONED_NONE) for regular block devices, 1 (BLK_ZONED_HA) for host-aware zone block devices and 2 (BLK_ZONED_HM) for host-managed zone block devices. The standards defined drive managed model is not defined here since these block devices do not provide any command for accessing zone information. Drive managed model devices will be reported as BLK_ZONED_NONE. The helper functions blk_queue_zoned_model and bdev_zoned_model return the zoned limit and the functions blk_queue_is_zoned and bdev_is_zoned return a boolean for callers to test if a block device is zoned. The zoned attribute is also exported as a string to applications via sysfs. BLK_ZONED_NONE shows as "none", BLK_ZONED_HA as "host-aware" and BLK_ZONED_HM as "host-managed". Signed-off-by: Damien Le MoalReviewed-by: Christoph Hellwig Reviewed-by: Martin K. Petersen Reviewed-by: Shaun Tancheff Tested-by: Shaun Tancheff --- Documentation/ABI/testing/sysfs-block | 16 block/blk-settings.c | 1 + block/blk-sysfs.c | 18 ++ include/linux/blkdev.h| 47 +++ 4 files changed, 82 insertions(+) diff --git a/Documentation/ABI/testing/sysfs-block b/Documentation/ABI/testing/sysfs-block index 71d184d..75a5055 100644 --- a/Documentation/ABI/testing/sysfs-block +++ b/Documentation/ABI/testing/sysfs-block @@ -235,3 +235,19 @@ Description: write_same_max_bytes is 0, write same is not supported by the device. +What: /sys/block//queue/zoned +Date: September 2016 +Contact: Damien Le Moal +Description: + zoned indicates if the device is a zoned block device + and the zone model of the device if it is indeed zoned. + The possible values indicated by zoned are "none" for + regular block devices and "host-aware" or "host-managed" + for zoned block devices. The characteristics of + host-aware and host-managed zoned block devices are + described in the ZBC (Zoned Block Commands) and ZAC + (Zoned Device ATA Command Set) standards. These standards + also define the "drive-managed" zone model. However, + since drive-managed zoned block devices do not support + zone commands, they will be treated as regular block + devices and zoned will report "none". diff --git a/block/blk-settings.c b/block/blk-settings.c index f679ae1..b1d5b7f 100644 --- a/block/blk-settings.c +++ b/block/blk-settings.c @@ -107,6 +107,7 @@ void blk_set_default_limits(struct queue_limits *lim) lim->io_opt = 0; lim->misaligned = 0; lim->cluster = 1; + lim->zoned = BLK_ZONED_NONE; } EXPORT_SYMBOL(blk_set_default_limits); diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c index 9cc8d7c..ff9cd9c 100644 --- a/block/blk-sysfs.c +++ b/block/blk-sysfs.c @@ -257,6 +257,18 @@ QUEUE_SYSFS_BIT_FNS(random, ADD_RANDOM, 0); QUEUE_SYSFS_BIT_FNS(iostats, IO_STAT, 0); #undef QUEUE_SYSFS_BIT_FNS +static ssize_t queue_zoned_show(struct request_queue *q, char *page) +{ + switch (blk_queue_zoned_model(q)) { + case BLK_ZONED_HA: + return sprintf(page, "host-aware\n"); + case BLK_ZONED_HM: + return sprintf(page, "host-managed\n"); + default: + return sprintf(page, "none\n"); + } +} + static ssize_t queue_nomerges_show(struct request_queue *q, char *page) { return queue_var_show((blk_queue_nomerges(q) << 1) | @@ -485,6 +497,11 @@ static struct queue_sysfs_entry queue_nonrot_entry = { .store = queue_store_nonrot, }; +static struct queue_sysfs_entry queue_zoned_entry = { + .attr = {.name = "zoned", .mode = S_IRUGO }, + .show = queue_zoned_show, +}; + static struct queue_sysfs_entry queue_nomerges_entry = { .attr = {.name = "nomerges", .mode = S_IRUGO | S_IWUSR }, .show = queue_nomerges_show, @@ -546,6 +563,7 @@ static struct attribute *default_attrs[] = { _discard_zeroes_data_entry.attr, _write_same_max_entry.attr, _nonrot_entry.attr, + _zoned_entry.attr, _nomerges_entry.attr, _rq_affinity_entry.attr, _iostats_entry.attr, diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index c47c358..f19e16b 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -261,6 +261,15 @@ struct blk_queue_tag { #define BLK_SCSI_MAX_CMDS (256) #define BLK_SCSI_CMD_PER_LONG (BLK_SCSI_MAX_CMDS / (sizeof(long) * 8)) +/* + * Zoned block device models (zoned limit). + */ +enum blk_zoned_model { + BLK_ZONED_NONE, /* Regular block device
[PATCH v7 5/7] block: Implement support for zoned block devices
From: Hannes ReineckeImplement zoned block device zone information reporting and reset. Zone information are reported as struct blk_zone. This implementation does not differentiate between host-aware and host-managed device models and is valid for both. Two functions are provided: blkdev_report_zones for discovering the zone configuration of a zoned block device, and blkdev_reset_zones for resetting the write pointer of sequential zones. The helper function blk_queue_zone_size and bdev_zone_size are also provided for, as the name suggest, obtaining the zone size (in 512B sectors) of the zones of the device. Signed-off-by: Hannes Reinecke [Damien: * Removed the zone cache * Implement report zones operation based on earlier proposal by Shaun Tancheff ] Signed-off-by: Damien Le Moal Reviewed-by: Christoph Hellwig Reviewed-by: Martin K. Petersen Reviewed-by: Shaun Tancheff Tested-by: Shaun Tancheff --- block/Kconfig | 8 ++ block/Makefile| 2 +- block/blk-zoned.c | 257 ++ include/linux/blkdev.h| 31 + include/uapi/linux/Kbuild | 1 + include/uapi/linux/blkzoned.h | 103 + 6 files changed, 401 insertions(+), 1 deletion(-) create mode 100644 block/blk-zoned.c create mode 100644 include/uapi/linux/blkzoned.h diff --git a/block/Kconfig b/block/Kconfig index 5136ad4..7bb9bf8 100644 --- a/block/Kconfig +++ b/block/Kconfig @@ -89,6 +89,14 @@ config BLK_DEV_INTEGRITY T10/SCSI Data Integrity Field or the T13/ATA External Path Protection. If in doubt, say N. +config BLK_DEV_ZONED + bool "Zoned block device support" + ---help--- + Block layer zoned block device support. This option enables + support for ZAC/ZBC host-managed and host-aware zoned block devices. + + Say yes here if you have a ZAC or ZBC storage device. + config BLK_DEV_THROTTLING bool "Block layer bio throttling support" depends on BLK_CGROUP=y diff --git a/block/Makefile b/block/Makefile index 9eda232..4676969 100644 --- a/block/Makefile +++ b/block/Makefile @@ -22,4 +22,4 @@ obj-$(CONFIG_IOSCHED_CFQ) += cfq-iosched.o obj-$(CONFIG_BLOCK_COMPAT) += compat_ioctl.o obj-$(CONFIG_BLK_CMDLINE_PARSER) += cmdline-parser.o obj-$(CONFIG_BLK_DEV_INTEGRITY) += bio-integrity.o blk-integrity.o t10-pi.o - +obj-$(CONFIG_BLK_DEV_ZONED)+= blk-zoned.o diff --git a/block/blk-zoned.c b/block/blk-zoned.c new file mode 100644 index 000..1603573 --- /dev/null +++ b/block/blk-zoned.c @@ -0,0 +1,257 @@ +/* + * Zoned block device handling + * + * Copyright (c) 2015, Hannes Reinecke + * Copyright (c) 2015, SUSE Linux GmbH + * + * Copyright (c) 2016, Damien Le Moal + * Copyright (c) 2016, Western Digital + */ + +#include +#include +#include +#include + +static inline sector_t blk_zone_start(struct request_queue *q, + sector_t sector) +{ + sector_t zone_mask = blk_queue_zone_size(q) - 1; + + return sector & ~zone_mask; +} + +/* + * Check that a zone report belongs to the partition. + * If yes, fix its start sector and write pointer, copy it in the + * zone information array and return true. Return false otherwise. + */ +static bool blkdev_report_zone(struct block_device *bdev, + struct blk_zone *rep, + struct blk_zone *zone) +{ + sector_t offset = get_start_sect(bdev); + + if (rep->start < offset) + return false; + + rep->start -= offset; + if (rep->start + rep->len > bdev->bd_part->nr_sects) + return false; + + if (rep->type == BLK_ZONE_TYPE_CONVENTIONAL) + rep->wp = rep->start + rep->len; + else + rep->wp -= offset; + memcpy(zone, rep, sizeof(struct blk_zone)); + + return true; +} + +/** + * blkdev_report_zones - Get zones information + * @bdev: Target block device + * @sector:Sector from which to report zones + * @zones: Array of zone structures where to return the zones information + * @nr_zones: Number of zone structures in the zone array + * @gfp_mask: Memory allocation flags (for bio_alloc) + * + * Description: + *Get zone information starting from the zone containing @sector. + *The number of zone information reported may be less than the number + *requested by @nr_zones. The number of zones actually reported is + *returned in @nr_zones. + */ +int blkdev_report_zones(struct block_device *bdev, + sector_t sector, + struct blk_zone *zones, + unsigned int *nr_zones, + gfp_t gfp_mask) +{ + struct request_queue *q =
[PATCH v7 2/7] blk-sysfs: Add 'chunk_sectors' to sysfs attributes
From: Hannes ReineckeThe queue limits already have a 'chunk_sectors' setting, so we should be presenting it via sysfs. Signed-off-by: Hannes Reinecke [Damien: Updated Documentation/ABI/testing/sysfs-block] Signed-off-by: Damien Le Moal Reviewed-by: Christoph Hellwig Reviewed-by: Martin K. Petersen Reviewed-by: Shaun Tancheff Tested-by: Shaun Tancheff --- Documentation/ABI/testing/sysfs-block | 13 + block/blk-sysfs.c | 11 +++ 2 files changed, 24 insertions(+) diff --git a/Documentation/ABI/testing/sysfs-block b/Documentation/ABI/testing/sysfs-block index 75a5055..ee2d5cd 100644 --- a/Documentation/ABI/testing/sysfs-block +++ b/Documentation/ABI/testing/sysfs-block @@ -251,3 +251,16 @@ Description: since drive-managed zoned block devices do not support zone commands, they will be treated as regular block devices and zoned will report "none". + +What: /sys/block//queue/chunk_sectors +Date: September 2016 +Contact: Hannes Reinecke +Description: + chunk_sectors has different meaning depending on the type + of the disk. For a RAID device (dm-raid), chunk_sectors + indicates the size in 512B sectors of the RAID volume + stripe segment. For a zoned block device, either + host-aware or host-managed, chunk_sectors indicates the + size of 512B sectors of the zones of the device, with + the eventual exception of the last zone of the device + which may be smaller. diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c index ff9cd9c..488c2e2 100644 --- a/block/blk-sysfs.c +++ b/block/blk-sysfs.c @@ -130,6 +130,11 @@ static ssize_t queue_physical_block_size_show(struct request_queue *q, char *pag return queue_var_show(queue_physical_block_size(q), page); } +static ssize_t queue_chunk_sectors_show(struct request_queue *q, char *page) +{ + return queue_var_show(q->limits.chunk_sectors, page); +} + static ssize_t queue_io_min_show(struct request_queue *q, char *page) { return queue_var_show(queue_io_min(q), page); @@ -455,6 +460,11 @@ static struct queue_sysfs_entry queue_physical_block_size_entry = { .show = queue_physical_block_size_show, }; +static struct queue_sysfs_entry queue_chunk_sectors_entry = { + .attr = {.name = "chunk_sectors", .mode = S_IRUGO }, + .show = queue_chunk_sectors_show, +}; + static struct queue_sysfs_entry queue_io_min_entry = { .attr = {.name = "minimum_io_size", .mode = S_IRUGO }, .show = queue_io_min_show, @@ -555,6 +565,7 @@ static struct attribute *default_attrs[] = { _hw_sector_size_entry.attr, _logical_block_size_entry.attr, _physical_block_size_entry.attr, + _chunk_sectors_entry.attr, _io_min_entry.attr, _io_opt_entry.attr, _discard_granularity_entry.attr, -- 2.7.4 -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v7 3/7] block: update chunk_sectors in blk_stack_limits()
From: Hannes ReineckeSigned-off-by: Hannes Reinecke Signed-off-by: Damien Le Moal Reviewed-by: Christoph Hellwig Reviewed-by: Martin K. Petersen Reviewed-by: Shaun Tancheff Tested-by: Shaun Tancheff --- block/blk-settings.c | 4 1 file changed, 4 insertions(+) diff --git a/block/blk-settings.c b/block/blk-settings.c index b1d5b7f..55369a6 100644 --- a/block/blk-settings.c +++ b/block/blk-settings.c @@ -631,6 +631,10 @@ int blk_stack_limits(struct queue_limits *t, struct queue_limits *b, t->discard_granularity; } + if (b->chunk_sectors) + t->chunk_sectors = min_not_zero(t->chunk_sectors, + b->chunk_sectors); + return ret; } EXPORT_SYMBOL(blk_stack_limits); -- 2.7.4 -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v7 6/7] sd: Implement support for ZBC devices
From: Hannes ReineckeImplement ZBC support functions to setup zoned disks, both host-managed and host-aware models. Only zoned disks that satisfy the following conditions are supported: 1) All zones are the same size, with the exception of an eventual last smaller runt zone. 2) For host-managed disks, reads are unrestricted (reads are not failed due to zone or write pointer alignement constraints). Zoned disks that do not satisfy these 2 conditions are setup with a capacity of 0 to prevent their use. The function sd_zbc_read_zones, called from sd_revalidate_disk, checks that the device satisfies the above two constraints. This function may also change the disk capacity previously set by sd_read_capacity for devices reporting only the capacity of conventional zones at the beginning of the LBA range (i.e. devices reporting rc_basis set to 0). The capacity message output was moved out of sd_read_capacity into a new function sd_print_capacity to include this eventual capacity change by sd_zbc_read_zones. This new function also includes a call to sd_zbc_print_zones to display the number of zones and zone size of the device. Signed-off-by: Hannes Reinecke [Damien: * Removed zone cache support * Removed mapping of discard to reset write pointer command * Modified sd_zbc_read_zones to include checks that the device satisfies the kernel constraints * Implemeted REPORT ZONES setup and post-processing based on code from Shaun Tancheff * Removed confusing use of 512B sector units in functions interface] Signed-off-by: Damien Le Moal Reviewed-by: Christoph Hellwig Reviewed-by: Shaun Tancheff Tested-by: Shaun Tancheff --- drivers/scsi/Makefile | 1 + drivers/scsi/sd.c | 148 --- drivers/scsi/sd.h | 70 + drivers/scsi/sd_zbc.c | 638 ++ include/scsi/scsi_proto.h | 17 ++ 5 files changed, 839 insertions(+), 35 deletions(-) create mode 100644 drivers/scsi/sd_zbc.c diff --git a/drivers/scsi/Makefile b/drivers/scsi/Makefile index d539798..fabcb6d 100644 --- a/drivers/scsi/Makefile +++ b/drivers/scsi/Makefile @@ -179,6 +179,7 @@ hv_storvsc-y:= storvsc_drv.o sd_mod-objs:= sd.o sd_mod-$(CONFIG_BLK_DEV_INTEGRITY) += sd_dif.o +sd_mod-$(CONFIG_BLK_DEV_ZONED) += sd_zbc.o sr_mod-objs:= sr.o sr_ioctl.o sr_vendor.o ncr53c8xx-flags-$(CONFIG_SCSI_ZALON) \ diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c index d3e852a..e53d958 100644 --- a/drivers/scsi/sd.c +++ b/drivers/scsi/sd.c @@ -92,6 +92,7 @@ MODULE_ALIAS_BLOCKDEV_MAJOR(SCSI_DISK15_MAJOR); MODULE_ALIAS_SCSI_DEVICE(TYPE_DISK); MODULE_ALIAS_SCSI_DEVICE(TYPE_MOD); MODULE_ALIAS_SCSI_DEVICE(TYPE_RBC); +MODULE_ALIAS_SCSI_DEVICE(TYPE_ZBC); #if !defined(CONFIG_DEBUG_BLOCK_EXT_DEVT) #define SD_MINORS 16 @@ -162,7 +163,7 @@ cache_type_store(struct device *dev, struct device_attribute *attr, static const char temp[] = "temporary "; int len; - if (sdp->type != TYPE_DISK) + if (sdp->type != TYPE_DISK && sdp->type != TYPE_ZBC) /* no cache control on RBC devices; theoretically they * can do it, but there's probably so many exceptions * it's not worth the risk */ @@ -261,7 +262,7 @@ allow_restart_store(struct device *dev, struct device_attribute *attr, if (!capable(CAP_SYS_ADMIN)) return -EACCES; - if (sdp->type != TYPE_DISK) + if (sdp->type != TYPE_DISK && sdp->type != TYPE_ZBC) return -EINVAL; sdp->allow_restart = simple_strtoul(buf, NULL, 10); @@ -391,6 +392,11 @@ provisioning_mode_store(struct device *dev, struct device_attribute *attr, if (!capable(CAP_SYS_ADMIN)) return -EACCES; + if (sd_is_zoned(sdkp)) { + sd_config_discard(sdkp, SD_LBP_DISABLE); + return count; + } + if (sdp->type != TYPE_DISK) return -EINVAL; @@ -458,7 +464,7 @@ max_write_same_blocks_store(struct device *dev, struct device_attribute *attr, if (!capable(CAP_SYS_ADMIN)) return -EACCES; - if (sdp->type != TYPE_DISK) + if (sdp->type != TYPE_DISK && sdp->type != TYPE_ZBC) return -EINVAL; err = kstrtoul(buf, 10, ); @@ -843,6 +849,12 @@ static int sd_setup_write_same_cmnd(struct scsi_cmnd *cmd) BUG_ON(bio_offset(bio) || bio_iovec(bio).bv_len != sdp->sector_size); + if (sd_is_zoned(sdkp)) { + ret = sd_zbc_setup_write_cmnd(cmd); + if (ret != BLKPREP_OK) + return ret; + } + sector >>= ilog2(sdp->sector_size) - 9; nr_sectors >>= ilog2(sdp->sector_size) - 9; @@ -900,19
[PATCH v7 7/7] blk-zoned: implement ioctls
From: Shaun TancheffAdds the new BLKREPORTZONE and BLKRESETZONE ioctls for respectively obtaining the zone configuration of a zoned block device and resetting the write pointer of sequential zones of a zoned block device. The BLKREPORTZONE ioctl maps directly to a single call of the function blkdev_report_zones. The zone information result is passed as an array of struct blk_zone identical to the structure used internally for processing the REQ_OP_ZONE_REPORT operation. The BLKRESETZONE ioctl maps to a single call of the blkdev_reset_zones function. Signed-off-by: Shaun Tancheff Signed-off-by: Damien Le Moal Reviewed-by: Christoph Hellwig Reviewed-by: Martin K. Petersen Reviewed-by: Hannes Reinecke --- block/blk-zoned.c | 93 +++ block/ioctl.c | 4 ++ include/linux/blkdev.h| 21 ++ include/uapi/linux/blkzoned.h | 40 +++ include/uapi/linux/fs.h | 4 ++ 5 files changed, 162 insertions(+) diff --git a/block/blk-zoned.c b/block/blk-zoned.c index 1603573..667f95d 100644 --- a/block/blk-zoned.c +++ b/block/blk-zoned.c @@ -255,3 +255,96 @@ int blkdev_reset_zones(struct block_device *bdev, return 0; } EXPORT_SYMBOL_GPL(blkdev_reset_zones); + +/** + * BLKREPORTZONE ioctl processing. + * Called from blkdev_ioctl. + */ +int blkdev_report_zones_ioctl(struct block_device *bdev, fmode_t mode, + unsigned int cmd, unsigned long arg) +{ + void __user *argp = (void __user *)arg; + struct request_queue *q; + struct blk_zone_report rep; + struct blk_zone *zones; + int ret; + + if (!argp) + return -EINVAL; + + q = bdev_get_queue(bdev); + if (!q) + return -ENXIO; + + if (!blk_queue_is_zoned(q)) + return -ENOTTY; + + if (!capable(CAP_SYS_ADMIN)) + return -EACCES; + + if (copy_from_user(, argp, sizeof(struct blk_zone_report))) + return -EFAULT; + + if (!rep.nr_zones) + return -EINVAL; + + zones = kcalloc(rep.nr_zones, sizeof(struct blk_zone), GFP_KERNEL); + if (!zones) + return -ENOMEM; + + ret = blkdev_report_zones(bdev, rep.sector, + zones, _zones, + GFP_KERNEL); + if (ret) + goto out; + + if (copy_to_user(argp, , sizeof(struct blk_zone_report))) { + ret = -EFAULT; + goto out; + } + + if (rep.nr_zones) { + if (copy_to_user(argp + sizeof(struct blk_zone_report), zones, +sizeof(struct blk_zone) * rep.nr_zones)) + ret = -EFAULT; + } + + out: + kfree(zones); + + return ret; +} + +/** + * BLKRESETZONE ioctl processing. + * Called from blkdev_ioctl. + */ +int blkdev_reset_zones_ioctl(struct block_device *bdev, fmode_t mode, +unsigned int cmd, unsigned long arg) +{ + void __user *argp = (void __user *)arg; + struct request_queue *q; + struct blk_zone_range zrange; + + if (!argp) + return -EINVAL; + + q = bdev_get_queue(bdev); + if (!q) + return -ENXIO; + + if (!blk_queue_is_zoned(q)) + return -ENOTTY; + + if (!capable(CAP_SYS_ADMIN)) + return -EACCES; + + if (!(mode & FMODE_WRITE)) + return -EBADF; + + if (copy_from_user(, argp, sizeof(struct blk_zone_range))) + return -EFAULT; + + return blkdev_reset_zones(bdev, zrange.sector, zrange.nr_sectors, + GFP_KERNEL); +} diff --git a/block/ioctl.c b/block/ioctl.c index ed2397f..448f78a 100644 --- a/block/ioctl.c +++ b/block/ioctl.c @@ -513,6 +513,10 @@ int blkdev_ioctl(struct block_device *bdev, fmode_t mode, unsigned cmd, BLKDEV_DISCARD_SECURE); case BLKZEROOUT: return blk_ioctl_zeroout(bdev, mode, arg); + case BLKREPORTZONE: + return blkdev_report_zones_ioctl(bdev, mode, cmd, arg); + case BLKRESETZONE: + return blkdev_reset_zones_ioctl(bdev, mode, cmd, arg); case HDIO_GETGEO: return blkdev_getgeo(bdev, argp); case BLKRAGET: diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index 252043f..90097dd 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -316,6 +316,27 @@ extern int blkdev_report_zones(struct block_device *bdev, extern int blkdev_reset_zones(struct block_device *bdev, sector_t sectors, sector_t nr_sectors, gfp_t gfp_mask); +extern int blkdev_report_zones_ioctl(struct block_device *bdev, fmode_t mode, +
[PATCH v7 4/7] block: Define zoned block device operations
From: Shaun TancheffDefine REQ_OP_ZONE_REPORT and REQ_OP_ZONE_RESET for handling zones of host-managed and host-aware zoned block devices. With with these two new operations, the total number of operations defined reaches 8 and still fits with the 3 bits definition of REQ_OP_BITS. Signed-off-by: Shaun Tancheff Signed-off-by: Damien Le Moal Reviewed-by: Christoph Hellwig Reviewed-by: Martin K. Petersen Reviewed-by: Hannes Reinecke --- block/blk-core.c | 4 include/linux/blk_types.h | 2 ++ 2 files changed, 6 insertions(+) diff --git a/block/blk-core.c b/block/blk-core.c index 14d7c07..e4eda5d 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -1941,6 +1941,10 @@ generic_make_request_checks(struct bio *bio) case REQ_OP_WRITE_SAME: if (!bdev_write_same(bio->bi_bdev)) goto not_supported; + case REQ_OP_ZONE_REPORT: + case REQ_OP_ZONE_RESET: + if (!bdev_is_zoned(bio->bi_bdev)) + goto not_supported; break; default: break; diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h index cd395ec..dd50dce 100644 --- a/include/linux/blk_types.h +++ b/include/linux/blk_types.h @@ -243,6 +243,8 @@ enum req_op { REQ_OP_SECURE_ERASE,/* request to securely erase sectors */ REQ_OP_WRITE_SAME, /* write same block many times */ REQ_OP_FLUSH, /* request for cache flush */ + REQ_OP_ZONE_REPORT, /* Get zone information */ + REQ_OP_ZONE_RESET, /* Reset a zone write pointer */ }; #define REQ_OP_BITS 3 -- 2.7.4 -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v7 0/7] ZBC / Zoned block device support
This series introduces support for zoned block devices. It integrates earlier submissions by Hannes Reinecke and Shaun Tancheff. Compared to the previous series version, the code was significantly simplified by limiting support to zoned devices satisfying the following conditions: 1) All zones of the device are the same size, with the exception of an eventual last smaller runt zone. 2) For host-managed disks, reads must be unrestricted (read commands do not fail due to zone or write pointer alignement constraints). Zoned disks that do not satisfy these 2 conditions are ignored. These 2 conditions allowed dropping the zone information cache implemented in the previous version. This simplifies the code and also reduces the memory consumption at run time. Support for zoned devices now only require one bit per zone (less than 8KB in total). This bit field is used to write-lock zones and prevent the concurrent execution of multiple write commands in the same zone. This avoids write ordering problems at dispatch time, for both the simple queue and scsi-mq settings. The new operations introduced to suport zone manipulation was reduced to only the two main ZBC/ZAC defined commands: REPORT ZONES (REQ_OP_ZONE_REPORT) and RESET WRITE POINTER (REQ_OP_ZONE_RESET). This brings the total number of operations defined to 8, which fits in the 3 bits (REQ_OP_BITS) reserved for operation code in bio->bi_opf and req->cmd_flags. Most of the ZBC specific code is kept out of sd.c and implemented in the new file sd_zbc.c. Similarly, at the block layer, most of the zoned block device code is implemented in the new blk-zoned.c. For host-managed zoned block devices, the sequential write constraint of write pointer zones is exposed to the user. Users of the disk (applications, file systems or device mappers) must sequentially write to zones. This means that for raw block device accesses from applications, buffered writes are unreliable and direct I/Os must be used (or buffered writes with O_SYNC). Access to zone manipulation operations is also provided to applications through a set of new ioctls. This allows applications operating on raw block devices (e.g. mkfs.xxx) to discover a device zone layout and manipulate zone state. Changes from v6: * Fixed problems with zone write locking: - Wrong sdkp->zone_wlock bitmap allocation size - Incorrect (reversed condition) test of lock state with test_and_set_bit - Potential error in sd_setup_read_write_cmnd could leave a zone locked without the locking write command being executed Changes from v5: * Rebased on Jens' for-4.9/block branch (v5 is based on next-20160928) Changes from v4: * Changed interface of sd_zbc_setup_read_write Changes from v3: * Fixed several typos and tabs/spaces * Added description of zoned and chunk_sectors queue attributes in Documentation/ABI/testing/sysfs-block * Fixed sd_read_capacity call in sd.c and to avoid missing information on the first pass of a disk scan * Fixed scsi_disk zone related field to use logical block size unit instead of 512B sector unit. Changes from v2: * Use kcalloc to allocate zone information array for ioctl * Use kcalloc to allocate zone information array for ioctl * Export GPL the functions blkdev_report_zones and blkdev_reset_zones * Shuffled uapi definitions from patch 7 into patch 5 Damien Le Moal (1): block: Add 'zoned' queue limit Hannes Reinecke (4): blk-sysfs: Add 'chunk_sectors' to sysfs attributes block: update chunk_sectors in blk_stack_limits() block: Implement support for zoned block devices sd: Implement support for ZBC devices Shaun Tancheff (2): block: Define zoned block device operations blk-zoned: implement ioctls Documentation/ABI/testing/sysfs-block | 29 ++ block/Kconfig | 8 + block/Makefile| 2 +- block/blk-core.c | 4 + block/blk-settings.c | 5 + block/blk-sysfs.c | 29 ++ block/blk-zoned.c | 350 +++ block/ioctl.c | 4 + drivers/scsi/Makefile | 1 + drivers/scsi/sd.c | 148 ++-- drivers/scsi/sd.h | 70 drivers/scsi/sd_zbc.c | 638 ++ include/linux/blk_types.h | 2 + include/linux/blkdev.h| 99 ++ include/scsi/scsi_proto.h | 17 + include/uapi/linux/Kbuild | 1 + include/uapi/linux/blkzoned.h | 143 include/uapi/linux/fs.h | 4 + 18 files changed, 1518 insertions(+), 36 deletions(-) create mode 100644 block/blk-zoned.c create mode 100644 drivers/scsi/sd_zbc.c create mode 100644 include/uapi/linux/blkzoned.h -- 2.7.4 -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at
[PATCH] scsi: ufshcd: fix possible unclocked register access
Vendor specific setup_clocks callback may require the clocks managed by ufshcd driver to be ON. So if the vendor specific setup_clocks callback is called while the required clocks are turned off, it could result into unclocked register access. To prevent possible unclock register access, this change makes sure that required clocks remain enabled before calling into vendor specific setup_clocks callback. Signed-off-by: Subhash Jadavani--- drivers/scsi/ufs/ufshcd.c | 16 1 file changed, 16 insertions(+) diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c index 05c7456..acee5a3 100644 --- a/drivers/scsi/ufs/ufshcd.c +++ b/drivers/scsi/ufs/ufshcd.c @@ -5389,6 +5389,17 @@ static int __ufshcd_setup_clocks(struct ufs_hba *hba, bool on, if (!head || list_empty(head)) goto out; + /* +* vendor specific setup_clocks ops may depend on clocks managed by +* this standard driver hence call the vendor specific setup_clocks +* before disabling the clocks managed here. +*/ + if (!on) { + ret = ufshcd_vops_setup_clocks(hba, on); + if (ret) + return ret; + } + list_for_each_entry(clki, head, list) { if (!IS_ERR_OR_NULL(clki->clk)) { if (skip_ref_clk && !strcmp(clki->name, "ref_clk")) @@ -5410,6 +5421,11 @@ static int __ufshcd_setup_clocks(struct ufs_hba *hba, bool on, } } + /* +* vendor specific setup_clocks ops may depend on clocks managed by +* this standard driver hence call the vendor specific setup_clocks +* after enabling the clocks managed here. +*/ ret = ufshcd_vops_setup_clocks(hba, on); out: if (ret) { -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 4/7] blk-mq: Introduce blk_quiesce_queue() and blk_resume_queue()
On 10/05/2016 03:49 PM, Ming Lei wrote: We can use srcu read lock for BLOCKING and rcu read lock for non-BLOCKING, by putting *_read_lock() and *_read_unlock() into two wrappers, which should minimize the cost of srcu read lock & unlock and the code is still easy to read & verify. Hello Ming, The lock checking algorithms in the sparse and smatch static checkers are unable to deal with code of the type "if (condition) (un)lock()". So unless someone has a better proposal my preference is to use the approach from the patch at the start of this e-mail thread. Thanks, Bart. -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 4/7] blk-mq: Introduce blk_quiesce_queue() and blk_resume_queue()
On Thu, Oct 6, 2016 at 5:08 AM, Bart Van Asschewrote: > On 10/05/2016 12:11 PM, Sagi Grimberg wrote: >> >> I was referring to weather we can take srcu in the submission path >> conditional of the hctx being STOPPED? > > > Hello Sagi, > > Regarding run-time overhead: > * rcu_read_lock() is a no-op on CONFIG_PREEMPT_NONE kernels and is > translated into preempt_disable() with preemption enabled. The latter > function modifies a per-cpu variable. > * Checking BLK_MQ_S_STOPPED before taking an rcu or srcu lock is only > safe if the BLK_MQ_S_STOPPED flag is tested in such a way that the > compiler is told to reread the hctx flags (READ_ONCE()) and if the > compiler and CPU are told not to reorder test_bit() with the > memory accesses in (s)rcu_read_lock(). To avoid races > BLK_MQ_S_STOPPED will have to be tested a second time after the lock > has been obtained, similar to the double-checked-locking pattern. > * srcu_read_lock() reads a word from the srcu structure, disables > preemption, calls __srcu_read_lock() and re-enables preemption. The > latter function increments two CPU-local variables and triggers a > memory barrier (smp_mp()). We can use srcu read lock for BLOCKING and rcu read lock for non-BLOCKING, by putting *_read_lock() and *_read_unlock() into two wrappers, which should minimize the cost of srcu read lock & unlock and the code is still easy to read & verify. > > Swapping srcu_read_lock() and the BLK_MQ_S_STOPPED flag test will make the > code more complicated. Going back to the implementation that calls > rcu_read_lock() if .queue_rq() won't sleep will result in an implementation > that is easier to read and to verify. Yeah, I agree. Thanks, Ming Lei -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 6/7] SRP transport: Port srp_wait_for_queuecommand() to scsi-mq
On 10/05/2016 10:38 AM, Sagi Grimberg wrote: +static void srp_mq_wait_for_queuecommand(struct Scsi_Host *shost) +{ +struct scsi_device *sdev; +struct request_queue *q; + +shost_for_each_device(sdev, shost) { +q = sdev->request_queue; + +blk_mq_quiesce_queue(q); +blk_mq_resume_queue(q); +} +} + This *should* live in scsi_lib.c. I suspect that various drivers would really want this functionality. Hello Sagi, There are multiple direct blk_*() calls in other SCSI transport drivers. So my proposal is to wait with moving this code into scsi_lib.c until there is a second user of this code. Bart. -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 4/7] blk-mq: Introduce blk_quiesce_queue() and blk_resume_queue()
On 10/05/2016 12:11 PM, Sagi Grimberg wrote: I was referring to weather we can take srcu in the submission path conditional of the hctx being STOPPED? Hello Sagi, Regarding run-time overhead: * rcu_read_lock() is a no-op on CONFIG_PREEMPT_NONE kernels and is translated into preempt_disable() with preemption enabled. The latter function modifies a per-cpu variable. * Checking BLK_MQ_S_STOPPED before taking an rcu or srcu lock is only safe if the BLK_MQ_S_STOPPED flag is tested in such a way that the compiler is told to reread the hctx flags (READ_ONCE()) and if the compiler and CPU are told not to reorder test_bit() with the memory accesses in (s)rcu_read_lock(). To avoid races BLK_MQ_S_STOPPED will have to be tested a second time after the lock has been obtained, similar to the double-checked-locking pattern. * srcu_read_lock() reads a word from the srcu structure, disables preemption, calls __srcu_read_lock() and re-enables preemption. The latter function increments two CPU-local variables and triggers a memory barrier (smp_mp()). Swapping srcu_read_lock() and the BLK_MQ_S_STOPPED flag test will make the code more complicated. Going back to the implementation that calls rcu_read_lock() if .queue_rq() won't sleep will result in an implementation that is easier to read and to verify. If I overlooked something, please let me know. Bart. -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 00/12] Fixes, cleanup and g_NCR5380_mmio/g_NCR5380 merger
On Tuesday 04 October 2016 07:40:50 Finn Thain wrote: > This patch series has fixes for compatibility, reliability and > performance issues and some cleanup. It also includes a new version > of Ondrej Zary's patch that merges g_NCR5380_mmio into g_NCR5380. > > I've tested this patch series on a Powerbook 180. If someone would > test some of the other platforms that would be very helpful. All > drivers were compile-tested. > > (Apologies for any duplicate messages.) The patches won't apply against: 4.9/scsi-queue Linus' master with my patches applied What tree should I try? > Finn Thain (12): > scsi/g_NCR5380: Merge g_NCR5380 and g_NCR5380_mmio drivers > scsi/cumana_1: Remove unused cumanascsi_setup() function > scsi/atari_scsi: Make device register accessors re-enterant > scsi/ncr5380: Simplify register polling limit > scsi/ncr5380: Increase register polling limit > scsi/ncr5380: Improve hostdata struct member alignment and > cache-ability > scsi/ncr5380: Store IO ports and addresses in host private data > scsi/ncr5380: Use correct types for device register accessors > scsi/ncr5380: Pass hostdata pointer to register polling routines > scsi/ncr5380: Expedite register polling > scsi/ncr5380: Use correct types for DMA routines > scsi/ncr5380: Suppress unhelpful "interrupt without IRQ bit" message > > MAINTAINERS | 1 - > drivers/scsi/Kconfig | 32 + > drivers/scsi/Makefile | 1 - > drivers/scsi/NCR5380.c| 137 +++- > drivers/scsi/NCR5380.h| 87 + > drivers/scsi/arm/cumana_1.c | 98 +++--- > drivers/scsi/arm/oak.c| 34 +++-- > drivers/scsi/atari_scsi.c | 77 ++- > drivers/scsi/dmx3191d.c | 20 +-- > drivers/scsi/g_NCR5380.c | 290 > -- drivers/scsi/g_NCR5380.h | > 32 + > drivers/scsi/g_NCR5380_mmio.c | 10 -- > drivers/scsi/mac_scsi.c | 83 +--- > drivers/scsi/sun3_scsi.c | 80 ++-- > 14 files changed, 495 insertions(+), 487 deletions(-) > delete mode 100644 drivers/scsi/g_NCR5380_mmio.c -- Ondrej Zary -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 4/7] blk-mq: Introduce blk_quiesce_queue() and blk_resume_queue()
Hello Ming, Can you have a look at the attached patch? That patch uses an srcu read lock for all queue types, whether or not the BLK_MQ_F_BLOCKING flag has been set. Additionally, I have dropped the QUEUE_FLAG_QUIESCING flag. Just like previous versions, this patch has been tested. Hey Bart, Do we care about the synchronization of queue_rq and/or blk_mq_run_hw_queue of the hctx is not stopped? I'm wandering if we can avoid introducing new barriers in the submission path of its not absolutely needed. Hello Sagi, Hey Bart, I'm not sure whether the new blk_quiesce_queue() function is useful without stopping all hardware contexts first. In other words, in my view setting BLK_MQ_F_BLOCKING flag before calling blk_quiesce_queue() is sufficient and I don't think that a new QUEUE_FLAG_QUIESCING flag is necessary. I was referring to weather we can take srcu in the submission path conditional of the hctx being STOPPED? -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 4/7] blk-mq: Introduce blk_quiesce_queue() and blk_resume_queue()
On 10/05/2016 11:14 AM, Sagi Grimberg wrote: Hello Ming, Can you have a look at the attached patch? That patch uses an srcu read lock for all queue types, whether or not the BLK_MQ_F_BLOCKING flag has been set. Additionally, I have dropped the QUEUE_FLAG_QUIESCING flag. Just like previous versions, this patch has been tested. Hey Bart, Do we care about the synchronization of queue_rq and/or blk_mq_run_hw_queue of the hctx is not stopped? I'm wandering if we can avoid introducing new barriers in the submission path of its not absolutely needed. Hello Sagi, I'm not sure whether the new blk_quiesce_queue() function is useful without stopping all hardware contexts first. In other words, in my view setting BLK_MQ_F_BLOCKING flag before calling blk_quiesce_queue() is sufficient and I don't think that a new QUEUE_FLAG_QUIESCING flag is necessary. Bart. -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: iscsi_trx going into D state
Thanks, we will apply that too. We'd like to get this stable. We'll report back on what we find with these patches. Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Wed, Oct 5, 2016 at 12:03 PM, Christoph Hellwigwrote: > Hi Robert, > > I actually got the name wrong, the patch wasn't from Lee, but from Zhu, > another SuSE engineer. This is the one: > > http://www.spinics.net/lists/target-devel/msg13463.html -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 4/7] blk-mq: Introduce blk_quiesce_queue() and blk_resume_queue()
Hello Ming, Can you have a look at the attached patch? That patch uses an srcu read lock for all queue types, whether or not the BLK_MQ_F_BLOCKING flag has been set. Additionally, I have dropped the QUEUE_FLAG_QUIESCING flag. Just like previous versions, this patch has been tested. Hey Bart, Do we care about the synchronization of queue_rq and/or blk_mq_run_hw_queue of the hctx is not stopped? I'm wandering if we can avoid introducing new barriers in the submission path of its not absolutely needed. -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: iscsi_trx going into D state
Hi Robert, I actually got the name wrong, the patch wasn't from Lee, but from Zhu, another SuSE engineer. This is the one: http://www.spinics.net/lists/target-devel/msg13463.html -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 1/7] blk-mq: Introduce blk_mq_queue_stopped()
Looks good, Reviewed-by: Sagi Grimberg-- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 3/7] [RFC] nvme: Use BLK_MQ_S_STOPPED instead of QUEUE_FLAG_STOPPED in blk-mq code
Make nvme_requeue_req() check BLK_MQ_S_STOPPED instead of QUEUE_FLAG_STOPPED. Remove the QUEUE_FLAG_STOPPED manipulations that became superfluous because of this change. This patch fixes a race condition: using queue_flag_clear_unlocked() is not safe if any other function that manipulates the queue flags can be called concurrently, e.g. blk_cleanup_queue(). Untested. This looks good to me, but I know keith had all sort of creative ways to challenge this are so I'd wait for his input... -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] scsi: ufs: add support for BLKSECDISCARD
Hi SzymonX, On 2016-10-04 05:55, Mielczarek, SzymonX wrote: Hi Jadavani, _> Did you mean sending purge when bProvisioningType is set to 02h (TPRZ = 0)? why do we want to send the purge if TPRZ is 1?_ By doing Purge we want to protect from die level attacks (JESD220B 12.2.3.3). Once Erase is enabled on partition, the Read will return zeros, however data can still reside in unmapped memory on flash (behind mapping/translation table) (12.2.2.2). We expose BLKSECDISCARD on "Erase enabled" partitions just to remove possibility on die level attacks. Now it make sense, i wasn't expecting that this patch is to prevent die level attack. Do you want to make that explicit in the commit text? Are you suggesting that this check is not required, and in any TPRZ (thus 02h and 03h) BLKSECDISCARD (this Purge) shall be enabled? That's also possible. Yes, for BLKSECDISCARD, isn't it good to issue purge for TPRZ=0 (bProvisioningType = 3) to make sure we can't read back data? _> We had seen purge taking few mins to complete with some of the UFS device vendors._ _> Did you run any experiments to major the time taken for purge to complete?_ Yes, we did several experiments around Dec 2015, and the time of Purge operation with software overhead was varying between 100-500 seconds (!), with typical time approx. 350 seconds! We also consulted one vendor on this observation, and got response that Purge times over 1 min are possible, depending on flash state. That's true. Purge time depends on flash state and it also varies a lot from vendor to vendor. Anything over a min may not be good for user experience (especially for mobile) and user may simply abort (phone restart) thinking that device isn't stuck. BR, Szymon -Original Message- From: Pielaszkiewicz, Tomasz Sent: Tuesday, October 4, 2016 1:41 PM To: subha...@codeaurora.org; Wodkowski, PawelX; Mielczarek, SzymonX Cc: linux-scsi@vger.kernel.org; hun...@vger.kernel.org; Hunter, Adrian ; pielaszkiew...@vger.kernel.org; ja...@vger.kernel.org; Janca, Grzegorz ; linux-scsi-ow...@vger.kernel.org Subject: RE: [PATCH] scsi: ufs: add support for BLKSECDISCARD Hi, Adding Szymon, who took over Pawel's work. Tomek -Original Message- From: subha...@codeaurora.org [mailto:subha...@codeaurora.org] Sent: Tuesday, September 27, 2016 10:18 PM To: Wodkowski, PawelX Cc: linux-scsi@vger.kernel.org; hun...@vger.kernel.org; Hunter, Adrian; pielaszkiew...@vger.kernel.org; Pielaszkiewicz, Tomasz; ja...@vger.kernel.org; Janca, Grzegorz; linux-scsi-ow...@vger.kernel.org Subject: Re: [PATCH] scsi: ufs: add support for BLKSECDISCARD Hi Pawel, Please find some comments inline. On 2016-07-26 04:56, Pawel Wodkowski wrote: Add BLKSECDISCAD feature support if LU is provisioned for TPRZ (bProvisioningType = 3). Did you mean sending purge when bProvisioningType is set to 02h (TPRZ = 0)? why do we want to send the purge if TPRZ is 1? To perform BLKSECDISCAD driver issue purge operation after each discard SCSI command with REQ_SECURE flag set, and delay calling scsi_done() till purge finish. This operation might long so block requests from SCSI layer in ufshcd_queueucommand() and then unblock it after purge finish. We had seen purge taking few mins to complete with some of the UFS device vendors. Did you run any experiments to major the time taken for purge to complete? Signed-off-by: Pawel Wodkowski --- drivers/scsi/ufs/ufs.h| 19 + drivers/scsi/ufs/ufshcd.c | 187 +- drivers/scsi/ufs/ufshcd.h | 6 ++ 3 files changed, 208 insertions(+), 4 deletions(-) diff --git a/drivers/scsi/ufs/ufs.h b/drivers/scsi/ufs/ufs.h index b291fa6ed2ad..2f769974fda1 100644 --- a/drivers/scsi/ufs/ufs.h +++ b/drivers/scsi/ufs/ufs.h @@ -132,12 +132,14 @@ enum flag_idn { QUERY_FLAG_IDN_FDEVICEINIT = 0x01, QUERY_FLAG_IDN_PWR_ON_WPE = 0x03, QUERY_FLAG_IDN_BKOPS_EN = 0x04, + QUERY_FLAG_IDN_PURGE_EN = 0x06, }; /* Attribute idn for Query requests */ enum attr_idn { QUERY_ATTR_IDN_ACTIVE_ICC_LVL = 0x03, QUERY_ATTR_IDN_BKOPS_STATUS= 0x05, + QUERY_ATTR_IDN_PURGE_STATUS = 0x06, QUERY_ATTR_IDN_EE_CONTROL = 0x0D, QUERY_ATTR_IDN_EE_STATUS = 0x0E, }; @@ -247,6 +249,13 @@ enum { UFSHCD_AMP = 3, }; +/* Provisioning type */ +enum unit_desc_param_provisioning_type { + THIN_PROVISIONING_DISABLED = 0x00, + THIN_PROVISIONING_ENABLED_TPRZ_0 = 0x02, + THIN_PROVISIONING_ENABLED_TPRZ_1 = 0x03, +}; +
Re: iscsi_trx going into D state
We are not able to identify the patch that you mentioned from Lee, can you give us a commit or a link to the patch? Thanks, Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Tue, Oct 4, 2016 at 5:46 AM, Christoph Hellwigwrote: > On Tue, Oct 04, 2016 at 11:11:18AM +0200, Hannes Reinecke wrote: >> Hmm. Looking at the code it looks as we might miss some calls to >> 'complete'. Can you try with the attached patch? > > That only looks slightly better than the original. What this really > needs is a waitqueue and and waitevent on sess->ncon. Although > that will need a bit more refactoring around that code. There also > are a few more ovbious issues around it, e.g. iscsit_close_connection > needs to use atomic_dec_and_test on sess->nconn instead of having > separate atomic_dec and atomic_read calls, and a lot of the 0 or 1 > atomic_ts in this code should be replaced with atomic bitops. > > Btw, there also was a fix from Lee in this area that added a missing > wakeup, make sure your tree already has that. -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 7/7] [RFC] nvme: Fix a race condition
Avoid that nvme_queue_rq() is still running when nvme_stop_queues() returns. Untested. Signed-off-by: Bart Van AsscheCc: Keith Busch Cc: Christoph Hellwig Cc: Sagi Grimberg Bart this looks really good! and possibly fixes an issue I've been chasing with fabrics a while ago. I'll take it for testing but you can add my: Reviewed-by: Sagi Grimberg -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 6/7] SRP transport: Port srp_wait_for_queuecommand() to scsi-mq
+static void srp_mq_wait_for_queuecommand(struct Scsi_Host *shost) +{ + struct scsi_device *sdev; + struct request_queue *q; + + shost_for_each_device(sdev, shost) { + q = sdev->request_queue; + + blk_mq_quiesce_queue(q); + blk_mq_resume_queue(q); + } +} + This *should* live in scsi_lib.c. I suspect that various drivers would really want this functionality. Thoughts? -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 4/7] blk-mq: Introduce blk_quiesce_queue() and blk_resume_queue()
On Wed, Oct 5, 2016 at 10:46 PM, Bart Van Asschewrote: > On 10/04/16 21:32, Ming Lei wrote: >> >> On Wed, Oct 5, 2016 at 12:16 PM, Bart Van Assche >> wrote: >>> >>> On 10/01/16 15:56, Ming Lei wrote: If we just call the rcu/srcu read lock(or the mutex) around .queue_rq(), the above code needn't to be duplicated any more. >>> >>> >>> Can you have a look at the attached patch? That patch uses an srcu read >>> lock >>> for all queue types, whether or not the BLK_MQ_F_BLOCKING flag has been >>> set. >> >> >> That is much cleaner now. >> >>> Additionally, I have dropped the QUEUE_FLAG_QUIESCING flag. Just like >>> previous versions, this patch has been tested. >> >> >> I think the flag of QUEUE_FLAG_QUIESCING is still needed because we >> have to set this flag to prevent new coming .queue_rq() from being run, >> and synchronize_srcu() won't wait for completion of that at all (see >> section of 'Update-Side Primitives' in [1]). >> >> [1] https://lwn.net/Articles/202847/ > > > Hello Ming, > > How about using the existing flag BLK_MQ_S_STOPPED instead of introducing a > new QUEUE_FLAG_QUIESCING flag? From the comment above blk_mq_quiesce_queue() That looks fine, and we need to stop direct issue first after hw queue becomes BLK_MQ_S_STOPPED. > in the patch that was attached to my previous e-mail: "Additionally, it is > not prevented that new queue_rq() calls occur unless the queue has been > stopped first." > > Thanks, > > Bart. -- Ming Lei -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BUG and Oops while trying to issue a discard to LVM on RAID1 md
On 5 October 2016 at 16:04, Sitsofe Wheelerwrote: > On 4 October 2016 at 07:20, Sitsofe Wheeler wrote: >> On 4 October 2016 at 07:17, Sitsofe Wheeler wrote: >>> While trying to do a discard inside an ESXi 6 VM to an LVM device atop >>> an md RAID1 device composed of two SATA SSDs passed up as a raw disk >>> mappings through a PVSCSI controller, this BUG followed by an Oops was >>> hit: >>> >>> [ 86.902888] [ cut here ] >>> [ 86.904600] kernel BUG at arch/x86/kernel/pci-nommu.c:66! (sent that a bit too soon) On a 4.8.0 kernel the problem seems to have shifted a bit but still results in a lock up: [ 26.208152] [ cut here ] [ 26.208935] kernel BUG at ./include/linux/scatterlist.h:90! [ 26.209799] invalid opcode: [#1] SMP [ 26.210454] Modules linked in: vmw_vsock_vmci_transport vsock sb_edac edac_core intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel raid1 intel_rapl_perf ppdev vmw_balloon pcspkr joydev vmxnet3 acpi_cpufreq tpm_tis tpm_tis_core tpm vmw_vmci fjes shpchp parport_pc parport i2c_piix4 dm_multipath vmwgfx drm_kms_helper ttm drm crc32c_intel serio_raw vmw_pvscsi ata_generic pata_acpi [ 26.216797] CPU: 0 PID: 220 Comm: kworker/0:1H Not tainted 4.8.0-1.vanilla.knurd.1.fc24.x86_64 #1 [ 26.218191] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 09/30/2014 [ 26.219861] Workqueue: kblockd blk_delay_work [ 26.220570] task: 9608bf30 task.stack: 9608b9d9 [ 26.221505] RIP: 0010:[] [] blk_rq_map_sg+0x317/0x560 [ 26.222812] RSP: 0018:9608b9d93b78 EFLAGS: 00010002 [ 26.223650] RAX: 002e RBX: 0200 RCX: 9608bb71bd00 [ 26.224766] RDX: 0007fc01 RSI: 0002 RDI: 0400 [ 26.225867] RBP: 9608b9d93c00 R08: 9608bec1ca00 R09: [ 26.226992] R10: 9608bb71bd00 R11: 9608bb74d900 R12: 0200 [ 26.228085] R13: 0400 R14: R15: 9608bb71b800 [ 26.229195] FS: () GS:9608bec0() knlGS: [ 26.230509] CS: 0010 DS: ES: CR0: 80050033 [ 26.231442] CR2: 7fe4bc4ea000 CR3: 39cab000 CR4: 001406f0 [ 26.232620] Stack: [ 26.232967] 9608b9d93bd0 9d3f2f1d 9608bb71bd00 01080020 [ 26.234269] 9608bfaade60 9608bf162380 002e [ 26.235558] 04000200 80a6fe96 [ 26.236854] Call Trace: [ 26.237263] [] ? __sg_alloc_table+0x7d/0x160 [ 26.238217] [] scsi_init_sgtable+0x3d/0x70 [ 26.239148] [] scsi_init_io+0x44/0x1c0 [ 26.240013] [] sd_init_command+0x2b2/0xde0 [ 26.240970] [] ? scsi_host_alloc_command+0x4b/0xc0 [ 26.242015] [] scsi_setup_cmnd+0x101/0x160 [ 26.242962] [] scsi_prep_fn+0xf4/0x180 [ 26.243869] [] blk_peek_request+0x16e/0x2b0 [ 26.244836] [] scsi_request_fn+0x3f/0x5f0 [ 26.245756] [] __blk_run_queue+0x33/0x40 [ 26.246636] [] blk_delay_work+0x25/0x40 [ 26.247506] [] process_one_work+0x184/0x430 [ 26.248433] [] worker_thread+0x4e/0x480 [ 26.249311] [] ? process_one_work+0x430/0x430 [ 26.250265] [] ? process_one_work+0x430/0x430 [ 26.251210] [] kthread+0xd8/0xf0 [ 26.251993] [] ret_from_fork+0x1f/0x40 [ 26.252845] [] ? kthread_worker_fn+0x180/0x180 [ 26.253801] Code: c6 41 01 c5 41 29 c0 41 29 c4 44 39 ea 75 c9 41 83 c6 01 45 31 ed eb c0 48 8b 4c 24 10 48 8b 31 83 e6 03 a8 03 0f 84 38 ff ff ff <0f> 0b 48 8b 5c 24 20 4c 89 54 24 30 48 89 df ff 90 c0 00 00 00 [ 26.258363] RIP [] blk_rq_map_sg+0x317/0x560 [ 26.259345] RSP [ 26.259890] ---[ end trace bb376bf807673a6f ]--- [ 26.260678] BUG: unable to handle kernel paging request at 80a6fe96 [ 26.261828] IP: [] __wake_up_common+0x2b/0x80 [ 26.262785] PGD 0 [ 26.263141] Oops: [#2] SMP [ 26.263644] Modules linked in: vmw_vsock_vmci_transport vsock sb_edac edac_core intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel raid1 intel_rapl_perf ppdev vmw_balloon pcspkr joydev vmxnet3 acpi_cpufreq tpm_tis tpm_tis_core tpm vmw_vmci fjes shpchp parport_pc parport i2c_piix4 dm_multipath vmwgfx drm_kms_helper ttm drm crc32c_intel serio_raw vmw_pvscsi ata_generic pata_acpi [ 26.270080] CPU: 0 PID: 220 Comm: kworker/0:1H Tainted: G D 4.8.0-1.vanilla.knurd.1.fc24.x86_64 #1 [ 26.271661] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 09/30/2014 [ 26.273349] task: 9608bf30 task.stack: 9608b9d9 [ 26.274273] RIP: 0010:[] [] __wake_up_common+0x2b/0x80 [ 26.275621] RSP: 0018:9608b9d93e38 EFLAGS: 00010086 [ 26.276454] RAX: 0282 RBX: 9608b9d93f10 RCX: [ 26.277593] RDX: 80a6fe96 RSI: 0003 RDI:
Re: BUG and Oops while trying to issue a discard to LVM on RAID1 md
On 4 October 2016 at 07:20, Sitsofe Wheelerwrote: > On 4 October 2016 at 07:17, Sitsofe Wheeler wrote: >> While trying to do a discard inside an ESXi 6 VM to an LVM device atop >> an md RAID1 device composed of two SATA SSDs passed up as a raw disk >> mappings through a PVSCSI controller, this BUG followed by an Oops was >> hit: >> >> [ 86.902888] [ cut here ] >> [ 86.904600] kernel BUG at arch/x86/kernel/pci-nommu.c:66! On a 4.8.0 kernel the problem seems to have shifted a bit: -- Sitsofe | http://sucs.org/~sits/ -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 4/7] blk-mq: Introduce blk_quiesce_queue() and blk_resume_queue()
On 10/04/16 21:32, Ming Lei wrote: On Wed, Oct 5, 2016 at 12:16 PM, Bart Van Asschewrote: On 10/01/16 15:56, Ming Lei wrote: If we just call the rcu/srcu read lock(or the mutex) around .queue_rq(), the above code needn't to be duplicated any more. Can you have a look at the attached patch? That patch uses an srcu read lock for all queue types, whether or not the BLK_MQ_F_BLOCKING flag has been set. That is much cleaner now. Additionally, I have dropped the QUEUE_FLAG_QUIESCING flag. Just like previous versions, this patch has been tested. I think the flag of QUEUE_FLAG_QUIESCING is still needed because we have to set this flag to prevent new coming .queue_rq() from being run, and synchronize_srcu() won't wait for completion of that at all (see section of 'Update-Side Primitives' in [1]). [1] https://lwn.net/Articles/202847/ Hello Ming, How about using the existing flag BLK_MQ_S_STOPPED instead of introducing a new QUEUE_FLAG_QUIESCING flag? From the comment above blk_mq_quiesce_queue() in the patch that was attached to my previous e-mail: "Additionally, it is not prevented that new queue_rq() calls occur unless the queue has been stopped first." Thanks, Bart. -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html