Re: [PATCHv2 5/5] scsi: move scsi/sg.h to uapi/linux/sg.h
On 15-01-29 07:56 PM, Andy Grover wrote: This will enable user programs to have access to the most current definitions. The above description is a bit too generic. At the moment many user space programs in Linux access the SG_IO ioctl related declarations with: #include scsi/sg.h I suspect that file is maintained by the glib folks (because it does not match the kernel file but is logically equivalent). Now if this change encourages those glib folks to break the above include (same applies to '#include scsi/sg.h') then that is a serious regression. So is this delayed regression likely to happen?? Do we need to speak to the glib folks? What is the appropriate include for a user space program to fetch sg.h in its new (proposed) location? For example my utilities currently use: #include linux/bsg.h to fetch the bsg header but need some autotools magic to not break in the absence of that header (e.g. prior to around lk 2.6.30). Finally isn't the semi-flat nature of the uapi/linux directory asking for trouble? IOW why isn't there a uapi/linux/scsi directory? # cd /usr/src/linux-3.19/include/uapi/linux/ # ls | wc 440 4404350 # ls -p | grep / | wc 24 24 204 The last command is counting directories (and really should be simpler than that). Doug Gilbert -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] sg: Remove an unused variable
The 'data_dir' variable is not used in sg_common_write(), hence remove this variable. Signed-off-by: Bart Van Assche bart.vanass...@sandisk.com Acked-by: Douglas Gilbert dgilb...@interlog.com --- drivers/scsi/sg.c | 17 + 1 file changed, 1 insertion(+), 16 deletions(-) diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c index b14f64c..a668c88 100644 --- a/drivers/scsi/sg.c +++ b/drivers/scsi/sg.c @@ -763,7 +763,7 @@ static int sg_common_write(Sg_fd * sfp, Sg_request * srp, unsigned char *cmnd, int timeout, int blocking) { - int k, data_dir, at_head; + int k, at_head; Sg_device *sdp = sfp-parentdp; sg_io_hdr_t *hp = srp-header; @@ -793,21 +793,6 @@ sg_common_write(Sg_fd * sfp, Sg_request * srp, return -ENODEV; } - switch (hp-dxfer_direction) { - case SG_DXFER_TO_FROM_DEV: - case SG_DXFER_FROM_DEV: - data_dir = DMA_FROM_DEVICE; - break; - case SG_DXFER_TO_DEV: - data_dir = DMA_TO_DEVICE; - break; - case SG_DXFER_UNKNOWN: - data_dir = DMA_BIDIRECTIONAL; - break; - default: - data_dir = DMA_NONE; - break; - } hp-duration = jiffies_to_msecs(jiffies); if (hp-interface_id != '\0' /* v3 (or later) interface */ (SG_FLAG_Q_AT_TAIL hp-flags)) -- 2.1.4 -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/2] ipr: Driver version 2.6.1
Bump driver version. Signed-off-by: Brian King brk...@linux.vnet.ibm.com --- drivers/scsi/ipr.h |4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff -puN drivers/scsi/ipr.h~ipr_version_2_6_1 drivers/scsi/ipr.h --- scsi-queue/drivers/scsi/ipr.h~ipr_version_2_6_1 2015-01-14 10:28:14.227130049 -0600 +++ scsi-queue-bjking1/drivers/scsi/ipr.h 2015-01-30 16:25:07.005173771 -0600 @@ -39,8 +39,8 @@ /* * Literals */ -#define IPR_DRIVER_VERSION 2.6.0 -#define IPR_DRIVER_DATE (November 16, 2012) +#define IPR_DRIVER_VERSION 2.6.1 +#define IPR_DRIVER_DATE (January 30, 2015) /* * IPR_MAX_CMD_PER_LUN: This defines the maximum number of outstanding _ -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: General protection fault in iscsi_rx_thread_pre_handler
On Fri, 2015-01-23 at 09:30 +0800, Gavin Guo wrote: Hi Nicholas, On Fri, Jan 23, 2015 at 1:35 AM, Nicholas A. Bellinger n...@linux-iscsi.org wrote: On Thu, 2015-01-22 at 23:56 +0800, Gavin Guo wrote: Hi Nicolas, On Thu, Jan 22, 2015 at 5:50 PM, Nicholas A. Bellinger n...@linux-iscsi.org wrote: Hi Gavin, On Thu, 2015-01-22 at 06:38 +0800, Gavin Guo wrote: Hi all, The general protection fault screenshot is attached. Summary: The kernel is Ubuntu-3.13.0-39.66. I've done basic analysis and found the fault is in list_del of iscsi_del_ts_from_active_list. And it looks like deleting the iscsi_thread_set *ts two times. The point to delete including iscsi_get_ts_from_inactive_list, was also checked but still can't find the clue. Really appreciate if anyone can provide any idea on the bug. SNIP Thanks for your detailed analysis. A similar bug was reported off-list some months back by a person using iser-target + RoCE export on v3.12.y code. Just to confirm, your environment is using traditional iscsi-target + TCP export, right..? I am sorry that I'm not an expert of the field and already google RoCE on the internet but still don't really know what RoCE is. However, I can provide the informations. We used iscsiadm on the initiator side and lio_node and tcm_node commands to create the targets for connection. I think it should be normal iscsi-target using TCP export. Yep, that would be traditional iscsi-target + TCP export. At the time, a different set of iser-target related changes ended up avoiding this issue on his particular setup, so we thought it was likely a race triggered by login failures specific to iser-target code. There was a untested patch (included inline below) to drop the legacy active_ts_list usage all-together, but IIRC he was not able to reproduce further so the patch didn't get picked up for mainline. If your able to reliability reproduce, please try with the following patch and let us know your progress. Thanks for your time reading the mail. I'll let you know the result. Just curious, are you able to reliability reproduce this bug in a VM..? Thanks for your caring, the machine is on the customer side, I've asked and now waiting for their response. Hi Gavin, Just curious if there has been any update on this yet..? --nab -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/2] ipr: Reboot speed improvements
Currently when performing a reboot with an ipr adapter, the adapter gets shutdown completely, flushing all write cache, as well as performing a full hardware reset of the card during the shutdown phase of the old kernel. This ensures the adapter is in a fully quiesced state across the reboot. There are scenarios, however, such as when performing kexec, where this full adapter shutdown is not required and not desired, since it can make the reboot process take noticeably longer. This patch adds a module parameter to allow for skipping the full shutdown during reboot. Rather than performing a full adapter shutdown and reset, we simply cancel any outstanding error buffers, place the adapter into a state where it has no memory of any DMA addresses from the old kernel, then disable the device. This significantly speeds up kexec boot, particularly in configurations with multiple ipr adapters. Signed-off-by: Brian King brk...@linux.vnet.ibm.com --- drivers/scsi/ipr.c | 160 ++--- drivers/scsi/ipr.h |6 + 2 files changed, 157 insertions(+), 9 deletions(-) diff -puN drivers/scsi/ipr.c~ipr_cancel_hcams3 drivers/scsi/ipr.c --- linux/drivers/scsi/ipr.c~ipr_cancel_hcams3 2015-01-30 16:28:13.534983070 -0600 +++ linux-bjking1/drivers/scsi/ipr.c2015-01-30 16:28:13.549982974 -0600 @@ -99,6 +99,7 @@ static unsigned int ipr_debug = 0; static unsigned int ipr_max_devs = IPR_DEFAULT_SIS64_DEVS; static unsigned int ipr_dual_ioa_raid = 1; static unsigned int ipr_number_of_msix = 2; +static unsigned int ipr_fast_reboot; static DEFINE_SPINLOCK(ipr_driver_lock); /* This table describes the differences between DMA controller chips */ @@ -221,6 +222,8 @@ MODULE_PARM_DESC(max_devs, Specify the [Default= __stringify(IPR_DEFAULT_SIS64_DEVS) ]); module_param_named(number_of_msix, ipr_number_of_msix, int, 0); MODULE_PARM_DESC(number_of_msix, Specify the number of MSIX interrupts to use on capable adapters (1 - 16). (default:2)); +module_param_named(fast_reboot, ipr_fast_reboot, int, S_IRUGO | S_IWUSR); +MODULE_PARM_DESC(fast_reboot, Skip adapter shutdown during reboot. Set to 1 to enable. (default: 0)); MODULE_LICENSE(GPL); MODULE_VERSION(IPR_DRIVER_VERSION); @@ -1462,7 +1465,8 @@ static void ipr_process_ccn(struct ipr_c list_add_tail(ipr_cmd-queue, ipr_cmd-hrrq-hrrq_free_q); if (ioasc) { - if (ioasc != IPR_IOASC_IOA_WAS_RESET) + if (ioasc != IPR_IOASC_IOA_WAS_RESET + ioasc != IPR_IOASC_ABORTED_CMD_TERM_BY_HOST) dev_err(ioa_cfg-pdev-dev, Host RCB failed with IOASC: 0x%08X\n, ioasc); @@ -2566,7 +2570,8 @@ static void ipr_process_error(struct ipr ipr_handle_log_data(ioa_cfg, hostrcb); if (fd_ioasc == IPR_IOASC_NR_IOA_RESET_REQUIRED) ipr_initiate_ioa_reset(ioa_cfg, IPR_SHUTDOWN_ABBREV); - } else if (ioasc != IPR_IOASC_IOA_WAS_RESET) { + } else if (ioasc != IPR_IOASC_IOA_WAS_RESET + ioasc != IPR_IOASC_ABORTED_CMD_TERM_BY_HOST) { dev_err(ioa_cfg-pdev-dev, Host RCB failed with IOASC: 0x%08X\n, ioasc); } @@ -5379,9 +5384,6 @@ static irqreturn_t ipr_handle_other_inte if (int_reg IPR_PCII_IOA_TRANS_TO_OPER) { /* Mask the interrupt */ writel(IPR_PCII_IOA_TRANS_TO_OPER, ioa_cfg-regs.set_interrupt_mask_reg); - - /* Clear the interrupt */ - writel(IPR_PCII_IOA_TRANS_TO_OPER, ioa_cfg-regs.clr_interrupt_reg); int_reg = readl(ioa_cfg-regs.sense_interrupt_reg); list_del(ioa_cfg-reset_cmd-queue); @@ -8479,6 +8481,122 @@ static int ipr_reset_alert(struct ipr_cm } /** + * ipr_reset_quiesce_done - Complete IOA disconnect + * @ipr_cmd: ipr command struct + * + * Description: Freeze the adapter to complete quiesce processing + * + * Return value: + * IPR_RC_JOB_CONTINUE + **/ +static int ipr_reset_quiesce_done(struct ipr_cmnd *ipr_cmd) +{ + struct ipr_ioa_cfg *ioa_cfg = ipr_cmd-ioa_cfg; + + ENTER; + ipr_cmd-job_step = ipr_ioa_bringdown_done; + ipr_mask_and_clear_interrupts(ioa_cfg, ~IPR_PCII_IOA_TRANS_TO_OPER); + LEAVE; + return IPR_RC_JOB_CONTINUE; +} + +/** + * ipr_reset_cancel_hcam_done - Check for outstanding commands + * @ipr_cmd: ipr command struct + * + * Description: Ensure nothing is outstanding to the IOA and + * proceed with IOA disconnect. Otherwise reset the IOA. + * + * Return value: + * IPR_RC_JOB_RETURN / IPR_RC_JOB_CONTINUE + **/ +static int ipr_reset_cancel_hcam_done(struct ipr_cmnd *ipr_cmd) +{ + struct ipr_ioa_cfg *ioa_cfg = ipr_cmd-ioa_cfg; + struct ipr_cmnd *loop_cmd; + struct ipr_hrr_queue *hrrq; + int rc = IPR_RC_JOB_CONTINUE; + int count = 0; + + ENTER; + ipr_cmd-job_step
Re: [PATCHv2 5/5] scsi: move scsi/sg.h to uapi/linux/sg.h
On 01/30/2015 09:28 AM, Douglas Gilbert wrote: At the moment many user space programs in Linux access the SG_IO ioctl related declarations with: #include scsi/sg.h I suspect that file is maintained by the glib folks (because it does not match the kernel file but is logically equivalent). Now if this change encourages those glib folks to break the above include (same applies to '#include scsi/sg.h') then that is a serious regression. This is avoided by putting the new kernel-exported headers in linux/* instead of scsi/*. Then user programs could move to the new include over time, and the old glibc-provided one could be deprecated and eventually removed. So is this delayed regression likely to happen?? It shouldn't, given the different include path. Do we need to speak to the glib folks? Yes for phase 2 (but CC'd) to get the glibc scsi/* headers deprecated and eventually removed, but phase 1 is the kernel exporting proper uapi headers, which have just gotten feedback for v2 and I'll be cranking out a v3 shortly. What is the appropriate include for a user space program to fetch sg.h in its new (proposed) location? #include linux/sg.h For example my utilities currently use: #include linux/bsg.h to fetch the bsg header but need some autotools magic to not break in the absence of that header (e.g. prior to around lk 2.6.30). CCing dhowells; something similar would be needed for sg.h I imagine. To me it looks like scsi/{scsi, scsi_ioctl, sg}.h should maybe have been included in the scripted uapi changes (607ca46). Finally isn't the semi-flat nature of the uapi/linux directory asking for trouble? IOW why isn't there a uapi/linux/scsi directory? # cd /usr/src/linux-3.19/include/uapi/linux/ # ls | wc 440 4404350 # ls -p | grep / | wc 24 24 204 The last command is counting directories (and really should be simpler than that). I could see #include linux/scsi/sg.h as reasonable but I am just a worker bee on a quest to not have to define modern scsi opcodes in my userspace code. Regards -- Andy -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] IB/srp: Process REQ_PREEMPT requests correctly
On 01/30/2015 08:34 AM, Bart Van Assche wrote: On 01/30/15 03:06, Mike Christie wrote: I think I figured this out. I think we want to change the scsi_prep_state_check check instead of each driver/class. It looks like for the SDEV_QUIESCE state we want to allow REQ_PREEMPT commands. James would know best, but I think SPI needs that ability. For SDEV_BLOCK/SDEV_CREATED_BLOCK, it looks like drivers/classes that use that state do not want any commands to be queued at that time, because the transport is normally down. Hello Mike, Is something like the patch below what you had in mind ? Bart. [PATCH] Defer processing of REQ_PREEMPT requests for blocked devices SCSI transport drivers and SCSI LLDs block a SCSI device if the transport layer is not operational. This means that in this state no requests should be processed, even if the REQ_PREEMPT flag has been set. This patch avoids that a rescan shortly after a cable pull sporadically triggers the following kernel oops: BUG: unable to handle kernel paging request at c9001a6bc084 IP: [a04e08f2] mlx4_ib_post_send+0xd2/0xb30 [mlx4_ib] Process rescan-scsi-bus (pid: 9241, threadinfo 88053484a000, task 880534aae100) Call Trace: [a0718135] srp_post_send+0x65/0x70 [ib_srp] [a071b9df] srp_queuecommand+0x1cf/0x3e0 [ib_srp] [a0001ff1] scsi_dispatch_cmd+0x101/0x280 [scsi_mod] [a0009ad1] scsi_request_fn+0x411/0x4d0 [scsi_mod] [81223b37] __blk_run_queue+0x27/0x30 [8122a8d2] blk_execute_rq_nowait+0x82/0x110 [8122a9c2] blk_execute_rq+0x62/0xf0 [a000b0e8] scsi_execute+0xe8/0x190 [scsi_mod] [a000b2f3] scsi_execute_req+0xa3/0x130 [scsi_mod] [a000c1aa] scsi_probe_lun+0x17a/0x450 [scsi_mod] [a000ce86] scsi_probe_and_add_lun+0x156/0x480 [scsi_mod] [a000dc2f] __scsi_scan_target+0xdf/0x1f0 [scsi_mod] [a000dfa3] scsi_scan_host_selected+0x183/0x1c0 [scsi_mod] [a000edfb] scsi_scan+0xdb/0xe0 [scsi_mod] [a000ee13] store_scan+0x13/0x20 [scsi_mod] [811c8d9b] sysfs_write_file+0xcb/0x160 [811589de] vfs_write+0xce/0x140 [81158b53] sys_write+0x53/0xa0 [81464592] system_call_fastpath+0x16/0x1b [7f611c9d9300] 0x7f611c9d92ff --- drivers/scsi/scsi_lib.c | 4 +++- include/linux/blk_types.h | 4 +++- 2 files changed, 6 insertions(+), 2 deletions(-) diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index 17bb541..7129701 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -1311,9 +1311,11 @@ scsi_prep_state_check(struct scsi_device *sdev, struct request *req) rejecting I/O to dead device\n); ret = BLKPREP_KILL; break; - case SDEV_QUIESCE: case SDEV_BLOCK: case SDEV_CREATED_BLOCK: + ret = BLKPREP_DEFER; + break; + case SDEV_QUIESCE: /* * If the devices is blocked we defer normal commands. */ diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h index c294e3e..a1b25e3 100644 --- a/include/linux/blk_types.h +++ b/include/linux/blk_types.h @@ -181,7 +181,9 @@ enum rq_flag_bits { __REQ_ELVPRIV, /* elevator private data attached */ __REQ_FAILED, /* set if the request failed */ __REQ_QUIET,/* don't worry about errors */ - __REQ_PREEMPT, /* set for ide_preempt requests */ + __REQ_PREEMPT, /* set for ide_preempt requests and also +for requests for which the SCSI quiesce +state must be ignored. */ __REQ_ALLOCED, /* request came from our alloc pool */ __REQ_COPY_USER,/* contains copies of user pages */ __REQ_FLUSH_SEQ,/* request for flush sequence */ Yeah. Did some basic tests and it works for me with iscsi and fc. Will try to do some real testing over the weekend. -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: block layer copying user io vectors
This function does something that seems rather strange. On line 859, a for loop determines the number of pages needed for the copying of the user data to kernel space. Then the memory is allocated (line 886 bio_kmalloc()). Then, strangely, on line 895, there is this conditional: This is because the function can also be used with preallocated pages, a feature only used by the sg and tape drivers. Make sure your user memory is 4k aligned, and you should be able to avoid the copy entirely (1). Where is this 4k alignment being enforced? When sg_start_req calls to blk_rq_map_user_iov, the only check for alignment is that the data buffers are 4-byte aligned (q-dma_alignment == 3). I have verified that they are. (1) except that the sg driver disables the direct mapping of user pages when using readv/writev. I can't really see why and it should be fixable by just removign that condition from the if in sg_start_req. I am investigating this now. Alternatively use the SG_IO ioctl directly on the disk device node, which neither has the read/writev limitation, nor does it use a fixed upper bound preallocated page pool. Unfortunately, the write/read interface of sg is what I need to use for asynchronous reasons. -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] IB/srp: Process REQ_PREEMPT requests correctly
On 01/30/15 03:06, Mike Christie wrote: I think I figured this out. I think we want to change the scsi_prep_state_check check instead of each driver/class. It looks like for the SDEV_QUIESCE state we want to allow REQ_PREEMPT commands. James would know best, but I think SPI needs that ability. For SDEV_BLOCK/SDEV_CREATED_BLOCK, it looks like drivers/classes that use that state do not want any commands to be queued at that time, because the transport is normally down. Hello Mike, Is something like the patch below what you had in mind ? Bart. [PATCH] Defer processing of REQ_PREEMPT requests for blocked devices SCSI transport drivers and SCSI LLDs block a SCSI device if the transport layer is not operational. This means that in this state no requests should be processed, even if the REQ_PREEMPT flag has been set. This patch avoids that a rescan shortly after a cable pull sporadically triggers the following kernel oops: BUG: unable to handle kernel paging request at c9001a6bc084 IP: [a04e08f2] mlx4_ib_post_send+0xd2/0xb30 [mlx4_ib] Process rescan-scsi-bus (pid: 9241, threadinfo 88053484a000, task 880534aae100) Call Trace: [a0718135] srp_post_send+0x65/0x70 [ib_srp] [a071b9df] srp_queuecommand+0x1cf/0x3e0 [ib_srp] [a0001ff1] scsi_dispatch_cmd+0x101/0x280 [scsi_mod] [a0009ad1] scsi_request_fn+0x411/0x4d0 [scsi_mod] [81223b37] __blk_run_queue+0x27/0x30 [8122a8d2] blk_execute_rq_nowait+0x82/0x110 [8122a9c2] blk_execute_rq+0x62/0xf0 [a000b0e8] scsi_execute+0xe8/0x190 [scsi_mod] [a000b2f3] scsi_execute_req+0xa3/0x130 [scsi_mod] [a000c1aa] scsi_probe_lun+0x17a/0x450 [scsi_mod] [a000ce86] scsi_probe_and_add_lun+0x156/0x480 [scsi_mod] [a000dc2f] __scsi_scan_target+0xdf/0x1f0 [scsi_mod] [a000dfa3] scsi_scan_host_selected+0x183/0x1c0 [scsi_mod] [a000edfb] scsi_scan+0xdb/0xe0 [scsi_mod] [a000ee13] store_scan+0x13/0x20 [scsi_mod] [811c8d9b] sysfs_write_file+0xcb/0x160 [811589de] vfs_write+0xce/0x140 [81158b53] sys_write+0x53/0xa0 [81464592] system_call_fastpath+0x16/0x1b [7f611c9d9300] 0x7f611c9d92ff --- drivers/scsi/scsi_lib.c | 4 +++- include/linux/blk_types.h | 4 +++- 2 files changed, 6 insertions(+), 2 deletions(-) diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index 17bb541..7129701 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -1311,9 +1311,11 @@ scsi_prep_state_check(struct scsi_device *sdev, struct request *req) rejecting I/O to dead device\n); ret = BLKPREP_KILL; break; - case SDEV_QUIESCE: case SDEV_BLOCK: case SDEV_CREATED_BLOCK: + ret = BLKPREP_DEFER; + break; + case SDEV_QUIESCE: /* * If the devices is blocked we defer normal commands. */ diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h index c294e3e..a1b25e3 100644 --- a/include/linux/blk_types.h +++ b/include/linux/blk_types.h @@ -181,7 +181,9 @@ enum rq_flag_bits { __REQ_ELVPRIV, /* elevator private data attached */ __REQ_FAILED, /* set if the request failed */ __REQ_QUIET,/* don't worry about errors */ - __REQ_PREEMPT, /* set for ide_preempt requests */ + __REQ_PREEMPT, /* set for ide_preempt requests and also + for requests for which the SCSI quiesce + state must be ignored. */ __REQ_ALLOCED, /* request came from our alloc pool */ __REQ_COPY_USER,/* contains copies of user pages */ __REQ_FLUSH_SEQ,/* request for flush sequence */ -- 2.1.4 -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/8] lib/iovec: Add memcpy_fromiovec_out library function
On 30/01/2015 09:12, Nicholas A. Bellinger wrote: From: Nicholas Bellinger n...@linux-iscsi.org This patch adds a new memcpy_fromiovec_out() library function which modifies the passed *iov following memcpy_fromiovec(), but also returns the next current iovec pointer via **iov_out. This is useful for vhost ANY_LAYOUT support when guests are allowed to generate incoming virtio request headers combined with subsequent SGL payloads into a single iovec. Cc: Michael S. Tsirkin m...@redhat.com Cc: Paolo Bonzini pbonz...@redhat.com Signed-off-by: Nicholas Bellinger n...@linux-iscsi.org --- include/linux/uio.h | 2 ++ lib/iovec.c | 27 +++ 2 files changed, 29 insertions(+) diff --git a/include/linux/uio.h b/include/linux/uio.h index 1c5e453..3e4473d 100644 --- a/include/linux/uio.h +++ b/include/linux/uio.h @@ -136,6 +136,8 @@ size_t csum_and_copy_to_iter(void *addr, size_t bytes, __wsum *csum, struct iov_ size_t csum_and_copy_from_iter(void *addr, size_t bytes, __wsum *csum, struct iov_iter *i); int memcpy_fromiovec(unsigned char *kdata, struct iovec *iov, int len); +int memcpy_fromiovec_out(unsigned char *kdata, struct iovec *iov, + struct iovec **iov_out, int len); int memcpy_fromiovecend(unsigned char *kdata, const struct iovec *iov, int offset, int len); int memcpy_toiovecend(const struct iovec *v, unsigned char *kdata, diff --git a/lib/iovec.c b/lib/iovec.c index 2d99cb4..6a813dd 100644 --- a/lib/iovec.c +++ b/lib/iovec.c @@ -28,6 +28,33 @@ int memcpy_fromiovec(unsigned char *kdata, struct iovec *iov, int len) EXPORT_SYMBOL(memcpy_fromiovec); /* + * Copy iovec to kernel, saving the current iov to *iov_out. + * Returns -EFAULT on error. Perhaps document that iov is modified, zeroing everything in [iov, *iov_out) and possibly removing the front of *iov_out? Paolo + */ + +int memcpy_fromiovec_out(unsigned char *kdata, struct iovec *iov, + struct iovec **iov_out, int len) +{ + while (len 0) { + if (iov-iov_len) { + int copy = min_t(unsigned int, len, iov-iov_len); + if (copy_from_user(kdata, iov-iov_base, copy)) + return -EFAULT; + len -= copy; + kdata += copy; + iov-iov_base += copy; + iov-iov_len -= copy; + } + if (!iov-iov_len) + iov++; + } + *iov_out = iov; + + return 0; +} +EXPORT_SYMBOL(memcpy_fromiovec_out); + +/* * Copy kernel to iovec. Returns -EFAULT on error. */ -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/3] scsi: serialize -rescan against -remove
On 30/01/2015 02:08, Fam Zheng wrote: On Fri, 01/30 00:11, Paolo Bonzini wrote: On 29/01/2015 00:00, Christoph Hellwig wrote: Lock the device embedded in the scsi_device to protect against concurrent calls to -remove. Signed-off-by: Christoph Hellwig h...@lst.de I wonder if this makes this problem: https://lkml.org/lkml/2015/1/5/9 go away. A quick test says yes. Great, we might want to revert that patch in 3.21. Paolo -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 5/8] vhost/scsi: Add common vhost_scsi_queue_desc code
From: Nicholas Bellinger n...@linux-iscsi.org Move logic for typical vhost_scsi_handle_vq() - tcm_vhost_workqueue - tcm_vhost_submission_work() dispatch into vhost_scsi_queue_desc(). Can be shared by vhost_scsi_handle_vqal() code. Cc: Michael S. Tsirkin m...@redhat.com Cc: Paolo Bonzini pbonz...@redhat.com Signed-off-by: Nicholas Bellinger n...@linux-iscsi.org --- drivers/vhost/scsi.c | 33 +++-- 1 file changed, 19 insertions(+), 14 deletions(-) diff --git a/drivers/vhost/scsi.c b/drivers/vhost/scsi.c index 049e603..756a893 100644 --- a/drivers/vhost/scsi.c +++ b/drivers/vhost/scsi.c @@ -963,6 +963,24 @@ vhost_scsi_send_bad_target(struct vhost_scsi *vs, pr_err(Faulted on virtio_scsi_cmd_resp\n); } +static void vhost_scsi_queue_desc(struct tcm_vhost_cmd *cmd, int desc) +{ + /* +* Save the descriptor from vhost_get_vq_desc() to be used to +* complete the virtio-scsi request in TCM callback context via +* tcm_vhost_queue_data_in() and tcm_vhost_queue_status() +*/ + cmd-tvc_vq_desc = desc; + /* +* Dispatch cmd descriptor for cmwq execution in process +* context provided by tcm_vhost_workqueue. This also ensures +* cmd is executed on the same kworker CPU as this vhost +* thread to gain positive L2 cache locality effects. +*/ + INIT_WORK(cmd-work, tcm_vhost_submission_work); + queue_work(tcm_vhost_workqueue, cmd-work); +} + static void vhost_scsi_handle_vq(struct vhost_scsi *vs, struct vhost_virtqueue *vq) { @@ -1191,20 +1209,7 @@ vhost_scsi_handle_vq(struct vhost_scsi *vs, struct vhost_virtqueue *vq) continue; } } - /* -* Save the descriptor from vhost_get_vq_desc() to be used to -* complete the virtio-scsi request in TCM callback context via -* tcm_vhost_queue_data_in() and tcm_vhost_queue_status() -*/ - cmd-tvc_vq_desc = head; - /* -* Dispatch tv_cmd descriptor for cmwq execution in process -* context provided by tcm_vhost_workqueue. This also ensures -* tv_cmd is executed on the same kworker CPU as this vhost -* thread to gain positive L2 cache locality effects.. -*/ - INIT_WORK(cmd-work, tcm_vhost_submission_work); - queue_work(tcm_vhost_workqueue, cmd-work); + vhost_scsi_queue_desc(cmd, head); } out: mutex_unlock(vq-mutex); -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 3/8] vhost/scsi: Fix incorrect early vhost_scsi_handle_vq failures
From: Nicholas Bellinger n...@linux-iscsi.org This patch fixes vhost_scsi_handle_vq() failure cases that result in BUG_ON() getting triggered when vhost_scsi_free_cmd() is called, and -tvc_se_cmd has not been initialized by target_submit_cmd_map_sgls(). It changes tcm_vhost_release_cmd() to use tcm_vhost_cmd-tvc_nexus for obtaining se_session pointer reference. Also, avoid calling put_page() on NULL sg-page entries in vhost_scsi_map_to_sgl() failure path. Cc: Michael S. Tsirkin m...@redhat.com Cc: Paolo Bonzini pbonz...@redhat.com Signed-off-by: Nicholas Bellinger n...@linux-iscsi.org --- drivers/vhost/scsi.c | 52 +--- 1 file changed, 29 insertions(+), 23 deletions(-) diff --git a/drivers/vhost/scsi.c b/drivers/vhost/scsi.c index a03ac41..9c5ac23 100644 --- a/drivers/vhost/scsi.c +++ b/drivers/vhost/scsi.c @@ -460,7 +460,7 @@ static void tcm_vhost_release_cmd(struct se_cmd *se_cmd) { struct tcm_vhost_cmd *tv_cmd = container_of(se_cmd, struct tcm_vhost_cmd, tvc_se_cmd); - struct se_session *se_sess = se_cmd-se_sess; + struct se_session *se_sess = tv_cmd-tvc_nexus-tvn_se_sess; int i; if (tv_cmd-tvc_sgl_count) { @@ -859,9 +859,11 @@ vhost_scsi_map_iov_to_sgl(struct tcm_vhost_cmd *cmd, ret = vhost_scsi_map_to_sgl(cmd, sg, sgl_count, iov[i], cmd-tvc_upages, write); if (ret 0) { - for (i = 0; i cmd-tvc_sgl_count; i++) - put_page(sg_page(cmd-tvc_sgl[i])); - + for (i = 0; i cmd-tvc_sgl_count; i++) { + struct page *page = sg_page(cmd-tvc_sgl[i]); + if (page) + put_page(page); + } cmd-tvc_sgl_count = 0; return ret; } @@ -900,9 +902,11 @@ vhost_scsi_map_iov_to_prot(struct tcm_vhost_cmd *cmd, ret = vhost_scsi_map_to_sgl(cmd, prot_sg, prot_sgl_count, iov[i], cmd-tvc_upages, write); if (ret 0) { - for (i = 0; i cmd-tvc_prot_sgl_count; i++) - put_page(sg_page(cmd-tvc_prot_sgl[i])); - + for (i = 0; i cmd-tvc_prot_sgl_count; i++) { + struct page *page = sg_page(cmd-tvc_prot_sgl[i]); + if (page) + put_page(page); + } cmd-tvc_prot_sgl_count = 0; return ret; } @@ -1060,12 +1064,14 @@ vhost_scsi_handle_vq(struct vhost_scsi *vs, struct vhost_virtqueue *vq) if (unlikely(vq-iov[0].iov_len req_size)) { pr_err(Expecting virtio-scsi header: %zu, got %zu\n, req_size, vq-iov[0].iov_len); - break; + vhost_scsi_send_bad_target(vs, vq, head, out); + continue; } ret = memcpy_fromiovecend(req, vq-iov[0], 0, req_size); if (unlikely(ret)) { vq_err(vq, Faulted on virtio_scsi_cmd_req\n); - break; + vhost_scsi_send_bad_target(vs, vq, head, out); + continue; } /* virtio-scsi spec requires byte 0 of the lun to be 1 */ @@ -1096,14 +1102,16 @@ vhost_scsi_handle_vq(struct vhost_scsi *vs, struct vhost_virtqueue *vq) if (data_direction != DMA_TO_DEVICE) { vq_err(vq, Received non zero do_pi_niov , but wrong data_direction\n); - goto err_cmd; + vhost_scsi_send_bad_target(vs, vq, head, out); + continue; } prot_bytes = vhost32_to_cpu(vq, v_req_pi.pi_bytesout); } else if (v_req_pi.pi_bytesin) { if (data_direction != DMA_FROM_DEVICE) { vq_err(vq, Received non zero di_pi_niov , but wrong data_direction\n); - goto err_cmd; + vhost_scsi_send_bad_target(vs, vq, head, out); + continue; } prot_bytes = vhost32_to_cpu(vq, v_req_pi.pi_bytesin); } @@ -1143,7 +1151,8 @@ vhost_scsi_handle_vq(struct vhost_scsi *vs, struct vhost_virtqueue
[PATCH 4/8] vhost/scsi: Change vhost_scsi_map_to_sgl to accept iov ptr + len
From: Nicholas Bellinger n...@linux-iscsi.org This patch changes vhost_scsi_map_to_sgl() parameters to accept virtio iovec ptr + len when determing pages_nr. This is currently done with iov_num_pages() - PAGE_ALIGN, so allow the same parameters as well. Cc: Michael S. Tsirkin m...@redhat.com Cc: Paolo Bonzini pbonz...@redhat.com Signed-off-by: Nicholas Bellinger n...@linux-iscsi.org --- drivers/vhost/scsi.c | 37 +++-- 1 file changed, 15 insertions(+), 22 deletions(-) diff --git a/drivers/vhost/scsi.c b/drivers/vhost/scsi.c index 9c5ac23..049e603 100644 --- a/drivers/vhost/scsi.c +++ b/drivers/vhost/scsi.c @@ -220,10 +220,10 @@ static struct workqueue_struct *tcm_vhost_workqueue; static DEFINE_MUTEX(tcm_vhost_mutex); static LIST_HEAD(tcm_vhost_list); -static int iov_num_pages(struct iovec *iov) +static int iov_num_pages(void __user *iov_base, size_t iov_len) { - return (PAGE_ALIGN((unsigned long)iov-iov_base + iov-iov_len) - - ((unsigned long)iov-iov_base PAGE_MASK)) PAGE_SHIFT; + return (PAGE_ALIGN((unsigned long)iov_base + iov_len) - + ((unsigned long)iov_base PAGE_MASK)) PAGE_SHIFT; } static void tcm_vhost_done_inflight(struct kref *kref) @@ -777,25 +777,18 @@ vhost_scsi_get_tag(struct vhost_virtqueue *vq, struct tcm_vhost_tpg *tpg, * Returns the number of scatterlist entries used or -errno on error. */ static int -vhost_scsi_map_to_sgl(struct tcm_vhost_cmd *tv_cmd, +vhost_scsi_map_to_sgl(struct tcm_vhost_cmd *cmd, + void __user *ptr, + size_t len, struct scatterlist *sgl, - unsigned int sgl_count, - struct iovec *iov, - struct page **pages, bool write) { - unsigned int npages = 0, pages_nr, offset, nbytes; + unsigned int npages = 0, offset, nbytes; + unsigned int pages_nr = iov_num_pages(ptr, len); struct scatterlist *sg = sgl; - void __user *ptr = iov-iov_base; - size_t len = iov-iov_len; + struct page **pages = cmd-tvc_upages; int ret, i; - pages_nr = iov_num_pages(iov); - if (pages_nr sgl_count) { - pr_err(vhost_scsi_map_to_sgl() pages_nr: %u greater than - sgl_count: %u\n, pages_nr, sgl_count); - return -ENOBUFS; - } if (pages_nr TCM_VHOST_PREALLOC_UPAGES) { pr_err(vhost_scsi_map_to_sgl() pages_nr: %u greater than preallocated TCM_VHOST_PREALLOC_UPAGES: %u\n, @@ -840,7 +833,7 @@ vhost_scsi_map_iov_to_sgl(struct tcm_vhost_cmd *cmd, int ret, i; for (i = 0; i niov; i++) - sgl_count += iov_num_pages(iov[i]); + sgl_count += iov_num_pages(iov[i].iov_base, iov[i].iov_len); if (sgl_count TCM_VHOST_PREALLOC_SGLS) { pr_err(vhost_scsi_map_iov_to_sgl() sgl_count: %u greater than @@ -856,8 +849,8 @@ vhost_scsi_map_iov_to_sgl(struct tcm_vhost_cmd *cmd, pr_debug(Mapping iovec %p for %u pages\n, iov[0], sgl_count); for (i = 0; i niov; i++) { - ret = vhost_scsi_map_to_sgl(cmd, sg, sgl_count, iov[i], - cmd-tvc_upages, write); + ret = vhost_scsi_map_to_sgl(cmd, iov[i].iov_base, iov[i].iov_len, + sg, write); if (ret 0) { for (i = 0; i cmd-tvc_sgl_count; i++) { struct page *page = sg_page(cmd-tvc_sgl[i]); @@ -884,7 +877,7 @@ vhost_scsi_map_iov_to_prot(struct tcm_vhost_cmd *cmd, int ret, i; for (i = 0; i niov; i++) - prot_sgl_count += iov_num_pages(iov[i]); + prot_sgl_count += iov_num_pages(iov[i].iov_base, iov[i].iov_len); if (prot_sgl_count TCM_VHOST_PREALLOC_PROT_SGLS) { pr_err(vhost_scsi_map_iov_to_prot() sgl_count: %u greater than @@ -899,8 +892,8 @@ vhost_scsi_map_iov_to_prot(struct tcm_vhost_cmd *cmd, cmd-tvc_prot_sgl_count = prot_sgl_count; for (i = 0; i niov; i++) { - ret = vhost_scsi_map_to_sgl(cmd, prot_sg, prot_sgl_count, iov[i], - cmd-tvc_upages, write); + ret = vhost_scsi_map_to_sgl(cmd, iov[i].iov_base, iov[i].iov_len, + prot_sg, write); if (ret 0) { for (i = 0; i cmd-tvc_prot_sgl_count; i++) { struct page *page = sg_page(cmd-tvc_prot_sgl[i]); -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/8] lib/iovec: Add memcpy_fromiovec_out library function
From: Nicholas Bellinger n...@linux-iscsi.org This patch adds a new memcpy_fromiovec_out() library function which modifies the passed *iov following memcpy_fromiovec(), but also returns the next current iovec pointer via **iov_out. This is useful for vhost ANY_LAYOUT support when guests are allowed to generate incoming virtio request headers combined with subsequent SGL payloads into a single iovec. Cc: Michael S. Tsirkin m...@redhat.com Cc: Paolo Bonzini pbonz...@redhat.com Signed-off-by: Nicholas Bellinger n...@linux-iscsi.org --- include/linux/uio.h | 2 ++ lib/iovec.c | 27 +++ 2 files changed, 29 insertions(+) diff --git a/include/linux/uio.h b/include/linux/uio.h index 1c5e453..3e4473d 100644 --- a/include/linux/uio.h +++ b/include/linux/uio.h @@ -136,6 +136,8 @@ size_t csum_and_copy_to_iter(void *addr, size_t bytes, __wsum *csum, struct iov_ size_t csum_and_copy_from_iter(void *addr, size_t bytes, __wsum *csum, struct iov_iter *i); int memcpy_fromiovec(unsigned char *kdata, struct iovec *iov, int len); +int memcpy_fromiovec_out(unsigned char *kdata, struct iovec *iov, +struct iovec **iov_out, int len); int memcpy_fromiovecend(unsigned char *kdata, const struct iovec *iov, int offset, int len); int memcpy_toiovecend(const struct iovec *v, unsigned char *kdata, diff --git a/lib/iovec.c b/lib/iovec.c index 2d99cb4..6a813dd 100644 --- a/lib/iovec.c +++ b/lib/iovec.c @@ -28,6 +28,33 @@ int memcpy_fromiovec(unsigned char *kdata, struct iovec *iov, int len) EXPORT_SYMBOL(memcpy_fromiovec); /* + * Copy iovec to kernel, saving the current iov to *iov_out. + * Returns -EFAULT on error. + */ + +int memcpy_fromiovec_out(unsigned char *kdata, struct iovec *iov, +struct iovec **iov_out, int len) +{ + while (len 0) { + if (iov-iov_len) { + int copy = min_t(unsigned int, len, iov-iov_len); + if (copy_from_user(kdata, iov-iov_base, copy)) + return -EFAULT; + len -= copy; + kdata += copy; + iov-iov_base += copy; + iov-iov_len -= copy; + } + if (!iov-iov_len) + iov++; + } + *iov_out = iov; + + return 0; +} +EXPORT_SYMBOL(memcpy_fromiovec_out); + +/* * Copy kernel to iovec. Returns -EFAULT on error. */ -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 0/8] vhost/scsi: Add ANY_LAYOUT support
From: Nicholas Bellinger n...@linux-iscsi.org Hi MST Paolo, The series adds initial vhost/scsi ANY_LAYOUT layout support. It assumes request/CDB and response/sense_buffer headers may span more than a single iovec using lib/iovec.c logic, along with a new addition of memcpy_fromiovec_out() to return the current re-calcuated **iov_out. This is currently done using a new vhost_scsi_handle_vqal() callback, and doesn't modify behavior of existing !ANY_LAYOUT code. It also allows virtio-scsi headers + T10_PI + Data SGL payloads to span the same iovec when pinning user-space memory via get_user_pages_fast() code. (Not tested yet) So far it's functioning against v3.19-rc1 virtio-scsi LLD code, with a layout of header - T10_PI - Data SGL payload in seperate iovecs and VIRTIO_F_ANY_LAYOUT + VIRTIO_F_VERSION_1 guest feature bits. Please review. Thank you, --nab Nicholas Bellinger (8): lib/iovec: Add memcpy_fromiovec_out library function vhost/scsi: Convert completion path to use memcpy_toiovecend vhost/scsi: Fix incorrect early vhost_scsi_handle_vq failures vhost/scsi: Change vhost_scsi_map_to_sgl to accept iov ptr + len vhost/scsi: Add common vhost_scsi_queue_desc code vhost/scsi: Add ANY_LAYOUT prerequisites vhost/scsi: Add ANY_LAYOUT support vhost/scsi: Set VIRTIO_F_ANY_LAYOUT + VIRTIO_F_VERSION_1 feature bits drivers/vhost/scsi.c | 512 --- include/linux/uio.h | 2 + lib/iovec.c | 27 +++ 3 files changed, 476 insertions(+), 65 deletions(-) -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/8] vhost/scsi: Convert completion path to use memcpy_toiovecend
From: Nicholas Bellinger n...@linux-iscsi.org Required for ANY_LAYOUT support when the incoming virtio-scsi response header + fixed size sense buffer payload may span more than a single iovec entry. This changes existing code to save cmd-tvc_resp_iod instead of the first single iovec base pointer from vq-iov[out]. Cc: Michael S. Tsirkin m...@redhat.com Cc: Paolo Bonzini pbonz...@redhat.com Signed-off-by: Nicholas Bellinger n...@linux-iscsi.org --- drivers/vhost/scsi.c | 9 + 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/drivers/vhost/scsi.c b/drivers/vhost/scsi.c index 01c01cb..a03ac41 100644 --- a/drivers/vhost/scsi.c +++ b/drivers/vhost/scsi.c @@ -87,8 +87,8 @@ struct tcm_vhost_cmd { struct scatterlist *tvc_sgl; struct scatterlist *tvc_prot_sgl; struct page **tvc_upages; - /* Pointer to response */ - struct virtio_scsi_cmd_resp __user *tvc_resp; + /* Pointer to response header iovec */ + struct iovec *tvc_resp_iov; /* Pointer to vhost_scsi for our device */ struct vhost_scsi *tvc_vhost; /* Pointer to vhost_virtqueue for the cmd */ @@ -703,7 +703,8 @@ static void vhost_scsi_complete_cmd_work(struct vhost_work *work) se_cmd-scsi_sense_length); memcpy(v_rsp.sense, cmd-tvc_sense_buf, se_cmd-scsi_sense_length); - ret = copy_to_user(cmd-tvc_resp, v_rsp, sizeof(v_rsp)); + ret = memcpy_toiovecend(cmd-tvc_resp_iov, (unsigned char *)v_rsp, + 0, sizeof(v_rsp)); if (likely(ret == 0)) { struct vhost_scsi_virtqueue *q; vhost_add_used(cmd-tvc_vq, cmd-tvc_vq_desc, 0); @@ -1159,7 +1160,7 @@ vhost_scsi_handle_vq(struct vhost_scsi *vs, struct vhost_virtqueue *vq) cmd-tvc_vhost = vs; cmd-tvc_vq = vq; - cmd-tvc_resp = vq-iov[out].iov_base; + cmd-tvc_resp_iov = vq-iov[out]; pr_debug(vhost_scsi got command opcode: %#02x, lun: %d\n, cmd-tvc_cdb[0], cmd-tvc_lun); -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 7/8] vhost/scsi: Add ANY_LAYOUT support
From: Nicholas Bellinger n...@linux-iscsi.org This patch adds initial ANY_LAYOUT support with a new vhost_virtqueue callback in vhost_scsi_handle_vqal(). It calculates data_direction + exp_data_len for the new tcm_vhost_cmd descriptor by walking both outgoing + incoming iovecs, assuming the layout of outgoing request header + T10_PI + Data payload comes first. It also uses memcpy_fromiovec_out() to copy leading virtio-scsi request header that may or may not include SCSI CDB, that returns a re-calculated iovec to start of T10_PI or Data SGL memory. Cc: Michael S. Tsirkin m...@redhat.com Cc: Paolo Bonzini pbonz...@redhat.com Signed-off-by: Nicholas Bellinger n...@linux-iscsi.org --- drivers/vhost/scsi.c | 267 +++ 1 file changed, 267 insertions(+) diff --git a/drivers/vhost/scsi.c b/drivers/vhost/scsi.c index d2208a41..206f4ef 100644 --- a/drivers/vhost/scsi.c +++ b/drivers/vhost/scsi.c @@ -1087,6 +1087,273 @@ static void vhost_scsi_queue_desc(struct tcm_vhost_cmd *cmd, int desc) } static void +vhost_scsi_handle_vqal(struct vhost_scsi *vs, struct vhost_virtqueue *vq) +{ + struct tcm_vhost_tpg **vs_tpg; + struct virtio_scsi_cmd_req v_req; + struct virtio_scsi_cmd_req_pi v_req_pi; + struct tcm_vhost_tpg *tpg; + struct tcm_vhost_cmd *cmd; + struct iovec *iov, *iov_out, *prot_iov, *data_iov; + u64 tag; + u32 exp_data_len, data_direction; + unsigned out, in, i; + int head, ret, prot_bytes, iov_off, data_off, prot_off; + size_t req_size, rsp_size = sizeof(struct virtio_scsi_cmd_resp); + size_t out_size, in_size; + u16 lun; + u8 *target, *lunp, task_attr; + bool t10_pi = vhost_has_feature(vq, VIRTIO_SCSI_F_T10_PI); + void *req, *cdb; + + mutex_lock(vq-mutex); + /* +* We can handle the vq only after the endpoint is setup by calling the +* VHOST_SCSI_SET_ENDPOINT ioctl. +*/ + vs_tpg = vq-private_data; + if (!vs_tpg) + goto out; + + vhost_disable_notify(vs-dev, vq); + + for (;;) { + head = vhost_get_vq_desc(vq, vq-iov, +ARRAY_SIZE(vq-iov), out, in, +NULL, NULL); + pr_debug(vhost_get_vq_desc: head: %d, out: %u in: %u\n, +head, out, in); + /* On error, stop handling until the next kick. */ + if (unlikely(head 0)) + break; + /* Nothing new? Wait for eventfd to tell us they refilled. */ + if (head == vq-num) { + if (unlikely(vhost_enable_notify(vs-dev, vq))) { + vhost_disable_notify(vs-dev, vq); + continue; + } + break; + } + /* +* Setup pointers and values based upon different virtio-scsi +* request header if T10_PI is enabled in KVM guest. +*/ + if (t10_pi) { + req = v_req_pi; + req_size = sizeof(v_req_pi); + lunp = v_req_pi.lun[0]; + target = v_req_pi.lun[1]; + } else { + req = v_req; + req_size = sizeof(v_req); + lunp = v_req.lun[0]; + target = v_req.lun[1]; + } + /* +* Determine data_direction for ANY_LAYOUT by calculating the +* total outgoing iovec sizes / incoming iovec sizes vs. +* virtio-scsi request / response headers respectively. +* +* FIXME: Not correct for BIDI operation +*/ + out_size = in_size = 0; + for (i = 0; i out; i++) + out_size += vq-iov[i].iov_len; + for (i = out; i out + in; i++) + in_size += vq-iov[i].iov_len; + /* +* Any associated T10_PI bytes for the outgoing / incoming +* payloads are included in calculation of exp_data_len here. +*/ + if (out_size req_size) { + data_direction = DMA_TO_DEVICE; + exp_data_len = out_size - req_size; + } else if (in_size rsp_size) { + data_direction = DMA_FROM_DEVICE; + exp_data_len = in_size - rsp_size; + } else { + data_direction = DMA_NONE; + exp_data_len = 0; + } + /* +* Copy over the virtio-scsi request header, which when +* ANY_LAYOUT is enabled may span multiple iovecs, or a +* single iovec may contain both
[PATCH 8/8] vhost/scsi: Set VIRTIO_F_ANY_LAYOUT + VIRTIO_F_VERSION_1 feature bits
From: Nicholas Bellinger n...@linux-iscsi.org Signal support of VIRTIO_F_ANY_LAYOUT + VIRTIO_F_VERSION_1 feature bits required for virtio-scsi 1.0 spec layout requirements. Cc: Michael S. Tsirkin m...@redhat.com Cc: Paolo Bonzini pbonz...@redhat.com Signed-off-by: Nicholas Bellinger n...@linux-iscsi.org --- drivers/vhost/scsi.c | 9 +++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/drivers/vhost/scsi.c b/drivers/vhost/scsi.c index 206f4ef..a773af3 100644 --- a/drivers/vhost/scsi.c +++ b/drivers/vhost/scsi.c @@ -171,7 +171,9 @@ enum { /* Note: can't set VIRTIO_F_VERSION_1 yet, since that implies ANY_LAYOUT. */ enum { VHOST_SCSI_FEATURES = VHOST_FEATURES | (1ULL VIRTIO_SCSI_F_HOTPLUG) | - (1ULL VIRTIO_SCSI_F_T10_PI) + (1ULL VIRTIO_SCSI_F_T10_PI) | + (1ULL VIRTIO_F_ANY_LAYOUT) | + (1ULL VIRTIO_F_VERSION_1) }; #define VHOST_SCSI_MAX_TARGET 256 @@ -1644,7 +1646,10 @@ static void vhost_scsi_handle_kick(struct vhost_work *work) poll.work); struct vhost_scsi *vs = container_of(vq-dev, struct vhost_scsi, dev); - vhost_scsi_handle_vq(vs, vq); + if (vhost_has_feature(vq, VIRTIO_F_ANY_LAYOUT)) + vhost_scsi_handle_vqal(vs, vq); + else + vhost_scsi_handle_vq(vs, vq); } static void vhost_scsi_flush_vq(struct vhost_scsi *vs, int index) -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 6/8] vhost/scsi: Add ANY_LAYOUT prerequisites
From: Nicholas Bellinger n...@linux-iscsi.org This patch adds ANY_LAYOUT prerequisites logic for accepting a set of protection + data payloads via iovec + offset. Also includes helpers for calcuating SGLs + invoking vhost_scsi_map_to_sgl() with a known number of iovecs. Required by ANY_LAYOUT processing when struct iovec may be offset into the first outgoing virtio-scsi request header. Cc: Michael S. Tsirkin m...@redhat.com Cc: Paolo Bonzini pbonz...@redhat.com Signed-off-by: Nicholas Bellinger n...@linux-iscsi.org --- drivers/vhost/scsi.c | 105 +++ 1 file changed, 105 insertions(+) diff --git a/drivers/vhost/scsi.c b/drivers/vhost/scsi.c index 756a893..d2208a41 100644 --- a/drivers/vhost/scsi.c +++ b/drivers/vhost/scsi.c @@ -909,6 +909,111 @@ vhost_scsi_map_iov_to_prot(struct tcm_vhost_cmd *cmd, return 0; } +static int +vhost_scsi_calc_sgls(struct iovec *iov, size_t off, size_t bytes, +int *niov, int max_sgls) +{ + size_t tmp = 0; + int sgl_count = 0; + + *niov = 0; + + while (tmp bytes) { + void __user *base = iov[*niov].iov_base + off; + size_t len = iov[(*niov)++].iov_len - off; + + sgl_count += iov_num_pages(base, len); + tmp += min(len, bytes); + off = 0; + } + if (sgl_count max_sgls) { + pr_err(%s: requested sgl_count: %d exceeds pre-allocated + max_sgls: %d\n, __func__, sgl_count, max_sgls); + return -ENOBUFS; + } + return sgl_count; +} + +static int +vhost_scsi_iov_to_sgl(struct tcm_vhost_cmd *cmd, bool write, + struct iovec *iov, size_t iov_off, int niov, + struct scatterlist *sg, int sg_count) +{ + int i, ret; + + for (i = 0; i niov; i++) { + void __user *base = iov[i].iov_base + iov_off; + size_t len = iov[i].iov_len - iov_off; + + ret = vhost_scsi_map_to_sgl(cmd, base, len, sg, write); + if (ret 0) { + for (i = 0; i sg_count; i++) { + struct page *page = sg_page(sg[i]); + if (page) + put_page(page); + } + return ret; + } + sg += ret; + iov_off = 0; + } + return 0; +} + +static int +vhost_scsi_mapal(struct tcm_vhost_cmd *cmd, +size_t prot_bytes, struct iovec *prot_iov, size_t prot_off, +size_t data_bytes, struct iovec *data_iov, size_t data_off) +{ + int data_sgl_count = 0, niov, ret; + bool write = (cmd-tvc_data_direction == DMA_FROM_DEVICE); + + if (prot_bytes) { + int prot_sgl_count; + + if (!prot_iov) { + pr_err(%s: prot_iov is NULL, but prot_bytes: %zu + present\n, __func__, prot_bytes); + return -EINVAL; + } + prot_sgl_count = vhost_scsi_calc_sgls(prot_iov, prot_off, + prot_bytes, niov, + TCM_VHOST_PREALLOC_PROT_SGLS); + if (prot_sgl_count 0) + return prot_sgl_count; + + sg_init_table(cmd-tvc_prot_sgl, prot_sgl_count); + cmd-tvc_prot_sgl_count = prot_sgl_count; + pr_debug(%s prot_sg %p prot_sgl_count %u\n, __func__, +cmd-tvc_prot_sgl, cmd-tvc_prot_sgl_count); + + ret = vhost_scsi_iov_to_sgl(cmd, write, prot_iov, prot_off, + niov, cmd-tvc_prot_sgl, + prot_sgl_count); + if (ret 0) { + cmd-tvc_prot_sgl_count = 0; + return ret; + } + } + if (!data_iov) { + pr_err(%s: data_iov is NULL, but data_bytes: %zu present\n, + __func__, data_bytes); + return -EINVAL; + } + data_sgl_count = vhost_scsi_calc_sgls(data_iov, data_off, data_bytes, + niov, TCM_VHOST_PREALLOC_SGLS); + if (data_sgl_count 0) + return data_sgl_count; + + sg_init_table(cmd-tvc_sgl, data_sgl_count); + cmd-tvc_sgl_count = data_sgl_count; + pr_debug(%s data_sg %p data_sgl_count %u\n, __func__, + cmd-tvc_sgl, cmd-tvc_sgl_count); + + return vhost_scsi_iov_to_sgl(cmd, write, data_iov, data_off, +niov, cmd-tvc_sgl, data_sgl_count); +} + static void tcm_vhost_submission_work(struct work_struct *work) { struct tcm_vhost_cmd *cmd = -- 1.9.1 -- To unsubscribe from this list: send the line
Re: usb-storage URB use-after-free
On Thu, 29 Jan 2015 11:42:18 -0500 Alan Stern st...@rowland.harvard.edu wrote: On Wed, 28 Jan 2015, Joe Lawrence wrote: This one should have gone over to linux-usb. -- Joe On 01/28/2015 05:04 PM, Joe Lawrence wrote: Hello linux-usb, We've hit a USB use-after-free on Stratus HW during device removal tests. We're running fio disk I/O to a scsi disk hanging off USB when the USB controller is hotplug removed. This crash is very consistent (usually the first device pull during testing). Without I/O, it may take days to encounter. general protection fault: [#1] SMP ... CPU: 35 PID: 19626 Comm: kworker/u97:0 Tainted: PF W O-- 3.10.0-210.el7.dev02.x86_64 #1 Hardware name: Stratus ftServer 6800/G7LYY, BIOS BIOS Version 8.0:30 11/12/2014 Workqueue: scsi_tmf_872 scmd_eh_abort_handler task: 88031bd91960 ti: 880981318000 task.ti: 880981318000 RIP: 0010:[812d5e2d] [812d5e2d] kobject_put+0xd/0x60 RSP: 0018:88098131bd28 EFLAGS: 00010002 RAX: RBX: 6b6b6b6b6b6b6c0b RCX: 0001002f0009 RDX: 002f RSI: ea0015719800 RDI: 6b6b6b6b6b6b6c0b RBP: 88098131bd30 R08: 88055c6622f0 R09: 0001002f0008 R10: 88085f407a80 R11: 81450503 R12: 8804bab6d248 R13: ff98 R14: 0086 R15: 0c20 FS: () GS:88085fce() knlGS: CS: 0010 DS: ES: CR0: 80050033 CR2: 7f2ebb5d6008 CR3: 001038dc5000 CR4: 001407e0 DR0: DR1: DR2: DR3: DR6: 0ff0 DR7: 0400 Stack: 88098131bd40 813cd327 88098131bd50 8140a65a 88098131bd78 81416795 880414791968 880414791978 88098131bd88 814175f6 Call Trace: [813cd327] put_device+0x17/0x20 [8140a65a] usb_put_dev+0x1a/0x20 [81416795] usb_hcd_unlink_urb+0x65/0xe0 [814175f6] usb_unlink_urb+0x26/0x50 [81418e92] usb_sg_cancel+0x82/0xe0 [a0074e71] usb_stor_stop_transport+0x31/0x50 [usb_storage] [a0073b8d] command_abort+0xcd/0xe0 [usb_storage] [813f51ef] scmd_eh_abort_handler+0xbf/0x480 [8108f06b] process_one_work+0x17b/0x470 [8108fe4b] worker_thread+0x11b/0x400 [8108fd30] ? rescuer_thread+0x400/0x400 [8109722f] kthread+0xcf/0xe0 [81097160] ? kthread_create_on_node+0x140/0x140 [8161387c] ret_from_fork+0x7c/0xb0 [81097160] ? kthread_create_on_node+0x140/0x140 from another crash, we know that the URB itself has been freed: but the underlying usb_device-device is still present: It looks like usb_hcd_unlink_urb takes a reference out on the underlying device but not the URB, which in our test case consistently gets freed just before usb_hcd_unlink_urb tries to return a reference on urb-device. The analysis looks correct. Maybe the URB is a reference count short when usb_hcd_unlink_urb calls unlink1? Somewhere usb-storage missed taking out a ref? No; it's a simple use-after-free error. A tourniquet hack-patch follows (probably inside out, as the URB should stay intact while usb_hcd_unlink_urb does its thing) and has been running a little over 30 surprise device removals under I/O without incident. Any better suggestions? Please try this instead. Alan Stern Index: usb-3.19/drivers/usb/core/hcd.c === --- usb-3.19.orig/drivers/usb/core/hcd.c +++ usb-3.19/drivers/usb/core/hcd.c @@ -1618,6 +1618,7 @@ static int unlink1(struct usb_hcd *hcd, int usb_hcd_unlink_urb (struct urb *urb, int status) { struct usb_hcd *hcd; + struct usb_device *udev = urb-dev; int retval = -EIDRM; unsigned long flags; @@ -1629,20 +1630,19 @@ int usb_hcd_unlink_urb (struct urb *urb, spin_lock_irqsave(hcd_urb_unlink_lock, flags); if (atomic_read(urb-use_count) 0) { retval = 0; - usb_get_dev(urb-dev); + usb_get_dev(udev); } spin_unlock_irqrestore(hcd_urb_unlink_lock, flags); if (retval == 0) { hcd = bus_to_hcd(urb-dev-bus); retval = unlink1(hcd, urb, status); - usb_put_dev(urb-dev); + if (retval == 0) + retval = -EINPROGRESS; + else if (retval != -EIDRM retval != -EBUSY) + dev_dbg(udev-dev, hcd_unlink_urb %p fail %d\n, + urb, retval); + usb_put_dev(udev); } - - if (retval == 0) - retval
Re: [PATCH 4/8] vhost/scsi: Change vhost_scsi_map_to_sgl to accept iov ptr + len
On Fri, Jan 30, 2015 at 08:12:28AM +, Nicholas A. Bellinger wrote: From: Nicholas Bellinger n...@linux-iscsi.org This patch changes vhost_scsi_map_to_sgl() parameters to accept virtio iovec ptr + len when determing pages_nr. This is currently done with iov_num_pages() - PAGE_ALIGN, so allow the same parameters as well. Cc: Michael S. Tsirkin m...@redhat.com Cc: Paolo Bonzini pbonz...@redhat.com Signed-off-by: Nicholas Bellinger n...@linux-iscsi.org --- drivers/vhost/scsi.c | 37 +++-- 1 file changed, 15 insertions(+), 22 deletions(-) diff --git a/drivers/vhost/scsi.c b/drivers/vhost/scsi.c index 9c5ac23..049e603 100644 --- a/drivers/vhost/scsi.c +++ b/drivers/vhost/scsi.c @@ -220,10 +220,10 @@ static struct workqueue_struct *tcm_vhost_workqueue; static DEFINE_MUTEX(tcm_vhost_mutex); static LIST_HEAD(tcm_vhost_list); -static int iov_num_pages(struct iovec *iov) +static int iov_num_pages(void __user *iov_base, size_t iov_len) { - return (PAGE_ALIGN((unsigned long)iov-iov_base + iov-iov_len) - -((unsigned long)iov-iov_base PAGE_MASK)) PAGE_SHIFT; + return (PAGE_ALIGN((unsigned long)iov_base + iov_len) - +((unsigned long)iov_base PAGE_MASK)) PAGE_SHIFT; } static void tcm_vhost_done_inflight(struct kref *kref) @@ -777,25 +777,18 @@ vhost_scsi_get_tag(struct vhost_virtqueue *vq, struct tcm_vhost_tpg *tpg, * Returns the number of scatterlist entries used or -errno on error. */ static int -vhost_scsi_map_to_sgl(struct tcm_vhost_cmd *tv_cmd, +vhost_scsi_map_to_sgl(struct tcm_vhost_cmd *cmd, + void __user *ptr, + size_t len, struct scatterlist *sgl, - unsigned int sgl_count, - struct iovec *iov, - struct page **pages, bool write) { - unsigned int npages = 0, pages_nr, offset, nbytes; + unsigned int npages = 0, offset, nbytes; + unsigned int pages_nr = iov_num_pages(ptr, len); struct scatterlist *sg = sgl; - void __user *ptr = iov-iov_base; - size_t len = iov-iov_len; + struct page **pages = cmd-tvc_upages; int ret, i; - pages_nr = iov_num_pages(iov); - if (pages_nr sgl_count) { - pr_err(vhost_scsi_map_to_sgl() pages_nr: %u greater than - sgl_count: %u\n, pages_nr, sgl_count); - return -ENOBUFS; - } if (pages_nr TCM_VHOST_PREALLOC_UPAGES) { pr_err(vhost_scsi_map_to_sgl() pages_nr: %u greater than preallocated TCM_VHOST_PREALLOC_UPAGES: %u\n, @@ -840,7 +833,7 @@ vhost_scsi_map_iov_to_sgl(struct tcm_vhost_cmd *cmd, int ret, i; for (i = 0; i niov; i++) - sgl_count += iov_num_pages(iov[i]); + sgl_count += iov_num_pages(iov[i].iov_base, iov[i].iov_len); A helper function for this loop seems in order as well? if (sgl_count TCM_VHOST_PREALLOC_SGLS) { pr_err(vhost_scsi_map_iov_to_sgl() sgl_count: %u greater than @@ -856,8 +849,8 @@ vhost_scsi_map_iov_to_sgl(struct tcm_vhost_cmd *cmd, pr_debug(Mapping iovec %p for %u pages\n, iov[0], sgl_count); for (i = 0; i niov; i++) { - ret = vhost_scsi_map_to_sgl(cmd, sg, sgl_count, iov[i], - cmd-tvc_upages, write); + ret = vhost_scsi_map_to_sgl(cmd, iov[i].iov_base, iov[i].iov_len, + sg, write); if (ret 0) { for (i = 0; i cmd-tvc_sgl_count; i++) { struct page *page = sg_page(cmd-tvc_sgl[i]); @@ -884,7 +877,7 @@ vhost_scsi_map_iov_to_prot(struct tcm_vhost_cmd *cmd, int ret, i; for (i = 0; i niov; i++) - prot_sgl_count += iov_num_pages(iov[i]); + prot_sgl_count += iov_num_pages(iov[i].iov_base, iov[i].iov_len); if (prot_sgl_count TCM_VHOST_PREALLOC_PROT_SGLS) { pr_err(vhost_scsi_map_iov_to_prot() sgl_count: %u greater than @@ -899,8 +892,8 @@ vhost_scsi_map_iov_to_prot(struct tcm_vhost_cmd *cmd, cmd-tvc_prot_sgl_count = prot_sgl_count; for (i = 0; i niov; i++) { - ret = vhost_scsi_map_to_sgl(cmd, prot_sg, prot_sgl_count, iov[i], - cmd-tvc_upages, write); + ret = vhost_scsi_map_to_sgl(cmd, iov[i].iov_base, iov[i].iov_len, + prot_sg, write); if (ret 0) { for (i = 0; i cmd-tvc_prot_sgl_count; i++) { struct page *page = sg_page(cmd-tvc_prot_sgl[i]); -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at
[patch] [SCSI] mvumi: shift wrapping bug in mvumi_delete_internal_cmd()
m_sg-baseaddr_h is a u32. We shift it 32 bits before casting it to dma_addr_t so the cast is too late. Fixes: f0c568a478f0 ('[SCSI] mvumi: Add Marvell UMI driver') Signed-off-by: Dan Carpenter dan.carpen...@oracle.com diff --git a/drivers/scsi/mvumi.c b/drivers/scsi/mvumi.c index 3e6b866..e7f5485 100644 --- a/drivers/scsi/mvumi.c +++ b/drivers/scsi/mvumi.c @@ -322,7 +322,7 @@ static void mvumi_delete_internal_cmd(struct mvumi_hba *mhba, sgd_getsz(mhba, m_sg, size); phy_addr = (dma_addr_t) m_sg-baseaddr_l | - (dma_addr_t) ((m_sg-baseaddr_h 16) 16); + (((dma_addr_t)m_sg-baseaddr_h 16) 16); pci_free_consistent(mhba-pdev, size, cmd-data_buf, phy_addr); -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html