Re: [PATCH] scsi: cxlflash: fix assignment of the backend operations
On Thu, Jul 05, 2018 at 07:26:23AM +0200, Cédric Le Goater wrote: > commit cd43c221bb5e ("scsi: cxlflash: Isolate external module > dependencies") introduced the use of ifdefs to avoid compilation > errors when one of the possible backend driver, CXL or OCXL, is not > compiled. > > Unfortunately, the wrong defines are used and the backend ops are > never assigned, leading to a kernel crash in any case when the > cxlflash module is loaded. > > Signed-off-by: Cédric Le Goater Odd, I know Uma tested this but perhaps not with a clean build... Thanks for catching this Cedric! Acked-by: Matthew R. Ochs
Re: [PATCH 2/6] scsi: cxlflash: Drop unused sense buffers
On Tue, May 22, 2018 at 11:15:08AM -0700, Kees Cook wrote: > This removes the unused sense buffer in read_cap16() and write_same16(). > > Signed-off-by: Kees Cook <keesc...@chromium.org> Looks good. Acked-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com>
Re: [PATCH 7/7] cxlflash: Isolate external module dependencies
On Fri, May 11, 2018 at 02:06:19PM -0500, Uma Krishnan wrote: > Depending on the underlying transport, cxlflash has a dependency on either > the CXL or OCXL drivers, which are enabled via their Kconfig option. > Instead of having a module wide dependency on these config options, it is > better to isolate the object modules that are dependent on the CXL and OCXL > drivers and adjust the module dependencies accordingly. > > This commit isolates the object files that are dependent on CXL and/or > OCXL. The cxl/ocxl fops used in the core driver are tucked under an ifdef > to avoid compilation errors. > > Signed-off-by: Uma Krishnan <ukri...@linux.vnet.ibm.com> Acked-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com>
Re: [PATCH 6/7] cxlflash: Abstract hardware dependent assignments
On Fri, May 11, 2018 at 02:06:05PM -0500, Uma Krishnan wrote: > As a staging cleanup to support transport specific builds of the cxlflash > module, relocate device dependent assignments to header files. This will > avoid littering the core driver with conditional compilation logic. > > Signed-off-by: Uma Krishnan <ukri...@linux.vnet.ibm.com> Acked-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com>
Re: [PATCH 5/7] cxlflash: Add include guards to backend.h
On Fri, May 11, 2018 at 02:05:51PM -0500, Uma Krishnan wrote: > The new header file, backend.h, that was recently added is missing > the include guards. This commit adds the guards. > > Signed-off-by: Uma Krishnan <ukri...@linux.vnet.ibm.com> Acked-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com>
Re: [PATCH 3/7] cxlflash: Acquire semaphore before invoking ioctl services
On Fri, May 11, 2018 at 02:05:22PM -0500, Uma Krishnan wrote: > When a superpipe process that makes use of virtual LUNs is terminated or > killed abruptly, there is a possibility that the cxlflash driver could > hang and deprive other operations on the adapter. > > The release fop registered to be invoked on a context close, detaches > every LUN associated with the context. The underlying service to detach > the LUN assumes it has been called with the read semaphore held, and > releases the semaphore before any operation that could be time consuming. > > When invoked without holding the read semaphore, an opportunity is created > for the semaphore's count to become negative when it is temporarily > released during one of these potential lengthy operations. This negative > count results in subsequent acquisition attempts taking forever, leading > to the hang. > > To support the current design point of holding the semaphore on the > ioctl() paths, the release fop should acquire it before invoking any > ioctl services. > > Signed-off-by: Uma Krishnan <ukri...@linux.vnet.ibm.com> Acked-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com>
Re: [PATCH 2/7] cxlflash: Limit the debug logs in the IO path
On Fri, May 11, 2018 at 02:05:08PM -0500, Uma Krishnan wrote: > The kernel log can get filled with debug messages from send_cmd_ioarrin() > when dynamic debug is enabled for the cxlflash module and there is a lot > of legacy I/O traffic. > > While these messages are necessary to debug issues that involve command > tracking, the abundance of data can overwrite other useful data in the > log. The best option available is to limit the messages that should > serve most of the common use cases. > > Signed-off-by: Uma Krishnan <ukri...@linux.vnet.ibm.com> Acked-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com>
Re: [PATCH 1/7] cxlflash: Yield to active send threads
On Fri, May 11, 2018 at 02:04:46PM -0500, Uma Krishnan wrote: > The following Oops may be encountered if the device is reset, i.e. EEH > recovery, while there is heavy I/O traffic: > > 59:mon> t > [c000200db64bb680] c00809264c40 cxlflash_queuecommand+0x3b8/0x500 > [cxlflash] > [c000200db64bb770] c090d3b0 scsi_dispatch_cmd+0x130/0x2f0 > [c000200db64bb7f0] c090fdd8 scsi_request_fn+0x3c8/0x8d0 > [c000200db64bb900] c067f528 __blk_run_queue+0x68/0xb0 > [c000200db64bb930] c067ab80 __elv_add_request+0x140/0x3c0 > [c000200db64bb9b0] c068daac blk_execute_rq_nowait+0xec/0x1a0 > [c000200db64bba00] c068dbb0 blk_execute_rq+0x50/0xe0 > [c000200db64bba50] c06b2040 sg_io+0x1f0/0x520 > [c000200db64bbaf0] c06b2e94 scsi_cmd_ioctl+0x534/0x610 > [c000200db64bbc20] c0926208 sd_ioctl+0x118/0x280 > [c000200db64bbcc0] c069f7ac blkdev_ioctl+0x7fc/0xe30 > [c000200db64bbd20] c0439204 block_ioctl+0x84/0xa0 > [c000200db64bbd40] c03f8514 do_vfs_ioctl+0xd4/0xa00 > [c000200db64bbde0] c03f8f04 SyS_ioctl+0xc4/0x130 > [c000200db64bbe30] c000b184 system_call+0x58/0x6c > > When there is no room to send the I/O request, the cached room is refreshed > by reading the memory mapped command room value from the AFU. The AFU > register mapping is refreshed during a reset, creating a race condition > that can lead to the Oops above. > > During a device reset, the AFU should not be unmapped until all the active > send threads quiesce. An atomic counter, cmds_active, is currently used to > track internal AFU commands and quiesce during reset. This same counter can > also be used for the active send threads. > > Signed-off-by: Uma Krishnan <ukri...@linux.vnet.ibm.com> Acked-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com>
Re: [PATCH v3 41/41] cxlflash: Handle spurious interrupts
On Mon, Mar 26, 2018 at 11:35:42AM -0500, Uma Krishnan wrote: > The following Oops can occur when there is heavy I/O traffic and the host > is reset by a tool such as sg_reset. > > [c000200fff3fbc90] c0081690117c process_cmd_doneq+0x104/0x500 >[cxlflash] (unreliable) > [c000200fff3fbd80] c00816901648 cxlflash_rrq_irq+0xd0/0x150 [cxlflash] > [c000200fff3fbde0] c0193130 __handle_irq_event_percpu+0xa0/0x310 > [c000200fff3fbea0] c01933d8 handle_irq_event_percpu+0x38/0x90 > [c000200fff3fbee0] c0193494 handle_irq_event+0x64/0xb0 > [c000200fff3fbf10] c0198ea0 handle_fasteoi_irq+0xc0/0x230 > [c000200fff3fbf40] c019182c generic_handle_irq+0x4c/0x70 > [c000200fff3fbf60] c001794c __do_irq+0x7c/0x1c0 > [c000200fff3fbf90] c002a390 call_do_irq+0x14/0x24 > [c000200e5828fab0] c0017b2c do_IRQ+0x9c/0x130 > [c000200e5828fb00] c0009b04 h_virt_irq_common+0x114/0x120 > > When a context is reset, the pending commands are flushed and the AFU > is notified. Before the AFU handles this request there could be command > completion interrupts queued to PHB which are yet to be delivered to the > context. In this scenario, a context could receive an interrupt for a > command that has been flushed, leading to a possible crash when the memory > for the flushed command is accessed. > > To resolve this problem, a boolean will indicate if the hardware queue is > ready to process interrupts or not. This can be evaluated in the interrupt > handler before proessing an interrupt. > > Signed-off-by: Uma Krishnan <ukri...@linux.vnet.ibm.com> Acked-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com>
Re: [PATCH v3 40/41] cxlflash: Remove commmands from pending list on timeout
On Mon, Mar 26, 2018 at 11:35:34AM -0500, Uma Krishnan wrote: > The following Oops can occur if an internal command sent to the AFU does > not complete within the timeout: > > [c00ff101b810] c00816020d94 term_mc+0xfc/0x1b0 [cxlflash] > [c00ff101b8a0] c00816020fb0 term_afu+0x168/0x280 [cxlflash] > [c00ff101b930] c008160232ec cxlflash_pci_error_detected+0x184/0x230 >[cxlflash] > [c00ff101b9e0] c0080d95d468 cxl_vphb_error_detected+0x90/0x150[cxl] > [c00ff101ba20] c0080d95f27c cxl_pci_error_detected+0xa4/0x240 [cxl] > [c00ff101bac0] c003eaf8 eeh_report_error+0xd8/0x1b0 > [c00ff101bb20] c003d0b8 eeh_pe_dev_traverse+0x98/0x170 > [c00ff101bbb0] c003f438 eeh_handle_normal_event+0x198/0x580 > [c00ff101bc60] c003fba4 eeh_handle_event+0x2a4/0x338 > [c00ff101bd10] c00400b8 eeh_event_handler+0x1f8/0x200 > [c00ff101bdc0] c013da48 kthread+0x1a8/0x1b0 > [c00ff101be30] c000b528 ret_from_kernel_thread+0x5c/0xb4 > > When an internal command times out, the command buffer is freed while it > is still in the pending commands list of the context. This corrupts the > list and when the context is cleaned up, a crash is encountered. > > To resolve this issue, when an AFU command or TMF command times out, the > command should be deleted from the hardware queue pending command list > before freeing the buffer. > > Signed-off-by: Uma Krishnan <ukri...@linux.vnet.ibm.com> Acked-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com>
Re: [PATCH v3 39/41] cxlflash: Synchronize reset and remove ops
On Mon, Mar 26, 2018 at 11:35:27AM -0500, Uma Krishnan wrote: > The following Oops can be encountered if a device removal or system > shutdown is initiated while an EEH recovery is in process: > > [c00ff2f479c0] c00815256f18 cxlflash_pci_slot_reset+0xa0/0x100 > [cxlflash] > [c00ff2f47a30] c0080dae22e0 cxl_pci_slot_reset+0x168/0x290 [cxl] > [c00ff2f47ae0] c003ef1c eeh_report_reset+0xec/0x170 > [c00ff2f47b20] c003d0b8 eeh_pe_dev_traverse+0x98/0x170 > [c00ff2f47bb0] c003f80c eeh_handle_normal_event+0x56c/0x580 > [c00ff2f47c60] c003fba4 eeh_handle_event+0x2a4/0x338 > [c00ff2f47d10] c00400b8 eeh_event_handler+0x1f8/0x200 > [c00ff2f47dc0] c013da48 kthread+0x1a8/0x1b0 > [c00ff2f47e30] c000b528 ret_from_kernel_thread+0x5c/0xb4 > > The remove handler frees AFU memory while the EEH recovery is in progress, > leading to a race condition. This can result in a crash if the recovery > thread tries to access this memory. > > To resolve this issue, the cxlflash remove handler will evaluate the > device state and yield to any active reset or probing threads. > > Signed-off-by: Uma Krishnan <ukri...@linux.vnet.ibm.com> Looks good! Acked-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com>
Re: [PATCH 00/38] cxlflash: OpenCXL transport support
On Thu, Feb 22, 2018 at 04:20:10PM -0600, Uma Krishnan wrote: > This patch series adds OpenCXL support to the cxlflash driver. With > this support, new devices using the OpenCXL transport will be supported > by the cxlflash driver along with the existing CXL devices. An effort is > made to keep this transport specific function independent of the existing > core driver that communicates with the AFU. > > The first three patches contain a minor fix and staging improvements. > > This series is intended for 4.17 and is bisectable. I'm fine with the entire series and agree that the OpenCXL terminology should be consistent throughout the kernel. So...contingent upon a V2 with OpenCXL references being replaced by OCXL: Acked-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com> > > Matthew R. Ochs (1): > cxlflash: Avoid clobbering context control register value > > Uma Krishnan (37): > cxlflash: Preserve number of interrupts for master contexts > cxlflash: Add argument identifier names > cxlflash: Introduce OpenCXL backend > cxlflash: Hardware AFU for OpenCXL > cxlflash: Read host function configuration > cxlflash: Setup function acTag range > cxlflash: Read host AFU configuration > cxlflash: Setup AFU acTag range > cxlflash: Setup AFU PASID > cxlflash: Adapter context support for OpenCXL > cxlflash: Use IDR to manage adapter contexts > cxlflash: Support adapter file descriptors for OpenCXL > cxlflash: Support adapter context discovery > cxlflash: Support image reload policy modification > cxlflash: MMIO map the AFU > cxlflash: Support starting an adapter context > cxlflash: Support process specific mappings > cxlflash: Support AFU state toggling > cxlflash: Support reading adapter VPD data > cxlflash: Setup function OpenCXL link > cxlflash: Setup OpenCXL transaction layer > cxlflash: Support process element lifecycle > cxlflash: Support AFU interrupt management > cxlflash: Support AFU interrupt mapping and registration > cxlflash: Support starting user contexts > cxlflash: Support adapter context polling > cxlflash: Support adapter context reading > cxlflash: Support adapter context mmap and release > cxlflash: Support file descriptor mapping > cxlflash: Introduce object handle fop > cxlflash: Setup LISNs for user contexts > cxlflash: Setup LISNs for master contexts > cxlflash: Update synchronous interrupt status bits > cxlflash: Introduce OCXL context state machine > cxlflash: Register for translation errors > cxlflash: Support AFU reset > cxlflash: Enable OpenCXL operations > > drivers/scsi/cxlflash/Kconfig |2 +- > drivers/scsi/cxlflash/Makefile|2 +- > drivers/scsi/cxlflash/backend.h | 50 +- > drivers/scsi/cxlflash/common.h| 10 +- > drivers/scsi/cxlflash/cxl_hw.c| 13 + > drivers/scsi/cxlflash/main.c | 55 +- > drivers/scsi/cxlflash/main.h |1 + > drivers/scsi/cxlflash/ocxl_hw.c | 1428 > + > drivers/scsi/cxlflash/ocxl_hw.h | 76 ++ > drivers/scsi/cxlflash/sislite.h | 41 +- > drivers/scsi/cxlflash/superpipe.c | 14 + > 11 files changed, 1644 insertions(+), 48 deletions(-) > create mode 100644 drivers/scsi/cxlflash/ocxl_hw.c > create mode 100644 drivers/scsi/cxlflash/ocxl_hw.h > > -- > 2.1.0 >
Re: [PATCH] scsi: cxlflash: Select SCSI_SCAN_ASYNC
On Tue, Feb 20, 2018 at 07:56:35PM +1100, Michael Ellerman wrote: > Vaibhav Jainwrites: > > > The cxlflash driver uses "Asynchronous SCSI scanning" enabled by > > CONFIG_SCSI_SCAN_ASYNC. Without this enabled the modprobe of cxlflash > > module gets hung with following backtrace: > > > > Call Trace: > > __switch_to+0x2cc/0x470 > > __schedule+0x288/0xab0 > > schedule+0x40/0xc0 > > schedule_timeout+0x254/0x4f0 > > wait_for_common+0xdc/0x260 > > flush_work+0x140/0x2a0 > > work_on_cpu+0x88/0xb0 > > pci_device_probe+0x1d0/0x220 > > driver_probe_device+0x408/0x5b0 > > __driver_attach+0x16c/0x1a0 > > bus_for_each_dev+0xb8/0x110 > > driver_attach+0x3c/0x60 > > bus_add_driver+0x1d8/0x370 > > driver_register+0x9c/0x180 > > __pci_register_driver+0x74/0xa0 > > init_cxlflash+0x158/0x1cc > > do_one_initcall+0x68/0x1e0 > > do_init_module+0x90/0x254 > > load_module+0x2f8c/0x3720 > > SyS_finit_module+0xcc/0x140 > > system_call+0x58/0x6c > > Why does it "hang"? That's kind of bizarre, I would expect either a > build or runtime failure if a feature the driver requires is missing. > It hangs due to a bug in the driver. I briefly looked at it several months back before getting distracted with other items. IIRC there was an issue with the state machine. > > diff --git a/drivers/scsi/cxlflash/Kconfig b/drivers/scsi/cxlflash/Kconfig > > index a011c5dbf214..f054c1b0fff3 100644 > > --- a/drivers/scsi/cxlflash/Kconfig > > +++ b/drivers/scsi/cxlflash/Kconfig > > @@ -6,6 +6,7 @@ config CXLFLASH > > tristate "Support for IBM CAPI Flash" > > depends on PCI && SCSI && CXL && EEH > > select IRQ_POLL > > + select SCSI_SCAN_ASYNC > > It's user configurable, so it's rude to select it. It can also be > disabled on the kernel command line, so this seems like a fragile > solution. > I think Vaibhav's intention here was to avoid the hang while the bug is still present - I believe he has encountered it several times recently. The proper solution would be to fix the bug.
Re: [PATCH 5/6] cxlflash: Adapter context init can return error
On Wed, Jan 03, 2018 at 04:55:04PM -0600, Uma Krishnan wrote: > Adapter context creation can return either NULL or an error pointer. > Updating the check condition to reflect this. > > Signed-off-by: Uma Krishnan <ukri...@linux.vnet.ibm.com> Acked-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com>
Re: [PATCH 2/6] cxlflash: Update cxl-specific arguments to generic cookie
On Wed, Jan 03, 2018 at 04:54:25PM -0600, Uma Krishnan wrote: > Convert cxl-specific pointers to generic cookies to facilitate future > enhancements. > > Signed-off-by: Uma Krishnan <ukri...@linux.vnet.ibm.com> Acked-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com>
Re: [PATCH 1/6] cxlflash: Reset command ioasc
On Thu, Jan 04, 2018 at 05:33:48PM +1100, Andrew Donnellan wrote: > On 04/01/18 09:54, Uma Krishnan wrote: > >In the event of a command failure, cxlflash returns the failure to the > >upper layers to process. After processing the error, when the command is > >queued again, the private command structure will not be zeroed and the > >ioasc could be stale. Per the SISLite specification, the AFU only sets the > >ioasc in the presence of a failure. Thus, even though the original command > >succeeds the second time, the command is considered a failure due to stale > >ioasc. This cycle repeats indefinitely and can cause a hang or IO failure. > > > >To fix the issue, clear the ioasc before queuing any command. > > > >Fixes: 479ad8e9d48c ("scsi: cxlflash: Remove zeroing of private command > >data") > >Signed-off-by: Uma Krishnan> > Should this go to stable? Not a bad idea.
Re: [PATCH 1/6] cxlflash: Reset command ioasc
On Wed, Jan 03, 2018 at 04:54:02PM -0600, Uma Krishnan wrote: > In the event of a command failure, cxlflash returns the failure to the > upper layers to process. After processing the error, when the command is > queued again, the private command structure will not be zeroed and the > ioasc could be stale. Per the SISLite specification, the AFU only sets the > ioasc in the presence of a failure. Thus, even though the original command > succeeds the second time, the command is considered a failure due to stale > ioasc. This cycle repeats indefinitely and can cause a hang or IO failure. > > To fix the issue, clear the ioasc before queuing any command. > > Fixes: 479ad8e9d48c ("scsi: cxlflash: Remove zeroing of private command > data") > Signed-off-by: Uma Krishnan <ukri...@linux.vnet.ibm.com> Acked-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com>
Re: [PATCH] scsi: cxlflash: Fix an error handling path in 'cxlflash_disk_attach()'
> On Aug 15, 2017, at 3:18 PM, Christophe JAILLET > <christophe.jail...@wanadoo.fr> wrote: > > 'rc' is known to be 0 at this point. > If 'create_context()' fails, returns -ENOMEM instead of 0 which means > success. > > Signed-off-by: Christophe JAILLET <christophe.jail...@wanadoo.fr> Yep, that's a bug. Acked-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com>
Re: [PATCH] cxlflash: return -EFAULT if copy_from_user() fails
> On Jun 30, 2017, at 3:01 AM, Dan Carpenter <dan.carpen...@oracle.com> wrote: > > The copy_from/to_user() functions return the number of bytes remaining > to be copied but we had intended to return -EFAULT here. > > Fixes: bc88ac47d5cb ("scsi: cxlflash: Support AFU debug") > Signed-off-by: Dan Carpenter <dan.carpen...@oracle.com> Good catch Dan! Acked-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com>
[PATCH 3/3] cxlflash: Update debug prints in reset handlers
The device and host reset handler contain debug prints to help identify the entities being reset. Today these reset handlers are based on a SCSI EH design that uses a SCSI command reference as a means of identifying the target entity. As such, the debug trace includes the SCSI command pointer and associated CDB. This is not necessary as the SCSI command is simply the messenger in these scenarios. Refactor the debug prints in the host and reset handlers to only present information that is applicable given the function scope. Signed-off-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com> --- drivers/scsi/cxlflash/main.c | 18 +++--- 1 file changed, 3 insertions(+), 15 deletions(-) diff --git a/drivers/scsi/cxlflash/main.c b/drivers/scsi/cxlflash/main.c index 7054e11..077f62e 100644 --- a/drivers/scsi/cxlflash/main.c +++ b/drivers/scsi/cxlflash/main.c @@ -2435,14 +2435,8 @@ static int cxlflash_eh_device_reset_handler(struct scsi_cmnd *scp) struct device *dev = >dev->dev; int rcr = 0; - dev_dbg(dev, "%s: (scp=%p) %d/%d/%d/%llu " - "cdb=(%08x-%08x-%08x-%08x)\n", __func__, scp, host->host_no, - sdev->channel, sdev->id, sdev->lun, - get_unaligned_be32(&((u32 *)scp->cmnd)[0]), - get_unaligned_be32(&((u32 *)scp->cmnd)[1]), - get_unaligned_be32(&((u32 *)scp->cmnd)[2]), - get_unaligned_be32(&((u32 *)scp->cmnd)[3])); - + dev_dbg(dev, "%s: %d/%d/%d/%llu\n", __func__, + host->host_no, sdev->channel, sdev->id, sdev->lun); retry: switch (cfg->state) { case STATE_NORMAL: @@ -2483,13 +2477,7 @@ static int cxlflash_eh_host_reset_handler(struct scsi_cmnd *scp) struct cxlflash_cfg *cfg = shost_priv(host); struct device *dev = >dev->dev; - dev_dbg(dev, "%s: (scp=%p) %d/%d/%d/%llu " - "cdb=(%08x-%08x-%08x-%08x)\n", __func__, scp, host->host_no, - scp->device->channel, scp->device->id, scp->device->lun, - get_unaligned_be32(&((u32 *)scp->cmnd)[0]), - get_unaligned_be32(&((u32 *)scp->cmnd)[1]), - get_unaligned_be32(&((u32 *)scp->cmnd)[2]), - get_unaligned_be32(&((u32 *)scp->cmnd)[3])); + dev_dbg(dev, "%s: %d\n", __func__, host->host_no); switch (cfg->state) { case STATE_NORMAL: -- 2.1.0
[PATCH 2/3] cxlflash: Update send_tmf() parameters
The current send_tmf() implementation is based on the caller providing a SCSI command reference. In reality all that is needed is a SCSI device reference as the routine uses a private command. Refactor send_tmf() to pass the private adapter configuration reference and a SCSI device reference. As a nice side effect, this will ease the burden of converting caller routines to be based solely off of a SCSI device reference. Signed-off-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com> --- drivers/scsi/cxlflash/main.c | 27 +-- 1 file changed, 13 insertions(+), 14 deletions(-) diff --git a/drivers/scsi/cxlflash/main.c b/drivers/scsi/cxlflash/main.c index 455564f..7054e11 100644 --- a/drivers/scsi/cxlflash/main.c +++ b/drivers/scsi/cxlflash/main.c @@ -459,21 +459,20 @@ static u32 cmd_to_target_hwq(struct Scsi_Host *host, struct scsi_cmnd *scp, /** * send_tmf() - sends a Task Management Function (TMF) - * @afu: AFU to checkout from. - * @scp: SCSI command from stack describing target. + * @cfg: Internal structure associated with the host. + * @sdev: SCSI device destined for TMF. * @tmfcmd:TMF command to send. * * Return: * 0 on success, SCSI_MLQUEUE_HOST_BUSY or -errno on failure */ -static int send_tmf(struct afu *afu, struct scsi_cmnd *scp, u64 tmfcmd) +static int send_tmf(struct cxlflash_cfg *cfg, struct scsi_device *sdev, + u64 tmfcmd) { - struct Scsi_Host *host = scp->device->host; - struct cxlflash_cfg *cfg = shost_priv(host); + struct afu *afu = cfg->afu; struct afu_cmd *cmd = NULL; struct device *dev = >dev->dev; - int hwq_index = cmd_to_target_hwq(host, scp, afu); - struct hwq *hwq = get_hwq(afu, hwq_index); + struct hwq *hwq = get_hwq(afu, PRIMARY_HWQ); char *buf = NULL; ulong lock_flags; int rc = 0; @@ -500,12 +499,12 @@ static int send_tmf(struct afu *afu, struct scsi_cmnd *scp, u64 tmfcmd) cmd->parent = afu; cmd->cmd_tmf = true; - cmd->hwq_index = hwq_index; + cmd->hwq_index = hwq->index; cmd->rcb.ctx_id = hwq->ctx_hndl; cmd->rcb.msi = SISL_MSI_RRQ_UPDATED; - cmd->rcb.port_sel = CHAN2PORTMASK(scp->device->channel); - cmd->rcb.lun_id = lun_to_lunid(scp->device->lun); + cmd->rcb.port_sel = CHAN2PORTMASK(sdev->channel); + cmd->rcb.lun_id = lun_to_lunid(sdev->lun); cmd->rcb.req_flags = (SISL_REQ_FLAGS_PORT_LUN_ID | SISL_REQ_FLAGS_SUP_UNDERRUN | SISL_REQ_FLAGS_TMF_CMD); @@ -2430,15 +2429,15 @@ static int cxlflash_eh_abort_handler(struct scsi_cmnd *scp) static int cxlflash_eh_device_reset_handler(struct scsi_cmnd *scp) { int rc = SUCCESS; - struct Scsi_Host *host = scp->device->host; + struct scsi_device *sdev = scp->device; + struct Scsi_Host *host = sdev->host; struct cxlflash_cfg *cfg = shost_priv(host); struct device *dev = >dev->dev; - struct afu *afu = cfg->afu; int rcr = 0; dev_dbg(dev, "%s: (scp=%p) %d/%d/%d/%llu " "cdb=(%08x-%08x-%08x-%08x)\n", __func__, scp, host->host_no, - scp->device->channel, scp->device->id, scp->device->lun, + sdev->channel, sdev->id, sdev->lun, get_unaligned_be32(&((u32 *)scp->cmnd)[0]), get_unaligned_be32(&((u32 *)scp->cmnd)[1]), get_unaligned_be32(&((u32 *)scp->cmnd)[2]), @@ -2447,7 +2446,7 @@ static int cxlflash_eh_device_reset_handler(struct scsi_cmnd *scp) retry: switch (cfg->state) { case STATE_NORMAL: - rcr = send_tmf(afu, scp, TMF_LUN_RESET); + rcr = send_tmf(cfg, sdev, TMF_LUN_RESET); if (unlikely(rcr)) rc = FAILED; break; -- 2.1.0
[PATCH 1/3] cxlflash: Avoid double free of character device
The device_unregister() service used when cleaning up the character device is already responsible for the internal state associated with the device upon successful creation. As the cxlflash driver does not obtain a second reference to the character device, the explicit call to put_device() is not required and can lead to an inconsistent sysfs among other issues as the reference is no longer valid after the first put_device() is performed. Remove the unnecessary put_device() to remedy this issue. Fixes: a834a36b57d9 ("scsi: cxlflash: Create character device to provide host management interface") Signed-off-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com> --- drivers/scsi/cxlflash/main.c | 1 - 1 file changed, 1 deletion(-) diff --git a/drivers/scsi/cxlflash/main.c b/drivers/scsi/cxlflash/main.c index 7a787b6..455564f 100644 --- a/drivers/scsi/cxlflash/main.c +++ b/drivers/scsi/cxlflash/main.c @@ -923,7 +923,6 @@ static void cxlflash_put_minor(int minor) */ static void cxlflash_release_chrdev(struct cxlflash_cfg *cfg) { - put_device(cfg->chardev); device_unregister(cfg->chardev); cfg->chardev = NULL; cdev_del(>cdev); -- 2.1.0
[PATCH 0/3] cxlflash: Minor fix and EH refactoring
This small series fixes a recently injected double free and also includes two patches to refactor the host and reset handlers to ease the burden of making this driver compatible with an updated SCSI EH framework. This series is intended for 4.13 and is bisectable. Matthew R. Ochs (3): cxlflash: Avoid double free of character device cxlflash: Update send_tmf() parameters cxlflash: Update debug prints in reset handlers drivers/scsi/cxlflash/main.c | 44 +++- 1 file changed, 15 insertions(+), 29 deletions(-) -- 2.1.0
Re: [PATCH 33/47] cxlflash: use dedicated reset command in send_tmf()
Hi Hannes, We actually just reworked these paths in a series that was pulled in on Monday. While testing, I came across a bug in that series and was planning on sending out a fix for it. I'll include some patches that will ease the burden of incorporating your EH updates for cxlflash. Many thanks for leading this effort, it really helps to clean up the SCSI EH paths! =) -matt > On Jun 28, 2017, at 3:32 AM, Hannes Reineckewrote: > > From: Hannes Reinecke > > Reduce the queue depth by 1, and use this command as a dedicated > reset command send_tmf(). > > Signed-off-by: Hannes Reinecke > --- > drivers/scsi/cxlflash/common.h | 3 ++- > drivers/scsi/cxlflash/main.c | 23 --- > 2 files changed, 18 insertions(+), 8 deletions(-) > > diff --git a/drivers/scsi/cxlflash/common.h b/drivers/scsi/cxlflash/common.h > index 6d95e8e..b2b3bdc 100644 > --- a/drivers/scsi/cxlflash/common.h > +++ b/drivers/scsi/cxlflash/common.h > @@ -54,7 +54,8 @@ > > /* Command management definitions */ > #define CXLFLASH_MAX_CMDS 256 > -#define CXLFLASH_MAX_CMDS_PER_LUN CXLFLASH_MAX_CMDS > +#define CXLFLASH_MAX_CMDS_PER_LUN CXLFLASH_MAX_CMDS - 1 > +#define CXLFLASH_RESET_CMD 255 > > /* RRQ for master issued cmds */ > #define NUM_RRQ_ENTRY CXLFLASH_MAX_CMDS > diff --git a/drivers/scsi/cxlflash/main.c b/drivers/scsi/cxlflash/main.c > index 462e8fc..b33e3e7 100644 > --- a/drivers/scsi/cxlflash/main.c > +++ b/drivers/scsi/cxlflash/main.c > @@ -23,6 +23,7 @@ > > #include > #include > +#include > #include > > #include "main.h" > @@ -460,15 +461,16 @@ static u32 cmd_to_target_hwq(struct Scsi_Host *host, > struct scsi_cmnd *scp, > /** > * send_tmf() - sends a Task Management Function (TMF) > * @afu: AFU to checkout from. > - * @scp: SCSI command from stack describing target. > + * @sdev:SCSI device to reset. > * @tmfcmd: TMF command to send. > * > * Return: > *0 on success, SCSI_MLQUEUE_HOST_BUSY or -errno on failure > */ > -static int send_tmf(struct afu *afu, struct scsi_cmnd *scp, u64 tmfcmd) > +static int send_tmf(struct afu *afu, struct scsi_device *sdev, u64 tmfcmd) > { > - struct Scsi_Host *host = scp->device->host; > + struct scsi_cmnd *scp; > + struct Scsi_Host *host = sdev->host; > struct cxlflash_cfg *cfg = shost_priv(host); > struct afu_cmd *cmd = NULL; > struct device *dev = >dev->dev; > @@ -498,14 +500,21 @@ static int send_tmf(struct afu *afu, struct scsi_cmnd > *scp, u64 tmfcmd) > cfg->tmf_active = true; > spin_unlock_irqrestore(>tmf_slock, lock_flags); > > + scp = scsi_host_find_tag(host, CXLFLASH_RESET_CMD); > + scp->device = sdev; > + cmd = sc_to_afucz(scp); > + hwq_index = cmd_to_target_hwq(host, scp, afu); > + hwq = get_hwq(afu, hwq_index); > + > + cmd->scp = scp; > cmd->parent = afu; > cmd->cmd_tmf = true; > cmd->hwq_index = hwq_index; > > cmd->rcb.ctx_id = hwq->ctx_hndl; > cmd->rcb.msi = SISL_MSI_RRQ_UPDATED; > - cmd->rcb.port_sel = CHAN2PORTMASK(scp->device->channel); > - cmd->rcb.lun_id = lun_to_lunid(scp->device->lun); > + cmd->rcb.port_sel = CHAN2PORTMASK(sdev->channel); > + cmd->rcb.lun_id = lun_to_lunid(sdev->lun); > cmd->rcb.req_flags = (SISL_REQ_FLAGS_PORT_LUN_ID | > SISL_REQ_FLAGS_SUP_UNDERRUN | > SISL_REQ_FLAGS_TMF_CMD); > @@ -2448,7 +2457,7 @@ static int cxlflash_eh_device_reset_handler(struct > scsi_cmnd *scp) > retry: > switch (cfg->state) { > case STATE_NORMAL: > - rcr = send_tmf(afu, scp, TMF_LUN_RESET); > + rcr = send_tmf(afu, sdev, TMF_LUN_RESET); > if (unlikely(rcr)) > rc = FAILED; > break; > @@ -3139,7 +3148,7 @@ static ssize_t mode_show(struct device *dev, > .eh_host_reset_handler = cxlflash_eh_host_reset_handler, > .change_queue_depth = cxlflash_change_queue_depth, > .cmd_per_lun = CXLFLASH_MAX_CMDS_PER_LUN, > - .can_queue = CXLFLASH_MAX_CMDS, > + .can_queue = CXLFLASH_MAX_CMDS - 1, > .cmd_size = sizeof(struct afu_cmd) + __alignof__(struct afu_cmd) - 1, > .this_id = -1, > .sg_tablesize = 1, /* No scatter gather support */ > -- > 1.8.5.6 >
Re: [PATCH 09/17] cxlflash: Create character device to provide host management interface
> On Jun 21, 2017, at 9:15 PM, Uma Krishnan <ukri...@linux.vnet.ibm.com> wrote: > > The cxlflash driver currently lacks host management interface. Future > devices supported by cxlflash will provide a variety of host-wide > management functions. Examples include LUN provisioning, hardware debug > support, and firmware download. > > In order to provide a way to manage the device, a character device will > be created during probe of each adapter. This device will support a set of > ioctls defined in the SISLite specification from which administrators can > manage the adapter. > > Signed-off-by: Uma Krishnan <ukri...@linux.vnet.ibm.com> Acked-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com>
Re: [PATCH 08/17] cxlflash: Add scsi command abort handler
> On Jun 21, 2017, at 9:15 PM, Uma Krishnan <ukri...@linux.vnet.ibm.com> wrote: > > To date, CXL flash devices do not support a single command abort operation. > Instead, the SISLite specification provides a context reset operation to > cleanup all pending commands for a given context. > > When a context reset is successful, it is guaranteed that the AFU has > aborted all currently pending I/O. This sequence is less invasive than a > device or host reset and can be executed to support scsi command abort > requests. Add eh_abort_handler callback support to process command timeouts > and abort requests. > > Signed-off-by: Uma Krishnan <ukri...@linux.vnet.ibm.com> Acked-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com>
Re: [PATCH 06/17] cxlflash: Track pending scsi commands in each hardware queue
> On Jun 21, 2017, at 9:14 PM, Uma Krishnan <ukri...@linux.vnet.ibm.com> wrote: > > Currently, there is no book keeping of the pending scsi commands in the > cxlflash driver. This lack of tracking in-flight requests is too > restrictive and requires a heavy-hammer reset each time an adapter error is > encountered. Additionally, it does not allow for commands to be properly > retried. > > In order to avoid this problem and to better handle error path command > cleanup, introduce a linked list for each hardware queue that tracks > pending commands. > > Signed-off-by: Uma Krishnan <ukri...@linux.vnet.ibm.com> Acked-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com>
Re: [PATCH 07/17] cxlflash: Flush pending commands in cleanup path
> On Jun 21, 2017, at 9:14 PM, Uma Krishnan <ukri...@linux.vnet.ibm.com> wrote: > > When the AFU is reset in an error path, pending scsi commands can be > silently dropped without completion or a formal abort. This puts the onus > on the cxlflash driver to notify mid-layer and indicating that the command > can be retried. > > Once the card has been quiesced, the hardware send queue lock is acquired > to prevent any data movement while the pending commands are processed. > > Signed-off-by: Uma Krishnan <ukri...@linux.vnet.ibm.com> Acked-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com>
Re: [PATCH 05/17] cxlflash: Handle AFU sync failures
> On Jun 21, 2017, at 9:14 PM, Uma Krishnan <ukri...@linux.vnet.ibm.com> wrote: > > AFU sync operations are not currently evaluated for failure. This is > acceptable for paths where there is not a dependency on the AFU being > consistent with the host. Examples include link reset events and LUN > cleanup operations. On paths where there is a dependency, such as a LUN > open, a sync failure should be acted upon. > > In the event of AFU sync failures, either log or cleanup as appropriate for > operations that are dependent on a successful sync completion. > > Update documentation to reflect behavior in the event of an AFU sync > failure. > > Signed-off-by: Uma Krishnan <ukri...@linux.vnet.ibm.com> Acked-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com>
Re: [PATCH 03/17] cxlflash: Reset hardware queue context via specified register
> On Jun 21, 2017, at 9:14 PM, Uma Krishnan <ukri...@linux.vnet.ibm.com> wrote: > > Per the SISLite specification, context_reset() writes 0x1 to the LSB of the > reset register. When the AFU processes this reset request, it is expected > to clear the bit after reset is complete. The current implementation simply > checks that the entire value read back is not 1, instead of masking off the > LSB and evaluating it for a change to 0. Should the AFU manipulate other > bits during the reset (reading back a value of 0xF for example), successful > completion will be prematurely indicated given the existing logic. > > Additionally, in the event that the context reset operation fails, there > does not currently exist a way to provide feedback to the initiator of the > reset. This poses a problem for the rare case that a context reset fails as > the caller will proceed on the assumption that all is well. > > To remedy these issues, refactor the context reset routine to only mask off > the LSB when evaluating for success and return status to the caller. Also > update the context reset handler parameters to pass a hardware queue > reference instead of a single command to better reflect that the entire > queue associated with the context is impacted by the reset. > > Signed-off-by: Uma Krishnan <ukri...@linux.vnet.ibm.com> Acked-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com>
Re: [PATCH 04/17] cxlflash: Schedule asynchronous reset of the host
> On Jun 21, 2017, at 9:14 PM, Uma Krishnan <ukri...@linux.vnet.ibm.com> wrote: > > A context reset failure indicates the AFU is in a bad state. At present, > when such a situation occurs, no further action is taken. This leaves the > adapter in an unusable state with no recoverable actions. > > To avoid this situation, context reset failures will be escalated to a host > reset operation. This will be done asynchronously to allow the acting > thread to return to the user with a failure. > > Signed-off-by: Uma Krishnan <ukri...@linux.vnet.ibm.com> Acked-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com>
Re: [PATCH 02/17] cxlflash: Update cxlflash_afu_sync() to return errno
> On Jun 21, 2017, at 9:13 PM, Uma Krishnan <ukri...@linux.vnet.ibm.com> wrote: > > The cxlflash_afu_sync() routine returns a negative one to indicate any kind > of failure. This makes it impossible to establish why the error occurred. > > Update the return codes to clearly indicate the failure cause to the > caller. > > Signed-off-by: Uma Krishnan <ukri...@linux.vnet.ibm.com> Acked-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com>
Re: [PATCH 01/17] cxlflash: Combine the send queue locks
> On Jun 21, 2017, at 9:13 PM, Uma Krishnan <ukri...@linux.vnet.ibm.com> wrote: > > Currently there are separate spin locks for the two supported I/O queueing > models. This makes it difficult to serialize with paths outside the enqueue > path. > > As a design simplification and to support serialization with enqueue > operations, move to only a single lock that is used for enqueueing > regardless of the queueing model. > > Signed-off-by: Uma Krishnan <ukri...@linux.vnet.ibm.com> Acked-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com>
Re: [PATCH] cxlflash: Select IRQ_POLL
> On May 5, 2017, at 6:31 PM, Guenter Roeck <li...@roeck-us.net> wrote: > > The driver now uses IRQ_POLL and needs to select it to avoid the following > build error. > > ERROR: ".irq_poll_complete" [drivers/scsi/cxlflash/cxlflash.ko] undefined! > ERROR: ".irq_poll_sched" [drivers/scsi/cxlflash/cxlflash.ko] undefined! > ERROR: ".irq_poll_disable" [drivers/scsi/cxlflash/cxlflash.ko] undefined! > ERROR: ".irq_poll_init" [drivers/scsi/cxlflash/cxlflash.ko] undefined! > > Fixes: cba06e6de403 ("scsi: cxlflash: Implement IRQ polling for RRQ > processing") > Signed-off-by: Guenter Roeck <li...@roeck-us.net> Acked-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com>
Re: [PATCH 15/17] cxlflash: Support multiple hardware queues
> On Apr 12, 2017, at 2:15 PM, Uma Krishnan <ukri...@linux.vnet.ibm.com> wrote: > > Introduce multiple hardware queues to improve legacy I/O path performance. > Each hardware queue is comprised of a master context and associated I/O > resources. The hardware queues are initially implemented as a static array > embedded in the AFU. This will be transitioned to a dynamic allocation in a > later series to improve the memory footprint of the driver. > > Signed-off-by: Uma Krishnan <ukri...@linux.vnet.ibm.com> Looks good! Acked-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com>
Re: [PATCH 10/17] cxlflash: Fence EEH during probe
> On Apr 13, 2017, at 1:27 AM, Andrew Donnellan <andrew.donnel...@au1.ibm.com> > wrote: > > On 13/04/17 05:14, Uma Krishnan wrote: >> From: "Matthew R. Ochs" <mro...@linux.vnet.ibm.com> >> >> An EEH during probe can lead to a crash as the recovery thread races >> with the probe thread. To avoid this issue, introduce new states to >> fence out EEH recovery until probe has completed. Also ensure the reset >> wait queue is flushed during device removal to avoid orphaned threads. >> >> Signed-off-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com> >> Signed-off-by: Uma Krishnan <ukri...@linux.vnet.ibm.com> > > The use of the term "fence" in the commit message might be a bit confusing > given how the term is already used in the context of EEH :) Hi Andrew, I agree that it could be a bit misleading but think the commit message is descriptive enough to avoid confusion. If you feel strongly about it I'm happy to reword the commit. -matt
Re: [PATCH 4/4] cxlflash: Cancel scheduled workers before stopping AFU
> On Jan 6, 2017, at 8:05 AM, Uma Krishnan <ukri...@linux.vnet.ibm.com> wrote: > > When processing an AFU asynchronous interrupt, if the action results in an > operation that requires off level processing (a link reset for example), > the worker thread is scheduled. In the meantime a reset event (i.e.: EEH) > could unmap the AFU to recover. This results in an Oops when the worker > thread tries to access the AFU mapping. > > [c00f17e03b90] d7cd5978 cxlflash_worker_thread+0x268/0x550 > [c00f17e03c40] c011883c process_one_work+0x1dc/0x680 > [c00f17e03ce0] c0118e80 worker_thread+0x1a0/0x520 > [c00f17e03d80] c0126174 kthread+0xf4/0x100 > [c00f17e03e30] c000a47c ret_from_kernel_thread+0x5c/0xe0 > > In an effort to avoid this, a mapcount was introduced in > commit b45cdbaf9f7f ("cxlflash: Resolve oops in wait_port_offline") > but due to the race condition described above, this solution is incomplete. > > In order to fully resolve this problem and to simplify things, this commit > removes the mapcount solution. Instead, the scheduled worker thread is > cancelled after interrupts have been disabled and prior to the mapping > being freed. > > Fixes: b45cdbaf9f7f ("cxlflash: Resolve oops in wait_port_offline") > Signed-off-by: Uma Krishnan <ukri...@linux.vnet.ibm.com> Looks good. Acked-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com> -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 04/14] cxlflash: Avoid command room violation
Uma, This looks better, thanks for reworking. -matt > On Nov 28, 2016, at 6:41 PM, Uma Krishnan <ukri...@linux.vnet.ibm.com> wrote: > > During test, a command room violation interrupt is occasionally seen > for the master context when the CXL flash devices are stressed. > > After studying the code, there could be gaps in the way command room > value is being cached in cxlflash. When the cached command room is zero > the thread attempting to send becomes burdened with updating the cached > value with the actual value from the AFU. Today, this is handled with an > atomic set operation of the raw value read. Following the atomic update, > the thread proceeds to send. > > This behavior is incorrect on two counts: > > - The update fails to take into account the current thread and its > consumption of one of the hardware commands. > > - The update does not take into account other threads also atomically > updating. Per design, a worker thread updates the cached value when a > send thread times out. By not protecting the update with a lock, the > cached value can be incorrectly clobbered. > > To correct these issues, the update of the cached command room has been > simplified and also protected using a spin lock which is held until the > MMIO is complete. This ensures the command room is properly consumed by > the same thread. Update of cached value also takes into account the > current thread consuming a hardware command. > > Signed-off-by: Uma Krishnan <ukri...@linux.vnet.ibm.com> Acked-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com> -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 04/14] cxlflash: Avoid command room violation
Hi Uma, I do see a potential hang issue with this patch. See my comments below. -matt > On Nov 15, 2016, at 5:14 PM, Uma Krishnanwrote: > > During test, a command room violation interrupt is occasionally seen > for the master context when the CXL flash devices are stressed. > > After studying the code, there could be gaps in the way command room > value is being cached in cxlflash. When the cached command room is zero > the thread attempting to send becomes burdened with updating the cached > value with the actual value from the AFU. Today, this is handled with > an atomic set operation of the raw value read. Following the atomic > update, the thread proceeds to send. > > This behavior is incorrect on two counts: > > - The update fails to take into account the current thread and its > consumption of one of the hardware commands. > > - The update does not take into account other threads also atomically > updating. Per design, a worker thread updates the cached value when > a send thread times out. By not performing an atomic compare/exchange, > the cached value can be incorrectly clobbered. > > To correct these issues, the runtime updates of the cached command room > are updated to use atomic64_cmpxchg() and the send routine is updated to > take into account the current thread consuming a hardware command. > > Signed-off-by: Uma Krishnan > --- > drivers/scsi/cxlflash/main.c | 16 ++-- > 1 file changed, 10 insertions(+), 6 deletions(-) > > diff --git a/drivers/scsi/cxlflash/main.c b/drivers/scsi/cxlflash/main.c > index 6d33d8c..1a32e8b 100644 > --- a/drivers/scsi/cxlflash/main.c > +++ b/drivers/scsi/cxlflash/main.c > @@ -322,9 +322,10 @@ static int send_cmd(struct afu *afu, struct afu_cmd *cmd) > if (!newval) { When this path is invoked, the current thread is consuming the last entry available entry before the room must be read again. While the change below is fine for circumstances where the hardware queue has room for more than one command, consider a scenario where the queue has room for only 1 command (the command that you just consumed via the atomic but are not really consuming with a MMIO due to the revised goto). In such a scenario this code would loop endlessly, bypassing the timeout logic completely, until the read room reflected a value greater than 1. > do { > room = readq_be(>host_map->cmd_room); > - atomic64_set(>room, room); > - if (room) > - goto write_ioarrin; > + if (room) { > + atomic64_cmpxchg(>room, 0, room); > + goto retry; > + } If you instead fully consume the entry (goto write_ioarrin - similar as it was before) and take into account the consumption when you update the cached value (i.e.: cmpxchg(..., 0, room - 1) the scenario described above will not occur. -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 03/14] cxlflash: Improve context_reset() logic
> On Nov 15, 2016, at 5:14 PM, Uma Krishnan <ukri...@linux.vnet.ibm.com> wrote: > > Currently, the context reset routine waits for command room to > be available before sending the reset request. Per review of the > SISLite specification and clarifications from the CXL Flash AFU > designers, this wait is unnecessary. The reset request can be > sent anytime regardless of command room, so long as only a single > reset request is active at any one point in time. > > This commit simplifies the reset routine by removing the wait for > command room. Additionally it adds a debug trace to help pinpoint > hardware errors when a context reset does not complete. > > Signed-off-by: Uma Krishnan <ukri...@linux.vnet.ibm.com> Acked-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com> -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 02/14] cxlflash: Fix crash in cxlflash_restore_luntable()
> On Nov 15, 2016, at 5:14 PM, Uma Krishnan <ukri...@linux.vnet.ibm.com> wrote: > > During test, the following crash was observed: > > [34538.981505] Faulting instruction address: 0xd7c9c870 > cpu 0x9: Vector: 300 (Data Access) at [c007f1e8f590] >pc: d7c9c870: cxlflash_restore_luntable+0x70/0x1d0 [cxlflash] >lr: d7c9c84c: cxlflash_restore_luntable+0x4c/0x1d0 [cxlflash] >sp: c007f1e8f810 > msr: 90019033 > dar: c0171d637438 > dsisr: 4000 > current = 0xc007f1e43f90 > paca= 0xc7b25100 softe: 0irq_happened: 0x01 >pid = 493, comm = eehd > enter ? for help > [c007f1e8f8a0] d7c940b0 init_afu+0xd60/0x1200 [cxlflash] > [c007f1e8f9a0] d7c945a8 cxlflash_pci_slot_reset+0x58/0xe0 > [cxlflash] > [c007f1e8fa20] d715f790 cxl_pci_slot_reset+0x230/0x340 [cxl] > [c007f1e8fae0] c0040dd4 eeh_report_reset+0x144/0x180 > [c007f1e8fb20] c003f708 eeh_pe_dev_traverse+0x98/0x170 > [c007f1e8fbb0] c0041618 eeh_handle_normal_event+0x328/0x410 > [c007f1e8fc30] c0041db8 eeh_handle_event+0x178/0x330 > [c007f1e8fce0] c0042118 eeh_event_handler+0x1a8/0x1b0 > [c007f1e8fd80] c011420c kthread+0xec/0x100 > [c007f1e8fe30] c000a47c ret_from_kernel_thread+0x5c/0xe0 > > When superpipe mode is disabled for a LUN, the references for the > local lun are deleted but the LUN is still identified as being present > in the LUN table. This mismatched state can result in the above crash > when the LUN table is restored during an error recovery operation. > > To fix this issue, the local LUN information structure is updated to > reflect the LUN is no longer in the LUN table once all references to > the LUN are gone. > > Signed-off-by: Uma Krishnan <ukri...@linux.vnet.ibm.com> Acked-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com> -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 01/14] cxlflash: Set sg_tablesize to 1 instead of SG_NONE
> On Nov 15, 2016, at 5:13 PM, Uma Krishnan <ukri...@linux.vnet.ibm.com> wrote: > > The following Oops is encountered when blk_mq is enabled with the > cxlflash driver: > > [ 2960.817172] Oops: Kernel access of bad area, sig: 11 [#5] > [ 2960.817309] NIP __blk_mq_run_hw_queue+0x278/0x4c0 > [ 2960.817313] LR __blk_mq_run_hw_queue+0x2bc/0x4c0 > [ 2960.817314] Call Trace: > [ 2960.817320] __blk_mq_run_hw_queue+0x2bc/0x4c0 (unreliable) > [ 2960.817324] blk_mq_run_hw_queue+0xd8/0x100 > [ 2960.817329] blk_mq_insert_requests+0x14c/0x1f0 > [ 2960.817333] blk_mq_flush_plug_list+0x150/0x190 > [ 2960.817338] blk_flush_plug_list+0x11c/0x2b0 > [ 2960.817344] blk_finish_plug+0x58/0x80 > [ 2960.817348] __do_page_cache_readahead+0x1c0/0x2e0 > [ 2960.817352] force_page_cache_readahead+0x68/0xd0 > [ 2960.817356] generic_file_read_iter+0x43c/0x6a0 > [ 2960.817359] blkdev_read_iter+0x68/0xa0 > [ 2960.817361] __vfs_read+0x11c/0x180 > [ 2960.817364] vfs_read+0xa4/0x1c0 > [ 2960.817366] SyS_read+0x6c/0x110 > [ 2960.817369] system_call+0x38/0xb4 > > The SCSI blk_mq stack assumes that sg_tablesize is always a non-zero > value with scsi_mq_setup_tags() allocating tags using sg_tablesize. > The cxlflash driver currently uses SG_NONE (0) for the sg_tablesize > as the devices it supports are not capable of scatter gather. This > mismatch of values results in the Oops above. > > To resolve this issue, sg_tablesize for cxlflash can simply be set > to 1, a value which satisfies the constraints in cxlflash and the > lack of support of SG_NONE in SCSI blk_mq. > > Signed-off-by: Uma Krishnan <ukri...@linux.vnet.ibm.com> Acked-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com> -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/6] cxlflash: Scan host only after the port is ready for I/O
> On Sep 2, 2016, at 3:38 PM, Uma Krishnan <ukri...@linux.vnet.ibm.com> wrote: > > When a port link is established, the AFU sends a 'link up' interrupt. > After the link is up, corresponding initialization steps are performed > on the card. Following that, when the card is ready for I/O, the AFU > sends 'login succeeded' interrupt. Today, cxlflash invokes > scsi_scan_host() upon receipt of both interrupts. > > SCSI commands sent to the port prior to the 'login succeeded' interrupt > will fail with 'port not available' error. This is not desirable. > Moreover, when async_scan is active for the host, subsequent scan calls > are terminated with error. Due to this, the scsi_scan_host() call > performed after 'login succeeded' interrupt could portentially return > error and the devices may not be scanned properly. > > To avoid this problem, scsi_scan_host() should be called only after the > 'login succeeded' interrupt. > > Signed-off-by: Uma Krishna <ukri...@linux.vnet.ibm.com> Acked-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com> -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/6] cxlflash: Remove the device cleanly in the system shutdown path
> On Sep 2, 2016, at 3:39 PM, Uma Krishnan <ukri...@linux.vnet.ibm.com> wrote: > > Commit 704c4b0ddc03 ("cxlflash: Shutdown notify support for CXL Flash > cards") was recently introduced to notify the AFU when a system is going > down. Due to the position of the cxlflash driver in the device stack, > cxlflash devices are _always_ removed during a reboot/shutdown. This can > lead to a crash if the cxlflash shutdown hook is invoked _after_ the > shutdown hook for the owning virtual PHB. Furthermore, the current > implementation of shutdown/remove hooks for cxlflash are not tolerant to > being invoked when the device is not enabled. This can also lead to a > crash in situations where the remove hook is invoked after the device has > been removed via the vPHBs shutdown hook. An example of this scenario > would be an EEH reset failure while a reboot/shutdown is in progress. > > To solve both problems, the shutdown hook for cxlflash is updated to > simply remove the device. This path already includes the AFU notification > and thus this solution will continue to perform the original intent. At > the same time, the remove hook is updated to protect against being > called when the device is not enabled. > > Fixes: 704c4b0ddc03 ("cxlflash: Shutdown notify support for CXL Flash > cards") > Signed-off-by: Uma Krishna <ukri...@linux.vnet.ibm.com> Acked-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com> -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 5/6] cxlflash: Remove adapter file descriptor cache
The adapter file descriptor was previously cached within the kernel for a given context in order to support performing a close on behalf of an application. This is no longer needed as applications are now required to perform a close on the adapter file descriptor. Inspired-by: Al Viro <v...@zeniv.linux.org.uk> Signed-off-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com> --- drivers/scsi/cxlflash/superpipe.c | 26 +- drivers/scsi/cxlflash/superpipe.h | 1 - 2 files changed, 13 insertions(+), 14 deletions(-) diff --git a/drivers/scsi/cxlflash/superpipe.c b/drivers/scsi/cxlflash/superpipe.c index b3bb90d..c91fe6f 100644 --- a/drivers/scsi/cxlflash/superpipe.c +++ b/drivers/scsi/cxlflash/superpipe.c @@ -788,20 +788,18 @@ err: * @cfg: Internal structure associated with the host. * @ctx: Previously obtained CXL context reference. * @ctxid: Previously obtained process element associated with CXL context. - * @adap_fd: Previously obtained adapter fd associated with CXL context. * @file: Previously obtained file associated with CXL context. * @perms: User-specified permissions. */ static void init_context(struct ctx_info *ctxi, struct cxlflash_cfg *cfg, -struct cxl_context *ctx, int ctxid, int adap_fd, -struct file *file, u32 perms) +struct cxl_context *ctx, int ctxid, struct file *file, +u32 perms) { struct afu *afu = cfg->afu; ctxi->rht_perms = perms; ctxi->ctrl_map = >afu_map->ctrls[ctxid].ctrl; ctxi->ctxid = ENCODE_CTXID(ctxi, ctxid); - ctxi->lfd = adap_fd; ctxi->pid = current->tgid; /* tgid = pid */ ctxi->ctx = ctx; ctxi->cfg = cfg; @@ -1086,8 +1084,7 @@ static int cxlflash_mmap_fault(struct vm_area_struct *vma, struct vm_fault *vmf) goto err; } - dev_dbg(dev, "%s: fault(%d) for context %d\n", - __func__, ctxi->lfd, ctxid); + dev_dbg(dev, "%s: fault for context %d\n", __func__, ctxid); if (likely(!ctxi->err_recovery_active)) { vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot); @@ -1162,8 +1159,7 @@ static int cxlflash_cxl_mmap(struct file *file, struct vm_area_struct *vma) goto out; } - dev_dbg(dev, "%s: mmap(%d) for context %d\n", - __func__, ctxi->lfd, ctxid); + dev_dbg(dev, "%s: mmap for context %d\n", __func__, ctxid); rc = cxl_fd_mmap(file, vma); if (likely(!rc)) { @@ -1406,7 +1402,7 @@ static int cxlflash_disk_attach(struct scsi_device *sdev, perms = SISL_RHT_PERM(attach->hdr.flags + 1); /* Context mutex is locked upon return */ - init_context(ctxi, cfg, ctx, ctxid, fd, file, perms); + init_context(ctxi, cfg, ctx, ctxid, file, perms); rc = afu_attach(cfg, ctxi); if (unlikely(rc)) { @@ -1488,12 +1484,15 @@ err: * recover_context() - recovers a context in error * @cfg: Internal structure associated with the host. * @ctxi: Context to release. + * @adap_fd: Adapter file descriptor associated with new/recovered context. * * Restablishes the state for a context-in-error. * * Return: 0 on success, -errno on failure */ -static int recover_context(struct cxlflash_cfg *cfg, struct ctx_info *ctxi) +static int recover_context(struct cxlflash_cfg *cfg, + struct ctx_info *ctxi, + int *adap_fd) { struct device *dev = >dev->dev; int rc = 0; @@ -1546,7 +1545,6 @@ static int recover_context(struct cxlflash_cfg *cfg, struct ctx_info *ctxi) * visible to user space and can't be undone safely on this thread. */ ctxi->ctxid = ENCODE_CTXID(ctxi, ctxid); - ctxi->lfd = fd; ctxi->ctx = ctx; ctxi->file = file; @@ -1563,6 +1561,7 @@ static int recover_context(struct cxlflash_cfg *cfg, struct ctx_info *ctxi) cfg->ctx_tbl[ctxid] = ctxi; mutex_unlock(>ctx_tbl_list_mutex); fd_install(fd, file); + *adap_fd = fd; out: dev_dbg(dev, "%s: returning ctxid=%d fd=%d rc=%d\n", __func__, ctxid, fd, rc); @@ -1621,6 +1620,7 @@ static int cxlflash_afu_recover(struct scsi_device *sdev, rctxid = recover->context_id; long reg; int lretry = 20; /* up to 2 seconds */ + int new_adap_fd = -1; int rc = 0; atomic_inc(>recovery_threads); @@ -1650,7 +1650,7 @@ retry: if (ctxi->err_recovery_active) { retry_recover: - rc = recover_context(cfg, ctxi); + rc = recover_context(cfg, ctxi, _adap_fd); if (unlikely(rc)) { dev_err(dev, "%s: Recovery failed for context %
[PATCH 1/6] cxlflash: Avoid mutex when destroying context
Context information structures are protected by a mutex that is held when accessing/manipulating the context. When the code that manages these structures was authored, a decision was made to include taking the mutex as part of the allocation/initialization sequence and also handle the scenario where the mutex was already held when freeing the context. While not a problem outright, this design decision has been deemed as too flexible and the code should be made more rigid to avoid future bugs. In addition, further review of the code yields that the existing mutex manipulations in both of these context management paths are superfluous. This commit removes the obtaining of the context mutex in the context initialization routine and assumes the mutex is not held in the context free path. Inspired-by: Al Viro <v...@zeniv.linux.org.uk> Signed-off-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com> --- drivers/scsi/cxlflash/superpipe.c | 26 -- 1 file changed, 8 insertions(+), 18 deletions(-) diff --git a/drivers/scsi/cxlflash/superpipe.c b/drivers/scsi/cxlflash/superpipe.c index ce15070..ab5c893 100644 --- a/drivers/scsi/cxlflash/superpipe.c +++ b/drivers/scsi/cxlflash/superpipe.c @@ -709,14 +709,13 @@ int cxlflash_disk_release(struct scsi_device *sdev, * @cfg: Internal structure associated with the host. * @ctxi: Context to release. * - * This routine is safe to be called with a a non-initialized context - * and is tolerant of being called with the context's mutex held (it - * will be unlocked if necessary before freeing). Also note that the - * routine conditionally checks for the existence of the context control - * map before clearing the RHT registers and context capabilities because - * it is possible to destroy a context while the context is in the error - * state (previous mapping was removed [so there is no need to worry about - * clearing] and context is waiting for a new mapping). + * This routine is safe to be called with a a non-initialized context. + * Also note that the routine conditionally checks for the existence + * of the context control map before clearing the RHT registers and + * context capabilities because it is possible to destroy a context + * while the context is in the error state (previous mapping was + * removed [so there is no need to worry about clearing] and context + * is waiting for a new mapping). */ static void destroy_context(struct cxlflash_cfg *cfg, struct ctx_info *ctxi) @@ -732,9 +731,6 @@ static void destroy_context(struct cxlflash_cfg *cfg, writeq_be(0, >ctrl_map->rht_cnt_id); writeq_be(0, >ctrl_map->ctx_cap); } - - if (mutex_is_locked(>mutex)) - mutex_unlock(>mutex); } /* Free memory associated with context */ @@ -795,9 +791,6 @@ err: * @adap_fd: Previously obtained adapter fd associated with CXL context. * @file: Previously obtained file associated with CXL context. * @perms: User-specified permissions. - * - * Upon return, the context is marked as initialized and the context's mutex - * is locked. */ static void init_context(struct ctx_info *ctxi, struct cxlflash_cfg *cfg, struct cxl_context *ctx, int ctxid, int adap_fd, @@ -816,8 +809,6 @@ static void init_context(struct ctx_info *ctxi, struct cxlflash_cfg *cfg, mutex_init(>mutex); INIT_LIST_HEAD(>luns); INIT_LIST_HEAD(>list); /* initialize for list_empty() */ - - mutex_lock(>mutex); } /** @@ -1445,7 +1436,6 @@ static int cxlflash_disk_attach(struct scsi_device *sdev, * knows about us yet; we can be the only one holding our mutex. */ list_add(_access->list, >luns); - mutex_unlock(>mutex); mutex_lock(>ctx_tbl_list_mutex); mutex_lock(>mutex); cfg->ctx_tbl[ctxid] = ctxi; @@ -1494,7 +1484,7 @@ err: file = NULL; } - /* Cleanup our context; safe to call even with mutex locked */ + /* Cleanup our context */ if (ctxi) { destroy_context(cfg, ctxi); ctxi = NULL; -- 2.1.0 -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 4/6] cxlflash: Transition to application close model
Caching the adapter file descriptor and performing a close on behalf of an application is a poor design. This is due to the fact that once a file descriptor in installed, it is free to be altered without the knowledge of the cxlflash driver. This can lead to inconsistencies between the application and kernel. Furthermore, the nature of the former design is more exploitable and thus should be abandoned. To support applications performing a close on the adapter file that is associated with a context, a new flag is introduced to the user API to indicate to applications that they are responsible for the close following the cleanup (detach) of a context. The documentation is also updated to reflect this change in behavior. Inspired-by: Al Viro <v...@zeniv.linux.org.uk> Signed-off-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com> --- Documentation/powerpc/cxlflash.txt | 42 +++--- drivers/scsi/cxlflash/superpipe.c | 71 ++ drivers/scsi/cxlflash/vlun.c | 13 ++- include/uapi/scsi/cxlflash_ioctl.h | 19 +++--- 4 files changed, 73 insertions(+), 72 deletions(-) diff --git a/Documentation/powerpc/cxlflash.txt b/Documentation/powerpc/cxlflash.txt index 4202d1b..f4c1190 100644 --- a/Documentation/powerpc/cxlflash.txt +++ b/Documentation/powerpc/cxlflash.txt @@ -171,11 +171,30 @@ DK_CXLFLASH_ATTACH destroyed, the tokens are to be considered stale and subsequent usage will result in errors. + - A valid adapter file descriptor (fd2 >= 0) is only returned on + the initial attach for a context. Subsequent attaches to an + existing context (DK_CXLFLASH_ATTACH_REUSE_CONTEXT flag present) + do not provide the adapter file descriptor as it was previously + made known to the application. + - When a context is no longer needed, the user shall detach from - the context via the DK_CXLFLASH_DETACH ioctl. + the context via the DK_CXLFLASH_DETACH ioctl. When this ioctl + returns with a valid adapter file descriptor and the return flag + DK_CXLFLASH_APP_CLOSE_ADAP_FD is present, the application _must_ + close the adapter file descriptor following a successful detach. + + - When this ioctl returns with a valid fd2 and the return flag + DK_CXLFLASH_APP_CLOSE_ADAP_FD is present, the application _must_ + close fd2 in the following circumstances: + + + Following a successful detach of the last user of the context + + Following a successful recovery on the context's original fd2 + + In the child process of a fork(), following a clone ioctl, + on the fd2 associated with the source context -- A close on fd2 will invalidate the tokens. This operation is not - required by the user. +- At any time, a close on fd2 will invalidate the tokens. Applications + should exercise caution to only close fd2 when appropriate (outlined + in the previous bullet) to avoid premature loss of I/O. DK_CXLFLASH_USER_DIRECT --- @@ -254,6 +273,10 @@ DK_CXLFLASH_DETACH success, all "tokens" which had been provided to the user from the DK_CXLFLASH_ATTACH onward are no longer valid. +When the DK_CXLFLASH_APP_CLOSE_ADAP_FD flag was returned on a successful +attach, the application _must_ close the fd2 associated with the context +following the detach of the final user of the context. + DK_CXLFLASH_VLUN_CLONE -- This ioctl is responsible for cloning a previously created @@ -261,7 +284,7 @@ DK_CXLFLASH_VLUN_CLONE support maintaining user space access to storage after a process forks. Upon success, the child process (which invoked the ioctl) will have access to the same LUNs via the same resource handle(s) -and fd2 as the parent, but under a different context. +as the parent, but under a different context. Context sharing across processes is not supported with CXL and therefore each fork must be met with establishing a new context @@ -275,6 +298,12 @@ DK_CXLFLASH_VLUN_CLONE translation tables are copied from the parent context to the child's and then synced with the AFU. +When the DK_CXLFLASH_APP_CLOSE_ADAP_FD flag was returned on a successful +attach, the application _must_ close the fd2 associated with the source +context (still resident/accessible in the parent process) following the +clone. This is to avoid a stale entry in the file descriptor table of the +child process. + DK_CXLFLASH_VERIFY -- This ioctl is used to detect various changes such as the capacity of @@ -309,6 +338,11 @@ DK_CXLFLASH_RECOVER_AFU at which time the context/resources they held will be freed as part of the release fop. +When the DK_CXLFLASH_APP_CLOSE_ADAP_FD flag was returned on
[PATCH 3/6] cxlflash: Add kref to context
Currently, context user references are tracked via the list of LUNs that have attached to the context. While convenient, this is not intuitive without a deep study of the code and is inconsistent with the existing reference tracking patterns within the kernel. This design choice can lead to future bug injection. To improve code comprehension and better protect against future bugs, add explicit reference counting to contexts and migrate the context removal code to the kref release handler. Inspired-by: Al Viro <v...@zeniv.linux.org.uk> Signed-off-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com> --- drivers/scsi/cxlflash/superpipe.c | 87 +++ drivers/scsi/cxlflash/superpipe.h | 1 + 2 files changed, 53 insertions(+), 35 deletions(-) diff --git a/drivers/scsi/cxlflash/superpipe.c b/drivers/scsi/cxlflash/superpipe.c index 640c3a2..be7522a 100644 --- a/drivers/scsi/cxlflash/superpipe.c +++ b/drivers/scsi/cxlflash/superpipe.c @@ -808,11 +808,56 @@ static void init_context(struct ctx_info *ctxi, struct cxlflash_cfg *cfg, ctxi->file = file; ctxi->initialized = true; mutex_init(>mutex); + kref_init(>kref); INIT_LIST_HEAD(>luns); INIT_LIST_HEAD(>list); /* initialize for list_empty() */ } /** + * remove_context() - context kref release handler + * @kref: Kernel reference associated with context to be removed. + * + * When a context no longer has any references it can safely be removed + * from global access and destroyed. Note that it is assumed the thread + * relinquishing access to the context holds its mutex. + */ +static void remove_context(struct kref *kref) +{ + struct ctx_info *ctxi = container_of(kref, struct ctx_info, kref); + struct cxlflash_cfg *cfg = ctxi->cfg; + int lfd; + u64 ctxid = DECODE_CTXID(ctxi->ctxid); + + /* Remove context from table/error list */ + WARN_ON(!mutex_is_locked(>mutex)); + ctxi->unavail = true; + mutex_unlock(>mutex); + mutex_lock(>ctx_tbl_list_mutex); + mutex_lock(>mutex); + + if (!list_empty(>list)) + list_del(>list); + cfg->ctx_tbl[ctxid] = NULL; + mutex_unlock(>ctx_tbl_list_mutex); + mutex_unlock(>mutex); + + /* Context now completely uncoupled/unreachable */ + lfd = ctxi->lfd; + destroy_context(cfg, ctxi); + + /* +* As a last step, clean up external resources when not +* already on an external cleanup thread, i.e.: close(adap_fd). +* +* NOTE: this will free up the context from the CXL services, +* allowing it to dole out the same context_id on a future +* (or even currently in-flight) disk_attach operation. +*/ + if (lfd != -1) + sys_close(lfd); +} + +/** * _cxlflash_disk_detach() - detaches a LUN from a context * @sdev: SCSI device associated with LUN. * @ctxi: Context owning resources. @@ -837,7 +882,6 @@ static int _cxlflash_disk_detach(struct scsi_device *sdev, int i; int rc = 0; - int lfd; u64 ctxid = DECODE_CTXID(detach->context_id), rctxid = detach->context_id; @@ -879,40 +923,12 @@ static int _cxlflash_disk_detach(struct scsi_device *sdev, break; } - /* Tear down context following last LUN cleanup */ - if (list_empty(>luns)) { - ctxi->unavail = true; - mutex_unlock(>mutex); - mutex_lock(>ctx_tbl_list_mutex); - mutex_lock(>mutex); - - /* Might not have been in error list so conditionally remove */ - if (!list_empty(>list)) - list_del(>list); - cfg->ctx_tbl[ctxid] = NULL; - mutex_unlock(>ctx_tbl_list_mutex); - mutex_unlock(>mutex); - - lfd = ctxi->lfd; - destroy_context(cfg, ctxi); - ctxi = NULL; - put_ctx = false; - - /* -* As a last step, clean up external resources when not -* already on an external cleanup thread, i.e.: close(adap_fd). -* -* NOTE: this will free up the context from the CXL services, -* allowing it to dole out the same context_id on a future -* (or even currently in-flight) disk_attach operation. -*/ - if (lfd != -1) - sys_close(lfd); - } - - /* Release the sdev reference that bound this LUN to the context */ + /* +* Release the context reference and the sdev reference that +* bound this LUN to the context. +*/ + put_ctx = !kref_put(>kref, remove_context); scsi_device_put(sdev); - out: if (put_ctx)
[PATCH 6/6] cxlflash: Update documentation
Update the block library link in the API documentation. Signed-off-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com> --- Documentation/powerpc/cxlflash.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Documentation/powerpc/cxlflash.txt b/Documentation/powerpc/cxlflash.txt index f4c1190..6d9a2ed 100644 --- a/Documentation/powerpc/cxlflash.txt +++ b/Documentation/powerpc/cxlflash.txt @@ -121,7 +121,7 @@ Block library API below. The block library can be found on GitHub: -http://www.github.com/mikehollinger/ibmcapikv +http://github.com/open-power/capiflash CXL Flash Driver IOCTLs -- 2.1.0 -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/6] cxlflash: Cache owning adapter within context
The context removal routine requires access to the owning adapter structure to reset the context within the AFU as part of the tear down sequence. In order to support kref adoption, the owning adapter must be accessible from the release handler. As the kref framework only provides the kref reference as the sole parameter, another means is needed to derive the owning adapter. As a remedy, the owning adapter reference is saved off within the context during initialization. Signed-off-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com> --- drivers/scsi/cxlflash/superpipe.c | 1 + drivers/scsi/cxlflash/superpipe.h | 1 + 2 files changed, 2 insertions(+) diff --git a/drivers/scsi/cxlflash/superpipe.c b/drivers/scsi/cxlflash/superpipe.c index ab5c893..640c3a2 100644 --- a/drivers/scsi/cxlflash/superpipe.c +++ b/drivers/scsi/cxlflash/superpipe.c @@ -804,6 +804,7 @@ static void init_context(struct ctx_info *ctxi, struct cxlflash_cfg *cfg, ctxi->lfd = adap_fd; ctxi->pid = current->tgid; /* tgid = pid */ ctxi->ctx = ctx; + ctxi->cfg = cfg; ctxi->file = file; ctxi->initialized = true; mutex_init(>mutex); diff --git a/drivers/scsi/cxlflash/superpipe.h b/drivers/scsi/cxlflash/superpipe.h index 5f9a091..61404f2 100644 --- a/drivers/scsi/cxlflash/superpipe.h +++ b/drivers/scsi/cxlflash/superpipe.h @@ -107,6 +107,7 @@ struct ctx_info { bool err_recovery_active; struct mutex mutex; /* Context protection */ struct cxl_context *ctx; + struct cxlflash_cfg *cfg; struct list_head luns; /* LUNs attached to this context */ const struct vm_operations_struct *cxl_mmap_vmops; struct file *file; -- 2.1.0 -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 0/6] cxlflash: Improvements and cleanup
This patch set contains various code improvements and cleanups that were inspired by Al Viro upon reviewing the cxlflash driver. The core improvement is that the driver will no longer cache the adapter file descriptor associated with a context. This results in a user API change that is documented alongside the modifications. The series is based upon 4.8-rc1, intended for 4.9, and is bisectable. Matthew R. Ochs (6): cxlflash: Avoid mutex when destroying context cxlflash: Cache owning adapter within context cxlflash: Add kref to context cxlflash: Transition to application close model cxlflash: Remove adapter file descriptor cache cxlflash: Update documentation Documentation/powerpc/cxlflash.txt | 44 - drivers/scsi/cxlflash/superpipe.c | 181 - drivers/scsi/cxlflash/superpipe.h | 3 +- drivers/scsi/cxlflash/vlun.c | 13 +-- include/uapi/scsi/cxlflash_ioctl.h | 19 +++- 5 files changed, 135 insertions(+), 125 deletions(-) -- 2.1.0 -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] MAINTAINERS: Update cxlflash maintainers
> On Jul 21, 2016, at 3:44 PM, Uma Krishnan <ukri...@linux.vnet.ibm.com> wrote: > > Adding myself as a cxlflash maintainer. > > Signed-off-by: Uma Krishnan <ukri...@linux.vnet.ibm.com> Acked-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com> -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] cxlflash: Verify problem state area is mapped before notifying shutdown
> On Jul 21, 2016, at 3:44 PM, Uma Krishnan <ukri...@linux.vnet.ibm.com> wrote: > > If an EEH or some other hard error occurs while the > adapter instance was being initialized, on the subsequent > shutdown of the device, the system could crash with: > > [c00f1da03b60] c05eccfc pci_device_shutdown+0x6c/0x100 > [c00f1da03ba0] c06d67d4 device_shutdown+0x1b4/0x2c0 > [c00f1da03c40] c00ea30c kernel_restart_prepare+0x5c/0x80 > [c00f1da03c70] c00ea48c kernel_restart+0x2c/0xc0 > [c00f1da03ce0] c00ea970 SyS_reboot+0x1c0/0x2d0 > [c00f1da03e30] c0009204 system_call+0x38/0xb4 > > This crash is due to the AFU not being mapped when the shutdown > notification routine is called and is a regression that was inserted > recently with Commit 704c4b0ddc03 ("cxlflash: Shutdown notify support > for CXL Flash cards"). > > As a fix, shutdown notification should only occur when the AFU is mapped. > > Fixes: 704c4b0ddc03 ("cxlflash: Shutdown notify support for CXL Flash cards") > Signed-off-by: Uma Krishnan <ukri...@linux.vnet.ibm.com> Acked-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com> -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] cxl: remove dead Kconfig options
> On Jul 4, 2016, at 2:12 AM, Andrew Donnellan <andrew.donnel...@au1.ibm.com> > wrote: > > Remove the CXL_KERNEL_API and CXL_EEH Kconfig options, as they were only > needed to coordinate the merging of the cxlflash driver. Also remove the > stub implementation of cxl_perst_reloads_same_image() in cxlflash which is > only used if CXL_EEH isn't defined (i.e. never). > > Suggested-by: Ian Munsie <imun...@au1.ibm.com> > Signed-off-by: Andrew Donnellan <andrew.donnel...@au1.ibm.com> Acked-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com> -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/3] cxlflash: Shutdown notify support for CXL Flash cards
> On Jun 15, 2016, at 6:49 PM, Uma Krishnan <ukri...@linux.vnet.ibm.com> wrote: > > Some CXL Flash cards need notification of device shutdown > in order to flush pending I/Os. > > A PCI notification hook for shutdown has been added where > the driver notifies the card and returns. When the device > is removed in the PCI remove path, notification code will > wait for shutdown processing to complete. > > Signed-off-by: Uma Krishnan <ukri...@linux.vnet.ibm.com> Acked-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com> -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/3] cxlflash: Fix to drain operations from previous reset
> On Jun 15, 2016, at 6:49 PM, Uma Krishnan <ukri...@linux.vnet.ibm.com> wrote: > > From: "Manoj N. Kumar" <ma...@linux.vnet.ibm.com> > > While running 'sg_reset -H' in a loop with a user-space application active, > hit the following exception: > > cpu 0x2: Vector: 300 (Data Access) >pc: : afu_attach+0x50/0x240 [cxlflash] >lr: : cxlflash_afu_recover+0x3dc/0x7d0 [cxlflash] >pid = 20365, comm = run_block_fvt > > Linux version 4.5.0-491-26f710d+ > > cxlflash_afu_recover+0x3dc/0x7d0 [cxlflash] > cxlflash_ioctl+0x5a8/0x6f0 [cxlflash] > scsi_ioctl+0x3b0/0x4c0 > sd_ioctl+0x110/0x190 > blkdev_ioctl+0x28c/0xc20 > block_ioctl+0xa4/0xd0 > do_vfs_ioctl+0xd8/0x8c0 > SyS_ioctl+0xd4/0xf0 > system_call+0x38/0xb4 > > The problem here is that the problem space area is unmapped while the > application issues the DK_CXLFLASH_RECOVER_AFU ioctl. > > This is the order I observe: > > proc1 proc2 > 1) sg_reset > 2) ioctl(DK_CXLFLASH_RECOVER_AFU) > 3) sg_reset again > causing a PSA unmap > 4) continues RECOVER_AFU processing > > The resolution to this problem is to have the reset handler drain all > outstanding user space initiated ioctls before proceeding. It is safe > to drain after the state has been changed to STATE_RESET. Also since > drain_ioctls() was static, it had to be moved up a bit to be before > cxlflash_eh_host_reset_handler(). > > Signed-off-by: Manoj N. Kumar <ma...@linux.vnet.ibm.com> Acked-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com> -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/1] cxlflash: Fix to resolve dead-lock during EEH recovery
Acked-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com> -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] cxlflash: Move to exponential back-off when cmd_room is not available
> On Mar 23, 2016, at 2:50 PM, Uma Krishnan <ukri...@linux.vnet.ibm.com> wrote: > > From: "Manoj N. Kumar" <ma...@linux.vnet.ibm.com> > > While profiling the cxlflash_queuecommand() path under a heavy load it > was found that number of retries to find cmd_room was fairly high. > > There are two problems with the current back-off: > a) It starts with a udelay of 0 > b) It backs-off linearly > > Tried several approaches (a higher multiple 10*n, 100*n, as well as n^2, > 2^n) and found that the exponential back-off(2^n) approach had the least > overall cost. Cost as being defined as overall time spent waiting. > > The fix is to change the linear back-off to an exponential back-off. > This solution also takes care of the problem with the initial > delay (starts with 1 usec). > > Signed-off-by: Manoj N. Kumar <ma...@linux.vnet.ibm.com> Acked-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com> -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] cxlflash: Fix regression issue with re-ordering patch
> On Mar 23, 2016, at 2:50 PM, Uma Krishnan <ukri...@linux.vnet.ibm.com> wrote: > > From: "Manoj N. Kumar" <ma...@linux.vnet.ibm.com> > > While running 'sg_reset -H' back to back the following exception was seen: > > [ 735.115695] Faulting instruction address: 0xd98c0864 > cpu 0x0: Vector: 300 (Data Access) at [c00afa80] >pc: d98c0864: cxlflash_async_err_irq+0x84/0x5c0 [cxlflash] >lr: c013aed0: handle_irq_event_percpu+0xa0/0x310 >sp: c00afd00 > msr: 90009033 > dar: 201 > dsisr: 4000 > current = 0xc1510880 > paca= 0xcfb8 softe: 0irq_happened: 0x01 >pid = 0, comm = swapper/0 > > Linux version 4.5.0-491-26f710d+ > > enter ? for help > [c00afe10] c013aed0 handle_irq_event_percpu+0xa0/0x310 > [c00afed0] c013b1a8 handle_irq_event+0x68/0xc0 > [c00aff00] c01404ec handle_fasteoi_irq+0xec/0x2a0 > [c00aff30] c013a084 generic_handle_irq+0x54/0x80 > [c00aff60] c0011130 __do_irq+0x80/0x1d0 > [c00aff90] c0024d40 call_do_irq+0x14/0x24 > [c1573a20] c0011318 do_IRQ+0x98/0x140 > [c1573a70] c0002594 hardware_interrupt_common+0x114/0x180 > > This exception is being hit because the async_err interrupt path performs > an MMIO to read the interrupt status register. The MMIO region in this > case is not available. > > Commit 6ded8b3cbd9a ("cxlflash: Unmap problem state area before detaching > master context") re-ordered the sequence in which term_mc() and stop_afu() > are called. This introduces a window for interrupts to come in with the > problem space area unmapped, that did not exist previously. > > The fix is to separate the disabling of all AFU interrupts to a distinct > function, term_intr() so that it is the first thing that is done in the > tear down process. > > To keep the initialization process symmetric, separate the AFU interrupt > setup also to a distinct function: init_intr(). > > Fixes: 6ded8b3cbd9a ("cxlflash: Unmap problem state area before detaching > master context") > Signed-off-by: Manoj N. Kumar <ma...@linux.vnet.ibm.com> Looks good, thanks for the quick turnaround! Acked-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com> -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 7/7] cxlflash: Increase cmd_per_lun for better throughput
> On Mar 4, 2016, at 3:55 PM, Uma Krishnan <ukri...@linux.vnet.ibm.com> wrote: > > From: "Manoj N. Kumar" <ma...@linux.vnet.ibm.com> > > With the current value of cmd_per_lun at 16, the throughput > over a single adapter is limited to around 150kIOPS. > > Increase the value of cmd_per_lun to 256 to improve > throughput. With this change a single adapter is able to > attain close to the maximum throughput (380kIOPS). > Also change the number of RRQ entries that can be queued. > > Signed-off-by: Manoj N. Kumar <ma...@linux.vnet.ibm.com> Acked-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com> -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 6/7] cxlflash: Fix to avoid unnecessary scan with internal LUNs
> On Mar 4, 2016, at 3:55 PM, Uma Krishnan <ukri...@linux.vnet.ibm.com> wrote: > > From: "Manoj N. Kumar" <ma...@linux.vnet.ibm.com> > > When switching to the internal LUN defined on the > IBM CXL flash adapter, there is an unnecessary > scan occurring on the second port. This scan leads > to the following extra lines in the log: > > Dec 17 10:09:00 tul83p1 kernel: [ 3708.561134] cxlflash 0008:00:00.0: > cxlflash_queuecommand: (scp=c000fc1f0f00) 11/1/0/0 > cdb=(A000--1000-) > Dec 17 10:09:00 tul83p1 kernel: [ 3708.561147] process_cmd_err: cmd failed > afu_rc=32 scsi_rc=0 fc_rc=0 afu_extra=0xE, scsi_extra=0x0, fc_extra=0x0 > > By definition, both of the internal LUNs are on the first port/channel. > > When the lun_mode is switched to internal LUN the > same value for host->max_channel is retained. This > causes an unnecessary scan over the second port/channel. > > This fix alters the host->max_channel to 0 (1 port), if internal > LUNs are configured and switches it back to 1 (2 ports) while > going back to external LUNs. > > Signed-off-by: Manoj N. Kumar <ma...@linux.vnet.ibm.com> Acked-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com> -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 5/7] cxlflash: Reorder user context initialization
> On Mar 4, 2016, at 3:55 PM, Uma Krishnan <ukri...@linux.vnet.ibm.com> wrote: > > In order to support cxlflash in the PowerVM environment, underlying > hypervisor APIs have imposed a kernel API ordering change. > > For the superpipe access to LUN, user applications need a context. > The cxlflash module creates this context by making a sequence of > cxl calls. In the current code, a context is initialized via > cxl_dev_context_init() followed by cxl_process_element(), a function > that obtains the process element id. Finally, cxl_start_work() > is called to attach the process element. > > In the PowerVM environment, a process element id cannot be obtained > from the hypervisor until the process element is attached. The > cxlflash module is unable to create contexts without a valid > process element id. > > To fix this problem, cxl_start_work() is called before obtaining > the process element id. > > Signed-off-by: Uma Krishnan <ukri...@linux.vnet.ibm.com> Acked-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com> -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/7] cxlflash: Unmap problem state area before detaching master context
> On Mar 4, 2016, at 3:55 PM, Uma Krishnan <ukri...@linux.vnet.ibm.com> wrote: > > When operating in the PowerVM environment, the cxlflash module can > receive an error from the hypervisor indicating that there are > existing mappings in the page table for the process MMIO space. > > This issue exists because term_afu() currently invokes term_mc() > before stop_afu(), allowing for the master context to be detached > first and the problem state area to be unmapped second. > > To resolve this issue, stop_afu() should be called before term_mc(). > > Signed-off-by: Uma Krishnan <ukri...@linux.vnet.ibm.com> Acked-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com> -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/7] cxlflash: Simplify PCI registration
> On Mar 4, 2016, at 3:55 PM, Uma Krishnan <ukri...@linux.vnet.ibm.com> wrote: > > From: "Manoj N. Kumar" <ma...@linux.vnet.ibm.com> > > The calls to pci_request_regions(), pci_resource_start(), > pci_set_dma_mask(), pci_set_master() and pci_save_state() are all > unnecessary for the IBM CXL flash adapter since data buffers > are not required to be mapped to the device's memory. > > The use of services such as pci_set_dma_mask() are problematic on > hypervisor managed systems as the IBM CXL flash adapter is operating > under a virtual PCI Host Bridge (virtual PHB) which does not support > these services. > > cxlflash 0001:00:00.0: init_pci: Failed to set PCI DMA mask rc=-5 > > The resolution is to simplify init_pci(), to a point where it does the > bare minimum (pci_enable_device). Similarly, remove the call the > pci_release_regions() from cxlflash_remove(). > > Signed-off-by: Manoj N. Kumar <ma...@linux.vnet.ibm.com> Acked-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com> -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v6 19/20] cxlflash: Use new cxl_pci_read_adapter_vpd() API
This patch should have also been sent to the SCSI list (included now). Changes look fine. Acked-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com> > On Mar 4, 2016, at 5:26 AM, Frederic Barrat <fbar...@linux.vnet.ibm.com> > wrote: > > To read the adapter VPD, drivers can't rely on pci config APIs, as it > wouldn't work on powerVM. cxl introduced a new kernel API especially > for this, so start using it. > > Co-authored-by: Christophe Lombard <clomb...@linux.vnet.ibm.com> > Signed-off-by: Frederic Barrat <fbar...@linux.vnet.ibm.com> > Signed-off-by: Christophe Lombard <clomb...@linux.vnet.ibm.com> > --- > drivers/scsi/cxlflash/common.h | 1 - > drivers/scsi/cxlflash/main.c | 18 ++ > 2 files changed, 2 insertions(+), 17 deletions(-) > > diff --git a/drivers/scsi/cxlflash/common.h b/drivers/scsi/cxlflash/common.h > index 5ada926..580f370 100644 > --- a/drivers/scsi/cxlflash/common.h > +++ b/drivers/scsi/cxlflash/common.h > @@ -106,7 +106,6 @@ struct cxlflash_cfg { > atomic_t scan_host_needed; > > struct cxl_afu *cxl_afu; > - struct pci_dev *parent_dev; > > atomic_t recovery_threads; > struct mutex ctx_recovery_mutex; > diff --git a/drivers/scsi/cxlflash/main.c b/drivers/scsi/cxlflash/main.c > index f6d90ce..e04aae7 100644 > --- a/drivers/scsi/cxlflash/main.c > +++ b/drivers/scsi/cxlflash/main.c > @@ -1407,7 +1407,7 @@ static int start_context(struct cxlflash_cfg *cfg) > */ > static int read_vpd(struct cxlflash_cfg *cfg, u64 wwpn[]) > { > - struct pci_dev *dev = cfg->parent_dev; > + struct pci_dev *dev = cfg->dev; > int rc = 0; > int ro_start, ro_size, i, j, k; > ssize_t vpd_size; > @@ -1416,7 +1416,7 @@ static int read_vpd(struct cxlflash_cfg *cfg, u64 > wwpn[]) > char *wwpn_vpd_tags[NUM_FC_PORTS] = { "V5", "V6" }; > > /* Get the VPD data from the device */ > - vpd_size = pci_read_vpd(dev, 0, sizeof(vpd_data), vpd_data); > + vpd_size = cxl_read_adapter_vpd(dev, vpd_data, sizeof(vpd_data)); > if (unlikely(vpd_size <= 0)) { > dev_err(>dev, "%s: Unable to read VPD (size = %ld)\n", > __func__, vpd_size); > @@ -2392,7 +2392,6 @@ static int cxlflash_probe(struct pci_dev *pdev, > { > struct Scsi_Host *host; > struct cxlflash_cfg *cfg = NULL; > - struct device *phys_dev; > struct dev_dependent_vals *ddv; > int rc = 0; > > @@ -2458,19 +2457,6 @@ static int cxlflash_probe(struct pci_dev *pdev, > > pci_set_drvdata(pdev, cfg); > > - /* > - * Use the special service provided to look up the physical > - * PCI device, since we are called on the probe of the virtual > - * PCI host bus (vphb) > - */ > - phys_dev = cxl_get_phys_dev(pdev); > - if (!dev_is_pci(phys_dev)) { > - dev_err(>dev, "%s: not a pci dev\n", __func__); > - rc = -ENODEV; > - goto out_remove; > - } > - cfg->parent_dev = to_pci_dev(phys_dev); > - > cfg->cxl_afu = cxl_pci_to_afu(pdev); > > rc = init_pci(cfg); > -- > 1.9.1 > > ___ > Linuxppc-dev mailing list > linuxppc-...@lists.ozlabs.org > https://lists.ozlabs.org/listinfo/linuxppc-dev -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2] osd: remove deadcode
Reviewed-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com> -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] imm: check parport_claim
Reviewed-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com> -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] hpsa: update MAINTAINERS with new e-mail
Reviewed-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com> -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 7/7] hpsa: update copyright information
Reviewed-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com> -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 6/7] hpsa: remove function definition for sanitize_inquiry_string
Thanks again for implementing this feedback! Reviewed-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com> -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 5/7] hpsa: check for a null phys_disk pointer in ioaccel2 path
Reviewed-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com> -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 4/7] hpsa: correct abort tmf for hba devices
Reviewed-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com> -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/7] hpsa: correct lun data caching bitmap definition
Reviewed-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com> -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/7] hpsa: add SMR drive support
Reviewed-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com> -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/7] hpsa: do not get enclosure info for external devices
Reviewed-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com> -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] lpfc: Remove redundant code block in lpfc_scsi_cmd_iocb_cmpl
Reviewed-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com> -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] sd: Optimal I/O size is in bytes, not sectors
Reviewed-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com> -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] scsi: export function scsi_scan.c:sanitize_inquiry_string
> On Jan 20, 2016, at 1:41 PM, Don Brace <don.br...@pmcs.com> wrote: > > The hpsa driver uses this function to cleanup inquiry > data. Our new pqi driver will also use this > function. This function was copied into both drivers. > > This patch exports sanitize_inquiry_string so the hpsa > and the pqi drivers can use this function directly. > > Hannes recommended the change in review: > https://www.marc.info/?l=linux-scsi=144619249003064=2 > that using the existing function in scsi_scan would be > preferrable. I also made this suggestion when reviewing the same patch: https://www.marc.info/?l=linux-scsi=144613827316617=2 Reviewed-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com> > > Suggested-by: Hannes Reinecke <h...@suse.de> > Reviewed-by: Kevin Barnett <kevin.barn...@pmcs.com> > Reviewed-by: Justin Lindley <justin.lind...@pmcs.com> > Reviewed-by: Scott Teel <scott.t...@pmcs.com> > Reviewed-by: Hannes Reinecke <h...@suse.de> > Signed-off-by: Don Brace <don.br...@pmcs.com> > --- > drivers/scsi/scsi_scan.c | 12 +++- > 1 file changed, 7 insertions(+), 5 deletions(-) > > diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c > index 6a82066..1f02e84 100644 > --- a/drivers/scsi/scsi_scan.c > +++ b/drivers/scsi/scsi_scan.c > @@ -518,7 +518,8 @@ void scsi_target_reap(struct scsi_target *starget) > } > > /** > - * sanitize_inquiry_string - remove non-graphical chars from an INQUIRY > result string > + * scsi_sanitize_inquiry_string - remove non-graphical chars from an > + *INQUIRY result string > * @s: INQUIRY result string to sanitize > * @len: length of the string > * > @@ -531,7 +532,7 @@ void scsi_target_reap(struct scsi_target *starget) > *string terminator, so all the following characters are set to > *spaces. > **/ > -static void sanitize_inquiry_string(unsigned char *s, int len) > +void scsi_sanitize_inquiry_string(unsigned char *s, int len) > { > int terminated = 0; > > @@ -542,6 +543,7 @@ static void sanitize_inquiry_string(unsigned char *s, int > len) > *s = ' '; > } > } > +EXPORT_SYMBOL(scsi_sanitize_inquiry_string); > > /** > * scsi_probe_lun - probe a single LUN using a SCSI INQUIRY > @@ -627,9 +629,9 @@ static int scsi_probe_lun(struct scsi_device *sdev, > unsigned char *inq_result, > } > > if (result == 0) { > - sanitize_inquiry_string(_result[8], 8); > - sanitize_inquiry_string(_result[16], 16); > - sanitize_inquiry_string(_result[32], 4); > + scsi_sanitize_inquiry_string(_result[8], 8); > + scsi_sanitize_inquiry_string(_result[16], 16); > + scsi_sanitize_inquiry_string(_result[32], 4); > > response_len = inq_result[4] + 5; > if (response_len > 255) > > -- > To unsubscribe from this list: send the line "unsubscribe linux-scsi" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2] ipr: fix out-of-bounds null overwrite
On Jan 6, 2016, at 7:53 AM, Insu Yun <wuni...@gmail.com> wrote: > > Return value of snprintf is not bound by size value, 2nd argument. > (https://www.kernel.org/doc/htmldocs/kernel-api/API-snprintf.html). > Return value is number of printed chars, can be larger than 2nd argument. > Therefore, it can write null byte out of bounds ofbuffer. > Since snprintf puts null, it does not need to put additional null byte. > > Signed-off-by: Insu Yun <wuni...@gmail.com> > --- > drivers/scsi/ipr.c | 5 ++--- > 1 file changed, 2 insertions(+), 3 deletions(-) > > diff --git a/drivers/scsi/ipr.c b/drivers/scsi/ipr.c > index 536cd5a..caf63f9 100644 > --- a/drivers/scsi/ipr.c > +++ b/drivers/scsi/ipr.c > @@ -4003,13 +4003,12 @@ static ssize_t ipr_store_update_fw(struct device *dev, > struct ipr_sglist *sglist; > char fname[100]; > char *src; > - int len, result, dnld_size; > + int result, dnld_size; > > if (!capable(CAP_SYS_ADMIN)) > return -EACCES; > > - len = snprintf(fname, 99, "%s", buf); > - fname[len-1] = '\0'; > + snprintf(fname, 99, "%s", buf); Changing the 99 to sizeof(fname) as part of this change would also make for cleaner code in my opinion. Will leave it up to you if you'd like to do a v3. Reviewed-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com> -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 3/3] hpsa: add box and bay information for enclosure devices
Reviewed-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com> -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 1/3] hpsa: fix path_info_show
> On Dec 21, 2015, at 11:21 AM, Don Brace <don.br...@pmcs.com> wrote: > > left off some changes from Rasmus Villemoes where he changed > snprintf to scnprintf > > Suggested-by: Rasmus Villemoes <li...@rasmusvillemoes.dk> > Reviewed-by: Justin Lindley <justin.lind...@pmcs.com> > Reviewed-by: Kevin Barnett <kevin.barn...@pmcs.com> > Reviewed-by: Scott Teel <scott.t...@pmcs.com> > Reviewed-by: Rasmus Villemoes <li...@rasmusvillemoes.dk> > Reviewed-by: Hannes Reinecke <h...@suse.com> > Signed-off-by: Don Brace <don.br...@pmcs.com> FYI, I did review and stamp this one back on 12/9. Reviewed-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com> -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/3] hpsa: change SAS transport devices to bus 0.
> On Dec 15, 2015, at 9:25 AM, Don Brace <brace77...@gmail.com> wrote: > > On 12/09/2015 05:21 PM, Matthew R. Ochs wrote: >>> On Dec 9, 2015, at 3:18 PM, Don Brace <don.br...@pmcs.com> wrote: >>> >>> sas transport places devices on bus 0 but driver was setting >>> the bus to 3. >>> >>> Reviewed-by: Justin Lindley <justin.lind...@pmcs.com> >>> Reviewed-by: Kevin Barnett <kevin.barn...@pmcs.com> >>> Reviewed-by: Scott Teel <scott.t...@pmcs.com> >>> Signed-off-by: Don Brace <don.br...@pmcs.com> >>> --- >>> drivers/scsi/hpsa.h |2 +- >>> 1 file changed, 1 insertion(+), 1 deletion(-) >>> >>> diff --git a/drivers/scsi/hpsa.h b/drivers/scsi/hpsa.h >>> index ae5beda..fdd39fc 100644 >>> --- a/drivers/scsi/hpsa.h >>> +++ b/drivers/scsi/hpsa.h >>> @@ -400,7 +400,7 @@ struct offline_device_entry { >>> #define HPSA_PHYSICAL_DEVICE_BUS0 >>> #define HPSA_RAID_VOLUME_BUS1 >>> #define HPSA_EXTERNAL_RAID_VOLUME_BUS 2 >>> -#define HPSA_HBA_BUS 3 >>> +#define HPSA_HBA_BUS 0 >> Is this not the same as using HPSA_PHYSICAL_DEVICE_BUS? > I was just trying to minimize the changes. I can update the driver > is necessary. Understandable. Reviewed-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com> -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/3] hpsa: add box and bay information for enclosure devices
Hi Don, Did you see these comments I had from my review of this patch? -matt > On Dec 9, 2015, at 5:41 PM, Matthew R. Ochs <mro...@linux.vnet.ibm.com> wrote: > >> On Dec 9, 2015, at 3:18 PM, Don Brace <don.br...@pmcs.com> wrote: > > No commit message? You can disregard this one as I saw you added a message in v1. > >> >> Reviewed-by: Justin Lindley <justin.lind...@pmcs.com> >> Reviewed-by: Kevin Barnett <kevin.barn...@pmcs.com> >> Reviewed-by: Scott Teel <scott.t...@pmcs.com> >> Signed-off-by: Don Brace <don.br...@pmcs.com> >> --- >> drivers/scsi/hpsa.c | 108 >> --- >> drivers/scsi/hpsa_cmd.h | 13 ++ >> 2 files changed, 114 insertions(+), 7 deletions(-) >> >> diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c >> index f8370b8..6f84ec7 100644 >> --- a/drivers/scsi/hpsa.c >> +++ b/drivers/scsi/hpsa.c >> @@ -750,7 +750,6 @@ static ssize_t >> host_show_hp_ssd_smart_path_enabled(struct device *dev, >> } >> >> #define MAX_PATHS 8 >> - >> static ssize_t path_info_show(struct device *dev, >> struct device_attribute *attr, char *buf) >> { >> @@ -792,9 +791,7 @@ static ssize_t path_info_show(struct device *dev, >> hdev->bus, hdev->target, hdev->lun, >> scsi_device_type(hdev->devtype)); >> >> -if (hdev->external || >> -hdev->devtype == TYPE_RAID || >> -is_logical_device(hdev)) { >> +if (hdev->devtype == TYPE_RAID || is_logical_device(hdev)) { > > How does removing the check for external here relate to the rest of this > commit? > >> output_len += scnprintf(buf + output_len, >> PAGE_SIZE - output_len, >> "%s\n", active); >> @@ -808,8 +805,7 @@ static ssize_t path_info_show(struct device *dev, >> phys_connector[0] = '0'; >> if (phys_connector[1] < '0') >> phys_connector[1] = '0'; >> -if (hdev->phys_connector[i] > 0) >> -output_len += scnprintf(buf + output_len, >> +output_len += scnprintf(buf + output_len, > > Same question regarding the removal of the > 0 check. > >> PAGE_SIZE - output_len, >> "PORT: %.2s ", >> phys_connector); >> @@ -3191,6 +3187,90 @@ out: >> return rc; >> } >> >> +/* >> + * get enclosure information >> + * struct ReportExtendedLUNdata *rlep - Used for BMIC drive number >> + * struct hpsa_scsi_dev_t *encl_dev - device entry for enclosure >> + * Uses id_physical_device to determine the box_index. >> + */ >> +static int hpsa_get_enclosure_info(struct ctlr_info *h, >> +unsigned char *scsi3addr, >> +struct ReportExtendedLUNdata *rlep, int rle_index, >> +struct hpsa_scsi_dev_t *encl_dev) >> +{ >> +int rc = -1; >> +struct CommandList *c = NULL; >> +struct ErrorInfo *ei = NULL; >> +struct bmic_sense_storage_box_params *bssbp = NULL; >> +struct bmic_identify_physical_device *id_phys = NULL; >> +struct ext_report_lun_entry *rle = >LUN[rle_index]; >> +u16 bmic_device_index = 0; >> + >> + >> +bssbp = kzalloc(sizeof(*bssbp), GFP_KERNEL); >> +if (!bssbp) >> +goto out; >> + >> +id_phys = kzalloc(sizeof(*id_phys), GFP_KERNEL); >> +if (!id_phys) >> +goto out; >> + >> +bmic_device_index = GET_BMIC_DRIVE_NUMBER(>lunid[0]); >> + >> +if (bmic_device_index == 0xFF00) >> +goto out; > > Why not put this check before the memory allocations so you can > avoid them if this condition is not met? > >> + >> +memset(id_phys, 0, sizeof(*id_phys)); > > Didn't you just obtain zeroed memory from kzalloc()? > >> +rc = hpsa_bmic_id_physical_device(h, scsi3addr, bmic_device_index, >> +id_phys, sizeof(*id_phys)); >> +if (rc) { >> +dev_warn(>pdev->dev, "%s: id_phys failed %d bdi[0x%x]\n", >> +__func__, encl_dev->external, bmic_device_index); >> +goto out; >> +} >&
Re: [PATCH 16/17] lpfc: Use kzalloc instead of kmalloc
Reviewed-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com> -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 3/6] cxlflash: Removed driver date print
Acked-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com> -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 4/6] cxlflash: Fix to resolve cmd leak after host reset
Acked-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com> -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 5/6] cxlflash: Resolve oops in wait_port_offline
Acked-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com> -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 6/6] cxlflash: Enable device id for future IBM CXL adapter
Acked-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com> -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/6] cxlflash: Fix to escalate LINK_RESET also on port 1
Acked-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com> -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/3] hpsa: fix path_info_show
Reviewed-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com> -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/3] hpsa: change SAS transport devices to bus 0.
> On Dec 9, 2015, at 3:18 PM, Don Bracewrote: > > sas transport places devices on bus 0 but driver was setting > the bus to 3. > > Reviewed-by: Justin Lindley > Reviewed-by: Kevin Barnett > Reviewed-by: Scott Teel > Signed-off-by: Don Brace > --- > drivers/scsi/hpsa.h |2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/scsi/hpsa.h b/drivers/scsi/hpsa.h > index ae5beda..fdd39fc 100644 > --- a/drivers/scsi/hpsa.h > +++ b/drivers/scsi/hpsa.h > @@ -400,7 +400,7 @@ struct offline_device_entry { > #define HPSA_PHYSICAL_DEVICE_BUS 0 > #define HPSA_RAID_VOLUME_BUS 1 > #define HPSA_EXTERNAL_RAID_VOLUME_BUS 2 > -#define HPSA_HBA_BUS 3 > +#define HPSA_HBA_BUS 0 Is this not the same as using HPSA_PHYSICAL_DEVICE_BUS? -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/3] hpsa: add box and bay information for enclosure devices
> On Dec 9, 2015, at 3:18 PM, Don Bracewrote: No commit message? > > Reviewed-by: Justin Lindley > Reviewed-by: Kevin Barnett > Reviewed-by: Scott Teel > Signed-off-by: Don Brace > --- > drivers/scsi/hpsa.c | 108 --- > drivers/scsi/hpsa_cmd.h | 13 ++ > 2 files changed, 114 insertions(+), 7 deletions(-) > > diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c > index f8370b8..6f84ec7 100644 > --- a/drivers/scsi/hpsa.c > +++ b/drivers/scsi/hpsa.c > @@ -750,7 +750,6 @@ static ssize_t host_show_hp_ssd_smart_path_enabled(struct > device *dev, > } > > #define MAX_PATHS 8 > - > static ssize_t path_info_show(struct device *dev, >struct device_attribute *attr, char *buf) > { > @@ -792,9 +791,7 @@ static ssize_t path_info_show(struct device *dev, > hdev->bus, hdev->target, hdev->lun, > scsi_device_type(hdev->devtype)); > > - if (hdev->external || > - hdev->devtype == TYPE_RAID || > - is_logical_device(hdev)) { > + if (hdev->devtype == TYPE_RAID || is_logical_device(hdev)) { How does removing the check for external here relate to the rest of this commit? > output_len += scnprintf(buf + output_len, > PAGE_SIZE - output_len, > "%s\n", active); > @@ -808,8 +805,7 @@ static ssize_t path_info_show(struct device *dev, > phys_connector[0] = '0'; > if (phys_connector[1] < '0') > phys_connector[1] = '0'; > - if (hdev->phys_connector[i] > 0) > - output_len += scnprintf(buf + output_len, > + output_len += scnprintf(buf + output_len, Same question regarding the removal of the > 0 check. > PAGE_SIZE - output_len, > "PORT: %.2s ", > phys_connector); > @@ -3191,6 +3187,90 @@ out: > return rc; > } > > +/* > + * get enclosure information > + * struct ReportExtendedLUNdata *rlep - Used for BMIC drive number > + * struct hpsa_scsi_dev_t *encl_dev - device entry for enclosure > + * Uses id_physical_device to determine the box_index. > + */ > +static int hpsa_get_enclosure_info(struct ctlr_info *h, > + unsigned char *scsi3addr, > + struct ReportExtendedLUNdata *rlep, int rle_index, > + struct hpsa_scsi_dev_t *encl_dev) > +{ > + int rc = -1; > + struct CommandList *c = NULL; > + struct ErrorInfo *ei = NULL; > + struct bmic_sense_storage_box_params *bssbp = NULL; > + struct bmic_identify_physical_device *id_phys = NULL; > + struct ext_report_lun_entry *rle = >LUN[rle_index]; > + u16 bmic_device_index = 0; > + > + > + bssbp = kzalloc(sizeof(*bssbp), GFP_KERNEL); > + if (!bssbp) > + goto out; > + > + id_phys = kzalloc(sizeof(*id_phys), GFP_KERNEL); > + if (!id_phys) > + goto out; > + > + bmic_device_index = GET_BMIC_DRIVE_NUMBER(>lunid[0]); > + > + if (bmic_device_index == 0xFF00) > + goto out; Why not put this check before the memory allocations so you can avoid them if this condition is not met? > + > + memset(id_phys, 0, sizeof(*id_phys)); Didn't you just obtain zeroed memory from kzalloc()? > + rc = hpsa_bmic_id_physical_device(h, scsi3addr, bmic_device_index, > + id_phys, sizeof(*id_phys)); > + if (rc) { > + dev_warn(>pdev->dev, "%s: id_phys failed %d bdi[0x%x]\n", > + __func__, encl_dev->external, bmic_device_index); > + goto out; > + } > + > + c = cmd_alloc(h); > + > + rc = fill_cmd(c, BMIC_SENSE_STORAGE_BOX_PARAMS, h, bssbp, > + sizeof(*bssbp), 0, RAID_CTLR_LUNID, TYPE_CMD); > + > + if (rc) > + goto out; > + > + if (id_phys->phys_connector[1] == 'E') > + c->Request.CDB[5] = id_phys->box_index; > + else > + c->Request.CDB[5] = 0; > + > + rc = hpsa_scsi_do_simple_cmd_with_retry(h, c, PCI_DMA_FROMDEVICE, > + NO_TIMEOUT); > + if (rc) > + goto out; > + > + ei = c->err_info; > + if (ei->CommandStatus != 0 && ei->CommandStatus != CMD_DATA_UNDERRUN) { > + rc = -1; > + goto out; > + } > + > + encl_dev->box[id_phys->active_path_number] = bssbp->phys_box_on_port; > + memcpy(_dev->phys_connector[id_phys->active_path_number], > + bssbp->phys_connector, sizeof(bssbp->phys_connector)); > + > + rc = IO_OK; > +out: > + kfree(bssbp); > + kfree(id_phys); > + > + if (c) >
[PATCH RESEND] cxlflash: drop unlikely before IS_ERR_OR_NULL
From: Geliang Tang <geliangt...@163.com> IS_ERR_OR_NULL already contain an unlikely compiler flag. Drop it. Signed-off-by: Geliang Tang <geliangt...@163.com> Acked-by: Manoj Kumar <ma...@linux.vnet.ibm.com> Acked-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com> Signed-off-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com> --- This patch was originally sent by Geliang Tang in October 2015. As a valid simplification, I'd like to see this make it into 'next'. I've gone ahead and performed the rebase so that it applies cleanly. drivers/scsi/cxlflash/superpipe.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/scsi/cxlflash/superpipe.c b/drivers/scsi/cxlflash/superpipe.c index 34b21a0..f4020db 100644 --- a/drivers/scsi/cxlflash/superpipe.c +++ b/drivers/scsi/cxlflash/superpipe.c @@ -1372,7 +1372,7 @@ static int cxlflash_disk_attach(struct scsi_device *sdev, } ctx = cxl_dev_context_init(cfg->dev); - if (unlikely(IS_ERR_OR_NULL(ctx))) { + if (IS_ERR_OR_NULL(ctx)) { dev_err(dev, "%s: Could not initialize context %p\n", __func__, ctx); rc = -ENODEV; @@ -1500,7 +1500,7 @@ static int recover_context(struct cxlflash_cfg *cfg, struct ctx_info *ctxi) struct afu *afu = cfg->afu; ctx = cxl_dev_context_init(cfg->dev); - if (unlikely(IS_ERR_OR_NULL(ctx))) { + if (IS_ERR_OR_NULL(ctx)) { dev_err(dev, "%s: Could not initialize context %p\n", __func__, ctx); rc = -ENODEV; -- 2.1.0 -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH RESEND] cxlflash: a couple off by one bugs
From: Dan Carpenter <dan.carpen...@oracle.com> The "> MAX_CONTEXT" should be ">= MAX_CONTEXT". Otherwise we go one step beyond the end of the cfg->ctx_tbl[] array. Signed-off-by: Dan Carpenter <dan.carpen...@oracle.com> Reviewed-by: Manoj Kumar <ma...@linux.vnet.ibm.com> Acked-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com> Signed-off-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com> --- This patch was originally sent by Dan Carpenter in September 2015. I had based my large patch series that went into 4.4 off of it but this patch appears to have not made it in. As a valid fix, I'd like to see this make it into 'next'. I've gone ahead and performed the rebase so that it applies cleanly. drivers/scsi/cxlflash/superpipe.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/scsi/cxlflash/superpipe.c b/drivers/scsi/cxlflash/superpipe.c index cac2e6a..34b21a0 100644 --- a/drivers/scsi/cxlflash/superpipe.c +++ b/drivers/scsi/cxlflash/superpipe.c @@ -1380,7 +1380,7 @@ static int cxlflash_disk_attach(struct scsi_device *sdev, } ctxid = cxl_process_element(ctx); - if (unlikely((ctxid > MAX_CONTEXT) || (ctxid < 0))) { + if (unlikely((ctxid >= MAX_CONTEXT) || (ctxid < 0))) { dev_err(dev, "%s: ctxid (%d) invalid!\n", __func__, ctxid); rc = -EPERM; goto err2; @@ -1508,7 +1508,7 @@ static int recover_context(struct cxlflash_cfg *cfg, struct ctx_info *ctxi) } ctxid = cxl_process_element(ctx); - if (unlikely((ctxid > MAX_CONTEXT) || (ctxid < 0))) { + if (unlikely((ctxid >= MAX_CONTEXT) || (ctxid < 0))) { dev_err(dev, "%s: ctxid (%d) invalid!\n", __func__, ctxid); rc = -EPERM; goto err1; -- 2.1.0 -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 18/27] hpsa: refactor hpsa_figure_bus_target_lun
Reviewed-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com> -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 20/27] hpsa: generalize external arrays
continue; > } > > + /* Determine if this is a lun from an external target array */ > + tmpdevice->external = > + figure_external_status(h, raid_ctlr_position, i, > + nphysicals, nlocal_logicals); > + > figure_bus_target_lun(h, lunaddrbytes, tmpdevice); > hpsa_update_device_supports_aborts(h, tmpdevice, lunaddrbytes); > this_device = currentsd[ncurrent]; > @@ -3962,6 +4031,7 @@ out: > kfree(currentsd); > kfree(physdev_list); > kfree(logdev_list); > + kfree(id_ctlr); > kfree(id_phys); > } > > @@ -6414,6 +6484,23 @@ static int fill_cmd(struct CommandList *c, u8 cmd, > struct ctlr_info *h, > c->Request.CDB[7] = (size >> 16) & 0xFF; > c->Request.CDB[8] = (size >> 8) & 0XFF; > break; > + case BMIC_IDENTIFY_CONTROLLER: > + c->Request.CDBLen = 10; > + c->Request.type_attr_dir = > + TYPE_ATTR_DIR(cmd_type, ATTR_SIMPLE, XFER_READ); > + c->Request.Timeout = 0; > + c->Request.CDB[0] = BMIC_READ; > + c->Request.CDB[1] = 0; > + c->Request.CDB[2] = 0; > + c->Request.CDB[3] = 0; > + c->Request.CDB[4] = 0; > + c->Request.CDB[5] = 0; > + c->Request.CDB[6] = BMIC_IDENTIFY_CONTROLLER; > + c->Request.CDB[7] = (size >> 16) & 0xFF; > + c->Request.CDB[8] = (size >> 8) & 0XFF; > + c->Request.CDB[9] = 0; > + break; > + For consistency with the other fill_cmd() cases these unnecessary zeroing statements should be removed, either now or at some point in the future. Reviewed-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com> > default: > dev_warn(>pdev->dev, "unknown command 0x%c\n", cmd); > BUG(); > diff --git a/drivers/scsi/hpsa.h b/drivers/scsi/hpsa.h > index a4cab12..ffcd4cb 100644 > --- a/drivers/scsi/hpsa.h > +++ b/drivers/scsi/hpsa.h > @@ -77,6 +77,7 @@ struct hpsa_scsi_dev_t { > struct hpsa_scsi_dev_t *phys_disk[RAID_MAP_MAX_ENTRIES]; > int nphysical_disks; > int supports_aborts; > + int external; /* 1-from external array 0-not <0-unknown */ > }; > > struct reply_queue_buffer { > diff --git a/drivers/scsi/hpsa_cmd.h b/drivers/scsi/hpsa_cmd.h > index c2c0737..c83eaf1 100644 > --- a/drivers/scsi/hpsa_cmd.h > +++ b/drivers/scsi/hpsa_cmd.h > @@ -286,6 +286,7 @@ struct SenseSubsystem_info { > #define BMIC_FLASH_FIRMWARE 0xF7 > #define BMIC_SENSE_CONTROLLER_PARAMETERS 0x64 > #define BMIC_IDENTIFY_PHYSICAL_DEVICE 0x15 > +#define BMIC_IDENTIFY_CONTROLLER 0x11 > > /* Command List Structure */ > union SCSI3Addr { > @@ -682,6 +683,16 @@ struct hpsa_pci_info { > u32 board_id; > }; > > +struct bmic_identify_controller { > + u8 configured_logical_drive_count; /* offset 0 */ > + u8 pad1[153]; > + __le16 extended_logical_unit_count;/* offset 154 */ > + u8 pad2[136]; > + u8 controller_mode;/* offset 292 */ > + u8 pad3[32]; > +}; > + > + > struct bmic_identify_physical_device { > u8scsi_bus; /* SCSI Bus number on controller */ > u8scsi_id; /* SCSI ID on this bus */ > > -- > To unsubscribe from this list: send the line "unsubscribe linux-scsi" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] st: trivial: remove form feed characters
Reviewed-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com> -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1 24/25] hpsa: add in sas transport class
> On Oct 30, 2015, at 5:00 PM, Don Brace <brace77...@gmail.com> wrote: > On 10/30/2015 03:07 PM, Matthew R. Ochs wrote: >>> On Oct 28, 2015, at 5:06 PM, Don Brace <don.br...@pmcs.com> wrote: >>> >>> From: Kevin Barnett <kevin.barn...@pmcs.com> >>> >>> Reviewed-by: Scott Teel <scott.t...@pmcs.com> >>> Reviewed-by: Justin Lindley <justin.lind...@pmcs.com> >>> Reviewed-by: Kevin Barnett <kevin.barn...@pmcs.com> >>> Signed-off-by: Don Brace <don.br...@pmcs.com> >>> --- >>> drivers/scsi/hpsa.c | 535 >>> +-- >>> drivers/scsi/hpsa.h | 27 ++ >>> drivers/scsi/hpsa_cmd.h | 14 + >>> 3 files changed, 555 insertions(+), 21 deletions(-) >>> >>> diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c >>> index 56526312..ca38a00 100644 >>> --- a/drivers/scsi/hpsa.c >>> +++ b/drivers/scsi/hpsa.c >>> @@ -41,6 +41,7 @@ >>> >>> +static u64 hpsa_get_sas_address_from_report_physical(struct ctlr_info *h, >>> + unsigned char *scsi3addr) >>> +{ >>> + struct ReportExtendedLUNdata *physdev; >>> + u32 nphysicals; >>> + u64 sa = 0; >>> + int i; >>> + >>> + physdev = kzalloc(sizeof(*physdev), GFP_KERNEL); >>> + if (!physdev) >>> + return 0; >>> + >>> + if (hpsa_scsi_do_report_phys_luns(h, physdev, sizeof(*physdev))) { >>> + dev_err(>pdev->dev, "report physical LUNs failed.\n"); >>> + kfree(physdev); >>> + return 0; >>> + } >>> + nphysicals = get_unaligned_be32(physdev->LUNListLength) / 24; >>> + >>> + for (i = 0; i < nphysicals; i++) >>> + if (!memcmp(>LUN[i].lunid[0], scsi3addr, 8)) >>> + sa = get_unaligned_be64(>LUN[i].wwid[0]); >> Don't you want to break out here if you found a match? > Agreed. >> >>> + >>> + kfree(physdev); >>> + >>> + return sa; >>> +} >>> + >>> +static void hpsa_get_sas_address(struct ctlr_info *h, unsigned char >>> *scsi3addr, >>> + struct hpsa_scsi_dev_t *dev) >>> +{ >>> + int rc; >>> + u64 sa = 0; >>> + >>> + if (is_hba_lunid(scsi3addr)) { >>> + struct bmic_sense_subsystem_info *ssi; >>> + >>> + ssi = kzalloc(sizeof(*ssi), GFP_KERNEL); >> What happens if this allocation fails? If the I/O can succeed without the >> DMA buffer then you will run into trouble when deriving sa. > Added check for NULL. With these changes Reviewed-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com> -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1 17/25] hpsa: move scsi_add_device and scsi_remove_device calls to new function
> On Oct 29, 2015, at 3:30 PM, Don Brace <brace77...@gmail.com> wrote: > > On 10/29/2015 12:21 PM, Matthew R. Ochs wrote: >>> On Oct 28, 2015, at 5:06 PM, Don Brace <don.br...@pmcs.com> wrote: >>> >>> From: Kevin Barnett <kevin.barn...@pmcs.com> >>> >>> preparation for adding the sas transport class >>> >>> Reviewed-by: Scott Teel <scott.t...@pmcs.com> >>> Reviewed-by: Justin Lindley <justin.lind...@pmcs.com> >>> Reviewed-by: Kevin Barnett <kevin.barn...@pmcs.com> >>> Signed-off-by: Don Brace <don.br...@pmcs.com> >>> --- >>> drivers/scsi/hpsa.c | 65 >>> +++ >>> 1 file changed, 39 insertions(+), 26 deletions(-) >>> >>> diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c >>> index 24b3c8c..06207e2 100644 >>> --- a/drivers/scsi/hpsa.c >>> +++ b/drivers/scsi/hpsa.c >>> @@ -1660,6 +1660,37 @@ static void >>> hpsa_update_log_drive_phys_drive_ptrs(struct ctlr_info *h, >>> } >>> } >>> >>> +static int hpsa_add_device(struct ctlr_info *h, struct hpsa_scsi_dev_t >>> *device) >>> +{ >>> + int rc = 0; >>> + >>> + rc = scsi_add_device(h->scsi_host, device->bus, >>> + device->target, device->lun); >>> + return rc; >>> +} >>> + >>> +static void hpsa_remove_device(struct ctlr_info *h, >>> + struct hpsa_scsi_dev_t *device) >>> +{ >>> + struct scsi_device *sdev = NULL; >>> + >>> + sdev = scsi_device_lookup(h->scsi_host, device->bus, >>> + device->target, device->lun); >>> + >>> + if (sdev) { >>> + scsi_remove_device(sdev); >>> + scsi_device_put(sdev); >>> + } else { >>> + /* >>> +* We don't expect to get here. Future commands >>> +* to this device will get a selection timeout as >>> +* if the device were gone. >>> +*/ >>> + hpsa_show_dev_msg(KERN_WARNING, h, device, >>> + "didn't find device for removal."); >>> + } >>> +} >>> + >>> static void adjust_hpsa_scsi_table(struct ctlr_info *h, int hostno, >>> struct hpsa_scsi_dev_t *sd[], int nsds) >>> { >>> @@ -1672,7 +1703,6 @@ static void adjust_hpsa_scsi_table(struct ctlr_info >>> *h, int hostno, >>> unsigned long flags; >>> struct hpsa_scsi_dev_t **added, **removed; >>> int nadded, nremoved; >>> - struct Scsi_Host *sh = NULL; >>> >>> /* >>> * A reset can cause a device status to change >>> @@ -1792,46 +1822,29 @@ static void adjust_hpsa_scsi_table(struct ctlr_info >>> *h, int hostno, >>> if (hostno == -1 || !changes) >>> goto free_and_out; >>> >>> - sh = h->scsi_host; >>> - if (sh == NULL) { >>> - dev_warn(>pdev->dev, "%s: scsi_host is null\n", __func__); >>> - return; >>> - } >> Are we guaranteed that h->scsi_host will never be NULL when running in here? >> Or when >> the newly introduced hpsa_remove_device() is invoked elsewhere for that >> matter? >> >> This commit loses this check and scsi_device_lookup() is not tolerant of a >> NULL >> scsi_host * (first action is to grab the host_lock). > Better to be safe. I'll add a check in both functions. With these changes Reviewed-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com> -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1 21/25] hpsa: disable report lun data caching
D_CACHING; > + > + if (fill_cmd(c, BMIC_SET_DIAG_OPTIONS, h, options, 4, 0, > + RAID_CTLR_LUNID, TYPE_CMD)) > + goto errout; > + > + rc = hpsa_scsi_do_simple_cmd_with_retry(h, c, > + PCI_DMA_TODEVICE, NO_TIMEOUT); > + if ((rc != 0) || (c->err_info->CommandStatus != 0)) > + goto errout; > + > + /* Now verify that it got set: */ > + if (fill_cmd(c, BMIC_SENSE_DIAG_OPTIONS, h, options, 4, 0, > + RAID_CTLR_LUNID, TYPE_CMD)) > + goto errout; > + > + rc = hpsa_scsi_do_simple_cmd_with_retry(h, c, > + PCI_DMA_FROMDEVICE, NO_TIMEOUT); > + if ((rc != 0) || (c->err_info->CommandStatus != 0)) > + goto errout; > + > + if (*options && HPSA_DIAG_OPTS_DISABLE_RLD_CACHING) > + goto out; > + else { > +errout: > + dev_err(>pdev->dev, > + "Error: failed to disable report lun data caching.\n"); > + } I agree with the simplification suggested by Tomas Henzl. Reviewed-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com> > +out: > + cmd_free(h, c); > + kfree(options); > +} > + > static void hpsa_shutdown(struct pci_dev *pdev) > { > struct ctlr_info *h; > diff --git a/drivers/scsi/hpsa_cmd.h b/drivers/scsi/hpsa_cmd.h > index c83eaf1..4910344 100644 > --- a/drivers/scsi/hpsa_cmd.h > +++ b/drivers/scsi/hpsa_cmd.h > @@ -287,6 +287,9 @@ struct SenseSubsystem_info { > #define BMIC_SENSE_CONTROLLER_PARAMETERS 0x64 > #define BMIC_IDENTIFY_PHYSICAL_DEVICE 0x15 > #define BMIC_IDENTIFY_CONTROLLER 0x11 > +#define BMIC_SET_DIAG_OPTIONS 0xF4 > +#define BMIC_SENSE_DIAG_OPTIONS 0xF5 > +#define HPSA_DIAG_OPTS_DISABLE_RLD_CACHING 0x4000 > > /* Command List Structure */ > union SCSI3Addr { > > -- > To unsubscribe from this list: send the line "unsubscribe linux-scsi" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html