Re: [PATCH] scsi: cxlflash: fix assignment of the backend operations

2018-07-05 Thread Matthew R. Ochs
On Thu, Jul 05, 2018 at 07:26:23AM +0200, Cédric Le Goater wrote:
> commit cd43c221bb5e ("scsi: cxlflash: Isolate external module
> dependencies") introduced the use of ifdefs to avoid compilation
> errors when one of the possible backend driver, CXL or OCXL, is not
> compiled.
> 
> Unfortunately, the wrong defines are used and the backend ops are
> never assigned, leading to a kernel crash in any case when the
> cxlflash module is loaded.
> 
> Signed-off-by: Cédric Le Goater 

Odd, I know Uma tested this but perhaps not with a clean build...

Thanks for catching this Cedric!

Acked-by: Matthew R. Ochs 



Re: [PATCH 2/6] scsi: cxlflash: Drop unused sense buffers

2018-05-22 Thread Matthew R. Ochs
On Tue, May 22, 2018 at 11:15:08AM -0700, Kees Cook wrote:
> This removes the unused sense buffer in read_cap16() and write_same16().
> 
> Signed-off-by: Kees Cook <keesc...@chromium.org>

Looks good.

Acked-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com>



Re: [PATCH 7/7] cxlflash: Isolate external module dependencies

2018-05-16 Thread Matthew R. Ochs
On Fri, May 11, 2018 at 02:06:19PM -0500, Uma Krishnan wrote:
> Depending on the underlying transport, cxlflash has a dependency on either
> the CXL or OCXL drivers, which are enabled via their Kconfig option.
> Instead of having a module wide dependency on these config options, it is
> better to isolate the object modules that are dependent on the CXL and OCXL
> drivers and adjust the module dependencies accordingly.
> 
> This commit isolates the object files that are dependent on CXL and/or
> OCXL. The cxl/ocxl fops used in the core driver are tucked under an ifdef
> to avoid compilation errors.
> 
> Signed-off-by: Uma Krishnan <ukri...@linux.vnet.ibm.com>

Acked-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com>



Re: [PATCH 6/7] cxlflash: Abstract hardware dependent assignments

2018-05-16 Thread Matthew R. Ochs
On Fri, May 11, 2018 at 02:06:05PM -0500, Uma Krishnan wrote:
> As a staging cleanup to support transport specific builds of the cxlflash
> module, relocate device dependent assignments to header files. This will
> avoid littering the core driver with conditional compilation logic.
> 
> Signed-off-by: Uma Krishnan <ukri...@linux.vnet.ibm.com>

Acked-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com>



Re: [PATCH 5/7] cxlflash: Add include guards to backend.h

2018-05-16 Thread Matthew R. Ochs
On Fri, May 11, 2018 at 02:05:51PM -0500, Uma Krishnan wrote:
> The new header file, backend.h, that was recently added is missing
> the include guards. This commit adds the guards.
> 
> Signed-off-by: Uma Krishnan <ukri...@linux.vnet.ibm.com>

Acked-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com>



Re: [PATCH 3/7] cxlflash: Acquire semaphore before invoking ioctl services

2018-05-16 Thread Matthew R. Ochs
On Fri, May 11, 2018 at 02:05:22PM -0500, Uma Krishnan wrote:
> When a superpipe process that makes use of virtual LUNs is terminated or
> killed abruptly, there is a possibility that the cxlflash driver could
> hang and deprive other operations on the adapter.
> 
> The release fop registered to be invoked on a context close, detaches
> every LUN associated with the context. The underlying service to detach
> the LUN assumes it has been called with the read semaphore held, and
> releases the semaphore before any operation that could be time consuming.
> 
> When invoked without holding the read semaphore, an opportunity is created
> for the semaphore's count to become negative when it is temporarily
> released during one of these potential lengthy operations. This negative
> count results in subsequent acquisition attempts taking forever, leading
> to the hang.
> 
> To support the current design point of holding the semaphore on the
> ioctl() paths, the release fop should acquire it before invoking any
> ioctl services.
> 
> Signed-off-by: Uma Krishnan <ukri...@linux.vnet.ibm.com>

Acked-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com>



Re: [PATCH 2/7] cxlflash: Limit the debug logs in the IO path

2018-05-16 Thread Matthew R. Ochs
On Fri, May 11, 2018 at 02:05:08PM -0500, Uma Krishnan wrote:
> The kernel log can get filled with debug messages from send_cmd_ioarrin()
> when dynamic debug is enabled for the cxlflash module and there is a lot
> of legacy I/O traffic.
> 
> While these messages are necessary to debug issues that involve command
> tracking, the abundance of data can overwrite other useful data in the
> log. The best option available is to limit the messages that should
> serve most of the common use cases.
> 
> Signed-off-by: Uma Krishnan <ukri...@linux.vnet.ibm.com>

Acked-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com>



Re: [PATCH 1/7] cxlflash: Yield to active send threads

2018-05-16 Thread Matthew R. Ochs
On Fri, May 11, 2018 at 02:04:46PM -0500, Uma Krishnan wrote:
> The following Oops may be encountered if the device is reset, i.e. EEH
> recovery, while there is heavy I/O traffic:
> 
> 59:mon> t
> [c000200db64bb680] c00809264c40 cxlflash_queuecommand+0x3b8/0x500
>   [cxlflash]
> [c000200db64bb770] c090d3b0 scsi_dispatch_cmd+0x130/0x2f0
> [c000200db64bb7f0] c090fdd8 scsi_request_fn+0x3c8/0x8d0
> [c000200db64bb900] c067f528 __blk_run_queue+0x68/0xb0
> [c000200db64bb930] c067ab80 __elv_add_request+0x140/0x3c0
> [c000200db64bb9b0] c068daac blk_execute_rq_nowait+0xec/0x1a0
> [c000200db64bba00] c068dbb0 blk_execute_rq+0x50/0xe0
> [c000200db64bba50] c06b2040 sg_io+0x1f0/0x520
> [c000200db64bbaf0] c06b2e94 scsi_cmd_ioctl+0x534/0x610
> [c000200db64bbc20] c0926208 sd_ioctl+0x118/0x280
> [c000200db64bbcc0] c069f7ac blkdev_ioctl+0x7fc/0xe30
> [c000200db64bbd20] c0439204 block_ioctl+0x84/0xa0
> [c000200db64bbd40] c03f8514 do_vfs_ioctl+0xd4/0xa00
> [c000200db64bbde0] c03f8f04 SyS_ioctl+0xc4/0x130
> [c000200db64bbe30] c000b184 system_call+0x58/0x6c
> 
> When there is no room to send the I/O request, the cached room is refreshed
> by reading the memory mapped command room value from the AFU. The AFU
> register mapping is refreshed during a reset, creating a race condition
> that can lead to the Oops above.
> 
> During a device reset, the AFU should not be unmapped until all the active
> send threads quiesce. An atomic counter, cmds_active, is currently used to
> track internal AFU commands and quiesce during reset. This same counter can
> also be used for the active send threads.
> 
> Signed-off-by: Uma Krishnan <ukri...@linux.vnet.ibm.com>

Acked-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com>



Re: [PATCH v3 41/41] cxlflash: Handle spurious interrupts

2018-03-28 Thread Matthew R. Ochs
On Mon, Mar 26, 2018 at 11:35:42AM -0500, Uma Krishnan wrote:
> The following Oops can occur when there is heavy I/O traffic and the host
> is reset by a tool such as sg_reset.
> 
> [c000200fff3fbc90] c0081690117c process_cmd_doneq+0x104/0x500
>[cxlflash] (unreliable)
> [c000200fff3fbd80] c00816901648 cxlflash_rrq_irq+0xd0/0x150 [cxlflash]
> [c000200fff3fbde0] c0193130 __handle_irq_event_percpu+0xa0/0x310
> [c000200fff3fbea0] c01933d8 handle_irq_event_percpu+0x38/0x90
> [c000200fff3fbee0] c0193494 handle_irq_event+0x64/0xb0
> [c000200fff3fbf10] c0198ea0 handle_fasteoi_irq+0xc0/0x230
> [c000200fff3fbf40] c019182c generic_handle_irq+0x4c/0x70
> [c000200fff3fbf60] c001794c __do_irq+0x7c/0x1c0
> [c000200fff3fbf90] c002a390 call_do_irq+0x14/0x24
> [c000200e5828fab0] c0017b2c do_IRQ+0x9c/0x130
> [c000200e5828fb00] c0009b04 h_virt_irq_common+0x114/0x120
> 
> When a context is reset, the pending commands are flushed and the AFU
> is notified. Before the AFU handles this request there could be command
> completion interrupts queued to PHB which are yet to be delivered to the
> context. In this scenario, a context could receive an interrupt for a
> command that has been flushed, leading to a possible crash when the memory
> for the flushed command is accessed.
> 
> To resolve this problem, a boolean will indicate if the hardware queue is
> ready to process interrupts or not. This can be evaluated in the interrupt
> handler before proessing an interrupt.
> 
> Signed-off-by: Uma Krishnan <ukri...@linux.vnet.ibm.com>

Acked-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com>



Re: [PATCH v3 40/41] cxlflash: Remove commmands from pending list on timeout

2018-03-28 Thread Matthew R. Ochs
On Mon, Mar 26, 2018 at 11:35:34AM -0500, Uma Krishnan wrote:
> The following Oops can occur if an internal command sent to the AFU does
> not complete within the timeout:
> 
> [c00ff101b810] c00816020d94 term_mc+0xfc/0x1b0 [cxlflash]
> [c00ff101b8a0] c00816020fb0 term_afu+0x168/0x280 [cxlflash]
> [c00ff101b930] c008160232ec cxlflash_pci_error_detected+0x184/0x230
>[cxlflash]
> [c00ff101b9e0] c0080d95d468 cxl_vphb_error_detected+0x90/0x150[cxl]
> [c00ff101ba20] c0080d95f27c cxl_pci_error_detected+0xa4/0x240 [cxl]
> [c00ff101bac0] c003eaf8 eeh_report_error+0xd8/0x1b0
> [c00ff101bb20] c003d0b8 eeh_pe_dev_traverse+0x98/0x170
> [c00ff101bbb0] c003f438 eeh_handle_normal_event+0x198/0x580
> [c00ff101bc60] c003fba4 eeh_handle_event+0x2a4/0x338
> [c00ff101bd10] c00400b8 eeh_event_handler+0x1f8/0x200
> [c00ff101bdc0] c013da48 kthread+0x1a8/0x1b0
> [c00ff101be30] c000b528 ret_from_kernel_thread+0x5c/0xb4
> 
> When an internal command times out, the command buffer is freed while it
> is still in the pending commands list of the context. This corrupts the
> list and when the context is cleaned up, a crash is encountered.
> 
> To resolve this issue, when an AFU command or TMF command times out, the
> command should be deleted from the hardware queue pending command list
> before freeing the buffer.
> 
> Signed-off-by: Uma Krishnan <ukri...@linux.vnet.ibm.com>

Acked-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com>



Re: [PATCH v3 39/41] cxlflash: Synchronize reset and remove ops

2018-03-28 Thread Matthew R. Ochs
On Mon, Mar 26, 2018 at 11:35:27AM -0500, Uma Krishnan wrote:
> The following Oops can be encountered if a device removal or system
> shutdown is initiated while an EEH recovery is in process:
> 
> [c00ff2f479c0] c00815256f18 cxlflash_pci_slot_reset+0xa0/0x100
>   [cxlflash]
> [c00ff2f47a30] c0080dae22e0 cxl_pci_slot_reset+0x168/0x290 [cxl]
> [c00ff2f47ae0] c003ef1c eeh_report_reset+0xec/0x170
> [c00ff2f47b20] c003d0b8 eeh_pe_dev_traverse+0x98/0x170
> [c00ff2f47bb0] c003f80c eeh_handle_normal_event+0x56c/0x580
> [c00ff2f47c60] c003fba4 eeh_handle_event+0x2a4/0x338
> [c00ff2f47d10] c00400b8 eeh_event_handler+0x1f8/0x200
> [c00ff2f47dc0] c013da48 kthread+0x1a8/0x1b0
> [c00ff2f47e30] c000b528 ret_from_kernel_thread+0x5c/0xb4
> 
> The remove handler frees AFU memory while the EEH recovery is in progress,
> leading to a race condition. This can result in a crash if the recovery
> thread tries to access this memory.
> 
> To resolve this issue, the cxlflash remove handler will evaluate the
> device state and yield to any active reset or probing threads.
> 
> Signed-off-by: Uma Krishnan <ukri...@linux.vnet.ibm.com>

Looks good!

Acked-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com>



Re: [PATCH 00/38] cxlflash: OpenCXL transport support

2018-02-26 Thread Matthew R. Ochs
On Thu, Feb 22, 2018 at 04:20:10PM -0600, Uma Krishnan wrote:
> This patch series adds OpenCXL support to the cxlflash driver. With
> this support, new devices using the OpenCXL transport will be supported
> by the cxlflash driver along with the existing CXL devices. An effort is
> made to keep this transport specific function independent of the existing
> core driver that communicates with the AFU.
> 
> The first three patches contain a minor fix and staging improvements.
> 
> This series is intended for 4.17 and is bisectable.

I'm fine with the entire series and agree that the OpenCXL terminology
should be consistent throughout the kernel.

So...contingent upon a V2 with OpenCXL references being replaced by OCXL:

Acked-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com>

> 
> Matthew R. Ochs (1):
>   cxlflash: Avoid clobbering context control register value
> 
> Uma Krishnan (37):
>   cxlflash: Preserve number of interrupts for master contexts
>   cxlflash: Add argument identifier names
>   cxlflash: Introduce OpenCXL backend
>   cxlflash: Hardware AFU for OpenCXL
>   cxlflash: Read host function configuration
>   cxlflash: Setup function acTag range
>   cxlflash: Read host AFU configuration
>   cxlflash: Setup AFU acTag range
>   cxlflash: Setup AFU PASID
>   cxlflash: Adapter context support for OpenCXL
>   cxlflash: Use IDR to manage adapter contexts
>   cxlflash: Support adapter file descriptors for OpenCXL
>   cxlflash: Support adapter context discovery
>   cxlflash: Support image reload policy modification
>   cxlflash: MMIO map the AFU
>   cxlflash: Support starting an adapter context
>   cxlflash: Support process specific mappings
>   cxlflash: Support AFU state toggling
>   cxlflash: Support reading adapter VPD data
>   cxlflash: Setup function OpenCXL link
>   cxlflash: Setup OpenCXL transaction layer
>   cxlflash: Support process element lifecycle
>   cxlflash: Support AFU interrupt management
>   cxlflash: Support AFU interrupt mapping and registration
>   cxlflash: Support starting user contexts
>   cxlflash: Support adapter context polling
>   cxlflash: Support adapter context reading
>   cxlflash: Support adapter context mmap and release
>   cxlflash: Support file descriptor mapping
>   cxlflash: Introduce object handle fop
>   cxlflash: Setup LISNs for user contexts
>   cxlflash: Setup LISNs for master contexts
>   cxlflash: Update synchronous interrupt status bits
>   cxlflash: Introduce OCXL context state machine
>   cxlflash: Register for translation errors
>   cxlflash: Support AFU reset
>   cxlflash: Enable OpenCXL operations
> 
>  drivers/scsi/cxlflash/Kconfig |2 +-
>  drivers/scsi/cxlflash/Makefile|2 +-
>  drivers/scsi/cxlflash/backend.h   |   50 +-
>  drivers/scsi/cxlflash/common.h|   10 +-
>  drivers/scsi/cxlflash/cxl_hw.c|   13 +
>  drivers/scsi/cxlflash/main.c  |   55 +-
>  drivers/scsi/cxlflash/main.h  |1 +
>  drivers/scsi/cxlflash/ocxl_hw.c   | 1428 
> +
>  drivers/scsi/cxlflash/ocxl_hw.h   |   76 ++
>  drivers/scsi/cxlflash/sislite.h   |   41 +-
>  drivers/scsi/cxlflash/superpipe.c |   14 +
>  11 files changed, 1644 insertions(+), 48 deletions(-)
>  create mode 100644 drivers/scsi/cxlflash/ocxl_hw.c
>  create mode 100644 drivers/scsi/cxlflash/ocxl_hw.h
> 
> -- 
> 2.1.0
> 



Re: [PATCH] scsi: cxlflash: Select SCSI_SCAN_ASYNC

2018-02-20 Thread Matthew R. Ochs
On Tue, Feb 20, 2018 at 07:56:35PM +1100, Michael Ellerman wrote:
> Vaibhav Jain  writes:
> 
> > The cxlflash driver uses "Asynchronous SCSI scanning" enabled by
> > CONFIG_SCSI_SCAN_ASYNC. Without this enabled the modprobe of cxlflash
> > module gets hung with following backtrace:
> >
> > Call Trace:
> >  __switch_to+0x2cc/0x470
> >  __schedule+0x288/0xab0
> >  schedule+0x40/0xc0
> >  schedule_timeout+0x254/0x4f0
> >  wait_for_common+0xdc/0x260
> >  flush_work+0x140/0x2a0
> >  work_on_cpu+0x88/0xb0
> >  pci_device_probe+0x1d0/0x220
> >  driver_probe_device+0x408/0x5b0
> >  __driver_attach+0x16c/0x1a0
> >  bus_for_each_dev+0xb8/0x110
> >  driver_attach+0x3c/0x60
> >  bus_add_driver+0x1d8/0x370
> >  driver_register+0x9c/0x180
> >  __pci_register_driver+0x74/0xa0
> >  init_cxlflash+0x158/0x1cc
> >  do_one_initcall+0x68/0x1e0
> >  do_init_module+0x90/0x254
> >  load_module+0x2f8c/0x3720
> >  SyS_finit_module+0xcc/0x140
> >  system_call+0x58/0x6c
> 
> Why does it "hang"? That's kind of bizarre, I would expect either a
> build or runtime failure if a feature the driver requires is missing.
> 

It hangs due to a bug in the driver. I briefly looked at it several
months back before getting distracted with other items. IIRC there
was an issue with the state machine.

> > diff --git a/drivers/scsi/cxlflash/Kconfig b/drivers/scsi/cxlflash/Kconfig
> > index a011c5dbf214..f054c1b0fff3 100644
> > --- a/drivers/scsi/cxlflash/Kconfig
> > +++ b/drivers/scsi/cxlflash/Kconfig
> > @@ -6,6 +6,7 @@ config CXLFLASH
> > tristate "Support for IBM CAPI Flash"
> > depends on PCI && SCSI && CXL && EEH
> > select IRQ_POLL
> > +   select SCSI_SCAN_ASYNC
> 
> It's user configurable, so it's rude to select it. It can also be
> disabled on the kernel command line, so this seems like a fragile
> solution.
>
I think Vaibhav's intention here was to avoid the hang while the bug
is still present - I believe he has encountered it several times recently.

The proper solution would be to fix the bug.



Re: [PATCH 5/6] cxlflash: Adapter context init can return error

2018-01-07 Thread Matthew R. Ochs
On Wed, Jan 03, 2018 at 04:55:04PM -0600, Uma Krishnan wrote:
> Adapter context creation can return either NULL or an error pointer.
> Updating the check condition to reflect this.
> 
> Signed-off-by: Uma Krishnan <ukri...@linux.vnet.ibm.com>

Acked-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com>



Re: [PATCH 2/6] cxlflash: Update cxl-specific arguments to generic cookie

2018-01-07 Thread Matthew R. Ochs
On Wed, Jan 03, 2018 at 04:54:25PM -0600, Uma Krishnan wrote:
> Convert cxl-specific pointers to generic cookies to facilitate future
> enhancements.
> 
> Signed-off-by: Uma Krishnan <ukri...@linux.vnet.ibm.com>

Acked-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com>



Re: [PATCH 1/6] cxlflash: Reset command ioasc

2018-01-07 Thread Matthew R. Ochs
On Thu, Jan 04, 2018 at 05:33:48PM +1100, Andrew Donnellan wrote:
> On 04/01/18 09:54, Uma Krishnan wrote:
> >In the event of a command failure, cxlflash returns the failure to the
> >upper layers to process. After processing the error, when the command is
> >queued again, the private command structure will not be zeroed and the
> >ioasc could be stale. Per the SISLite specification, the AFU only sets the
> >ioasc in the presence of a failure. Thus, even though the original command
> >succeeds the second time, the command is considered a failure due to stale
> >ioasc. This cycle repeats indefinitely and can cause a hang or IO failure.
> >
> >To fix the issue, clear the ioasc before queuing any command.
> >
> >Fixes: 479ad8e9d48c ("scsi: cxlflash: Remove zeroing of private command
> >data")
> >Signed-off-by: Uma Krishnan 
> 
> Should this go to stable?

Not a bad idea.



Re: [PATCH 1/6] cxlflash: Reset command ioasc

2018-01-07 Thread Matthew R. Ochs
On Wed, Jan 03, 2018 at 04:54:02PM -0600, Uma Krishnan wrote:
> In the event of a command failure, cxlflash returns the failure to the
> upper layers to process. After processing the error, when the command is
> queued again, the private command structure will not be zeroed and the
> ioasc could be stale. Per the SISLite specification, the AFU only sets the
> ioasc in the presence of a failure. Thus, even though the original command
> succeeds the second time, the command is considered a failure due to stale
> ioasc. This cycle repeats indefinitely and can cause a hang or IO failure.
> 
> To fix the issue, clear the ioasc before queuing any command.
> 
> Fixes: 479ad8e9d48c ("scsi: cxlflash: Remove zeroing of private command
> data")
> Signed-off-by: Uma Krishnan <ukri...@linux.vnet.ibm.com>

Acked-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com>



Re: [PATCH] scsi: cxlflash: Fix an error handling path in 'cxlflash_disk_attach()'

2017-08-15 Thread Matthew R. Ochs
> On Aug 15, 2017, at 3:18 PM, Christophe JAILLET 
> <christophe.jail...@wanadoo.fr> wrote:
> 
> 'rc' is known to be 0 at this point.
> If 'create_context()' fails, returns -ENOMEM instead of 0 which means
> success.
> 
> Signed-off-by: Christophe JAILLET <christophe.jail...@wanadoo.fr>

Yep, that's a bug.

Acked-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com>



Re: [PATCH] cxlflash: return -EFAULT if copy_from_user() fails

2017-07-05 Thread Matthew R. Ochs

> On Jun 30, 2017, at 3:01 AM, Dan Carpenter <dan.carpen...@oracle.com> wrote:
> 
> The copy_from/to_user() functions return the number of bytes remaining
> to be copied but we had intended to return -EFAULT here.
> 
> Fixes: bc88ac47d5cb ("scsi: cxlflash: Support AFU debug")
> Signed-off-by: Dan Carpenter <dan.carpen...@oracle.com>

Good catch Dan!

Acked-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com>



[PATCH 3/3] cxlflash: Update debug prints in reset handlers

2017-06-28 Thread Matthew R. Ochs
The device and host reset handler contain debug prints to help
identify the entities being reset. Today these reset handlers
are based on a SCSI EH design that uses a SCSI command reference
as a means of identifying the target entity. As such, the debug
trace includes the SCSI command pointer and associated CDB. This
is not necessary as the SCSI command is simply the messenger in
these scenarios.

Refactor the debug prints in the host and reset handlers to only
present information that is applicable given the function scope.

Signed-off-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com>
---
 drivers/scsi/cxlflash/main.c | 18 +++---
 1 file changed, 3 insertions(+), 15 deletions(-)

diff --git a/drivers/scsi/cxlflash/main.c b/drivers/scsi/cxlflash/main.c
index 7054e11..077f62e 100644
--- a/drivers/scsi/cxlflash/main.c
+++ b/drivers/scsi/cxlflash/main.c
@@ -2435,14 +2435,8 @@ static int cxlflash_eh_device_reset_handler(struct 
scsi_cmnd *scp)
struct device *dev = >dev->dev;
int rcr = 0;
 
-   dev_dbg(dev, "%s: (scp=%p) %d/%d/%d/%llu "
-   "cdb=(%08x-%08x-%08x-%08x)\n", __func__, scp, host->host_no,
-   sdev->channel, sdev->id, sdev->lun,
-   get_unaligned_be32(&((u32 *)scp->cmnd)[0]),
-   get_unaligned_be32(&((u32 *)scp->cmnd)[1]),
-   get_unaligned_be32(&((u32 *)scp->cmnd)[2]),
-   get_unaligned_be32(&((u32 *)scp->cmnd)[3]));
-
+   dev_dbg(dev, "%s: %d/%d/%d/%llu\n", __func__,
+   host->host_no, sdev->channel, sdev->id, sdev->lun);
 retry:
switch (cfg->state) {
case STATE_NORMAL:
@@ -2483,13 +2477,7 @@ static int cxlflash_eh_host_reset_handler(struct 
scsi_cmnd *scp)
struct cxlflash_cfg *cfg = shost_priv(host);
struct device *dev = >dev->dev;
 
-   dev_dbg(dev, "%s: (scp=%p) %d/%d/%d/%llu "
-   "cdb=(%08x-%08x-%08x-%08x)\n", __func__, scp, host->host_no,
-   scp->device->channel, scp->device->id, scp->device->lun,
-   get_unaligned_be32(&((u32 *)scp->cmnd)[0]),
-   get_unaligned_be32(&((u32 *)scp->cmnd)[1]),
-   get_unaligned_be32(&((u32 *)scp->cmnd)[2]),
-   get_unaligned_be32(&((u32 *)scp->cmnd)[3]));
+   dev_dbg(dev, "%s: %d\n", __func__, host->host_no);
 
switch (cfg->state) {
case STATE_NORMAL:
-- 
2.1.0



[PATCH 2/3] cxlflash: Update send_tmf() parameters

2017-06-28 Thread Matthew R. Ochs
The current send_tmf() implementation is based on the caller providing
a SCSI command reference. In reality all that is needed is a SCSI device
reference as the routine uses a private command.

Refactor send_tmf() to pass the private adapter configuration reference
and a SCSI device reference. As a nice side effect, this will ease the
burden of converting caller routines to be based solely off of a SCSI
device reference.

Signed-off-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com>
---
 drivers/scsi/cxlflash/main.c | 27 +--
 1 file changed, 13 insertions(+), 14 deletions(-)

diff --git a/drivers/scsi/cxlflash/main.c b/drivers/scsi/cxlflash/main.c
index 455564f..7054e11 100644
--- a/drivers/scsi/cxlflash/main.c
+++ b/drivers/scsi/cxlflash/main.c
@@ -459,21 +459,20 @@ static u32 cmd_to_target_hwq(struct Scsi_Host *host, 
struct scsi_cmnd *scp,
 
 /**
  * send_tmf() - sends a Task Management Function (TMF)
- * @afu:   AFU to checkout from.
- * @scp:   SCSI command from stack describing target.
+ * @cfg:   Internal structure associated with the host.
+ * @sdev:  SCSI device destined for TMF.
  * @tmfcmd:TMF command to send.
  *
  * Return:
  * 0 on success, SCSI_MLQUEUE_HOST_BUSY or -errno on failure
  */
-static int send_tmf(struct afu *afu, struct scsi_cmnd *scp, u64 tmfcmd)
+static int send_tmf(struct cxlflash_cfg *cfg, struct scsi_device *sdev,
+   u64 tmfcmd)
 {
-   struct Scsi_Host *host = scp->device->host;
-   struct cxlflash_cfg *cfg = shost_priv(host);
+   struct afu *afu = cfg->afu;
struct afu_cmd *cmd = NULL;
struct device *dev = >dev->dev;
-   int hwq_index = cmd_to_target_hwq(host, scp, afu);
-   struct hwq *hwq = get_hwq(afu, hwq_index);
+   struct hwq *hwq = get_hwq(afu, PRIMARY_HWQ);
char *buf = NULL;
ulong lock_flags;
int rc = 0;
@@ -500,12 +499,12 @@ static int send_tmf(struct afu *afu, struct scsi_cmnd 
*scp, u64 tmfcmd)
 
cmd->parent = afu;
cmd->cmd_tmf = true;
-   cmd->hwq_index = hwq_index;
+   cmd->hwq_index = hwq->index;
 
cmd->rcb.ctx_id = hwq->ctx_hndl;
cmd->rcb.msi = SISL_MSI_RRQ_UPDATED;
-   cmd->rcb.port_sel = CHAN2PORTMASK(scp->device->channel);
-   cmd->rcb.lun_id = lun_to_lunid(scp->device->lun);
+   cmd->rcb.port_sel = CHAN2PORTMASK(sdev->channel);
+   cmd->rcb.lun_id = lun_to_lunid(sdev->lun);
cmd->rcb.req_flags = (SISL_REQ_FLAGS_PORT_LUN_ID |
  SISL_REQ_FLAGS_SUP_UNDERRUN |
  SISL_REQ_FLAGS_TMF_CMD);
@@ -2430,15 +2429,15 @@ static int cxlflash_eh_abort_handler(struct scsi_cmnd 
*scp)
 static int cxlflash_eh_device_reset_handler(struct scsi_cmnd *scp)
 {
int rc = SUCCESS;
-   struct Scsi_Host *host = scp->device->host;
+   struct scsi_device *sdev = scp->device;
+   struct Scsi_Host *host = sdev->host;
struct cxlflash_cfg *cfg = shost_priv(host);
struct device *dev = >dev->dev;
-   struct afu *afu = cfg->afu;
int rcr = 0;
 
dev_dbg(dev, "%s: (scp=%p) %d/%d/%d/%llu "
"cdb=(%08x-%08x-%08x-%08x)\n", __func__, scp, host->host_no,
-   scp->device->channel, scp->device->id, scp->device->lun,
+   sdev->channel, sdev->id, sdev->lun,
get_unaligned_be32(&((u32 *)scp->cmnd)[0]),
get_unaligned_be32(&((u32 *)scp->cmnd)[1]),
get_unaligned_be32(&((u32 *)scp->cmnd)[2]),
@@ -2447,7 +2446,7 @@ static int cxlflash_eh_device_reset_handler(struct 
scsi_cmnd *scp)
 retry:
switch (cfg->state) {
case STATE_NORMAL:
-   rcr = send_tmf(afu, scp, TMF_LUN_RESET);
+   rcr = send_tmf(cfg, sdev, TMF_LUN_RESET);
if (unlikely(rcr))
rc = FAILED;
break;
-- 
2.1.0



[PATCH 1/3] cxlflash: Avoid double free of character device

2017-06-28 Thread Matthew R. Ochs
The device_unregister() service used when cleaning up the character
device is already responsible for the internal state associated with
the device upon successful creation. As the cxlflash driver does not
obtain a second reference to the character device, the explicit call
to put_device() is not required and can lead to an inconsistent sysfs
among other issues as the reference is no longer valid after the first
put_device() is performed.

Remove the unnecessary put_device() to remedy this issue.

Fixes: a834a36b57d9 ("scsi: cxlflash: Create character device to provide host 
management interface")
Signed-off-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com>
---
 drivers/scsi/cxlflash/main.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/scsi/cxlflash/main.c b/drivers/scsi/cxlflash/main.c
index 7a787b6..455564f 100644
--- a/drivers/scsi/cxlflash/main.c
+++ b/drivers/scsi/cxlflash/main.c
@@ -923,7 +923,6 @@ static void cxlflash_put_minor(int minor)
  */
 static void cxlflash_release_chrdev(struct cxlflash_cfg *cfg)
 {
-   put_device(cfg->chardev);
device_unregister(cfg->chardev);
cfg->chardev = NULL;
cdev_del(>cdev);
-- 
2.1.0



[PATCH 0/3] cxlflash: Minor fix and EH refactoring

2017-06-28 Thread Matthew R. Ochs
This small series fixes a recently injected double free and also includes
two patches to refactor the host and reset handlers to ease the burden of
making this driver compatible with an updated SCSI EH framework.

This series is intended for 4.13 and is bisectable.

Matthew R. Ochs (3):
  cxlflash: Avoid double free of character device
  cxlflash: Update send_tmf() parameters
  cxlflash: Update debug prints in reset handlers

 drivers/scsi/cxlflash/main.c | 44 +++-
 1 file changed, 15 insertions(+), 29 deletions(-)

-- 
2.1.0



Re: [PATCH 33/47] cxlflash: use dedicated reset command in send_tmf()

2017-06-28 Thread Matthew R. Ochs
Hi Hannes,

We actually just reworked these paths in a series that
was pulled in on Monday. While testing, I came across a
bug in that series and was planning on sending out a fix
for it. I'll include some patches that will ease the burden
of incorporating your EH updates for cxlflash.

Many thanks for leading this effort, it really helps to clean
up the SCSI EH paths! =)


-matt

> On Jun 28, 2017, at 3:32 AM, Hannes Reinecke  wrote:
> 
> From: Hannes Reinecke 
> 
> Reduce the queue depth by 1, and use this command as a dedicated
> reset command send_tmf().
> 
> Signed-off-by: Hannes Reinecke 
> ---
> drivers/scsi/cxlflash/common.h |  3 ++-
> drivers/scsi/cxlflash/main.c   | 23 ---
> 2 files changed, 18 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/scsi/cxlflash/common.h b/drivers/scsi/cxlflash/common.h
> index 6d95e8e..b2b3bdc 100644
> --- a/drivers/scsi/cxlflash/common.h
> +++ b/drivers/scsi/cxlflash/common.h
> @@ -54,7 +54,8 @@
> 
> /* Command management definitions */
> #define CXLFLASH_MAX_CMDS   256
> -#define CXLFLASH_MAX_CMDS_PER_LUN   CXLFLASH_MAX_CMDS
> +#define CXLFLASH_MAX_CMDS_PER_LUN   CXLFLASH_MAX_CMDS - 1
> +#define CXLFLASH_RESET_CMD  255
> 
> /* RRQ for master issued cmds */
> #define NUM_RRQ_ENTRY   CXLFLASH_MAX_CMDS
> diff --git a/drivers/scsi/cxlflash/main.c b/drivers/scsi/cxlflash/main.c
> index 462e8fc..b33e3e7 100644
> --- a/drivers/scsi/cxlflash/main.c
> +++ b/drivers/scsi/cxlflash/main.c
> @@ -23,6 +23,7 @@
> 
> #include 
> #include 
> +#include 
> #include 
> 
> #include "main.h"
> @@ -460,15 +461,16 @@ static u32 cmd_to_target_hwq(struct Scsi_Host *host, 
> struct scsi_cmnd *scp,
> /**
>  * send_tmf() - sends a Task Management Function (TMF)
>  * @afu:  AFU to checkout from.
> - * @scp: SCSI command from stack describing target.
> + * @sdev:SCSI device to reset.
>  * @tmfcmd:   TMF command to send.
>  *
>  * Return:
>  *0 on success, SCSI_MLQUEUE_HOST_BUSY or -errno on failure
>  */
> -static int send_tmf(struct afu *afu, struct scsi_cmnd *scp, u64 tmfcmd)
> +static int send_tmf(struct afu *afu, struct scsi_device *sdev, u64 tmfcmd)
> {
> - struct Scsi_Host *host = scp->device->host;
> + struct scsi_cmnd *scp;
> + struct Scsi_Host *host = sdev->host;
>   struct cxlflash_cfg *cfg = shost_priv(host);
>   struct afu_cmd *cmd = NULL;
>   struct device *dev = >dev->dev;
> @@ -498,14 +500,21 @@ static int send_tmf(struct afu *afu, struct scsi_cmnd 
> *scp, u64 tmfcmd)
>   cfg->tmf_active = true;
>   spin_unlock_irqrestore(>tmf_slock, lock_flags);
> 
> + scp = scsi_host_find_tag(host, CXLFLASH_RESET_CMD);
> + scp->device = sdev;
> + cmd = sc_to_afucz(scp);
> + hwq_index = cmd_to_target_hwq(host, scp, afu);
> + hwq = get_hwq(afu, hwq_index);
> +
> + cmd->scp = scp;
>   cmd->parent = afu;
>   cmd->cmd_tmf = true;
>   cmd->hwq_index = hwq_index;
> 
>   cmd->rcb.ctx_id = hwq->ctx_hndl;
>   cmd->rcb.msi = SISL_MSI_RRQ_UPDATED;
> - cmd->rcb.port_sel = CHAN2PORTMASK(scp->device->channel);
> - cmd->rcb.lun_id = lun_to_lunid(scp->device->lun);
> + cmd->rcb.port_sel = CHAN2PORTMASK(sdev->channel);
> + cmd->rcb.lun_id = lun_to_lunid(sdev->lun);
>   cmd->rcb.req_flags = (SISL_REQ_FLAGS_PORT_LUN_ID |
> SISL_REQ_FLAGS_SUP_UNDERRUN |
> SISL_REQ_FLAGS_TMF_CMD);
> @@ -2448,7 +2457,7 @@ static int cxlflash_eh_device_reset_handler(struct 
> scsi_cmnd *scp)
> retry:
>   switch (cfg->state) {
>   case STATE_NORMAL:
> - rcr = send_tmf(afu, scp, TMF_LUN_RESET);
> + rcr = send_tmf(afu, sdev, TMF_LUN_RESET);
>   if (unlikely(rcr))
>   rc = FAILED;
>   break;
> @@ -3139,7 +3148,7 @@ static ssize_t mode_show(struct device *dev,
>   .eh_host_reset_handler = cxlflash_eh_host_reset_handler,
>   .change_queue_depth = cxlflash_change_queue_depth,
>   .cmd_per_lun = CXLFLASH_MAX_CMDS_PER_LUN,
> - .can_queue = CXLFLASH_MAX_CMDS,
> + .can_queue = CXLFLASH_MAX_CMDS - 1,
>   .cmd_size = sizeof(struct afu_cmd) + __alignof__(struct afu_cmd) - 1,
>   .this_id = -1,
>   .sg_tablesize = 1,  /* No scatter gather support */
> -- 
> 1.8.5.6
> 



Re: [PATCH 09/17] cxlflash: Create character device to provide host management interface

2017-06-22 Thread Matthew R. Ochs
> On Jun 21, 2017, at 9:15 PM, Uma Krishnan <ukri...@linux.vnet.ibm.com> wrote:
> 
> The cxlflash driver currently lacks host management interface. Future
> devices supported by cxlflash will provide a variety of host-wide
> management functions. Examples include LUN provisioning, hardware debug
> support, and firmware download.
> 
> In order to provide a way to manage the device, a character device will
> be created during probe of each adapter. This device will support a set of
> ioctls defined in the SISLite specification from which administrators can
> manage the adapter.
> 
> Signed-off-by: Uma Krishnan <ukri...@linux.vnet.ibm.com>

Acked-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com>



Re: [PATCH 08/17] cxlflash: Add scsi command abort handler

2017-06-22 Thread Matthew R. Ochs
> On Jun 21, 2017, at 9:15 PM, Uma Krishnan <ukri...@linux.vnet.ibm.com> wrote:
> 
> To date, CXL flash devices do not support a single command abort operation.
> Instead, the SISLite specification provides a context reset operation to
> cleanup all pending commands for a given context.
> 
> When a context reset is successful, it is guaranteed that the AFU has
> aborted all currently pending I/O. This sequence is less invasive than a
> device or host reset and can be executed to support scsi command abort
> requests. Add eh_abort_handler callback support to process command timeouts
> and abort requests.
> 
> Signed-off-by: Uma Krishnan <ukri...@linux.vnet.ibm.com>

Acked-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com>



Re: [PATCH 06/17] cxlflash: Track pending scsi commands in each hardware queue

2017-06-22 Thread Matthew R. Ochs
> On Jun 21, 2017, at 9:14 PM, Uma Krishnan <ukri...@linux.vnet.ibm.com> wrote:
> 
> Currently, there is no book keeping of the pending scsi commands in the
> cxlflash driver. This lack of tracking in-flight requests is too
> restrictive and requires a heavy-hammer reset each time an adapter error is
> encountered. Additionally, it does not allow for commands to be properly
> retried.
> 
> In order to avoid this problem and to better handle error path command
> cleanup, introduce a linked list for each hardware queue that tracks
> pending commands.
> 
> Signed-off-by: Uma Krishnan <ukri...@linux.vnet.ibm.com>

Acked-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com>



Re: [PATCH 07/17] cxlflash: Flush pending commands in cleanup path

2017-06-22 Thread Matthew R. Ochs
> On Jun 21, 2017, at 9:14 PM, Uma Krishnan <ukri...@linux.vnet.ibm.com> wrote:
> 
> When the AFU is reset in an error path, pending scsi commands can be
> silently dropped without completion or a formal abort. This puts the onus
> on the cxlflash driver to notify mid-layer and indicating that the command
> can be retried.
> 
> Once the card has been quiesced, the hardware send queue lock is acquired
> to prevent any data movement while the pending commands are processed.
> 
> Signed-off-by: Uma Krishnan <ukri...@linux.vnet.ibm.com>

Acked-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com>



Re: [PATCH 05/17] cxlflash: Handle AFU sync failures

2017-06-22 Thread Matthew R. Ochs
> On Jun 21, 2017, at 9:14 PM, Uma Krishnan <ukri...@linux.vnet.ibm.com> wrote:
> 
> AFU sync operations are not currently evaluated for failure. This is
> acceptable for paths where there is not a dependency on the AFU being
> consistent with the host. Examples include link reset events and LUN
> cleanup operations. On paths where there is a dependency, such as a LUN
> open, a sync failure should be acted upon.
> 
> In the event of AFU sync failures, either log or cleanup as appropriate for
> operations that are dependent on a successful sync completion.
> 
> Update documentation to reflect behavior in the event of an AFU sync
> failure.
> 
> Signed-off-by: Uma Krishnan <ukri...@linux.vnet.ibm.com>

Acked-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com>



Re: [PATCH 03/17] cxlflash: Reset hardware queue context via specified register

2017-06-22 Thread Matthew R. Ochs
> On Jun 21, 2017, at 9:14 PM, Uma Krishnan <ukri...@linux.vnet.ibm.com> wrote:
> 
> Per the SISLite specification, context_reset() writes 0x1 to the LSB of the
> reset register. When the AFU processes this reset request, it is expected
> to clear the bit after reset is complete. The current implementation simply
> checks that the entire value read back is not 1, instead of masking off the
> LSB and evaluating it for a change to 0. Should the AFU manipulate other
> bits during the reset (reading back a value of 0xF for example), successful
> completion will be prematurely indicated given the existing logic.
> 
> Additionally, in the event that the context reset operation fails, there
> does not currently exist a way to provide feedback to the initiator of the
> reset. This poses a problem for the rare case that a context reset fails as
> the caller will proceed on the assumption that all is well.
> 
> To remedy these issues, refactor the context reset routine to only mask off
> the LSB when evaluating for success and return status to the caller. Also
> update the context reset handler parameters to pass a hardware queue
> reference instead of a single command to better reflect that the entire
> queue associated with the context is impacted by the reset.
> 
> Signed-off-by: Uma Krishnan <ukri...@linux.vnet.ibm.com>

Acked-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com>



Re: [PATCH 04/17] cxlflash: Schedule asynchronous reset of the host

2017-06-22 Thread Matthew R. Ochs
> On Jun 21, 2017, at 9:14 PM, Uma Krishnan <ukri...@linux.vnet.ibm.com> wrote:
> 
> A context reset failure indicates the AFU is in a bad state. At present,
> when such a situation occurs, no further action is taken. This leaves the
> adapter in an unusable state with no recoverable actions.
> 
> To avoid this situation, context reset failures will be escalated to a host
> reset operation. This will be done asynchronously to allow the acting
> thread to return to the user with a failure.
> 
> Signed-off-by: Uma Krishnan <ukri...@linux.vnet.ibm.com>

Acked-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com>



Re: [PATCH 02/17] cxlflash: Update cxlflash_afu_sync() to return errno

2017-06-22 Thread Matthew R. Ochs
> On Jun 21, 2017, at 9:13 PM, Uma Krishnan <ukri...@linux.vnet.ibm.com> wrote:
> 
> The cxlflash_afu_sync() routine returns a negative one to indicate any kind
> of failure. This makes it impossible to establish why the error occurred.
> 
> Update the return codes to clearly indicate the failure cause to the
> caller.
> 
> Signed-off-by: Uma Krishnan <ukri...@linux.vnet.ibm.com>

Acked-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com>



Re: [PATCH 01/17] cxlflash: Combine the send queue locks

2017-06-22 Thread Matthew R. Ochs
> On Jun 21, 2017, at 9:13 PM, Uma Krishnan <ukri...@linux.vnet.ibm.com> wrote:
> 
> Currently there are separate spin locks for the two supported I/O queueing
> models. This makes it difficult to serialize with paths outside the enqueue
> path.
> 
> As a design simplification and to support serialization with enqueue
> operations, move to only a single lock that is used for enqueueing
> regardless of the queueing model.
> 
> Signed-off-by: Uma Krishnan <ukri...@linux.vnet.ibm.com>

Acked-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com>



Re: [PATCH] cxlflash: Select IRQ_POLL

2017-05-08 Thread Matthew R. Ochs
> On May 5, 2017, at 6:31 PM, Guenter Roeck <li...@roeck-us.net> wrote:
> 
> The driver now uses IRQ_POLL and needs to select it to avoid the following
> build error.
> 
> ERROR: ".irq_poll_complete" [drivers/scsi/cxlflash/cxlflash.ko] undefined!
> ERROR: ".irq_poll_sched" [drivers/scsi/cxlflash/cxlflash.ko] undefined!
> ERROR: ".irq_poll_disable" [drivers/scsi/cxlflash/cxlflash.ko] undefined!
> ERROR: ".irq_poll_init" [drivers/scsi/cxlflash/cxlflash.ko] undefined!
> 
> Fixes: cba06e6de403 ("scsi: cxlflash: Implement IRQ polling for RRQ 
> processing")
> Signed-off-by: Guenter Roeck <li...@roeck-us.net>

Acked-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com>



Re: [PATCH 15/17] cxlflash: Support multiple hardware queues

2017-04-13 Thread Matthew R. Ochs
> On Apr 12, 2017, at 2:15 PM, Uma Krishnan <ukri...@linux.vnet.ibm.com> wrote:
> 
> Introduce multiple hardware queues to improve legacy I/O path performance.
> Each hardware queue is comprised of a master context and associated I/O
> resources. The hardware queues are initially implemented as a static array
> embedded in the AFU. This will be transitioned to a dynamic allocation in a
> later series to improve the memory footprint of the driver.
> 
> Signed-off-by: Uma Krishnan <ukri...@linux.vnet.ibm.com>

Looks good!

Acked-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com>


Re: [PATCH 10/17] cxlflash: Fence EEH during probe

2017-04-13 Thread Matthew R. Ochs
> On Apr 13, 2017, at 1:27 AM, Andrew Donnellan <andrew.donnel...@au1.ibm.com> 
> wrote:
> 
> On 13/04/17 05:14, Uma Krishnan wrote:
>> From: "Matthew R. Ochs" <mro...@linux.vnet.ibm.com>
>> 
>> An EEH during probe can lead to a crash as the recovery thread races
>> with the probe thread. To avoid this issue, introduce new states to
>> fence out EEH recovery until probe has completed. Also ensure the reset
>> wait queue is flushed during device removal to avoid orphaned threads.
>> 
>> Signed-off-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com>
>> Signed-off-by: Uma Krishnan <ukri...@linux.vnet.ibm.com>
> 
> The use of the term "fence" in the commit message might be a bit confusing 
> given how the term is already used in the context of EEH :)

Hi Andrew,

I agree that it could be a bit misleading but think the commit message is
descriptive enough to avoid confusion. If you feel strongly about it I'm happy
to reword the commit.


-matt


Re: [PATCH 4/4] cxlflash: Cancel scheduled workers before stopping AFU

2017-01-06 Thread Matthew R. Ochs
> On Jan 6, 2017, at 8:05 AM, Uma Krishnan <ukri...@linux.vnet.ibm.com> wrote:
> 
> When processing an AFU asynchronous interrupt, if the action results in an
> operation that requires off level processing (a link reset for example),
> the worker thread is scheduled. In the meantime a reset event (i.e.: EEH)
> could unmap the AFU to recover. This results in an Oops when the worker
> thread tries to access the AFU mapping.
> 
> [c00f17e03b90] d7cd5978 cxlflash_worker_thread+0x268/0x550
> [c00f17e03c40] c011883c process_one_work+0x1dc/0x680
> [c00f17e03ce0] c0118e80 worker_thread+0x1a0/0x520
> [c00f17e03d80] c0126174 kthread+0xf4/0x100
> [c00f17e03e30] c000a47c ret_from_kernel_thread+0x5c/0xe0
> 
> In an effort to avoid this, a mapcount was introduced in
> commit b45cdbaf9f7f ("cxlflash: Resolve oops in wait_port_offline")
> but due to the race condition described above, this solution is incomplete.
> 
> In order to fully resolve this problem and to simplify things, this commit
> removes the mapcount solution. Instead, the scheduled worker thread is
> cancelled after interrupts have been disabled and prior to the mapping
> being freed.
> 
> Fixes: b45cdbaf9f7f ("cxlflash: Resolve oops in wait_port_offline")
> Signed-off-by: Uma Krishnan <ukri...@linux.vnet.ibm.com>

Looks good.

Acked-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com>

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 04/14] cxlflash: Avoid command room violation

2016-11-29 Thread Matthew R. Ochs
Uma,

This looks better, thanks for reworking.


-matt

> On Nov 28, 2016, at 6:41 PM, Uma Krishnan <ukri...@linux.vnet.ibm.com> wrote:
> 
> During test, a command room violation interrupt is occasionally seen
> for the master context when the CXL flash devices are stressed.
> 
> After studying the code, there could be gaps in the way command room
> value is being cached in cxlflash. When the cached command room is zero
> the thread attempting to send becomes burdened with updating the cached
> value with the actual value from the AFU. Today, this is handled with an
> atomic set operation of the raw value read. Following the atomic update,
> the thread proceeds to send.
> 
> This behavior is incorrect on two counts:
> 
>   - The update fails to take into account the current thread and its
> consumption of one of the hardware commands.
> 
>   - The update does not take into account other threads also atomically
> updating. Per design, a worker thread updates the cached value when a
> send thread times out. By not protecting the update with a lock, the
> cached value can be incorrectly clobbered.
> 
> To correct these issues, the update of the cached command room has been
> simplified and also protected using a spin lock which is held until the
> MMIO is complete. This ensures the command room is properly consumed by
> the same thread. Update of cached value also takes into account the
> current thread consuming a hardware command.
> 
> Signed-off-by: Uma Krishnan <ukri...@linux.vnet.ibm.com>

Acked-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com>

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 04/14] cxlflash: Avoid command room violation

2016-11-17 Thread Matthew R. Ochs
Hi Uma,

I do see a potential hang issue with this patch. See my comments below.


-matt
 
> On Nov 15, 2016, at 5:14 PM, Uma Krishnan  wrote:
> 
> During test, a command room violation interrupt is occasionally seen
> for the master context when the CXL flash devices are stressed.
> 
> After studying the code, there could be gaps in the way command room
> value is being cached in cxlflash. When the cached command room is zero
> the thread attempting to send becomes burdened with updating the cached
> value with the actual value from the AFU. Today, this is handled with
> an atomic set operation of the raw value read. Following the atomic
> update, the thread proceeds to send.
> 
> This behavior is incorrect on two counts:
> 
>   - The update fails to take into account the current thread and its
> consumption of one of the hardware commands.
> 
>   - The update does not take into account other threads also atomically
> updating. Per design, a worker thread updates the cached value when
> a send thread times out. By not performing an atomic compare/exchange,
> the cached value can be incorrectly clobbered.
> 
> To correct these issues, the runtime updates of the cached command room
> are updated to use atomic64_cmpxchg() and the send routine is updated to
> take into account the current thread consuming a hardware command.
> 
> Signed-off-by: Uma Krishnan 
> ---
> drivers/scsi/cxlflash/main.c | 16 ++--
> 1 file changed, 10 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/scsi/cxlflash/main.c b/drivers/scsi/cxlflash/main.c
> index 6d33d8c..1a32e8b 100644
> --- a/drivers/scsi/cxlflash/main.c
> +++ b/drivers/scsi/cxlflash/main.c
> @@ -322,9 +322,10 @@ static int send_cmd(struct afu *afu, struct afu_cmd *cmd)
>   if (!newval) {

When this path is invoked, the current thread is consuming the last entry
available entry before the room must be read again. While the change
below is fine for circumstances where the hardware queue has room for
more than one command, consider a scenario where the queue has room
for only 1 command (the command that you just consumed via the atomic
but are not really consuming with a MMIO due to the revised goto).

In such a scenario this code would loop endlessly, bypassing the timeout
logic completely, until the read room reflected a value greater than 1.

>   do {
>   room = readq_be(>host_map->cmd_room);
> - atomic64_set(>room, room);
> - if (room)
> - goto write_ioarrin;
> + if (room) {
> + atomic64_cmpxchg(>room, 0, room);
> + goto retry;
> + }

If you instead fully consume the entry (goto write_ioarrin - similar as it was
before) and take into account the consumption when you update the cached
value (i.e.: cmpxchg(..., 0, room - 1) the scenario described above will not 
occur.


--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 03/14] cxlflash: Improve context_reset() logic

2016-11-17 Thread Matthew R. Ochs
> On Nov 15, 2016, at 5:14 PM, Uma Krishnan <ukri...@linux.vnet.ibm.com> wrote:
> 
> Currently, the context reset routine waits for command room to
> be available before sending the reset request. Per review of the
> SISLite specification and clarifications from the CXL Flash AFU
> designers, this wait is unnecessary. The reset request can be
> sent anytime regardless of command room, so long as only a single
> reset request is active at any one point in time.
> 
> This commit simplifies the reset routine by removing the wait for
> command room. Additionally it adds a debug trace to help pinpoint
> hardware errors when a context reset does not complete.
> 
> Signed-off-by: Uma Krishnan <ukri...@linux.vnet.ibm.com>

Acked-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com>

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 02/14] cxlflash: Fix crash in cxlflash_restore_luntable()

2016-11-17 Thread Matthew R. Ochs
> On Nov 15, 2016, at 5:14 PM, Uma Krishnan <ukri...@linux.vnet.ibm.com> wrote:
> 
> During test, the following crash was observed:
> 
> [34538.981505] Faulting instruction address: 0xd7c9c870
> cpu 0x9: Vector: 300 (Data Access) at [c007f1e8f590]
>pc: d7c9c870: cxlflash_restore_luntable+0x70/0x1d0 [cxlflash]
>lr: d7c9c84c: cxlflash_restore_luntable+0x4c/0x1d0 [cxlflash]
>sp: c007f1e8f810
>   msr: 90019033
>   dar: c0171d637438
> dsisr: 4000
>  current = 0xc007f1e43f90
>  paca= 0xc7b25100   softe: 0irq_happened: 0x01
>pid   = 493, comm = eehd
> enter ? for help
> [c007f1e8f8a0] d7c940b0 init_afu+0xd60/0x1200 [cxlflash]
> [c007f1e8f9a0] d7c945a8 cxlflash_pci_slot_reset+0x58/0xe0 
> [cxlflash]
> [c007f1e8fa20] d715f790 cxl_pci_slot_reset+0x230/0x340 [cxl]
> [c007f1e8fae0] c0040dd4 eeh_report_reset+0x144/0x180
> [c007f1e8fb20] c003f708 eeh_pe_dev_traverse+0x98/0x170
> [c007f1e8fbb0] c0041618 eeh_handle_normal_event+0x328/0x410
> [c007f1e8fc30] c0041db8 eeh_handle_event+0x178/0x330
> [c007f1e8fce0] c0042118 eeh_event_handler+0x1a8/0x1b0
> [c007f1e8fd80] c011420c kthread+0xec/0x100
> [c007f1e8fe30] c000a47c ret_from_kernel_thread+0x5c/0xe0
> 
> When superpipe mode is disabled for a LUN, the references for the
> local lun are deleted but the LUN is still identified as being present
> in the LUN table. This mismatched state can result in the above crash
> when the LUN table is restored during an error recovery operation.
> 
> To fix this issue, the local LUN information structure is updated to
> reflect the LUN is no longer in the LUN table once all references to
> the LUN are gone.
> 
> Signed-off-by: Uma Krishnan <ukri...@linux.vnet.ibm.com>

Acked-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com>

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 01/14] cxlflash: Set sg_tablesize to 1 instead of SG_NONE

2016-11-17 Thread Matthew R. Ochs
> On Nov 15, 2016, at 5:13 PM, Uma Krishnan <ukri...@linux.vnet.ibm.com> wrote:
> 
> The following Oops is encountered when blk_mq is enabled with the
> cxlflash driver:
> 
> [ 2960.817172] Oops: Kernel access of bad area, sig: 11 [#5]
> [ 2960.817309] NIP  __blk_mq_run_hw_queue+0x278/0x4c0
> [ 2960.817313] LR __blk_mq_run_hw_queue+0x2bc/0x4c0
> [ 2960.817314] Call Trace:
> [ 2960.817320] __blk_mq_run_hw_queue+0x2bc/0x4c0 (unreliable)
> [ 2960.817324] blk_mq_run_hw_queue+0xd8/0x100
> [ 2960.817329] blk_mq_insert_requests+0x14c/0x1f0
> [ 2960.817333] blk_mq_flush_plug_list+0x150/0x190
> [ 2960.817338] blk_flush_plug_list+0x11c/0x2b0
> [ 2960.817344] blk_finish_plug+0x58/0x80
> [ 2960.817348] __do_page_cache_readahead+0x1c0/0x2e0
> [ 2960.817352] force_page_cache_readahead+0x68/0xd0
> [ 2960.817356] generic_file_read_iter+0x43c/0x6a0
> [ 2960.817359] blkdev_read_iter+0x68/0xa0
> [ 2960.817361] __vfs_read+0x11c/0x180
> [ 2960.817364] vfs_read+0xa4/0x1c0
> [ 2960.817366] SyS_read+0x6c/0x110
> [ 2960.817369] system_call+0x38/0xb4
> 
> The SCSI blk_mq stack assumes that sg_tablesize is always a non-zero
> value with scsi_mq_setup_tags() allocating tags using sg_tablesize.
> The cxlflash driver currently uses SG_NONE (0) for the sg_tablesize
> as the devices it supports are not capable of scatter gather. This
> mismatch of values results in the Oops above.
> 
> To resolve this issue, sg_tablesize for cxlflash can simply be set
> to 1, a value which satisfies the constraints in cxlflash and the
> lack of support of SG_NONE in SCSI blk_mq.
> 
> Signed-off-by: Uma Krishnan <ukri...@linux.vnet.ibm.com>

Acked-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com>

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/6] cxlflash: Scan host only after the port is ready for I/O

2016-09-07 Thread Matthew R. Ochs
> On Sep 2, 2016, at 3:38 PM, Uma Krishnan <ukri...@linux.vnet.ibm.com> wrote:
> 
> When a port link is established, the AFU sends a 'link up' interrupt.
> After the link is up, corresponding initialization steps are performed
> on the card. Following that, when the card is ready for I/O, the AFU
> sends 'login succeeded' interrupt. Today, cxlflash invokes
> scsi_scan_host() upon receipt of both interrupts.
> 
> SCSI commands sent to the port prior to the 'login succeeded' interrupt
> will fail with 'port not available' error. This is not desirable.
> Moreover, when async_scan is active for the host, subsequent scan calls
> are terminated with error. Due to this, the scsi_scan_host() call
> performed after 'login succeeded' interrupt could portentially return
> error and the devices may not be scanned properly.
> 
> To avoid this problem, scsi_scan_host() should be called only after the
> 'login succeeded' interrupt.
> 
> Signed-off-by: Uma Krishna <ukri...@linux.vnet.ibm.com>

Acked-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com>

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/6] cxlflash: Remove the device cleanly in the system shutdown path

2016-09-07 Thread Matthew R. Ochs
> On Sep 2, 2016, at 3:39 PM, Uma Krishnan <ukri...@linux.vnet.ibm.com> wrote:
> 
> Commit 704c4b0ddc03 ("cxlflash: Shutdown notify support for CXL Flash
> cards") was recently introduced to notify the AFU when a system is going
> down. Due to the position of the cxlflash driver in the device stack,
> cxlflash devices are _always_ removed during a reboot/shutdown. This can
> lead to a crash if the cxlflash shutdown hook is invoked _after_ the
> shutdown hook for the owning virtual PHB. Furthermore, the current
> implementation of shutdown/remove hooks for cxlflash are not tolerant to
> being invoked when the device is not enabled. This can also lead to a
> crash in situations where the remove hook is invoked after the device has
> been removed via the vPHBs shutdown hook. An example of this scenario
> would be an EEH reset failure while a reboot/shutdown is in progress.
> 
> To solve both problems, the shutdown hook for cxlflash is updated to
> simply remove the device. This path already includes the AFU notification
> and thus this solution will continue to perform the original intent. At
> the same time, the remove hook is updated to protect against being
> called when the device is not enabled.
> 
> Fixes: 704c4b0ddc03 ("cxlflash: Shutdown notify support for CXL Flash
> cards")
> Signed-off-by: Uma Krishna <ukri...@linux.vnet.ibm.com>

Acked-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com>

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 5/6] cxlflash: Remove adapter file descriptor cache

2016-08-09 Thread Matthew R. Ochs
The adapter file descriptor was previously cached within the kernel
for a given context in order to support performing a close on behalf
of an application. This is no longer needed as applications are now
required to perform a close on the adapter file descriptor.

Inspired-by: Al Viro <v...@zeniv.linux.org.uk>
Signed-off-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com>
---
 drivers/scsi/cxlflash/superpipe.c | 26 +-
 drivers/scsi/cxlflash/superpipe.h |  1 -
 2 files changed, 13 insertions(+), 14 deletions(-)

diff --git a/drivers/scsi/cxlflash/superpipe.c 
b/drivers/scsi/cxlflash/superpipe.c
index b3bb90d..c91fe6f 100644
--- a/drivers/scsi/cxlflash/superpipe.c
+++ b/drivers/scsi/cxlflash/superpipe.c
@@ -788,20 +788,18 @@ err:
  * @cfg:   Internal structure associated with the host.
  * @ctx:   Previously obtained CXL context reference.
  * @ctxid: Previously obtained process element associated with CXL context.
- * @adap_fd:   Previously obtained adapter fd associated with CXL context.
  * @file:  Previously obtained file associated with CXL context.
  * @perms: User-specified permissions.
  */
 static void init_context(struct ctx_info *ctxi, struct cxlflash_cfg *cfg,
-struct cxl_context *ctx, int ctxid, int adap_fd,
-struct file *file, u32 perms)
+struct cxl_context *ctx, int ctxid, struct file *file,
+u32 perms)
 {
struct afu *afu = cfg->afu;
 
ctxi->rht_perms = perms;
ctxi->ctrl_map = >afu_map->ctrls[ctxid].ctrl;
ctxi->ctxid = ENCODE_CTXID(ctxi, ctxid);
-   ctxi->lfd = adap_fd;
ctxi->pid = current->tgid; /* tgid = pid */
ctxi->ctx = ctx;
ctxi->cfg = cfg;
@@ -1086,8 +1084,7 @@ static int cxlflash_mmap_fault(struct vm_area_struct 
*vma, struct vm_fault *vmf)
goto err;
}
 
-   dev_dbg(dev, "%s: fault(%d) for context %d\n",
-   __func__, ctxi->lfd, ctxid);
+   dev_dbg(dev, "%s: fault for context %d\n", __func__, ctxid);
 
if (likely(!ctxi->err_recovery_active)) {
vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
@@ -1162,8 +1159,7 @@ static int cxlflash_cxl_mmap(struct file *file, struct 
vm_area_struct *vma)
goto out;
}
 
-   dev_dbg(dev, "%s: mmap(%d) for context %d\n",
-   __func__, ctxi->lfd, ctxid);
+   dev_dbg(dev, "%s: mmap for context %d\n", __func__, ctxid);
 
rc = cxl_fd_mmap(file, vma);
if (likely(!rc)) {
@@ -1406,7 +1402,7 @@ static int cxlflash_disk_attach(struct scsi_device *sdev,
perms = SISL_RHT_PERM(attach->hdr.flags + 1);
 
/* Context mutex is locked upon return */
-   init_context(ctxi, cfg, ctx, ctxid, fd, file, perms);
+   init_context(ctxi, cfg, ctx, ctxid, file, perms);
 
rc = afu_attach(cfg, ctxi);
if (unlikely(rc)) {
@@ -1488,12 +1484,15 @@ err:
  * recover_context() - recovers a context in error
  * @cfg:   Internal structure associated with the host.
  * @ctxi:  Context to release.
+ * @adap_fd:   Adapter file descriptor associated with new/recovered context.
  *
  * Restablishes the state for a context-in-error.
  *
  * Return: 0 on success, -errno on failure
  */
-static int recover_context(struct cxlflash_cfg *cfg, struct ctx_info *ctxi)
+static int recover_context(struct cxlflash_cfg *cfg,
+  struct ctx_info *ctxi,
+  int *adap_fd)
 {
struct device *dev = >dev->dev;
int rc = 0;
@@ -1546,7 +1545,6 @@ static int recover_context(struct cxlflash_cfg *cfg, 
struct ctx_info *ctxi)
 * visible to user space and can't be undone safely on this thread.
 */
ctxi->ctxid = ENCODE_CTXID(ctxi, ctxid);
-   ctxi->lfd = fd;
ctxi->ctx = ctx;
ctxi->file = file;
 
@@ -1563,6 +1561,7 @@ static int recover_context(struct cxlflash_cfg *cfg, 
struct ctx_info *ctxi)
cfg->ctx_tbl[ctxid] = ctxi;
mutex_unlock(>ctx_tbl_list_mutex);
fd_install(fd, file);
+   *adap_fd = fd;
 out:
dev_dbg(dev, "%s: returning ctxid=%d fd=%d rc=%d\n",
__func__, ctxid, fd, rc);
@@ -1621,6 +1620,7 @@ static int cxlflash_afu_recover(struct scsi_device *sdev,
rctxid = recover->context_id;
long reg;
int lretry = 20; /* up to 2 seconds */
+   int new_adap_fd = -1;
int rc = 0;
 
atomic_inc(>recovery_threads);
@@ -1650,7 +1650,7 @@ retry:
 
if (ctxi->err_recovery_active) {
 retry_recover:
-   rc = recover_context(cfg, ctxi);
+   rc = recover_context(cfg, ctxi, _adap_fd);
if (unlikely(rc)) {
dev_err(dev, "%s: Recovery failed for context %

[PATCH 1/6] cxlflash: Avoid mutex when destroying context

2016-08-09 Thread Matthew R. Ochs
Context information structures are protected by a mutex that is held
when accessing/manipulating the context. When the code that manages
these structures was authored, a decision was made to include taking
the mutex as part of the allocation/initialization sequence and also
handle the scenario where the mutex was already held when freeing the
context.

While not a problem outright, this design decision has been deemed as
too flexible and the code should be made more rigid to avoid future bugs.
In addition, further review of the code yields that the existing mutex
manipulations in both of these context management paths are superfluous.

This commit removes the obtaining of the context mutex in the context
initialization routine and assumes the mutex is not held in the context
free path.

Inspired-by: Al Viro <v...@zeniv.linux.org.uk>
Signed-off-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com>
---
 drivers/scsi/cxlflash/superpipe.c | 26 --
 1 file changed, 8 insertions(+), 18 deletions(-)

diff --git a/drivers/scsi/cxlflash/superpipe.c 
b/drivers/scsi/cxlflash/superpipe.c
index ce15070..ab5c893 100644
--- a/drivers/scsi/cxlflash/superpipe.c
+++ b/drivers/scsi/cxlflash/superpipe.c
@@ -709,14 +709,13 @@ int cxlflash_disk_release(struct scsi_device *sdev,
  * @cfg:   Internal structure associated with the host.
  * @ctxi:  Context to release.
  *
- * This routine is safe to be called with a a non-initialized context
- * and is tolerant of being called with the context's mutex held (it
- * will be unlocked if necessary before freeing). Also note that the
- * routine conditionally checks for the existence of the context control
- * map before clearing the RHT registers and context capabilities because
- * it is possible to destroy a context while the context is in the error
- * state (previous mapping was removed [so there is no need to worry about
- * clearing] and context is waiting for a new mapping).
+ * This routine is safe to be called with a a non-initialized context.
+ * Also note that the routine conditionally checks for the existence
+ * of the context control map before clearing the RHT registers and
+ * context capabilities because it is possible to destroy a context
+ * while the context is in the error state (previous mapping was
+ * removed [so there is no need to worry about clearing] and context
+ * is waiting for a new mapping).
  */
 static void destroy_context(struct cxlflash_cfg *cfg,
struct ctx_info *ctxi)
@@ -732,9 +731,6 @@ static void destroy_context(struct cxlflash_cfg *cfg,
writeq_be(0, >ctrl_map->rht_cnt_id);
writeq_be(0, >ctrl_map->ctx_cap);
}
-
-   if (mutex_is_locked(>mutex))
-   mutex_unlock(>mutex);
}
 
/* Free memory associated with context */
@@ -795,9 +791,6 @@ err:
  * @adap_fd:   Previously obtained adapter fd associated with CXL context.
  * @file:  Previously obtained file associated with CXL context.
  * @perms: User-specified permissions.
- *
- * Upon return, the context is marked as initialized and the context's mutex
- * is locked.
  */
 static void init_context(struct ctx_info *ctxi, struct cxlflash_cfg *cfg,
 struct cxl_context *ctx, int ctxid, int adap_fd,
@@ -816,8 +809,6 @@ static void init_context(struct ctx_info *ctxi, struct 
cxlflash_cfg *cfg,
mutex_init(>mutex);
INIT_LIST_HEAD(>luns);
INIT_LIST_HEAD(>list); /* initialize for list_empty() */
-
-   mutex_lock(>mutex);
 }
 
 /**
@@ -1445,7 +1436,6 @@ static int cxlflash_disk_attach(struct scsi_device *sdev,
 * knows about us yet; we can be the only one holding our mutex.
 */
list_add(_access->list, >luns);
-   mutex_unlock(>mutex);
mutex_lock(>ctx_tbl_list_mutex);
mutex_lock(>mutex);
cfg->ctx_tbl[ctxid] = ctxi;
@@ -1494,7 +1484,7 @@ err:
file = NULL;
}
 
-   /* Cleanup our context; safe to call even with mutex locked */
+   /* Cleanup our context */
if (ctxi) {
destroy_context(cfg, ctxi);
ctxi = NULL;
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 4/6] cxlflash: Transition to application close model

2016-08-09 Thread Matthew R. Ochs
Caching the adapter file descriptor and performing a close on behalf
of an application is a poor design. This is due to the fact that once
a file descriptor in installed, it is free to be altered without the
knowledge of the cxlflash driver. This can lead to inconsistencies
between the application and kernel. Furthermore, the nature of the
former design is more exploitable and thus should be abandoned.

To support applications performing a close on the adapter file that
is associated with a context, a new flag is introduced to the user
API to indicate to applications that they are responsible for the
close following the cleanup (detach) of a context. The documentation
is also updated to reflect this change in behavior.

Inspired-by: Al Viro <v...@zeniv.linux.org.uk>
Signed-off-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com>
---
 Documentation/powerpc/cxlflash.txt | 42 +++---
 drivers/scsi/cxlflash/superpipe.c  | 71 ++
 drivers/scsi/cxlflash/vlun.c   | 13 ++-
 include/uapi/scsi/cxlflash_ioctl.h | 19 +++---
 4 files changed, 73 insertions(+), 72 deletions(-)

diff --git a/Documentation/powerpc/cxlflash.txt 
b/Documentation/powerpc/cxlflash.txt
index 4202d1b..f4c1190 100644
--- a/Documentation/powerpc/cxlflash.txt
+++ b/Documentation/powerpc/cxlflash.txt
@@ -171,11 +171,30 @@ DK_CXLFLASH_ATTACH
   destroyed, the tokens are to be considered stale and subsequent
   usage will result in errors.
 
+   - A valid adapter file descriptor (fd2 >= 0) is only returned on
+ the initial attach for a context. Subsequent attaches to an
+ existing context (DK_CXLFLASH_ATTACH_REUSE_CONTEXT flag present)
+ do not provide the adapter file descriptor as it was previously
+ made known to the application.
+
 - When a context is no longer needed, the user shall detach from
-  the context via the DK_CXLFLASH_DETACH ioctl.
+  the context via the DK_CXLFLASH_DETACH ioctl. When this ioctl
+ returns with a valid adapter file descriptor and the return flag
+ DK_CXLFLASH_APP_CLOSE_ADAP_FD is present, the application _must_
+ close the adapter file descriptor following a successful detach.
+
+   - When this ioctl returns with a valid fd2 and the return flag
+ DK_CXLFLASH_APP_CLOSE_ADAP_FD is present, the application _must_
+ close fd2 in the following circumstances:
+
+   + Following a successful detach of the last user of the context
+   + Following a successful recovery on the context's original fd2
+   + In the child process of a fork(), following a clone ioctl,
+ on the fd2 associated with the source context
 
-- A close on fd2 will invalidate the tokens. This operation is not
-  required by the user.
+- At any time, a close on fd2 will invalidate the tokens. Applications
+ should exercise caution to only close fd2 when appropriate (outlined
+ in the previous bullet) to avoid premature loss of I/O.
 
 DK_CXLFLASH_USER_DIRECT
 ---
@@ -254,6 +273,10 @@ DK_CXLFLASH_DETACH
 success, all "tokens" which had been provided to the user from the
 DK_CXLFLASH_ATTACH onward are no longer valid.
 
+When the DK_CXLFLASH_APP_CLOSE_ADAP_FD flag was returned on a successful
+attach, the application _must_ close the fd2 associated with the context
+following the detach of the final user of the context.
+
 DK_CXLFLASH_VLUN_CLONE
 --
 This ioctl is responsible for cloning a previously created
@@ -261,7 +284,7 @@ DK_CXLFLASH_VLUN_CLONE
 support maintaining user space access to storage after a process
 forks. Upon success, the child process (which invoked the ioctl)
 will have access to the same LUNs via the same resource handle(s)
-and fd2 as the parent, but under a different context.
+as the parent, but under a different context.
 
 Context sharing across processes is not supported with CXL and
 therefore each fork must be met with establishing a new context
@@ -275,6 +298,12 @@ DK_CXLFLASH_VLUN_CLONE
 translation tables are copied from the parent context to the child's
 and then synced with the AFU.
 
+When the DK_CXLFLASH_APP_CLOSE_ADAP_FD flag was returned on a successful
+attach, the application _must_ close the fd2 associated with the source
+context (still resident/accessible in the parent process) following the
+clone. This is to avoid a stale entry in the file descriptor table of the
+child process.
+
 DK_CXLFLASH_VERIFY
 --
 This ioctl is used to detect various changes such as the capacity of
@@ -309,6 +338,11 @@ DK_CXLFLASH_RECOVER_AFU
 at which time the context/resources they held will be freed as part of
 the release fop.
 
+When the DK_CXLFLASH_APP_CLOSE_ADAP_FD flag was returned on

[PATCH 3/6] cxlflash: Add kref to context

2016-08-09 Thread Matthew R. Ochs
Currently, context user references are tracked via the list of LUNs
that have attached to the context. While convenient, this is not
intuitive without a deep study of the code and is inconsistent with
the existing reference tracking patterns within the kernel. This design
choice can lead to future bug injection.

To improve code comprehension and better protect against future bugs, add
explicit reference counting to contexts and migrate the context removal
code to the kref release handler.

Inspired-by: Al Viro <v...@zeniv.linux.org.uk>
Signed-off-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com>
---
 drivers/scsi/cxlflash/superpipe.c | 87 +++
 drivers/scsi/cxlflash/superpipe.h |  1 +
 2 files changed, 53 insertions(+), 35 deletions(-)

diff --git a/drivers/scsi/cxlflash/superpipe.c 
b/drivers/scsi/cxlflash/superpipe.c
index 640c3a2..be7522a 100644
--- a/drivers/scsi/cxlflash/superpipe.c
+++ b/drivers/scsi/cxlflash/superpipe.c
@@ -808,11 +808,56 @@ static void init_context(struct ctx_info *ctxi, struct 
cxlflash_cfg *cfg,
ctxi->file = file;
ctxi->initialized = true;
mutex_init(>mutex);
+   kref_init(>kref);
INIT_LIST_HEAD(>luns);
INIT_LIST_HEAD(>list); /* initialize for list_empty() */
 }
 
 /**
+ * remove_context() - context kref release handler
+ * @kref:  Kernel reference associated with context to be removed.
+ *
+ * When a context no longer has any references it can safely be removed
+ * from global access and destroyed. Note that it is assumed the thread
+ * relinquishing access to the context holds its mutex.
+ */
+static void remove_context(struct kref *kref)
+{
+   struct ctx_info *ctxi = container_of(kref, struct ctx_info, kref);
+   struct cxlflash_cfg *cfg = ctxi->cfg;
+   int lfd;
+   u64 ctxid = DECODE_CTXID(ctxi->ctxid);
+
+   /* Remove context from table/error list */
+   WARN_ON(!mutex_is_locked(>mutex));
+   ctxi->unavail = true;
+   mutex_unlock(>mutex);
+   mutex_lock(>ctx_tbl_list_mutex);
+   mutex_lock(>mutex);
+
+   if (!list_empty(>list))
+   list_del(>list);
+   cfg->ctx_tbl[ctxid] = NULL;
+   mutex_unlock(>ctx_tbl_list_mutex);
+   mutex_unlock(>mutex);
+
+   /* Context now completely uncoupled/unreachable */
+   lfd = ctxi->lfd;
+   destroy_context(cfg, ctxi);
+
+   /*
+* As a last step, clean up external resources when not
+* already on an external cleanup thread, i.e.: close(adap_fd).
+*
+* NOTE: this will free up the context from the CXL services,
+* allowing it to dole out the same context_id on a future
+* (or even currently in-flight) disk_attach operation.
+*/
+   if (lfd != -1)
+   sys_close(lfd);
+}
+
+/**
  * _cxlflash_disk_detach() - detaches a LUN from a context
  * @sdev:  SCSI device associated with LUN.
  * @ctxi:  Context owning resources.
@@ -837,7 +882,6 @@ static int _cxlflash_disk_detach(struct scsi_device *sdev,
 
int i;
int rc = 0;
-   int lfd;
u64 ctxid = DECODE_CTXID(detach->context_id),
rctxid = detach->context_id;
 
@@ -879,40 +923,12 @@ static int _cxlflash_disk_detach(struct scsi_device *sdev,
break;
}
 
-   /* Tear down context following last LUN cleanup */
-   if (list_empty(>luns)) {
-   ctxi->unavail = true;
-   mutex_unlock(>mutex);
-   mutex_lock(>ctx_tbl_list_mutex);
-   mutex_lock(>mutex);
-
-   /* Might not have been in error list so conditionally remove */
-   if (!list_empty(>list))
-   list_del(>list);
-   cfg->ctx_tbl[ctxid] = NULL;
-   mutex_unlock(>ctx_tbl_list_mutex);
-   mutex_unlock(>mutex);
-
-   lfd = ctxi->lfd;
-   destroy_context(cfg, ctxi);
-   ctxi = NULL;
-   put_ctx = false;
-
-   /*
-* As a last step, clean up external resources when not
-* already on an external cleanup thread, i.e.: close(adap_fd).
-*
-* NOTE: this will free up the context from the CXL services,
-* allowing it to dole out the same context_id on a future
-* (or even currently in-flight) disk_attach operation.
-*/
-   if (lfd != -1)
-   sys_close(lfd);
-   }
-
-   /* Release the sdev reference that bound this LUN to the context */
+   /*
+* Release the context reference and the sdev reference that
+* bound this LUN to the context.
+*/
+   put_ctx = !kref_put(>kref, remove_context);
scsi_device_put(sdev);
-
 out:
if (put_ctx)
 

[PATCH 6/6] cxlflash: Update documentation

2016-08-09 Thread Matthew R. Ochs
Update the block library link in the API documentation.

Signed-off-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com>
---
 Documentation/powerpc/cxlflash.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/powerpc/cxlflash.txt 
b/Documentation/powerpc/cxlflash.txt
index f4c1190..6d9a2ed 100644
--- a/Documentation/powerpc/cxlflash.txt
+++ b/Documentation/powerpc/cxlflash.txt
@@ -121,7 +121,7 @@ Block library API
 below.
 
 The block library can be found on GitHub:
-http://www.github.com/mikehollinger/ibmcapikv
+http://github.com/open-power/capiflash
 
 
 CXL Flash Driver IOCTLs
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/6] cxlflash: Cache owning adapter within context

2016-08-09 Thread Matthew R. Ochs
The context removal routine requires access to the owning adapter
structure to reset the context within the AFU as part of the tear
down sequence. In order to support kref adoption, the owning adapter
must be accessible from the release handler. As the kref framework
only provides the kref reference as the sole parameter, another means
is needed to derive the owning adapter.

As a remedy, the owning adapter reference is saved off within the
context during initialization.

Signed-off-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com>
---
 drivers/scsi/cxlflash/superpipe.c | 1 +
 drivers/scsi/cxlflash/superpipe.h | 1 +
 2 files changed, 2 insertions(+)

diff --git a/drivers/scsi/cxlflash/superpipe.c 
b/drivers/scsi/cxlflash/superpipe.c
index ab5c893..640c3a2 100644
--- a/drivers/scsi/cxlflash/superpipe.c
+++ b/drivers/scsi/cxlflash/superpipe.c
@@ -804,6 +804,7 @@ static void init_context(struct ctx_info *ctxi, struct 
cxlflash_cfg *cfg,
ctxi->lfd = adap_fd;
ctxi->pid = current->tgid; /* tgid = pid */
ctxi->ctx = ctx;
+   ctxi->cfg = cfg;
ctxi->file = file;
ctxi->initialized = true;
mutex_init(>mutex);
diff --git a/drivers/scsi/cxlflash/superpipe.h 
b/drivers/scsi/cxlflash/superpipe.h
index 5f9a091..61404f2 100644
--- a/drivers/scsi/cxlflash/superpipe.h
+++ b/drivers/scsi/cxlflash/superpipe.h
@@ -107,6 +107,7 @@ struct ctx_info {
bool err_recovery_active;
struct mutex mutex; /* Context protection */
struct cxl_context *ctx;
+   struct cxlflash_cfg *cfg;
struct list_head luns;  /* LUNs attached to this context */
const struct vm_operations_struct *cxl_mmap_vmops;
struct file *file;
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/6] cxlflash: Improvements and cleanup

2016-08-09 Thread Matthew R. Ochs
This patch set contains various code improvements and cleanups that
were inspired by Al Viro upon reviewing the cxlflash driver. The core
improvement is that the driver will no longer cache the adapter file
descriptor associated with a context. This results in a user API change
that is documented alongside the modifications.

The series is based upon 4.8-rc1, intended for 4.9, and is bisectable.

Matthew R. Ochs (6):
  cxlflash: Avoid mutex when destroying context
  cxlflash: Cache owning adapter within context
  cxlflash: Add kref to context
  cxlflash: Transition to application close model
  cxlflash: Remove adapter file descriptor cache
  cxlflash: Update documentation

 Documentation/powerpc/cxlflash.txt |  44 -
 drivers/scsi/cxlflash/superpipe.c  | 181 -
 drivers/scsi/cxlflash/superpipe.h  |   3 +-
 drivers/scsi/cxlflash/vlun.c   |  13 +--
 include/uapi/scsi/cxlflash_ioctl.h |  19 +++-
 5 files changed, 135 insertions(+), 125 deletions(-)

-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] MAINTAINERS: Update cxlflash maintainers

2016-07-22 Thread Matthew R. Ochs
> On Jul 21, 2016, at 3:44 PM, Uma Krishnan <ukri...@linux.vnet.ibm.com> wrote:
> 
> Adding myself as a cxlflash maintainer.
> 
> Signed-off-by: Uma Krishnan <ukri...@linux.vnet.ibm.com>

Acked-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com>

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] cxlflash: Verify problem state area is mapped before notifying shutdown

2016-07-22 Thread Matthew R. Ochs
> On Jul 21, 2016, at 3:44 PM, Uma Krishnan <ukri...@linux.vnet.ibm.com> wrote:
> 
> If an EEH or some other hard error occurs while the
> adapter instance was being initialized, on the subsequent
> shutdown of the device, the system could crash with:
> 
> [c00f1da03b60] c05eccfc pci_device_shutdown+0x6c/0x100
> [c00f1da03ba0] c06d67d4 device_shutdown+0x1b4/0x2c0
> [c00f1da03c40] c00ea30c kernel_restart_prepare+0x5c/0x80
> [c00f1da03c70] c00ea48c kernel_restart+0x2c/0xc0
> [c00f1da03ce0] c00ea970 SyS_reboot+0x1c0/0x2d0
> [c00f1da03e30] c0009204 system_call+0x38/0xb4
> 
> This crash is due to the AFU not being mapped when the shutdown
> notification routine is called and is a regression that was inserted
> recently with Commit 704c4b0ddc03 ("cxlflash: Shutdown notify support
> for CXL Flash cards").
> 
> As a fix, shutdown notification should only occur when the AFU is mapped.
> 
> Fixes: 704c4b0ddc03 ("cxlflash: Shutdown notify support for CXL Flash cards")
> Signed-off-by: Uma Krishnan <ukri...@linux.vnet.ibm.com>

Acked-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com>

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] cxl: remove dead Kconfig options

2016-07-12 Thread Matthew R. Ochs
> On Jul 4, 2016, at 2:12 AM, Andrew Donnellan <andrew.donnel...@au1.ibm.com> 
> wrote:
> 
> Remove the CXL_KERNEL_API and CXL_EEH Kconfig options, as they were only
> needed to coordinate the merging of the cxlflash driver. Also remove the
> stub implementation of cxl_perst_reloads_same_image() in cxlflash which is
> only used if CXL_EEH isn't defined (i.e. never).
> 
> Suggested-by: Ian Munsie <imun...@au1.ibm.com>
> Signed-off-by: Andrew Donnellan <andrew.donnel...@au1.ibm.com>

Acked-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com>


--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/3] cxlflash: Shutdown notify support for CXL Flash cards

2016-06-20 Thread Matthew R. Ochs
> On Jun 15, 2016, at 6:49 PM, Uma Krishnan <ukri...@linux.vnet.ibm.com> wrote:
> 
> Some CXL Flash cards need notification of device shutdown
> in order to flush pending I/Os.
> 
> A PCI notification hook for shutdown has been added where
> the driver notifies the card and returns. When the device
> is removed in the PCI remove path, notification code will
> wait for shutdown processing to complete.
> 
> Signed-off-by: Uma Krishnan <ukri...@linux.vnet.ibm.com>

Acked-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com>

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/3] cxlflash: Fix to drain operations from previous reset

2016-06-20 Thread Matthew R. Ochs
> On Jun 15, 2016, at 6:49 PM, Uma Krishnan <ukri...@linux.vnet.ibm.com> wrote:
> 
> From: "Manoj N. Kumar" <ma...@linux.vnet.ibm.com>
> 
> While running 'sg_reset -H' in a loop with a user-space application active,
> hit the following exception:
> 
> cpu 0x2: Vector: 300 (Data Access)
>pc: : afu_attach+0x50/0x240 [cxlflash]
>lr: : cxlflash_afu_recover+0x3dc/0x7d0 [cxlflash]
>pid   = 20365, comm = run_block_fvt
> 
> Linux version 4.5.0-491-26f710d+
> 
> cxlflash_afu_recover+0x3dc/0x7d0 [cxlflash]
> cxlflash_ioctl+0x5a8/0x6f0 [cxlflash]
> scsi_ioctl+0x3b0/0x4c0
> sd_ioctl+0x110/0x190
> blkdev_ioctl+0x28c/0xc20
> block_ioctl+0xa4/0xd0
> do_vfs_ioctl+0xd8/0x8c0
> SyS_ioctl+0xd4/0xf0
> system_call+0x38/0xb4
> 
> The problem here is that the problem space area is unmapped while the
> application issues the DK_CXLFLASH_RECOVER_AFU ioctl.
> 
> This is the order I observe:
> 
> proc1 proc2
> 1) sg_reset
>   2) ioctl(DK_CXLFLASH_RECOVER_AFU)
> 3) sg_reset again
>   causing a PSA unmap
>   4) continues RECOVER_AFU processing
> 
> The resolution to this problem is to have the reset handler drain all
> outstanding user space initiated ioctls before proceeding.  It is safe
> to drain after the state has been changed to STATE_RESET. Also since
> drain_ioctls() was static, it had to be moved up a bit to be before
> cxlflash_eh_host_reset_handler().
> 
> Signed-off-by: Manoj N. Kumar <ma...@linux.vnet.ibm.com>

Acked-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com>

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/1] cxlflash: Fix to resolve dead-lock during EEH recovery

2016-05-04 Thread Matthew R. Ochs
Acked-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com>

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] cxlflash: Move to exponential back-off when cmd_room is not available

2016-03-23 Thread Matthew R. Ochs
> On Mar 23, 2016, at 2:50 PM, Uma Krishnan <ukri...@linux.vnet.ibm.com> wrote:
> 
> From: "Manoj N. Kumar" <ma...@linux.vnet.ibm.com>
> 
> While profiling the cxlflash_queuecommand() path under a heavy load it
> was found that number of retries to find cmd_room was fairly high.
> 
> There are two problems with the current back-off:
> a) It starts with a udelay of 0
> b) It backs-off linearly
> 
> Tried several approaches (a higher multiple 10*n, 100*n, as well as n^2,
> 2^n) and found that the exponential back-off(2^n) approach had the least
> overall cost. Cost as being defined as overall time spent waiting.
> 
> The fix is to change the linear back-off to an exponential back-off.
> This solution also takes care of the problem with the initial
> delay (starts with 1 usec).
> 
> Signed-off-by: Manoj N. Kumar <ma...@linux.vnet.ibm.com>

Acked-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com>

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] cxlflash: Fix regression issue with re-ordering patch

2016-03-23 Thread Matthew R. Ochs
> On Mar 23, 2016, at 2:50 PM, Uma Krishnan <ukri...@linux.vnet.ibm.com> wrote:
> 
> From: "Manoj N. Kumar" <ma...@linux.vnet.ibm.com>
> 
> While running 'sg_reset -H' back to back the following exception was seen:
> 
> [  735.115695] Faulting instruction address: 0xd98c0864
> cpu 0x0: Vector: 300 (Data Access) at [c00afa80]
>pc: d98c0864: cxlflash_async_err_irq+0x84/0x5c0 [cxlflash]
>lr: c013aed0: handle_irq_event_percpu+0xa0/0x310
>sp: c00afd00
>   msr: 90009033
>   dar: 201
> dsisr: 4000
>  current = 0xc1510880
>  paca= 0xcfb8   softe: 0irq_happened: 0x01
>pid   = 0, comm = swapper/0
> 
> Linux version 4.5.0-491-26f710d+
> 
> enter ? for help
> [c00afe10] c013aed0 handle_irq_event_percpu+0xa0/0x310
> [c00afed0] c013b1a8 handle_irq_event+0x68/0xc0
> [c00aff00] c01404ec handle_fasteoi_irq+0xec/0x2a0
> [c00aff30] c013a084 generic_handle_irq+0x54/0x80
> [c00aff60] c0011130 __do_irq+0x80/0x1d0
> [c00aff90] c0024d40 call_do_irq+0x14/0x24
> [c1573a20] c0011318 do_IRQ+0x98/0x140
> [c1573a70] c0002594 hardware_interrupt_common+0x114/0x180
> 
> This exception is being hit because the async_err interrupt path performs
> an MMIO to read the interrupt status register. The MMIO region in this
> case is not available.
> 
> Commit 6ded8b3cbd9a ("cxlflash: Unmap problem state area before detaching
> master context") re-ordered the sequence in which term_mc() and stop_afu()
> are called. This introduces a window for interrupts to come in with the
> problem space area unmapped, that did not exist previously.
> 
> The fix is to separate the disabling of all AFU interrupts to a distinct
> function, term_intr() so that it is the first thing that is done in the
> tear down process.
> 
> To keep the initialization process symmetric, separate the AFU interrupt
> setup also to a distinct function: init_intr().
> 
> Fixes: 6ded8b3cbd9a ("cxlflash: Unmap problem state area before detaching 
> master context")
> Signed-off-by: Manoj N. Kumar <ma...@linux.vnet.ibm.com>

Looks good, thanks for the quick turnaround!

Acked-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com>

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 7/7] cxlflash: Increase cmd_per_lun for better throughput

2016-03-07 Thread Matthew R. Ochs
> On Mar 4, 2016, at 3:55 PM, Uma Krishnan <ukri...@linux.vnet.ibm.com> wrote:
> 
> From: "Manoj N. Kumar" <ma...@linux.vnet.ibm.com>
> 
> With the current value of cmd_per_lun at 16, the throughput
> over a single adapter is limited to around 150kIOPS.
> 
> Increase the value of cmd_per_lun to 256 to improve
> throughput. With this change a single adapter is able to
> attain close to the maximum throughput (380kIOPS).
> Also change the number of RRQ entries that can be queued.
> 
> Signed-off-by: Manoj N. Kumar <ma...@linux.vnet.ibm.com>

Acked-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com>

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 6/7] cxlflash: Fix to avoid unnecessary scan with internal LUNs

2016-03-07 Thread Matthew R. Ochs
> On Mar 4, 2016, at 3:55 PM, Uma Krishnan <ukri...@linux.vnet.ibm.com> wrote:
> 
> From: "Manoj N. Kumar" <ma...@linux.vnet.ibm.com>
> 
> When switching to the internal LUN defined on the
> IBM CXL flash adapter, there is an unnecessary
> scan occurring on the second port. This scan leads
> to the following extra lines in the log:
> 
> Dec 17 10:09:00 tul83p1 kernel: [ 3708.561134] cxlflash 0008:00:00.0: 
> cxlflash_queuecommand: (scp=c000fc1f0f00) 11/1/0/0 
> cdb=(A000--1000-)
> Dec 17 10:09:00 tul83p1 kernel: [ 3708.561147] process_cmd_err: cmd failed 
> afu_rc=32 scsi_rc=0 fc_rc=0 afu_extra=0xE, scsi_extra=0x0, fc_extra=0x0
> 
> By definition, both of the internal LUNs are on the first port/channel.
> 
> When the lun_mode is switched to internal LUN the
> same value for host->max_channel is retained. This
> causes an unnecessary scan over the second port/channel.
> 
> This fix alters the host->max_channel to 0 (1 port), if internal
> LUNs are configured and switches it back to 1 (2 ports) while
> going back to external LUNs.
> 
> Signed-off-by: Manoj N. Kumar <ma...@linux.vnet.ibm.com>

Acked-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com>

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 5/7] cxlflash: Reorder user context initialization

2016-03-07 Thread Matthew R. Ochs
> On Mar 4, 2016, at 3:55 PM, Uma Krishnan <ukri...@linux.vnet.ibm.com> wrote:
> 
> In order to support cxlflash in the PowerVM environment, underlying
> hypervisor APIs have imposed a kernel API ordering change.
> 
> For the superpipe access to LUN, user applications need a context.
> The cxlflash module creates this context by making a sequence of
> cxl calls. In the current code, a context is initialized via
> cxl_dev_context_init() followed by cxl_process_element(), a function
> that obtains the process element id. Finally, cxl_start_work()
> is called to attach the process element.
> 
> In the PowerVM environment, a process element id cannot be obtained
> from the hypervisor until the process element is attached. The
> cxlflash module is unable to create contexts without a valid
> process element id.
> 
> To fix this problem, cxl_start_work() is called before obtaining
> the process element id.
> 
> Signed-off-by: Uma Krishnan <ukri...@linux.vnet.ibm.com>

Acked-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com>

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/7] cxlflash: Unmap problem state area before detaching master context

2016-03-07 Thread Matthew R. Ochs
> On Mar 4, 2016, at 3:55 PM, Uma Krishnan <ukri...@linux.vnet.ibm.com> wrote:
> 
> When operating in the PowerVM environment, the cxlflash module can
> receive an error from the hypervisor indicating that there are
> existing mappings in the page table for the process MMIO space.
> 
> This issue exists because term_afu() currently invokes term_mc()
> before stop_afu(), allowing for the master context to be detached
> first and the problem state area to be unmapped second.
> 
> To resolve this issue, stop_afu() should be called before term_mc().
> 
> Signed-off-by: Uma Krishnan <ukri...@linux.vnet.ibm.com>

Acked-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com>

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/7] cxlflash: Simplify PCI registration

2016-03-07 Thread Matthew R. Ochs
> On Mar 4, 2016, at 3:55 PM, Uma Krishnan <ukri...@linux.vnet.ibm.com> wrote:
> 
> From: "Manoj N. Kumar" <ma...@linux.vnet.ibm.com>
> 
> The calls to pci_request_regions(), pci_resource_start(),
> pci_set_dma_mask(), pci_set_master() and pci_save_state() are all
> unnecessary for the IBM CXL flash adapter since data buffers
> are not required to be mapped to the device's memory.
> 
> The use of services such as pci_set_dma_mask() are problematic on
> hypervisor managed systems as the IBM CXL flash adapter is operating
> under a virtual PCI Host Bridge (virtual PHB) which does not support
> these services.
> 
> cxlflash 0001:00:00.0: init_pci: Failed to set PCI DMA mask rc=-5
> 
> The resolution is to simplify init_pci(), to a point where it does the
> bare minimum (pci_enable_device). Similarly, remove the call the
> pci_release_regions() from cxlflash_remove().
> 
> Signed-off-by: Manoj N. Kumar <ma...@linux.vnet.ibm.com>

Acked-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com>

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v6 19/20] cxlflash: Use new cxl_pci_read_adapter_vpd() API

2016-03-04 Thread Matthew R. Ochs
This patch should have also been sent to the SCSI list (included now).

Changes look fine.

Acked-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com>

> On Mar 4, 2016, at 5:26 AM, Frederic Barrat <fbar...@linux.vnet.ibm.com> 
> wrote:
> 
> To read the adapter VPD, drivers can't rely on pci config APIs, as it
> wouldn't work on powerVM. cxl introduced a new kernel API especially
> for this, so start using it.
> 
> Co-authored-by: Christophe Lombard <clomb...@linux.vnet.ibm.com>
> Signed-off-by: Frederic Barrat <fbar...@linux.vnet.ibm.com>
> Signed-off-by: Christophe Lombard <clomb...@linux.vnet.ibm.com>
> ---
> drivers/scsi/cxlflash/common.h |  1 -
> drivers/scsi/cxlflash/main.c   | 18 ++
> 2 files changed, 2 insertions(+), 17 deletions(-)
> 
> diff --git a/drivers/scsi/cxlflash/common.h b/drivers/scsi/cxlflash/common.h
> index 5ada926..580f370 100644
> --- a/drivers/scsi/cxlflash/common.h
> +++ b/drivers/scsi/cxlflash/common.h
> @@ -106,7 +106,6 @@ struct cxlflash_cfg {
>   atomic_t scan_host_needed;
> 
>   struct cxl_afu *cxl_afu;
> - struct pci_dev *parent_dev;
> 
>   atomic_t recovery_threads;
>   struct mutex ctx_recovery_mutex;
> diff --git a/drivers/scsi/cxlflash/main.c b/drivers/scsi/cxlflash/main.c
> index f6d90ce..e04aae7 100644
> --- a/drivers/scsi/cxlflash/main.c
> +++ b/drivers/scsi/cxlflash/main.c
> @@ -1407,7 +1407,7 @@ static int start_context(struct cxlflash_cfg *cfg)
>  */
> static int read_vpd(struct cxlflash_cfg *cfg, u64 wwpn[])
> {
> - struct pci_dev *dev = cfg->parent_dev;
> + struct pci_dev *dev = cfg->dev;
>   int rc = 0;
>   int ro_start, ro_size, i, j, k;
>   ssize_t vpd_size;
> @@ -1416,7 +1416,7 @@ static int read_vpd(struct cxlflash_cfg *cfg, u64 
> wwpn[])
>   char *wwpn_vpd_tags[NUM_FC_PORTS] = { "V5", "V6" };
> 
>   /* Get the VPD data from the device */
> - vpd_size = pci_read_vpd(dev, 0, sizeof(vpd_data), vpd_data);
> + vpd_size = cxl_read_adapter_vpd(dev, vpd_data, sizeof(vpd_data));
>   if (unlikely(vpd_size <= 0)) {
>   dev_err(>dev, "%s: Unable to read VPD (size = %ld)\n",
>  __func__, vpd_size);
> @@ -2392,7 +2392,6 @@ static int cxlflash_probe(struct pci_dev *pdev,
> {
>   struct Scsi_Host *host;
>   struct cxlflash_cfg *cfg = NULL;
> - struct device *phys_dev;
>   struct dev_dependent_vals *ddv;
>   int rc = 0;
> 
> @@ -2458,19 +2457,6 @@ static int cxlflash_probe(struct pci_dev *pdev,
> 
>   pci_set_drvdata(pdev, cfg);
> 
> - /*
> -  * Use the special service provided to look up the physical
> -  * PCI device, since we are called on the probe of the virtual
> -  * PCI host bus (vphb)
> -  */
> - phys_dev = cxl_get_phys_dev(pdev);
> - if (!dev_is_pci(phys_dev)) {
> - dev_err(>dev, "%s: not a pci dev\n", __func__);
> - rc = -ENODEV;
> - goto out_remove;
> - }
> - cfg->parent_dev = to_pci_dev(phys_dev);
> -
>   cfg->cxl_afu = cxl_pci_to_afu(pdev);
> 
>   rc = init_pci(cfg);
> -- 
> 1.9.1
> 
> ___
> Linuxppc-dev mailing list
> linuxppc-...@lists.ozlabs.org
> https://lists.ozlabs.org/listinfo/linuxppc-dev

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] osd: remove deadcode

2016-02-24 Thread Matthew R. Ochs
Reviewed-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com>

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] imm: check parport_claim

2016-02-24 Thread Matthew R. Ochs
Reviewed-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com>

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] hpsa: update MAINTAINERS with new e-mail

2016-02-23 Thread Matthew R. Ochs
Reviewed-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com>

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 7/7] hpsa: update copyright information

2016-02-23 Thread Matthew R. Ochs
Reviewed-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com>

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 6/7] hpsa: remove function definition for sanitize_inquiry_string

2016-02-23 Thread Matthew R. Ochs
Thanks again for implementing this feedback!

Reviewed-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com>

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 5/7] hpsa: check for a null phys_disk pointer in ioaccel2 path

2016-02-23 Thread Matthew R. Ochs
Reviewed-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com>

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 4/7] hpsa: correct abort tmf for hba devices

2016-02-23 Thread Matthew R. Ochs
Reviewed-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com>

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/7] hpsa: correct lun data caching bitmap definition

2016-02-23 Thread Matthew R. Ochs
Reviewed-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com>

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/7] hpsa: add SMR drive support

2016-02-23 Thread Matthew R. Ochs
Reviewed-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com>

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/7] hpsa: do not get enclosure info for external devices

2016-02-23 Thread Matthew R. Ochs
Reviewed-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com>

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] lpfc: Remove redundant code block in lpfc_scsi_cmd_iocb_cmpl

2016-01-20 Thread Matthew R. Ochs
Reviewed-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com> 

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] sd: Optimal I/O size is in bytes, not sectors

2016-01-20 Thread Matthew R. Ochs
Reviewed-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com> 

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] scsi: export function scsi_scan.c:sanitize_inquiry_string

2016-01-20 Thread Matthew R. Ochs
> On Jan 20, 2016, at 1:41 PM, Don Brace <don.br...@pmcs.com> wrote:
> 
> The hpsa driver uses this function to cleanup inquiry
> data. Our new pqi driver will also use this
> function. This function was copied into both drivers.
> 
> This patch exports sanitize_inquiry_string so the hpsa
> and the pqi drivers can use this function directly.
> 
> Hannes recommended the change in review:
> https://www.marc.info/?l=linux-scsi=144619249003064=2
> that using the existing function in scsi_scan would be
> preferrable.

I also made this suggestion when reviewing the same patch:
https://www.marc.info/?l=linux-scsi=144613827316617=2

Reviewed-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com>

> 
> Suggested-by: Hannes Reinecke <h...@suse.de>
> Reviewed-by: Kevin Barnett <kevin.barn...@pmcs.com>
> Reviewed-by: Justin Lindley <justin.lind...@pmcs.com>
> Reviewed-by: Scott Teel <scott.t...@pmcs.com>
> Reviewed-by: Hannes Reinecke <h...@suse.de>
> Signed-off-by: Don Brace <don.br...@pmcs.com>
> ---
> drivers/scsi/scsi_scan.c |   12 +++-
> 1 file changed, 7 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
> index 6a82066..1f02e84 100644
> --- a/drivers/scsi/scsi_scan.c
> +++ b/drivers/scsi/scsi_scan.c
> @@ -518,7 +518,8 @@ void scsi_target_reap(struct scsi_target *starget)
> }
> 
> /**
> - * sanitize_inquiry_string - remove non-graphical chars from an INQUIRY 
> result string
> + * scsi_sanitize_inquiry_string - remove non-graphical chars from an
> + *INQUIRY result string
>  * @s: INQUIRY result string to sanitize
>  * @len: length of the string
>  *
> @@ -531,7 +532,7 @@ void scsi_target_reap(struct scsi_target *starget)
>  *string terminator, so all the following characters are set to
>  *spaces.
>  **/
> -static void sanitize_inquiry_string(unsigned char *s, int len)
> +void scsi_sanitize_inquiry_string(unsigned char *s, int len)
> {
>   int terminated = 0;
> 
> @@ -542,6 +543,7 @@ static void sanitize_inquiry_string(unsigned char *s, int 
> len)
>   *s = ' ';
>   }
> }
> +EXPORT_SYMBOL(scsi_sanitize_inquiry_string);
> 
> /**
>  * scsi_probe_lun - probe a single LUN using a SCSI INQUIRY
> @@ -627,9 +629,9 @@ static int scsi_probe_lun(struct scsi_device *sdev, 
> unsigned char *inq_result,
>   }
> 
>   if (result == 0) {
> - sanitize_inquiry_string(_result[8], 8);
> - sanitize_inquiry_string(_result[16], 16);
> - sanitize_inquiry_string(_result[32], 4);
> + scsi_sanitize_inquiry_string(_result[8], 8);
> + scsi_sanitize_inquiry_string(_result[16], 16);
> + scsi_sanitize_inquiry_string(_result[32], 4);
> 
>   response_len = inq_result[4] + 5;
>   if (response_len > 255)
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] ipr: fix out-of-bounds null overwrite

2016-01-06 Thread Matthew R. Ochs
On Jan 6, 2016, at 7:53 AM, Insu Yun <wuni...@gmail.com> wrote:
> 
> Return value of snprintf is not bound by size value, 2nd argument.
> (https://www.kernel.org/doc/htmldocs/kernel-api/API-snprintf.html).
> Return value is number of printed chars, can be larger than 2nd argument.
> Therefore, it can write null byte out of bounds ofbuffer.
> Since snprintf puts null, it does not need to put additional null byte.
> 
> Signed-off-by: Insu Yun <wuni...@gmail.com>
> ---
> drivers/scsi/ipr.c | 5 ++---
> 1 file changed, 2 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/scsi/ipr.c b/drivers/scsi/ipr.c
> index 536cd5a..caf63f9 100644
> --- a/drivers/scsi/ipr.c
> +++ b/drivers/scsi/ipr.c
> @@ -4003,13 +4003,12 @@ static ssize_t ipr_store_update_fw(struct device *dev,
>   struct ipr_sglist *sglist;
>   char fname[100];
>   char *src;
> - int len, result, dnld_size;
> + int result, dnld_size;
> 
>   if (!capable(CAP_SYS_ADMIN))
>   return -EACCES;
> 
> - len = snprintf(fname, 99, "%s", buf);
> - fname[len-1] = '\0';
> + snprintf(fname, 99, "%s", buf);

Changing the 99 to sizeof(fname) as part of this change would also
make for cleaner code in my opinion. Will leave it up to you if you'd
like to do a v3.

Reviewed-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com> 
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 3/3] hpsa: add box and bay information for enclosure devices

2015-12-21 Thread Matthew R. Ochs
Reviewed-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com>

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 1/3] hpsa: fix path_info_show

2015-12-21 Thread Matthew R. Ochs
> On Dec 21, 2015, at 11:21 AM, Don Brace <don.br...@pmcs.com> wrote:
> 
> left off some changes from Rasmus Villemoes where he changed
> snprintf to scnprintf
> 
> Suggested-by: Rasmus Villemoes <li...@rasmusvillemoes.dk>
> Reviewed-by: Justin Lindley <justin.lind...@pmcs.com>
> Reviewed-by: Kevin Barnett <kevin.barn...@pmcs.com>
> Reviewed-by: Scott Teel <scott.t...@pmcs.com>
> Reviewed-by: Rasmus Villemoes <li...@rasmusvillemoes.dk>
> Reviewed-by: Hannes Reinecke <h...@suse.com>
> Signed-off-by: Don Brace <don.br...@pmcs.com>

FYI, I did review and stamp this one back on 12/9.

Reviewed-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com>

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/3] hpsa: change SAS transport devices to bus 0.

2015-12-18 Thread Matthew R. Ochs
> On Dec 15, 2015, at 9:25 AM, Don Brace <brace77...@gmail.com> wrote:
> 
> On 12/09/2015 05:21 PM, Matthew R. Ochs wrote:
>>> On Dec 9, 2015, at 3:18 PM, Don Brace <don.br...@pmcs.com> wrote:
>>> 
>>> sas transport places devices on bus 0 but driver was setting
>>> the bus to 3.
>>> 
>>> Reviewed-by: Justin Lindley <justin.lind...@pmcs.com>
>>> Reviewed-by: Kevin Barnett <kevin.barn...@pmcs.com>
>>> Reviewed-by: Scott Teel <scott.t...@pmcs.com>
>>> Signed-off-by: Don Brace <don.br...@pmcs.com>
>>> ---
>>> drivers/scsi/hpsa.h |2 +-
>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>> 
>>> diff --git a/drivers/scsi/hpsa.h b/drivers/scsi/hpsa.h
>>> index ae5beda..fdd39fc 100644
>>> --- a/drivers/scsi/hpsa.h
>>> +++ b/drivers/scsi/hpsa.h
>>> @@ -400,7 +400,7 @@ struct offline_device_entry {
>>> #define HPSA_PHYSICAL_DEVICE_BUS0
>>> #define HPSA_RAID_VOLUME_BUS1
>>> #define HPSA_EXTERNAL_RAID_VOLUME_BUS   2
>>> -#define HPSA_HBA_BUS           3
>>> +#define HPSA_HBA_BUS   0
>> Is this not the same as using HPSA_PHYSICAL_DEVICE_BUS?
> I was just trying to minimize the changes. I can update the driver
> is necessary.

Understandable.

Reviewed-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com>

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/3] hpsa: add box and bay information for enclosure devices

2015-12-18 Thread Matthew R. Ochs
Hi Don,

Did you see these comments I had from my review of this patch?


-matt

> On Dec 9, 2015, at 5:41 PM, Matthew R. Ochs <mro...@linux.vnet.ibm.com> wrote:
> 
>> On Dec 9, 2015, at 3:18 PM, Don Brace <don.br...@pmcs.com> wrote:
> 
> No commit message?

You can disregard this one as I saw you added a message in v1.

> 
>> 
>> Reviewed-by: Justin Lindley <justin.lind...@pmcs.com>
>> Reviewed-by: Kevin Barnett <kevin.barn...@pmcs.com>
>> Reviewed-by: Scott Teel <scott.t...@pmcs.com>
>> Signed-off-by: Don Brace <don.br...@pmcs.com>
>> ---
>> drivers/scsi/hpsa.c |  108 
>> ---
>> drivers/scsi/hpsa_cmd.h |   13 ++
>> 2 files changed, 114 insertions(+), 7 deletions(-)
>> 
>> diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c
>> index f8370b8..6f84ec7 100644
>> --- a/drivers/scsi/hpsa.c
>> +++ b/drivers/scsi/hpsa.c
>> @@ -750,7 +750,6 @@ static ssize_t 
>> host_show_hp_ssd_smart_path_enabled(struct device *dev,
>> }
>> 
>> #define MAX_PATHS 8
>> -
>> static ssize_t path_info_show(struct device *dev,
>>   struct device_attribute *attr, char *buf)
>> {
>> @@ -792,9 +791,7 @@ static ssize_t path_info_show(struct device *dev,
>>  hdev->bus, hdev->target, hdev->lun,
>>  scsi_device_type(hdev->devtype));
>> 
>> -if (hdev->external ||
>> -hdev->devtype == TYPE_RAID ||
>> -is_logical_device(hdev)) {
>> +if (hdev->devtype == TYPE_RAID || is_logical_device(hdev)) {
> 
> How does removing the check for external here relate to the rest of this 
> commit?
> 
>>  output_len += scnprintf(buf + output_len,
>>  PAGE_SIZE - output_len,
>>  "%s\n", active);
>> @@ -808,8 +805,7 @@ static ssize_t path_info_show(struct device *dev,
>>  phys_connector[0] = '0';
>>  if (phys_connector[1] < '0')
>>  phys_connector[1] = '0';
>> -if (hdev->phys_connector[i] > 0)
>> -output_len += scnprintf(buf + output_len,
>> +output_len += scnprintf(buf + output_len,
> 
> Same question regarding the removal of the > 0 check.
> 
>>  PAGE_SIZE - output_len,
>>  "PORT: %.2s ",
>>  phys_connector);
>> @@ -3191,6 +3187,90 @@ out:
>>  return rc;
>> }
>> 
>> +/*
>> + * get enclosure information
>> + * struct ReportExtendedLUNdata *rlep - Used for BMIC drive number
>> + * struct hpsa_scsi_dev_t *encl_dev - device entry for enclosure
>> + * Uses id_physical_device to determine the box_index.
>> + */
>> +static int hpsa_get_enclosure_info(struct ctlr_info *h,
>> +unsigned char *scsi3addr,
>> +struct ReportExtendedLUNdata *rlep, int rle_index,
>> +struct hpsa_scsi_dev_t *encl_dev)
>> +{
>> +int rc = -1;
>> +struct CommandList *c = NULL;
>> +struct ErrorInfo *ei = NULL;
>> +struct bmic_sense_storage_box_params *bssbp = NULL;
>> +struct bmic_identify_physical_device *id_phys = NULL;
>> +struct ext_report_lun_entry *rle = >LUN[rle_index];
>> +u16 bmic_device_index = 0;
>> +
>> +
>> +bssbp = kzalloc(sizeof(*bssbp), GFP_KERNEL);
>> +if (!bssbp)
>> +goto out;
>> +
>> +id_phys = kzalloc(sizeof(*id_phys), GFP_KERNEL);
>> +if (!id_phys)
>> +goto out;
>> +
>> +bmic_device_index = GET_BMIC_DRIVE_NUMBER(>lunid[0]);
>> +
>> +if (bmic_device_index == 0xFF00)
>> +goto out;
> 
> Why not put this check before the memory allocations so you can
> avoid them if this condition is not met?
> 
>> +
>> +memset(id_phys, 0, sizeof(*id_phys));
> 
> Didn't you just obtain zeroed memory from kzalloc()?
> 
>> +rc = hpsa_bmic_id_physical_device(h, scsi3addr, bmic_device_index,
>> +id_phys, sizeof(*id_phys));
>> +if (rc) {
>> +dev_warn(>pdev->dev, "%s: id_phys failed %d bdi[0x%x]\n",
>> +__func__, encl_dev->external, bmic_device_index);
>> +goto out;
>> +}
>&

Re: [PATCH 16/17] lpfc: Use kzalloc instead of kmalloc

2015-12-16 Thread Matthew R. Ochs
Reviewed-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com>

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 3/6] cxlflash: Removed driver date print

2015-12-14 Thread Matthew R. Ochs
Acked-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com>

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 4/6] cxlflash: Fix to resolve cmd leak after host reset

2015-12-14 Thread Matthew R. Ochs
Acked-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com>

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 5/6] cxlflash: Resolve oops in wait_port_offline

2015-12-14 Thread Matthew R. Ochs
Acked-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com>

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 6/6] cxlflash: Enable device id for future IBM CXL adapter

2015-12-14 Thread Matthew R. Ochs
Acked-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com>

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/6] cxlflash: Fix to escalate LINK_RESET also on port 1

2015-12-14 Thread Matthew R. Ochs
Acked-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com>

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/3] hpsa: fix path_info_show

2015-12-09 Thread Matthew R. Ochs
Reviewed-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com>

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/3] hpsa: change SAS transport devices to bus 0.

2015-12-09 Thread Matthew R. Ochs
> On Dec 9, 2015, at 3:18 PM, Don Brace  wrote:
> 
> sas transport places devices on bus 0 but driver was setting
> the bus to 3.
> 
> Reviewed-by: Justin Lindley 
> Reviewed-by: Kevin Barnett 
> Reviewed-by: Scott Teel 
> Signed-off-by: Don Brace 
> ---
> drivers/scsi/hpsa.h |2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/scsi/hpsa.h b/drivers/scsi/hpsa.h
> index ae5beda..fdd39fc 100644
> --- a/drivers/scsi/hpsa.h
> +++ b/drivers/scsi/hpsa.h
> @@ -400,7 +400,7 @@ struct offline_device_entry {
> #define HPSA_PHYSICAL_DEVICE_BUS  0
> #define HPSA_RAID_VOLUME_BUS  1
> #define HPSA_EXTERNAL_RAID_VOLUME_BUS 2
> -#define HPSA_HBA_BUS 3
> +#define HPSA_HBA_BUS 0

Is this not the same as using HPSA_PHYSICAL_DEVICE_BUS?

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/3] hpsa: add box and bay information for enclosure devices

2015-12-09 Thread Matthew R. Ochs
> On Dec 9, 2015, at 3:18 PM, Don Brace  wrote:

No commit message?

> 
> Reviewed-by: Justin Lindley 
> Reviewed-by: Kevin Barnett 
> Reviewed-by: Scott Teel 
> Signed-off-by: Don Brace 
> ---
> drivers/scsi/hpsa.c |  108 ---
> drivers/scsi/hpsa_cmd.h |   13 ++
> 2 files changed, 114 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c
> index f8370b8..6f84ec7 100644
> --- a/drivers/scsi/hpsa.c
> +++ b/drivers/scsi/hpsa.c
> @@ -750,7 +750,6 @@ static ssize_t host_show_hp_ssd_smart_path_enabled(struct 
> device *dev,
> }
> 
> #define MAX_PATHS 8
> -
> static ssize_t path_info_show(struct device *dev,
>struct device_attribute *attr, char *buf)
> {
> @@ -792,9 +791,7 @@ static ssize_t path_info_show(struct device *dev,
>   hdev->bus, hdev->target, hdev->lun,
>   scsi_device_type(hdev->devtype));
> 
> - if (hdev->external ||
> - hdev->devtype == TYPE_RAID ||
> - is_logical_device(hdev)) {
> + if (hdev->devtype == TYPE_RAID || is_logical_device(hdev)) {

How does removing the check for external here relate to the rest of this commit?

>   output_len += scnprintf(buf + output_len,
>   PAGE_SIZE - output_len,
>   "%s\n", active);
> @@ -808,8 +805,7 @@ static ssize_t path_info_show(struct device *dev,
>   phys_connector[0] = '0';
>   if (phys_connector[1] < '0')
>   phys_connector[1] = '0';
> - if (hdev->phys_connector[i] > 0)
> - output_len += scnprintf(buf + output_len,
> + output_len += scnprintf(buf + output_len,

Same question regarding the removal of the > 0 check.

>   PAGE_SIZE - output_len,
>   "PORT: %.2s ",
>   phys_connector);
> @@ -3191,6 +3187,90 @@ out:
>   return rc;
> }
> 
> +/*
> + * get enclosure information
> + * struct ReportExtendedLUNdata *rlep - Used for BMIC drive number
> + * struct hpsa_scsi_dev_t *encl_dev - device entry for enclosure
> + * Uses id_physical_device to determine the box_index.
> + */
> +static int hpsa_get_enclosure_info(struct ctlr_info *h,
> + unsigned char *scsi3addr,
> + struct ReportExtendedLUNdata *rlep, int rle_index,
> + struct hpsa_scsi_dev_t *encl_dev)
> +{
> + int rc = -1;
> + struct CommandList *c = NULL;
> + struct ErrorInfo *ei = NULL;
> + struct bmic_sense_storage_box_params *bssbp = NULL;
> + struct bmic_identify_physical_device *id_phys = NULL;
> + struct ext_report_lun_entry *rle = >LUN[rle_index];
> + u16 bmic_device_index = 0;
> +
> +
> + bssbp = kzalloc(sizeof(*bssbp), GFP_KERNEL);
> + if (!bssbp)
> + goto out;
> +
> + id_phys = kzalloc(sizeof(*id_phys), GFP_KERNEL);
> + if (!id_phys)
> + goto out;
> +
> + bmic_device_index = GET_BMIC_DRIVE_NUMBER(>lunid[0]);
> +
> + if (bmic_device_index == 0xFF00)
> + goto out;

Why not put this check before the memory allocations so you can
avoid them if this condition is not met?

> +
> + memset(id_phys, 0, sizeof(*id_phys));

Didn't you just obtain zeroed memory from kzalloc()?

> + rc = hpsa_bmic_id_physical_device(h, scsi3addr, bmic_device_index,
> + id_phys, sizeof(*id_phys));
> + if (rc) {
> + dev_warn(>pdev->dev, "%s: id_phys failed %d bdi[0x%x]\n",
> + __func__, encl_dev->external, bmic_device_index);
> + goto out;
> + }
> +
> + c = cmd_alloc(h);
> +
> + rc = fill_cmd(c, BMIC_SENSE_STORAGE_BOX_PARAMS, h, bssbp,
> + sizeof(*bssbp), 0, RAID_CTLR_LUNID, TYPE_CMD);
> +
> + if (rc)
> + goto out;
> +
> + if (id_phys->phys_connector[1] == 'E')
> + c->Request.CDB[5] = id_phys->box_index;
> + else
> + c->Request.CDB[5] = 0;
> +
> + rc = hpsa_scsi_do_simple_cmd_with_retry(h, c, PCI_DMA_FROMDEVICE,
> + NO_TIMEOUT);
> + if (rc)
> + goto out;
> +
> + ei = c->err_info;
> + if (ei->CommandStatus != 0 && ei->CommandStatus != CMD_DATA_UNDERRUN) {
> + rc = -1;
> + goto out;
> + }
> +
> + encl_dev->box[id_phys->active_path_number] = bssbp->phys_box_on_port;
> + memcpy(_dev->phys_connector[id_phys->active_path_number],
> + bssbp->phys_connector, sizeof(bssbp->phys_connector));
> +
> + rc = IO_OK;
> +out:
> + kfree(bssbp);
> + kfree(id_phys);
> +
> + if (c)
> 

[PATCH RESEND] cxlflash: drop unlikely before IS_ERR_OR_NULL

2015-12-03 Thread Matthew R. Ochs
From: Geliang Tang <geliangt...@163.com>

IS_ERR_OR_NULL already contain an unlikely compiler flag. Drop it.

Signed-off-by: Geliang Tang <geliangt...@163.com>
Acked-by: Manoj Kumar <ma...@linux.vnet.ibm.com>
Acked-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com>
Signed-off-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com>
---
This patch was originally sent by Geliang Tang in October 2015. As a
valid simplification, I'd like to see this make it into 'next'. I've
gone ahead and performed the rebase so that it applies cleanly.

 drivers/scsi/cxlflash/superpipe.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/cxlflash/superpipe.c 
b/drivers/scsi/cxlflash/superpipe.c
index 34b21a0..f4020db 100644
--- a/drivers/scsi/cxlflash/superpipe.c
+++ b/drivers/scsi/cxlflash/superpipe.c
@@ -1372,7 +1372,7 @@ static int cxlflash_disk_attach(struct scsi_device *sdev,
}
 
ctx = cxl_dev_context_init(cfg->dev);
-   if (unlikely(IS_ERR_OR_NULL(ctx))) {
+   if (IS_ERR_OR_NULL(ctx)) {
dev_err(dev, "%s: Could not initialize context %p\n",
__func__, ctx);
rc = -ENODEV;
@@ -1500,7 +1500,7 @@ static int recover_context(struct cxlflash_cfg *cfg, 
struct ctx_info *ctxi)
struct afu *afu = cfg->afu;
 
ctx = cxl_dev_context_init(cfg->dev);
-   if (unlikely(IS_ERR_OR_NULL(ctx))) {
+   if (IS_ERR_OR_NULL(ctx)) {
dev_err(dev, "%s: Could not initialize context %p\n",
__func__, ctx);
rc = -ENODEV;
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH RESEND] cxlflash: a couple off by one bugs

2015-12-02 Thread Matthew R. Ochs
From: Dan Carpenter <dan.carpen...@oracle.com>

The "> MAX_CONTEXT" should be ">= MAX_CONTEXT".  Otherwise we go one
step beyond the end of the cfg->ctx_tbl[] array.

Signed-off-by: Dan Carpenter <dan.carpen...@oracle.com>
Reviewed-by: Manoj Kumar <ma...@linux.vnet.ibm.com>
Acked-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com>
Signed-off-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com>
---
This patch was originally sent by Dan Carpenter in September 2015. I had
based my large patch series that went into 4.4 off of it but this patch
appears to have not made it in. As a valid fix, I'd like to see this make
it into 'next'. I've gone ahead and performed the rebase so that it
applies cleanly.

 drivers/scsi/cxlflash/superpipe.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/cxlflash/superpipe.c 
b/drivers/scsi/cxlflash/superpipe.c
index cac2e6a..34b21a0 100644
--- a/drivers/scsi/cxlflash/superpipe.c
+++ b/drivers/scsi/cxlflash/superpipe.c
@@ -1380,7 +1380,7 @@ static int cxlflash_disk_attach(struct scsi_device *sdev,
}
 
ctxid = cxl_process_element(ctx);
-   if (unlikely((ctxid > MAX_CONTEXT) || (ctxid < 0))) {
+   if (unlikely((ctxid >= MAX_CONTEXT) || (ctxid < 0))) {
dev_err(dev, "%s: ctxid (%d) invalid!\n", __func__, ctxid);
rc = -EPERM;
goto err2;
@@ -1508,7 +1508,7 @@ static int recover_context(struct cxlflash_cfg *cfg, 
struct ctx_info *ctxi)
}
 
ctxid = cxl_process_element(ctx);
-   if (unlikely((ctxid > MAX_CONTEXT) || (ctxid < 0))) {
+   if (unlikely((ctxid >= MAX_CONTEXT) || (ctxid < 0))) {
dev_err(dev, "%s: ctxid (%d) invalid!\n", __func__, ctxid);
rc = -EPERM;
goto err1;
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 18/27] hpsa: refactor hpsa_figure_bus_target_lun

2015-11-05 Thread Matthew R. Ochs
Reviewed-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com>

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 20/27] hpsa: generalize external arrays

2015-11-05 Thread Matthew R. Ochs
  continue;
>   }
> 
> + /* Determine if this is a lun from an external target array */
> + tmpdevice->external =
> + figure_external_status(h, raid_ctlr_position, i,
> + nphysicals, nlocal_logicals);
> +
>   figure_bus_target_lun(h, lunaddrbytes, tmpdevice);
>   hpsa_update_device_supports_aborts(h, tmpdevice, lunaddrbytes);
>   this_device = currentsd[ncurrent];
> @@ -3962,6 +4031,7 @@ out:
>   kfree(currentsd);
>   kfree(physdev_list);
>   kfree(logdev_list);
> + kfree(id_ctlr);
>   kfree(id_phys);
> }
> 
> @@ -6414,6 +6484,23 @@ static int fill_cmd(struct CommandList *c, u8 cmd, 
> struct ctlr_info *h,
>   c->Request.CDB[7] = (size >> 16) & 0xFF;
>   c->Request.CDB[8] = (size >> 8) & 0XFF;
>   break;
> + case BMIC_IDENTIFY_CONTROLLER:
> + c->Request.CDBLen = 10;
> + c->Request.type_attr_dir =
> + TYPE_ATTR_DIR(cmd_type, ATTR_SIMPLE, XFER_READ);
> + c->Request.Timeout = 0;
> + c->Request.CDB[0] = BMIC_READ;
> + c->Request.CDB[1] = 0;
> + c->Request.CDB[2] = 0;
> + c->Request.CDB[3] = 0;
> + c->Request.CDB[4] = 0;
> + c->Request.CDB[5] = 0;
> + c->Request.CDB[6] = BMIC_IDENTIFY_CONTROLLER;
> + c->Request.CDB[7] = (size >> 16) & 0xFF;
> + c->Request.CDB[8] = (size >> 8) & 0XFF;
> + c->Request.CDB[9] = 0;
> + break;
> +

For consistency with the other fill_cmd() cases these unnecessary zeroing 
statements
should be removed, either now or at some point in the future.

Reviewed-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com>

>   default:
>   dev_warn(>pdev->dev, "unknown command 0x%c\n", cmd);
>   BUG();
> diff --git a/drivers/scsi/hpsa.h b/drivers/scsi/hpsa.h
> index a4cab12..ffcd4cb 100644
> --- a/drivers/scsi/hpsa.h
> +++ b/drivers/scsi/hpsa.h
> @@ -77,6 +77,7 @@ struct hpsa_scsi_dev_t {
>   struct hpsa_scsi_dev_t *phys_disk[RAID_MAP_MAX_ENTRIES];
>   int nphysical_disks;
>   int supports_aborts;
> + int external;   /* 1-from external array 0-not <0-unknown */
> };
> 
> struct reply_queue_buffer {
> diff --git a/drivers/scsi/hpsa_cmd.h b/drivers/scsi/hpsa_cmd.h
> index c2c0737..c83eaf1 100644
> --- a/drivers/scsi/hpsa_cmd.h
> +++ b/drivers/scsi/hpsa_cmd.h
> @@ -286,6 +286,7 @@ struct SenseSubsystem_info {
> #define BMIC_FLASH_FIRMWARE 0xF7
> #define BMIC_SENSE_CONTROLLER_PARAMETERS 0x64
> #define BMIC_IDENTIFY_PHYSICAL_DEVICE 0x15
> +#define BMIC_IDENTIFY_CONTROLLER 0x11
> 
> /* Command List Structure */
> union SCSI3Addr {
> @@ -682,6 +683,16 @@ struct hpsa_pci_info {
>   u32 board_id;
> };
> 
> +struct bmic_identify_controller {
> + u8  configured_logical_drive_count; /* offset 0 */
> + u8  pad1[153];
> + __le16  extended_logical_unit_count;/* offset 154 */
> + u8  pad2[136];
> + u8  controller_mode;/* offset 292 */
> + u8  pad3[32];
> +};
> +
> +
> struct bmic_identify_physical_device {
>   u8scsi_bus;  /* SCSI Bus number on controller */
>   u8scsi_id;   /* SCSI ID on this bus */
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] st: trivial: remove form feed characters

2015-11-04 Thread Matthew R. Ochs
Reviewed-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com>

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1 24/25] hpsa: add in sas transport class

2015-10-30 Thread Matthew R. Ochs
> On Oct 30, 2015, at 5:00 PM, Don Brace <brace77...@gmail.com> wrote:
> On 10/30/2015 03:07 PM, Matthew R. Ochs wrote:
>>> On Oct 28, 2015, at 5:06 PM, Don Brace <don.br...@pmcs.com> wrote:
>>> 
>>> From: Kevin Barnett <kevin.barn...@pmcs.com>
>>> 
>>> Reviewed-by: Scott Teel <scott.t...@pmcs.com>
>>> Reviewed-by: Justin Lindley <justin.lind...@pmcs.com>
>>> Reviewed-by: Kevin Barnett <kevin.barn...@pmcs.com>
>>> Signed-off-by: Don Brace <don.br...@pmcs.com>
>>> ---
>>> drivers/scsi/hpsa.c |  535 
>>> +--
>>> drivers/scsi/hpsa.h |   27 ++
>>> drivers/scsi/hpsa_cmd.h |   14 +
>>> 3 files changed, 555 insertions(+), 21 deletions(-)
>>> 
>>> diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c
>>> index 56526312..ca38a00 100644
>>> --- a/drivers/scsi/hpsa.c
>>> +++ b/drivers/scsi/hpsa.c
>>> @@ -41,6 +41,7 @@
>>> 
>>> +static u64 hpsa_get_sas_address_from_report_physical(struct ctlr_info *h,
>>> +   unsigned char *scsi3addr)
>>> +{
>>> +   struct ReportExtendedLUNdata *physdev;
>>> +   u32 nphysicals;
>>> +   u64 sa = 0;
>>> +   int i;
>>> +
>>> +   physdev = kzalloc(sizeof(*physdev), GFP_KERNEL);
>>> +   if (!physdev)
>>> +   return 0;
>>> +
>>> +   if (hpsa_scsi_do_report_phys_luns(h, physdev, sizeof(*physdev))) {
>>> +   dev_err(>pdev->dev, "report physical LUNs failed.\n");
>>> +   kfree(physdev);
>>> +   return 0;
>>> +   }
>>> +   nphysicals = get_unaligned_be32(physdev->LUNListLength) / 24;
>>> +
>>> +   for (i = 0; i < nphysicals; i++)
>>> +   if (!memcmp(>LUN[i].lunid[0], scsi3addr, 8))
>>> +   sa = get_unaligned_be64(>LUN[i].wwid[0]);
>> Don't you want to break out here if you found a match?
> Agreed.
>> 
>>> +
>>> +   kfree(physdev);
>>> +
>>> +   return sa;
>>> +}
>>> +
>>> +static void hpsa_get_sas_address(struct ctlr_info *h, unsigned char 
>>> *scsi3addr,
>>> +   struct hpsa_scsi_dev_t *dev)
>>> +{
>>> +   int rc;
>>> +   u64 sa = 0;
>>> +
>>> +   if (is_hba_lunid(scsi3addr)) {
>>> +   struct bmic_sense_subsystem_info *ssi;
>>> +
>>> +   ssi = kzalloc(sizeof(*ssi), GFP_KERNEL);
>> What happens if this allocation fails? If the I/O can succeed without the
>> DMA buffer then you will run into trouble when deriving sa.
> Added check for NULL.

With these changes

Reviewed-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com>

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1 17/25] hpsa: move scsi_add_device and scsi_remove_device calls to new function

2015-10-30 Thread Matthew R. Ochs
> On Oct 29, 2015, at 3:30 PM, Don Brace <brace77...@gmail.com> wrote:
> 
> On 10/29/2015 12:21 PM, Matthew R. Ochs wrote:
>>> On Oct 28, 2015, at 5:06 PM, Don Brace <don.br...@pmcs.com> wrote:
>>> 
>>> From: Kevin Barnett <kevin.barn...@pmcs.com>
>>> 
>>> preparation for adding the sas transport class
>>> 
>>> Reviewed-by: Scott Teel <scott.t...@pmcs.com>
>>> Reviewed-by: Justin Lindley <justin.lind...@pmcs.com>
>>> Reviewed-by: Kevin Barnett <kevin.barn...@pmcs.com>
>>> Signed-off-by: Don Brace <don.br...@pmcs.com>
>>> ---
>>> drivers/scsi/hpsa.c |   65 
>>> +++
>>> 1 file changed, 39 insertions(+), 26 deletions(-)
>>> 
>>> diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c
>>> index 24b3c8c..06207e2 100644
>>> --- a/drivers/scsi/hpsa.c
>>> +++ b/drivers/scsi/hpsa.c
>>> @@ -1660,6 +1660,37 @@ static void 
>>> hpsa_update_log_drive_phys_drive_ptrs(struct ctlr_info *h,
>>> }
>>> }
>>> 
>>> +static int hpsa_add_device(struct ctlr_info *h, struct hpsa_scsi_dev_t 
>>> *device)
>>> +{
>>> +   int rc = 0;
>>> +
>>> +   rc = scsi_add_device(h->scsi_host, device->bus,
>>> +   device->target, device->lun);
>>> +   return rc;
>>> +}
>>> +
>>> +static void hpsa_remove_device(struct ctlr_info *h,
>>> +   struct hpsa_scsi_dev_t *device)
>>> +{
>>> +   struct scsi_device *sdev = NULL;
>>> +
>>> +   sdev = scsi_device_lookup(h->scsi_host, device->bus,
>>> +   device->target, device->lun);
>>> +
>>> +   if (sdev) {
>>> +   scsi_remove_device(sdev);
>>> +   scsi_device_put(sdev);
>>> +   } else {
>>> +   /*
>>> +* We don't expect to get here.  Future commands
>>> +* to this device will get a selection timeout as
>>> +* if the device were gone.
>>> +*/
>>> +   hpsa_show_dev_msg(KERN_WARNING, h, device,
>>> +   "didn't find device for removal.");
>>> +   }
>>> +}
>>> +
>>> static void adjust_hpsa_scsi_table(struct ctlr_info *h, int hostno,
>>> struct hpsa_scsi_dev_t *sd[], int nsds)
>>> {
>>> @@ -1672,7 +1703,6 @@ static void adjust_hpsa_scsi_table(struct ctlr_info 
>>> *h, int hostno,
>>> unsigned long flags;
>>> struct hpsa_scsi_dev_t **added, **removed;
>>> int nadded, nremoved;
>>> -   struct Scsi_Host *sh = NULL;
>>> 
>>> /*
>>>  * A reset can cause a device status to change
>>> @@ -1792,46 +1822,29 @@ static void adjust_hpsa_scsi_table(struct ctlr_info 
>>> *h, int hostno,
>>> if (hostno == -1 || !changes)
>>> goto free_and_out;
>>> 
>>> -   sh = h->scsi_host;
>>> -   if (sh == NULL) {
>>> -   dev_warn(>pdev->dev, "%s: scsi_host is null\n", __func__);
>>> -   return;
>>> -   }
>> Are we guaranteed that h->scsi_host will never be NULL when running in here? 
>> Or when
>> the newly introduced hpsa_remove_device() is invoked elsewhere for that 
>> matter?
>> 
>> This commit loses this check and scsi_device_lookup() is not tolerant of a 
>> NULL
>> scsi_host * (first action is to grab the host_lock).
> Better to be safe. I'll add a check in both functions.

With these changes

Reviewed-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com>

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1 21/25] hpsa: disable report lun data caching

2015-10-30 Thread Matthew R. Ochs
D_CACHING;
> +
> + if (fill_cmd(c, BMIC_SET_DIAG_OPTIONS, h, options, 4, 0,
> + RAID_CTLR_LUNID, TYPE_CMD))
> + goto errout;
> +
> + rc = hpsa_scsi_do_simple_cmd_with_retry(h, c,
> + PCI_DMA_TODEVICE, NO_TIMEOUT);
> + if ((rc != 0)  || (c->err_info->CommandStatus != 0))
> + goto errout;
> +
> + /* Now verify that it got set: */
> + if (fill_cmd(c, BMIC_SENSE_DIAG_OPTIONS, h, options, 4, 0,
> + RAID_CTLR_LUNID, TYPE_CMD))
> + goto errout;
> +
> +     rc = hpsa_scsi_do_simple_cmd_with_retry(h, c,
> + PCI_DMA_FROMDEVICE, NO_TIMEOUT);
> + if ((rc != 0)  || (c->err_info->CommandStatus != 0))
> + goto errout;
> +
> + if (*options && HPSA_DIAG_OPTS_DISABLE_RLD_CACHING)
> + goto out;
> + else {
> +errout:
> + dev_err(>pdev->dev,
> + "Error: failed to disable report lun data caching.\n");
> + }

I agree with the simplification suggested by Tomas Henzl.

Reviewed-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com>

> +out:
> + cmd_free(h, c);
> + kfree(options);
> +}
> +
> static void hpsa_shutdown(struct pci_dev *pdev)
> {
>   struct ctlr_info *h;
> diff --git a/drivers/scsi/hpsa_cmd.h b/drivers/scsi/hpsa_cmd.h
> index c83eaf1..4910344 100644
> --- a/drivers/scsi/hpsa_cmd.h
> +++ b/drivers/scsi/hpsa_cmd.h
> @@ -287,6 +287,9 @@ struct SenseSubsystem_info {
> #define BMIC_SENSE_CONTROLLER_PARAMETERS 0x64
> #define BMIC_IDENTIFY_PHYSICAL_DEVICE 0x15
> #define BMIC_IDENTIFY_CONTROLLER 0x11
> +#define BMIC_SET_DIAG_OPTIONS 0xF4
> +#define BMIC_SENSE_DIAG_OPTIONS 0xF5
> +#define HPSA_DIAG_OPTS_DISABLE_RLD_CACHING 0x4000
> 
> /* Command List Structure */
> union SCSI3Addr {
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


  1   2   3   4   5   >