Re: [GIT PULL] target fixes for v4.11-rc3

2017-03-18 Thread Nicholas A. Bellinger
On Sat, 2017-03-18 at 19:08 -0700, Nicholas A. Bellinger wrote:
> Hello Linus,
> 
> Here are the target-pending fixes for v4.11-rc3 code.  Please go ahead
> and pull from:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending.git master
> 
> The bulk of the changes are in qla2xxx target driver code to address
> various issues found during Cavium/QLogic's internal testing (stable
> CC's included), along with a few other stability and smaller
> miscellaneous improvements.

Wrt the other qla2xxx improvements in this PULL request..

They where originally intended for v4.11-rc1, but missed the merge
window due to a regression reported on the list.

Since then, they have gone through 4 revisions over the last 7 weeks,
and Cavium/QLogic folks have addressed the outstanding regression +
feedback, and this series has passed their internal Q/A testing.

So while I acknowledge they aren't all strictly bug-fixes (5 of them are
CC'ed for stable), all are self contained within qla2xxx and I do trust
the Cavium/QLogic team's sign-off for the complete series.

Please consider pulling the series.



[GIT PULL] target fixes for v4.11-rc3

2017-03-18 Thread Nicholas A. Bellinger
Hello Linus,

Here are the target-pending fixes for v4.11-rc3 code.  Please go ahead
and pull from:

  git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending.git master

The bulk of the changes are in qla2xxx target driver code to address
various issues found during Cavium/QLogic's internal testing (stable
CC's included), along with a few other stability and smaller
miscellaneous improvements.

There are also a couple of different patch sets from Mike Christie,
which have been a result of his work to use target-core ALUA logic
together with tcm-user backend driver.

Finally, a patch to address some long standing issues with pass-through
SCSI export of TYPE_TAPE + TYPE_MEDIUM_CHANGER devices, which will make
folks using physical (or virtual) magnetic tape happy.

Thank you,

--nab

Anil Gurumurthy (1):
  qla2xxx: Export DIF stats via debugfs

Himanshu Madhani (2):
  qla2xxx: Add DebugFS node to display Port Database
  qla2xxx: Update driver version to 9.00.00.00-k

Joe Carnuccio (1):
  qla2xxx: Allow vref count to timeout on vport delete.

Max Lohrmann (1):
  target: Fix VERIFY_16 handling in sbc_parse_cdb

Mike Christie (10):
  tcmu: allow hw_max_sectors greater than 128
  tcmu: return on first Opt parse failure
  target: allow ALUA setup for some passthrough backends
  target: fail ALUA transitions for pscsi
  target: Use system workqueue for ALUA transitions
  target: fix ALUA transition timeout handling
  target: allow userspace to set state to transitioning
  target: fix race during implicit transition work flushes
  tcmu: add helper to check if dev was configured
  tcmu: make cmd timeout configurable

Nicholas Bellinger (3):
  target/pscsi: Fix TYPE_TAPE + TYPE_MEDIMUM_CHANGER export
  target: Drop pointless tfo->check_stop_free check
  tcmu: Convert cmd_time_out into backend device attribute

Quinn Tran (10):
  qla2xxx: Fix memory leak for abts processing
  qla2xxx: Fix request queue corruption.
  qla2xxx: Fix inadequate lock protection for ABTS.
  qla2xxx: Fix sess_lock & hardware_lock lock order problem.
  qla2xxx: Allow relogin to proceed if remote login did not finish
  qla2xxx: Improve T10-DIF/PI handling in driver.
  qla2xxx: Add async new target notification
  qla2xxx: Use IOCB interface to submit non-critical MBX.
  qla2xxx: Change scsi host lookup method.
  qla2xxx: Fix delayed response to command for loop mode/direct connect.

 drivers/scsi/qla2xxx/Kconfig   |   1 +
 drivers/scsi/qla2xxx/qla_attr.c|   4 +-
 drivers/scsi/qla2xxx/qla_dbg.h |   1 +
 drivers/scsi/qla2xxx/qla_def.h |  56 ++-
 drivers/scsi/qla2xxx/qla_dfs.c | 107 -
 drivers/scsi/qla2xxx/qla_gbl.h |  18 +-
 drivers/scsi/qla2xxx/qla_init.c|  85 ++--
 drivers/scsi/qla2xxx/qla_iocb.c|  13 +-
 drivers/scsi/qla2xxx/qla_isr.c |  41 +-
 drivers/scsi/qla2xxx/qla_mbx.c | 304 --
 drivers/scsi/qla2xxx/qla_mid.c |  14 +-
 drivers/scsi/qla2xxx/qla_os.c  |  23 +-
 drivers/scsi/qla2xxx/qla_target.c  | 748 +
 drivers/scsi/qla2xxx/qla_target.h  |  39 +-
 drivers/scsi/qla2xxx/qla_version.h |   6 +-
 drivers/scsi/qla2xxx/tcm_qla2xxx.c |  49 ++-
 drivers/target/target_core_alua.c  |  82 ++--
 drivers/target/target_core_configfs.c  |   4 +
 drivers/target/target_core_pscsi.c |  50 +--
 drivers/target/target_core_sbc.c   |  10 +-
 drivers/target/target_core_tpg.c   |   3 +-
 drivers/target/target_core_transport.c |   3 +-
 drivers/target/target_core_user.c  | 152 +--
 include/target/target_core_backend.h   |   7 +-
 include/target/target_core_base.h  |   2 +-
 25 files changed, 1274 insertions(+), 548 deletions(-)



Re: [PATCH v4 00/14] qla2xxx: Bug Fixes and updates for target.

2017-03-18 Thread Nicholas A. Bellinger
On Wed, 2017-03-15 at 09:48 -0700, Himanshu Madhani wrote:
> Hi Nic,
> 
> Please consider this series for inclusion in target-pending.
> 
> This series contains following changes.
> 
> o Fix for the deadlock because of inconsistent lock usage reported by Bart.
> o Added patch to submit non-critical MBX command via IOCB path.
> o Improved T10-DIF/PI handling with target stack.
> o Changed scsi host lookup method.
> o Some minor bug fixes.
> 
> Changes from v3 --> v4
> 
> o Fixed regression repored by Bart in following patch.
> "qla2xxx: Fix delayed response to command for loop mode/direct connect."
> 
> Note: Patch order has been changed as well from last series. This is to
> isolate bug-fixes in front of improvements to narrow down offending patch.
> Let me know if you would prefer me to submit series with same patch order
> as v3.

Thanks for breaking these up to put bug-fixes with stable CC's at the
head of series, followed by the other bug-fixes + improvements.

As-is, I think it's OK to merge the full series for v4.11-rc.

That said, merged to target-pending/master.

Thanks Himanshu & Co.







Re: [PATCH 0/3] Unblock SCSI devices even if the LLD is being unloaded

2017-03-18 Thread Bart Van Assche
On Sat, 2017-03-18 at 07:44 -0500, James Bottomley wrote:
> On Fri, 2017-03-17 at 16:40 +, Bart Van Assche wrote:
> > Does your comment mean that you think there is a scenario in which
> > scsi_target_block() or scsi_target_unblock() can be called while the 
> > text area of a SCSI LLD is being released? I have reviewed all the 
> > callers of these functions but I have not found such a scenario.
> > scsi_target_block() and scsi_target_unblock() are either called from 
> > a SCSI transport layer implementation (FC, iSCSI, SRP) or from a SCSI 
> > LLD kernel module (snic_disc). All these kernel modules only call 
> > scsi_target_*block() for resources (rport or SCSI target
> > respectively) that are removed before the code area of
> > these modules is released. This is why I think that calling
> > scsi_target_*block() without increasing the SCSI LLD module reference
> > count is safe.
> 
> The transport code is above the HBA module code and in that code
> unblock could be racing with module removal.  The original premise was
> that once the dev/target/host goes into DEL, nothing can call into
> queuecommand or get a reference to the device, so nothing halts removal
> after that, but you changed that with your code, which is why it's now
> unsafe.

Hello James,

Thank you for having provided more background information about the design
goals of the SCSI code. However, regarding scsi_target_*block(), I think it
is safe even for transport modules to call these functions without obtaining
an additional reference on the SCSI LLD kernel module. The transport
implementation won't attempt to block or unblock a SCSI target anymore once
unloading of the LLD text has started. SCSI LLDs release the attached
transport before unloading of the SCSI LLD kernel module text starts and
SCSI transport modules guarantee that scsi_target_*block() won't be called
anymore after the transport module has been released.
  
If you do not agree with the above please provide a call sequence for an
existing SCSI LLD or transport module that illustrates the race you
described.

Thanks,

Bart.

Re: [PATCH 3/3] Ensure that scsi_target_unblock() examines all devices

2017-03-18 Thread kbuild test robot
Hi Bart,

[auto build test WARNING on scsi/for-next]
[also build test WARNING on v4.11-rc2 next-20170310]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Bart-Van-Assche/Unblock-SCSI-devices-even-if-the-LLD-is-being-unloaded/20170318-214016
base:   https://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git for-next
reproduce: make htmldocs

All warnings (new ones prefixed by >>):

   drivers/scsi/scsi.c:669: warning: No description found for parameter 'sdev'
   drivers/scsi/scsi.c:679: warning: No description found for parameter 'sdev'
>> drivers/scsi/scsi_lib.c:3087: warning: No description found for parameter 
>> 'dev'
>> drivers/scsi/scsi_lib.c:3087: warning: No description found for parameter 
>> 'data'
   drivers/scsi/constants.c:1: warning: no structured comments found

vim +/dev +3087 drivers/scsi/scsi_lib.c

5d9fb5cc1 Mike Christie   2012-05-17  3071  
scsi_internal_device_unblock(sdev, *(enum scsi_device_state *)data);
^1da177e4 Linus Torvalds  2005-04-16  3072  }
^1da177e4 Linus Torvalds  2005-04-16  3073  
ae188a841 Bart Van Assche 2017-03-16  3074  /**
ae188a841 Bart Van Assche 2017-03-16  3075   * target_unblock() - unblock all 
devices associated with a SCSI target
ae188a841 Bart Van Assche 2017-03-16  3076   *
ae188a841 Bart Van Assche 2017-03-16  3077   * Notes:
ae188a841 Bart Van Assche 2017-03-16  3078   * - Do not use scsi_device_get() 
nor any of the macros that use this
ae188a841 Bart Van Assche 2017-03-16  3079   *   function from inside 
scsi_target_block() because otherwise this function
ae188a841 Bart Van Assche 2017-03-16  3080   *   won't have any effect when 
called while the SCSI LLD is being unloaded.
ae188a841 Bart Van Assche 2017-03-16  3081   * - Do not hold the host lock 
around the device_unblock() calls because at
ae188a841 Bart Van Assche 2017-03-16  3082   *   least for blk-sq the block 
layer queue lock is the outer lock and the
ae188a841 Bart Van Assche 2017-03-16  3083   *   SCSI host lock is the inner 
lock.
ae188a841 Bart Van Assche 2017-03-16  3084   */
^1da177e4 Linus Torvalds  2005-04-16  3085  static int
^1da177e4 Linus Torvalds  2005-04-16  3086  target_unblock(struct device *dev, 
void *data)
^1da177e4 Linus Torvalds  2005-04-16 @3087  {
^1da177e4 Linus Torvalds  2005-04-16  3088  if (scsi_is_target_device(dev))
ae188a841 Bart Van Assche 2017-03-16  3089  
starget_for_all_devices(to_scsi_target(dev), data,
^1da177e4 Linus Torvalds  2005-04-16  3090  
device_unblock);
^1da177e4 Linus Torvalds  2005-04-16  3091  return 0;
^1da177e4 Linus Torvalds  2005-04-16  3092  }
^1da177e4 Linus Torvalds  2005-04-16  3093  
ae188a841 Bart Van Assche 2017-03-16  3094  /**
ae188a841 Bart Van Assche 2017-03-16  3095   * scsi_target_unblock() - unblock 
all devices associated with a SCSI target

:: The code at line 3087 was first introduced by commit
:: 1da177e4c3f41524e886b7f1b8a0c1fc7321cac2 Linux-2.6.12-rc2

:: TO: Linus Torvalds <torva...@ppc970.osdl.org>
:: CC: Linus Torvalds <torva...@ppc970.osdl.org>

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: application/gzip


Re: [PATCH 2/3] Introduce starget_for_all_devices() and shost_for_all_devices()

2017-03-18 Thread kbuild test robot
Hi Bart,

[auto build test WARNING on scsi/for-next]
[also build test WARNING on v4.11-rc2 next-20170310]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Bart-Van-Assche/Unblock-SCSI-devices-even-if-the-LLD-is-being-unloaded/20170318-214016
base:   https://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git for-next
reproduce: make htmldocs

All warnings (new ones prefixed by >>):

>> drivers/scsi/scsi.c:669: warning: No description found for parameter 'sdev'
   drivers/scsi/scsi.c:679: warning: No description found for parameter 'sdev'
   drivers/scsi/constants.c:1: warning: no structured comments found

vim +/sdev +669 drivers/scsi/scsi.c

   653  struct scsi_device *sdev;
   654  
   655  shost_for_each_device(sdev, shost) {
   656  if ((sdev->channel == starget->channel) &&
   657  (sdev->id == starget->id))
   658  fn(sdev, data);
   659  }
   660  }
   661  EXPORT_SYMBOL(starget_for_each_device);
   662  
   663  /**
   664   * scsi_device_get_any() - get a reference to @sdev even if it is being 
deleted
   665   *
   666   * See also scsi_device_get().
   667   */
   668  static int scsi_device_get_any(struct scsi_device *sdev)
 > 669  {
   670  return get_device(>sdev_gendev) ? 0 : -ENXIO;
   671  }
   672  
   673  /**
   674   * scsi_device_put_any() - drop a reference obtained by 
scsi_device_get_any()
   675   *
   676   * See also scsi_device_put().
   677   */

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: application/gzip


Re: [PATCH v2] scsi_sysfs: fix hang when removing scsi device

2017-03-18 Thread Bart Van Assche
On Sat, 2017-03-18 at 12:17 +0100, Hannes Reinecke wrote:
> On 03/17/2017 12:00 AM, Bart Van Assche wrote:
> > I hadn't seen this crash before kernel v4.11-rc1 but with kernel v4.11-rc1
> > and later I see it if I let the srp-test scripts run for a few minutes. The
> > patch I used to disable async aborts on my test setup is as follows:
> 
> Thing is, I didn't change anything in the async abort case; all my
> patches haven't been merged yet.
> So I would rather think this being the side effect of something else
> 
> And I just noticed that you found the real issue with your alua fixes,
> so I guess this can be ignored, right?

Yes - sorry for the noise. This crash did not occur with the ALUA fixes
applied and async aborts enabled.

Bart.

Re: [PATCH 0/3] Unblock SCSI devices even if the LLD is being unloaded

2017-03-18 Thread James Bottomley
On Fri, 2017-03-17 at 16:40 +, Bart Van Assche wrote:
> On Fri, 2017-03-17 at 05:54 -0700, James Bottomley wrote:
> > So it's better to use the module without a reference in place and 
> > take the risk that it may exit and release its code area while 
> > we're calling it?
> 
> Hello James,
> 
> My understanding of scsi_device_get() / scsi_device_put() is that the 
> reason why these manipulate the module reference count is to avoid 
> that a SCSI LLD module can be unloaded while a SCSI device is being 
> used from a context that is not notified about SCSI LLD unloading 
> (e.g. a file handle controlled by the sd driver or a SCSI ALUA device
> handler worker thread).

Not just that: it's so the underlying module is pinned for every
potential user as well.  the unblock code is called in places outside
the actual hba driver module, so it needs that protection.

> Does your comment mean that you think there is a scenario in which
> scsi_target_block() or scsi_target_unblock() can be called while the 
> text area of a SCSI LLD is being released? I have reviewed all the 
> callers of these functions but I have not found such a scenario.
> scsi_target_block() and scsi_target_unblock() are either called from 
> a SCSI transport layer implementation (FC, iSCSI, SRP) or from a SCSI 
> LLD kernel module (snic_disc). All these kernel modules only call 
> scsi_target_*block() for resources (rport or SCSI target
> respectively) that are removed before the code area of
> these modules is released. This is why I think that calling
> scsi_target_*block() without increasing the SCSI LLD module reference
> count is safe.

The transport code is above the HBA module code and in that code
unblock could be racing with module removal.  The original premise was
that once the dev/target/host goes into DEL, nothing can call into
queuecommand or get a reference to the device, so nothing halts removal
after that, but you changed that with your code, which is why it's now
unsafe.

> > diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c
> > index 82dfe07..fd1ba1d 100644
> > --- a/drivers/scsi/scsi_sysfs.c
> > +++ b/drivers/scsi/scsi_sysfs.c
> > @@ -39,6 +39,7 @@ static const struct {
> > { SDEV_TRANSPORT_OFFLINE, "transport-offline" },
> > { SDEV_BLOCK,   "blocked" },
> > { SDEV_CREATED_BLOCK, "created-blocked" },
> > +   { SDEV_CANCEL_BLOCK, "blocked" },
> >  };
> 
> The multipathd function path_offline() translates "blocked" into 
> PATH_PENDING. Shouldn't SDEV_CANCEL_BLOCK be translated by multipathd 
> into PATH_DOWN? There might be other user space applications that 
> interpret the SCSI device state and that I am not aware of.

Given it's very short lived, I don't think it much matters, but if you
think it does, that can become cancel.

> Additionally, your patch does not modify scsi_device_get() and hence
> will cause scsi_device_get() to succeed for devices that are in state
> SDEV_CANCEL_BLOCK. I think that's a subtle behavior change.

Yes, otherwise device unblock wouldn't work (on the off chance the
device is unblocked before we get to the sync cache).  Again, it's like
the open race: you could have got the reference just before the device
went into cancel and you're in the same position so there's actually no
substantive behaviour change at all, it just elongates the window where
you get a reference to a device you can't send commands to.

James


> Thanks,
> 
> Bart.



Re: [PATCH 3/3] scsi_dh_alua: Warn if the first argument of alua_rtpg_queue() is NULL

2017-03-18 Thread Hannes Reinecke
On 03/18/2017 01:02 AM, Bart Van Assche wrote:
> Callers must provide a valid port group to alua_rtpg_queue().
> Issue a kernel warning if that is not the case.
> 
> Signed-off-by: Bart Van Assche 
> Cc: Hannes Reinecke 
> Cc: Tang Junhui 
> ---
>  drivers/scsi/device_handler/scsi_dh_alua.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/scsi/device_handler/scsi_dh_alua.c 
> b/drivers/scsi/device_handler/scsi_dh_alua.c
> index b6849d3ecefe..c01b47e5b55a 100644
> --- a/drivers/scsi/device_handler/scsi_dh_alua.c
> +++ b/drivers/scsi/device_handler/scsi_dh_alua.c
> @@ -876,7 +876,7 @@ static bool alua_rtpg_queue(struct alua_port_group *pg,
>   unsigned long flags;
>   struct workqueue_struct *alua_wq = kaluad_wq;
>  
> - if (!pg || scsi_device_get(sdev))
> + if (WARN_ON_ONCE(!pg) || scsi_device_get(sdev))
>   return false;
>  
>   spin_lock_irqsave(>lock, flags);
> 
Reviewed-by: Hannes Reinecke 

Cheers,

Hannes
-- 
Dr. Hannes Reinecke   zSeries & Storage
h...@suse.de  +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)


Re: [PATCH 2/3] scsi_dh_alua: Ensure that alua_activate() calls the completion function

2017-03-18 Thread Hannes Reinecke
On 03/18/2017 01:02 AM, Bart Van Assche wrote:
> Callers of scsi_dh_activate(), e.g. dm-mpath, assume that this
> function either returns an error code or calls the completion
> function. Make alua_activate() call the completion function even
> if scsi_device_get() fails.
> 
> Signed-off-by: Bart Van Assche 
> Cc: Hannes Reinecke 
> Cc: Tang Junhui 
> Cc: 
> ---
>  drivers/scsi/device_handler/scsi_dh_alua.c | 20 +++-
>  1 file changed, 15 insertions(+), 5 deletions(-)
> 
Reviewed-by: Hannes Reinecke 

Cheers,

Hannes
-- 
Dr. Hannes Reinecke   zSeries & Storage
h...@suse.de  +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)


Re: [PATCH v2] scsi_sysfs: fix hang when removing scsi device

2017-03-18 Thread Hannes Reinecke
On 03/17/2017 12:00 AM, Bart Van Assche wrote:
> On Mon, 2017-03-13 at 14:55 -0700, James Bottomley wrote:
>> On Mon, 2017-03-13 at 20:33 +, Bart Van Assche wrote:
>>> On Mon, 2017-03-13 at 12:23 -0700, James Bottomley wrote:
 On Mon, 2017-03-13 at 18:49 +, Bart Van Assche wrote:
> diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c
> index 7bfbcfa7af40..b3bb49d06943 100644
> --- a/drivers/scsi/scsi.c
> +++ b/drivers/scsi/scsi.c
> @@ -602,7 +602,7 @@ EXPORT_SYMBOL(scsi_device_get);
>   */
>  void scsi_device_put(struct scsi_device *sdev)
>  {
> -   module_put(sdev->host->hostt->module);
> +   module_put(sdev->hostt->module);
> put_device(>sdev_gendev);
>  }
>  EXPORT_SYMBOL(scsi_device_put);
> diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
> index 6f7128f49c30..7134487abbb1 100644
> --- a/drivers/scsi/scsi_scan.c
> +++ b/drivers/scsi/scsi_scan.c
> @@ -227,6 +227,7 @@ static struct scsi_device
> *scsi_alloc_sdev(struct
> scsi_target *starget,
> sdev->model = scsi_null_device_strs;
> sdev->rev = scsi_null_device_strs;
> sdev->host = shost;
> +   sdev->hostt = shost->hostt;
> sdev->queue_ramp_up_period = SCSI_DEFAULT_RAMP_UP_PERIOD;
> sdev->id = starget->id;
> sdev->lun = lun;
> diff --git a/include/scsi/scsi_device.h
> b/include/scsi/scsi_device.h
> index 6f22b39f1b0c..cda620ed5922 100644
> --- a/include/scsi/scsi_device.h
> +++ b/include/scsi/scsi_device.h
> @@ -82,6 +82,7 @@ struct scsi_event {
>  
>  struct scsi_device {
> struct Scsi_Host *host;
> +   struct scsi_host_template *hostt;
> struct request_queue *request_queue;
>  

 The apparent assumption behind this patch is that sdev->host can be
 freed but the sdev will still exist?  That shouldn't be correct:
 the
 rule for struct devices is that the child always holds the parent
 and
 the host is parented (albeit not necessarily directly) to the sdev,
 so
 it looks like something has gone wrong if the host had been freed
 before the sdev.
>>>
>>> Hello James,
>>>
>>> scsi_remove_host() decreases the sdev reference count but does not 
>>> wait until the sdev release work has finished. This is why the SCSI
>>> host can already have disappeared before the last scsi_device_put()
>>> call occurs.
>>
>> This is true, but I don't see how it can cause the host to be freed
>> before the sdev.  The memory for struct Scsi_Host is freed in the
>> shost_gendev release routine, which should be pinned by the parent
>> traversal from sdev.  So it should not be possible for
>>  scsi_host_dev_release() to be called before
>>  scsi_device_dev_release_usercontext() becase the latter has the final
>> put of the parent device.
> 
> Hello Hannes,
> 
> The following crash only occurs with async aborts enabled:
> 
> general protection fault:  [#1] SMP
> RIP: 0010:scsi_device_put+0xb/0x30
> Call Trace:
>  scsi_disk_put+0x2d/0x40
>  sd_release+0x3d/0xb0
>  __blkdev_put+0x29e/0x360
>  blkdev_put+0x49/0x170
>  dm_put_table_device+0x58/0xc0 [dm_mod]
>  dm_put_device+0x70/0xc0 [dm_mod]
>  free_priority_group+0x92/0xc0 [dm_multipath]
>  free_multipath+0x70/0xc0 [dm_multipath]
>  multipath_dtr+0x19/0x20 [dm_multipath]
>  dm_table_destroy+0x67/0x120 [dm_mod]
>  dev_suspend+0xde/0x240 [dm_mod]
>  ctl_ioctl+0x1f5/0x520 [dm_mod]
>  dm_ctl_ioctl+0xe/0x20 [dm_mod]
>  do_vfs_ioctl+0x8f/0x700
>  SyS_ioctl+0x3c/0x70
>  entry_SYSCALL_64_fastpath+0x18/0xad
> 
> I hadn't seen this crash before kernel v4.11-rc1 but with kernel v4.11-rc1
> and later I see it if I let the srp-test scripts run for a few minutes. The
> patch I used to disable async aborts on my test setup is as follows:
> 
How utterly curious.

Thing is, I didn't change anything in the async abort case; all my
patches haven't been merged yet.
So I would rather think this being the side effect of something else

And I just noticed that you found the real issue with your alua fixes,
so I guess this can be ignored, right?

Cheers,

Hannes
-- 
Dr. Hannes Reinecke   zSeries & Storage
h...@suse.de  +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)


Re: [PATCH 1/3] scsi_dh_alua: Check scsi_device_get() return value

2017-03-18 Thread Hannes Reinecke
On 03/18/2017 01:02 AM, Bart Van Assche wrote:
> Do not queue ALUA work nor call scsi_device_put() if the scsi_device_get()
> call fails. This patch fixes the following crash:
> 
> general protection fault:  [#1] SMP
> RIP: 0010:scsi_device_put+0xb/0x30
> Call Trace:
>  scsi_disk_put+0x2d/0x40
>  sd_release+0x3d/0xb0
>  __blkdev_put+0x29e/0x360
>  blkdev_put+0x49/0x170
>  dm_put_table_device+0x58/0xc0 [dm_mod]
>  dm_put_device+0x70/0xc0 [dm_mod]
>  free_priority_group+0x92/0xc0 [dm_multipath]
>  free_multipath+0x70/0xc0 [dm_multipath]
>  multipath_dtr+0x19/0x20 [dm_multipath]
>  dm_table_destroy+0x67/0x120 [dm_mod]
>  dev_suspend+0xde/0x240 [dm_mod]
>  ctl_ioctl+0x1f5/0x520 [dm_mod]
>  dm_ctl_ioctl+0xe/0x20 [dm_mod]
>  do_vfs_ioctl+0x8f/0x700
>  SyS_ioctl+0x3c/0x70
>  entry_SYSCALL_64_fastpath+0x18/0xad
> 
> Fixes: commit 03197b61c5ec ("scsi_dh_alua: Use workqueue for RTPG")
> Signed-off-by: Bart Van Assche 
> Cc: Hannes Reinecke 
> Cc: Tang Junhui 
> Cc: 
> ---
>  drivers/scsi/device_handler/scsi_dh_alua.c | 18 +-
>  1 file changed, 9 insertions(+), 9 deletions(-)
> 
Reviewed-by: Hannes Reinecke 

Cheers,

Hannes
-- 
Dr. Hannes Reinecke   zSeries & Storage
h...@suse.de  +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)