** Changed in: ubuntu-power-systems
       Status: New => Fix Committed

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1696445

Title:
  OpenPower: Some multipaths temporarily have only a single path

Status in The Ubuntu-power-systems project:
  Fix Committed
Status in linux package in Ubuntu:
  New
Status in linux source package in Yakkety:
  Fix Committed
Status in linux source package in Zesty:
  Fix Committed

Bug description:
  [Impact]

   * The SES driver causes a long delay in disk discovery when
     a large number of disks is present in the disk enclosure,
     which increases with the number of disks attached.

   * This delays the addition and visibility of the disk devices
     to userspace, which among other things causes multipath not
     to have multiple paths, actually, until the disk discovery
     eventually/finally finishes.

   * The fix significantly shortens the time taken by the SES
     driver to handle disk discovery, causing no extra delays,
     by removing a superfluous SCSI command sent to enclosure.

  [Test Case]

   * Load the module to access the enclosure and its disks; e.g.,

     $ sudo modprobe mpt3sas

   * Notice the interval between the discovery of each disk; e.g., dmesg

     $ dmesg -T | grep 'Attached SCSI disk' | tail -n2
     [Thu Jun 1 14:18:30 2017] sd 17:0:100:0: [sdcr] Attached SCSI disk
     [Thu Jun 1 14:18:35 2017] sd 17:0:101:0: [sdcs] Attached SCSI disk

   * The interval should be in the same second or so range with the fix.

     $ dmesg -T | grep 'Attached SCSI disk' | tail -n2
     [Wed Jun  7 13:11:59 2017] sd 18:0:176:0: [sdly] Attached SCSI disk
     [Wed Jun  7 13:11:59 2017] sd 18:0:175:0: [sdlx] Attached SCSI disk

  [Regression Potential]

   * The power status of the disks in the enclosure is no longer
     checked during probe time.  However, the patch demonstrates that
     initial value was never used in any way.  So, little regression
     potential.

   * Nonetheless, users of SES enclosures which verify the power status
     of disks in the enclosure might _theoretically_ see a problem, iff
     the fix has a problem (which has not been found yet).

  [Other Info]

   * None at this time.

  
  Problem Description:
  ====================

  This week, I went ahead and scaled up my test configuration to max
  configuration 2x5U84_Enclosures,_MaxCfg_168HDDs. This time, it hit a
  different issue. The issue is that some multipaths only have a single
  path and no redundancy. Others have multiple paths and redundancy.

  Checkpoint #1:
  ==============
  - system reboot around 2pm (14:00)

  Checkpoint # 2:
  ===============
  - It took several minutes for first disk to be detected.

  
  root@smb1p1:~# multipath -ll|grep dm |wc -l
  103

  root@smb1p1:~# dmesg -T | grep 'sd 1[78]:' |  grep 'Attached SCSI disk' | tail
  [Thu Jun  1 14:18:30 2017] sd 17:0:100:0: [sdcr] Attached SCSI disk
  [Thu Jun  1 14:18:35 2017] sd 17:0:101:0: [sdcs] Attached SCSI disk
  [Thu Jun  1 14:18:40 2017] sd 17:0:102:0: [sdct] Attached SCSI disk
  [Thu Jun  1 14:18:44 2017] sd 17:0:103:0: [sdcu] Attached SCSI disk
  [Thu Jun  1 14:18:54 2017] sd 17:0:105:0: [sdcv] Attached SCSI disk
  [Thu Jun  1 14:18:59 2017] sd 17:0:106:0: [sdcw] Attached SCSI disk
  [Thu Jun  1 14:19:04 2017] sd 17:0:107:0: [sdcx] Attached SCSI disk
  [Thu Jun  1 14:19:09 2017] sd 17:0:108:0: [sdcy] Attached SCSI disk
  [Thu Jun  1 14:19:14 2017] sd 17:0:109:0: [sdcz] Attached SCSI disk
  [Thu Jun  1 14:19:19 2017] sd 17:0:110:0: [sdda] Attached SCSI disk
  root@smb1p1:~#

  ...

  root@smb1p1:~# multipath -ll|grep dm |wc -l
  142
  root@smb1p1:~# dmesg -T | grep 'sd 1[78]:' |  grep 'Attached SCSI disk' | tail
  [Thu Jun  1 14:21:54 2017] sd 17:0:141:0: [sdee] Attached SCSI disk
  [Thu Jun  1 14:21:58 2017] sd 17:0:142:0: [sdef] Attached SCSI disk
  [Thu Jun  1 14:22:04 2017] sd 17:0:143:0: [sdeg] Attached SCSI disk
  [Thu Jun  1 14:22:08 2017] sd 17:0:144:0: [sdeh] Attached SCSI disk
  [Thu Jun  1 14:22:14 2017] sd 17:0:145:0: [sdei] Attached SCSI disk
  [Thu Jun  1 14:22:18 2017] sd 17:0:146:0: [sdej] Attached SCSI disk
  [Thu Jun  1 14:22:24 2017] sd 17:0:147:0: [sdek] Attached SCSI disk
  [Thu Jun  1 14:22:29 2017] sd 17:0:148:0: [sdel] Attached SCSI disk
  [Thu Jun  1 14:22:34 2017] sd 17:0:149:0: [sdem] Attached SCSI disk
  [Thu Jun  1 14:22:39 2017] sd 17:0:150:0: [sden] Attached SCSI disk
  root@smb1p1:~#

  
  ...

  - After 43  minutes, multipath -ll command shows some paths with only
  single path and no redundancy and some path with multiple paths and
  redundancy.

  root@smb1p1:~# date
  Thu Jun  1 14:43:00 CDT 2017
  root@smb1p1:~# multipath -ll | grep -c 'sd[a-z]\+'
  252
  root@smb1p1:~#

  ...

  - After 47 minutes, multipath -ll command still shows some paths with
  only single path and no redundancy.

  
  root@smb1p1:~# multipath -ll | grep -c 'sd[a-z]\+'
  288
  root@smb1p1:~#


  - After 51 minutes after system reboot, looks like all disk are
  discovered and the Multipath is correctly built.

  root@smb1p1:~# multipath -ll | grep -c 'sd[a-z]\+'
  336

  
  == Comment: #24 - Mauricio Faria De Oliveira  - 2017-06-06 11:42:59 ==
  Hi Paul,

  Per your logs, yes, it's the slowness with the SES driver.

  I'll ask Canonical to pick it up for 16.10 and 17.04 so it makes into
  16.04.2 and 16.04.3.

  Thanks,
  Mauricio

  == Comment: #26 - Mauricio Faria De Oliveira <mauri...@br.ibm.com> - 
2017-06-06 12:06:32 ==
  The patch applies cleanly in the master-next branch of ubuntu-zesty.git and 
ubuntu-yakkety.git.
  Mirroring to Canonical to get a LP bug number, required in the submission 
process.

  == Comment: #27 - Mauricio Faria De Oliveira <mauri...@br.ibm.com> - 
2017-06-06 12:07:58 ==
  The commit is [1].

  commit 75106523f39751390b5789b36ee1d213b3af1945
  Author: Mauricio Faria de Oliveira <mauri...@linux.vnet.ibm.com>
  Date:   Wed Apr 5 12:18:19 2017 -0300

      scsi: ses: don't get power status of SES device slot on probe

  [1]
  
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=75106523f39751390b5789b36ee1d213b3af1945

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-power-systems/+bug/1696445/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to