[Kernel-packages] [Bug 1709889] Comment bridged from LTC Bugzilla

2018-08-06 Thread bugproxy
--- Comment From dougm...@us.ibm.com 2018-08-06 09:43 EDT---
We are also seeing this behavior on Bionic 18.04.

I don't understand the request for better comments. The Launchpad
comments seem to include a fairly complete description of the problem.
Are there specific questions about this problem? I guess there are a lot
of comments in the beginning that show the progression of the diagnosis
efforts, so perhaps it requires more reading to reach the full problem
description.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1709889

Title:
  Ubuntu 18.04: Bug in cfq scheduler, I/Os do not get submitted to
  adapter for a very long time.

Status in Linux:
  Unknown
Status in The Ubuntu-power-systems project:
  Triaged
Status in linux package in Ubuntu:
  New
Status in linux source package in Zesty:
  Won't Fix
Status in linux source package in Bionic:
  Triaged

Bug description:
  ---Problem Description---
  When running stress test, sometimes seeing IO hung in dmesg or seeing "Host 
adapter abort request" error.

  ---Steps to Reproduce---
   There are two ways to re-create the issues:
  (1)running HTX, you will see IO timeout backtrace in dmesg in several hours
  (2)running some IO test, then reboot system, repeat this two steps, it takes 
long time to re-create the issue.
   
  ---uname output---
  4.10.0-11-generic

  The bulk of the effort for this issue is currently being worked in
  MicroSemi's JIRA https://jira.pmcs.com/browse/ESDIBMOP-133.

  Ran an interesting test: Ran HTX until I started getting the "stall"
  messages on the console, then shutdown HTX and examined the I/O
  counters for the tested disks in sysfs:

  root@bostonp15:~# for i in 
/sys/devices/pci0003:00/0003:00:00.0/0003:01:00.0/host0/target0:2:[2345]/0:2:[2345]:0;
 do echo ${i##*/} $(<${i}/iorequest_cnt) $(<${i}/iodone_cnt); done
  0:2:2:0 0x5eba3d 0x5eba3d
  0:2:3:0 0x773cc9 0x773cc9
  0:2:4:0 0x782c61 0x782c61
  0:2:5:0 0x5ca134 0x5ca134
  root@bostonp15:~#

  So, none of the disks showed any evidence of having lost an I/O. I
  then restarted HTX and aside from having to manually restart one of
  the disks, see no problems with the testing. It appears that what was
  "hung" was purely in userland.

  This does not absolve the kernel or aacraid driver from blame, but it
  shows that the OS "believes" that it completed the I/O and thus
  removed it from the queue. What we don't know is whether the OS truly
  notified HTX about the completion, or if HTX (or userland libraries)
  just failed to process the notification.

  Tests are running again, will see what happens next.

  Update from JIRA:

  I have run some more experiments. Not sure what it tells us, but
  here's what I've seen.

  First test, ran until I got kernel messages about stalled tasks, then
  shutdown HTX. After HTX was down, I checked the above mentioned
  counters and found that on each disk iorequest_cnt matched iodone_cnt.
  The disks were usable and I could restart HTX. This suggests that the
  problem is not in the PM8069 firmware, and makes the case for the
  aacraid driver having a bug somewhat weaker. However, this merely says
  that the driver "completed" the I/O as far as the kernel is concerned,
  not that a completion rippled back to the application.

  I restarted HTX and have run until errors. This time, I am leaving HTX
  running and observing. Two of the disks reached the HTX error
  threshold and the testers stopped (those 2 disks are now idle).
  Another disks saw errors but then stopped and appears to be running
  fine now. The last disk has not seen any errors (yet). On the two idle
  (errored-out) disks I see  iorequest_cnt matches iodone_cnt. I am able
  to "terminate and restart" the two idle disks and HTX appears to be
  testing them again "normally". Note that no reboot was required,
  further supporting the evidence that, as far as the kernel is
  concerned, there is nothing wrong with the disks and their I/O paths.

  So, I don't believe this completely eliminates aacraid from the
  picture, especially given we don't see this behavior on other
  systems/drivers. But, it probably moves the focus of the investigation
  away form the adapter firmware.

  Tried build upstream 4.11 kernel on Ubuntu. This still gets the hangs.
  Both Ubuntu 4.10 and upstream 4.11 have aacraid driver
  1.2.1[50792]-custom.

  Good new/bad news... While doing an initial evaluation of the LSI-3008
  SAS HBA on Boston and Ubuntu 17.04, I am hitting this same problem.
  So, it appears to have nothing specific to do with the PM8069 or
  aacraid driver.

  Some notes on reproduce this. I have been using the github release of
  HTX, built using the following steps:

  1. apt install make gcc g++ git libncurses5-dev libcxl-dev libdapl-dev 
(others may be required)
  2. git clone https://github.com/open-power/HTX
  3. cd HTX
  4. make
  5. make deb

  The

[Kernel-packages] [Bug 1709889] Comment bridged from LTC Bugzilla

2018-07-31 Thread bugproxy
--- Comment From frede...@fr.ibm.com 2018-07-31 13:10 EDT---
Doug,
- do we have patches for this issue ? I saw you talked about some but as I 
understand they do not seem satisfactory.
- can any of the cfq tunables help on this ?
- if not, do we have some extended tests/view of deadline scheduler on Power : 
would it introduce other issues in place of what it would solve, if it would be 
set by default ?

Canonical,
can we mark this issue as happening on 18.04 too  (4.15.0-26-generic) ? 
Launchpad only shows it affects Zesty.

Fred

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1709889

Title:
  Ubuntu 17.04: Bug in cfq scheduler, I/Os do not get submitted to
  adapter for a very long time.

Status in Linux:
  Unknown
Status in The Ubuntu-power-systems project:
  Invalid
Status in linux package in Ubuntu:
  Invalid
Status in linux source package in Zesty:
  Won't Fix

Bug description:
  ---Problem Description---
  When running stress test, sometimes seeing IO hung in dmesg or seeing "Host 
adapter abort request" error.

  ---Steps to Reproduce---
   There are two ways to re-create the issues:
  (1)running HTX, you will see IO timeout backtrace in dmesg in several hours
  (2)running some IO test, then reboot system, repeat this two steps, it takes 
long time to re-create the issue.
   
  ---uname output---
  4.10.0-11-generic

  The bulk of the effort for this issue is currently being worked in
  MicroSemi's JIRA https://jira.pmcs.com/browse/ESDIBMOP-133.

  Ran an interesting test: Ran HTX until I started getting the "stall"
  messages on the console, then shutdown HTX and examined the I/O
  counters for the tested disks in sysfs:

  root@bostonp15:~# for i in 
/sys/devices/pci0003:00/0003:00:00.0/0003:01:00.0/host0/target0:2:[2345]/0:2:[2345]:0;
 do echo ${i##*/} $(<${i}/iorequest_cnt) $(<${i}/iodone_cnt); done
  0:2:2:0 0x5eba3d 0x5eba3d
  0:2:3:0 0x773cc9 0x773cc9
  0:2:4:0 0x782c61 0x782c61
  0:2:5:0 0x5ca134 0x5ca134
  root@bostonp15:~#

  So, none of the disks showed any evidence of having lost an I/O. I
  then restarted HTX and aside from having to manually restart one of
  the disks, see no problems with the testing. It appears that what was
  "hung" was purely in userland.

  This does not absolve the kernel or aacraid driver from blame, but it
  shows that the OS "believes" that it completed the I/O and thus
  removed it from the queue. What we don't know is whether the OS truly
  notified HTX about the completion, or if HTX (or userland libraries)
  just failed to process the notification.

  Tests are running again, will see what happens next.

  Update from JIRA:

  I have run some more experiments. Not sure what it tells us, but
  here's what I've seen.

  First test, ran until I got kernel messages about stalled tasks, then
  shutdown HTX. After HTX was down, I checked the above mentioned
  counters and found that on each disk iorequest_cnt matched iodone_cnt.
  The disks were usable and I could restart HTX. This suggests that the
  problem is not in the PM8069 firmware, and makes the case for the
  aacraid driver having a bug somewhat weaker. However, this merely says
  that the driver "completed" the I/O as far as the kernel is concerned,
  not that a completion rippled back to the application.

  I restarted HTX and have run until errors. This time, I am leaving HTX
  running and observing. Two of the disks reached the HTX error
  threshold and the testers stopped (those 2 disks are now idle).
  Another disks saw errors but then stopped and appears to be running
  fine now. The last disk has not seen any errors (yet). On the two idle
  (errored-out) disks I see  iorequest_cnt matches iodone_cnt. I am able
  to "terminate and restart" the two idle disks and HTX appears to be
  testing them again "normally". Note that no reboot was required,
  further supporting the evidence that, as far as the kernel is
  concerned, there is nothing wrong with the disks and their I/O paths.

  So, I don't believe this completely eliminates aacraid from the
  picture, especially given we don't see this behavior on other
  systems/drivers. But, it probably moves the focus of the investigation
  away form the adapter firmware.

  Tried build upstream 4.11 kernel on Ubuntu. This still gets the hangs.
  Both Ubuntu 4.10 and upstream 4.11 have aacraid driver
  1.2.1[50792]-custom.

  Good new/bad news... While doing an initial evaluation of the LSI-3008
  SAS HBA on Boston and Ubuntu 17.04, I am hitting this same problem.
  So, it appears to have nothing specific to do with the PM8069 or
  aacraid driver.

  Some notes on reproduce this. I have been using the github release of
  HTX, built using the following steps:

  1. apt install make gcc g++ git libncurses5-dev libcxl-dev libdapl-dev 
(others may be required)
  2. git clone https://github.com/open-power/HTX
  3. cd HTX
  4. make
  5. mak

[Kernel-packages] [Bug 1709889] Comment bridged from LTC Bugzilla

2018-07-16 Thread bugproxy
--- Comment From dougm...@us.ibm.com 2018-07-16 09:02 EDT---
*** Bug 169550 has been marked as a duplicate of this bug. ***

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1709889

Title:
  Ubuntu 17.04: Bug in cfq scheduler, I/Os do not get submitted to
  adapter for a very long time.

Status in Linux:
  Unknown
Status in The Ubuntu-power-systems project:
  Invalid
Status in linux package in Ubuntu:
  Invalid
Status in linux source package in Zesty:
  Won't Fix

Bug description:
  ---Problem Description---
  When running stress test, sometimes seeing IO hung in dmesg or seeing "Host 
adapter abort request" error.

  ---Steps to Reproduce---
   There are two ways to re-create the issues:
  (1)running HTX, you will see IO timeout backtrace in dmesg in several hours
  (2)running some IO test, then reboot system, repeat this two steps, it takes 
long time to re-create the issue.
   
  ---uname output---
  4.10.0-11-generic

  The bulk of the effort for this issue is currently being worked in
  MicroSemi's JIRA https://jira.pmcs.com/browse/ESDIBMOP-133.

  Ran an interesting test: Ran HTX until I started getting the "stall"
  messages on the console, then shutdown HTX and examined the I/O
  counters for the tested disks in sysfs:

  root@bostonp15:~# for i in 
/sys/devices/pci0003:00/0003:00:00.0/0003:01:00.0/host0/target0:2:[2345]/0:2:[2345]:0;
 do echo ${i##*/} $(<${i}/iorequest_cnt) $(<${i}/iodone_cnt); done
  0:2:2:0 0x5eba3d 0x5eba3d
  0:2:3:0 0x773cc9 0x773cc9
  0:2:4:0 0x782c61 0x782c61
  0:2:5:0 0x5ca134 0x5ca134
  root@bostonp15:~#

  So, none of the disks showed any evidence of having lost an I/O. I
  then restarted HTX and aside from having to manually restart one of
  the disks, see no problems with the testing. It appears that what was
  "hung" was purely in userland.

  This does not absolve the kernel or aacraid driver from blame, but it
  shows that the OS "believes" that it completed the I/O and thus
  removed it from the queue. What we don't know is whether the OS truly
  notified HTX about the completion, or if HTX (or userland libraries)
  just failed to process the notification.

  Tests are running again, will see what happens next.

  Update from JIRA:

  I have run some more experiments. Not sure what it tells us, but
  here's what I've seen.

  First test, ran until I got kernel messages about stalled tasks, then
  shutdown HTX. After HTX was down, I checked the above mentioned
  counters and found that on each disk iorequest_cnt matched iodone_cnt.
  The disks were usable and I could restart HTX. This suggests that the
  problem is not in the PM8069 firmware, and makes the case for the
  aacraid driver having a bug somewhat weaker. However, this merely says
  that the driver "completed" the I/O as far as the kernel is concerned,
  not that a completion rippled back to the application.

  I restarted HTX and have run until errors. This time, I am leaving HTX
  running and observing. Two of the disks reached the HTX error
  threshold and the testers stopped (those 2 disks are now idle).
  Another disks saw errors but then stopped and appears to be running
  fine now. The last disk has not seen any errors (yet). On the two idle
  (errored-out) disks I see  iorequest_cnt matches iodone_cnt. I am able
  to "terminate and restart" the two idle disks and HTX appears to be
  testing them again "normally". Note that no reboot was required,
  further supporting the evidence that, as far as the kernel is
  concerned, there is nothing wrong with the disks and their I/O paths.

  So, I don't believe this completely eliminates aacraid from the
  picture, especially given we don't see this behavior on other
  systems/drivers. But, it probably moves the focus of the investigation
  away form the adapter firmware.

  Tried build upstream 4.11 kernel on Ubuntu. This still gets the hangs.
  Both Ubuntu 4.10 and upstream 4.11 have aacraid driver
  1.2.1[50792]-custom.

  Good new/bad news... While doing an initial evaluation of the LSI-3008
  SAS HBA on Boston and Ubuntu 17.04, I am hitting this same problem.
  So, it appears to have nothing specific to do with the PM8069 or
  aacraid driver.

  Some notes on reproduce this. I have been using the github release of
  HTX, built using the following steps:

  1. apt install make gcc g++ git libncurses5-dev libcxl-dev libdapl-dev 
(others may be required)
  2. git clone https://github.com/open-power/HTX
  3. cd HTX
  4. make
  5. make deb

  Then install the resulting "htxubuntu.deb" package.

  Note, HTX will not test disks that have a filesystem or OS installed,
  so there must be at least two disks made available to HTX by clearing
  any previous data. A partition table is optional, in my testing I have
  none.

  Also, it may be desirable to run HTX somewhere other than the console,
  leaving the console free to watch for messages.

  

[Kernel-packages] [Bug 1709889] Comment bridged from LTC Bugzilla

2017-08-23 Thread bugproxy
--- Comment From dougm...@us.ibm.com 2017-08-23 12:13 EDT---
By the way, I would not classify this behavior I'm seeing as a performance 
issue. There are hundreds of I/Os per second on each disk, and most of them are 
being submitted right away. But a subset of those I/Os are getting delayed for 
10 minutes - if not over an hour - which is a huge disparity. The effect this 
can have on applications could be significant, if not critical. Some of the 
I/Os are taking at least 3 orders of magnitude longer than the rest. And since 
no timeouts are in place at this stage, I wonder if any other test frameworks 
even notice it. You could be hitting this in your test cases but are not aware. 
Consider what would happen in a database server if the rollback segment I/Os 
were getting delayed like this.

Some things to think about.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1709889

Title:
  Ubuntu 17.04: Bug in cfq scheduler, I/Os do not get submitted to
  adapter for a very long time.

Status in Linux:
  Unknown
Status in The Ubuntu-power-systems project:
  In Progress
Status in linux package in Ubuntu:
  In Progress
Status in linux source package in Zesty:
  In Progress

Bug description:
  ---Problem Description---
  When running stress test, sometimes seeing IO hung in dmesg or seeing "Host 
adapter abort request" error.

  ---Steps to Reproduce---
   There are two ways to re-create the issues:
  (1)running HTX, you will see IO timeout backtrace in dmesg in several hours
  (2)running some IO test, then reboot system, repeat this two steps, it takes 
long time to re-create the issue.
   
  ---uname output---
  4.10.0-11-generic

  The bulk of the effort for this issue is currently being worked in
  MicroSemi's JIRA https://jira.pmcs.com/browse/ESDIBMOP-133.

  Ran an interesting test: Ran HTX until I started getting the "stall"
  messages on the console, then shutdown HTX and examined the I/O
  counters for the tested disks in sysfs:

  root@bostonp15:~# for i in 
/sys/devices/pci0003:00/0003:00:00.0/0003:01:00.0/host0/target0:2:[2345]/0:2:[2345]:0;
 do echo ${i##*/} $(<${i}/iorequest_cnt) $(<${i}/iodone_cnt); done
  0:2:2:0 0x5eba3d 0x5eba3d
  0:2:3:0 0x773cc9 0x773cc9
  0:2:4:0 0x782c61 0x782c61
  0:2:5:0 0x5ca134 0x5ca134
  root@bostonp15:~#

  So, none of the disks showed any evidence of having lost an I/O. I
  then restarted HTX and aside from having to manually restart one of
  the disks, see no problems with the testing. It appears that what was
  "hung" was purely in userland.

  This does not absolve the kernel or aacraid driver from blame, but it
  shows that the OS "believes" that it completed the I/O and thus
  removed it from the queue. What we don't know is whether the OS truly
  notified HTX about the completion, or if HTX (or userland libraries)
  just failed to process the notification.

  Tests are running again, will see what happens next.

  Update from JIRA:

  I have run some more experiments. Not sure what it tells us, but
  here's what I've seen.

  First test, ran until I got kernel messages about stalled tasks, then
  shutdown HTX. After HTX was down, I checked the above mentioned
  counters and found that on each disk iorequest_cnt matched iodone_cnt.
  The disks were usable and I could restart HTX. This suggests that the
  problem is not in the PM8069 firmware, and makes the case for the
  aacraid driver having a bug somewhat weaker. However, this merely says
  that the driver "completed" the I/O as far as the kernel is concerned,
  not that a completion rippled back to the application.

  I restarted HTX and have run until errors. This time, I am leaving HTX
  running and observing. Two of the disks reached the HTX error
  threshold and the testers stopped (those 2 disks are now idle).
  Another disks saw errors but then stopped and appears to be running
  fine now. The last disk has not seen any errors (yet). On the two idle
  (errored-out) disks I see  iorequest_cnt matches iodone_cnt. I am able
  to "terminate and restart" the two idle disks and HTX appears to be
  testing them again "normally". Note that no reboot was required,
  further supporting the evidence that, as far as the kernel is
  concerned, there is nothing wrong with the disks and their I/O paths.

  So, I don't believe this completely eliminates aacraid from the
  picture, especially given we don't see this behavior on other
  systems/drivers. But, it probably moves the focus of the investigation
  away form the adapter firmware.

  Tried build upstream 4.11 kernel on Ubuntu. This still gets the hangs.
  Both Ubuntu 4.10 and upstream 4.11 have aacraid driver
  1.2.1[50792]-custom.

  Good new/bad news... While doing an initial evaluation of the LSI-3008
  SAS HBA on Boston and Ubuntu 17.04, I am hitting this same problem.
  So, it appears to have nothing specific to do with the PM8069 or
  aac

[Kernel-packages] [Bug 1709889] Comment bridged from LTC Bugzilla

2017-08-23 Thread bugproxy
--- Comment From dougm...@us.ibm.com 2017-08-23 08:45 EDT---
I tried running 4.13-rc6 with "cfq" set as default scheduler. The problem is 
even worse. I/O delays show up almost immediately. Many exceed the HTX 
10-minute limit. It seems that CFQ is even more broken on the latest kernels.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1709889

Title:
  Ubuntu 17.04: Bug in cfq scheduler, I/Os do not get submitted to
  adapter for a very long time.

Status in The Ubuntu-power-systems project:
  In Progress
Status in linux package in Ubuntu:
  In Progress
Status in linux source package in Zesty:
  In Progress

Bug description:
  ---Problem Description---
  When running stress test, sometimes seeing IO hung in dmesg or seeing "Host 
adapter abort request" error.

  ---Steps to Reproduce---
   There are two ways to re-create the issues:
  (1)running HTX, you will see IO timeout backtrace in dmesg in several hours
  (2)running some IO test, then reboot system, repeat this two steps, it takes 
long time to re-create the issue.
   
  ---uname output---
  4.10.0-11-generic

  The bulk of the effort for this issue is currently being worked in
  MicroSemi's JIRA https://jira.pmcs.com/browse/ESDIBMOP-133.

  Ran an interesting test: Ran HTX until I started getting the "stall"
  messages on the console, then shutdown HTX and examined the I/O
  counters for the tested disks in sysfs:

  root@bostonp15:~# for i in 
/sys/devices/pci0003:00/0003:00:00.0/0003:01:00.0/host0/target0:2:[2345]/0:2:[2345]:0;
 do echo ${i##*/} $(<${i}/iorequest_cnt) $(<${i}/iodone_cnt); done
  0:2:2:0 0x5eba3d 0x5eba3d
  0:2:3:0 0x773cc9 0x773cc9
  0:2:4:0 0x782c61 0x782c61
  0:2:5:0 0x5ca134 0x5ca134
  root@bostonp15:~#

  So, none of the disks showed any evidence of having lost an I/O. I
  then restarted HTX and aside from having to manually restart one of
  the disks, see no problems with the testing. It appears that what was
  "hung" was purely in userland.

  This does not absolve the kernel or aacraid driver from blame, but it
  shows that the OS "believes" that it completed the I/O and thus
  removed it from the queue. What we don't know is whether the OS truly
  notified HTX about the completion, or if HTX (or userland libraries)
  just failed to process the notification.

  Tests are running again, will see what happens next.

  Update from JIRA:

  I have run some more experiments. Not sure what it tells us, but
  here's what I've seen.

  First test, ran until I got kernel messages about stalled tasks, then
  shutdown HTX. After HTX was down, I checked the above mentioned
  counters and found that on each disk iorequest_cnt matched iodone_cnt.
  The disks were usable and I could restart HTX. This suggests that the
  problem is not in the PM8069 firmware, and makes the case for the
  aacraid driver having a bug somewhat weaker. However, this merely says
  that the driver "completed" the I/O as far as the kernel is concerned,
  not that a completion rippled back to the application.

  I restarted HTX and have run until errors. This time, I am leaving HTX
  running and observing. Two of the disks reached the HTX error
  threshold and the testers stopped (those 2 disks are now idle).
  Another disks saw errors but then stopped and appears to be running
  fine now. The last disk has not seen any errors (yet). On the two idle
  (errored-out) disks I see  iorequest_cnt matches iodone_cnt. I am able
  to "terminate and restart" the two idle disks and HTX appears to be
  testing them again "normally". Note that no reboot was required,
  further supporting the evidence that, as far as the kernel is
  concerned, there is nothing wrong with the disks and their I/O paths.

  So, I don't believe this completely eliminates aacraid from the
  picture, especially given we don't see this behavior on other
  systems/drivers. But, it probably moves the focus of the investigation
  away form the adapter firmware.

  Tried build upstream 4.11 kernel on Ubuntu. This still gets the hangs.
  Both Ubuntu 4.10 and upstream 4.11 have aacraid driver
  1.2.1[50792]-custom.

  Good new/bad news... While doing an initial evaluation of the LSI-3008
  SAS HBA on Boston and Ubuntu 17.04, I am hitting this same problem.
  So, it appears to have nothing specific to do with the PM8069 or
  aacraid driver.

  Some notes on reproduce this. I have been using the github release of
  HTX, built using the following steps:

  1. apt install make gcc g++ git libncurses5-dev libcxl-dev libdapl-dev 
(others may be required)
  2. git clone https://github.com/open-power/HTX
  3. cd HTX
  4. make
  5. make deb

  Then install the resulting "htxubuntu.deb" package.

  Note, HTX will not test disks that have a filesystem or OS installed,
  so there must be at least two disks made available to HTX by clearing
  any previous data. A partition table is optional, in my te

[Kernel-packages] [Bug 1709889] Comment bridged from LTC Bugzilla

2017-08-22 Thread bugproxy
--- Comment From dougm...@us.ibm.com 2017-08-22 09:22 EDT---
Can we get some answers as to why CFQ is the default scheduler? It seems like 
the expedient fix is to change the default to "deadline".

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1709889

Title:
  Ubuntu 17.04: Bug in cfq scheduler, I/Os do not get submitted to
  adapter for a very long time.

Status in The Ubuntu-power-systems project:
  In Progress
Status in linux package in Ubuntu:
  In Progress
Status in linux source package in Zesty:
  In Progress

Bug description:
  ---Problem Description---
  When running stress test, sometimes seeing IO hung in dmesg or seeing "Host 
adapter abort request" error.

  ---Steps to Reproduce---
   There are two ways to re-create the issues:
  (1)running HTX, you will see IO timeout backtrace in dmesg in several hours
  (2)running some IO test, then reboot system, repeat this two steps, it takes 
long time to re-create the issue.
   
  ---uname output---
  4.10.0-11-generic

  The bulk of the effort for this issue is currently being worked in
  MicroSemi's JIRA https://jira.pmcs.com/browse/ESDIBMOP-133.

  Ran an interesting test: Ran HTX until I started getting the "stall"
  messages on the console, then shutdown HTX and examined the I/O
  counters for the tested disks in sysfs:

  root@bostonp15:~# for i in 
/sys/devices/pci0003:00/0003:00:00.0/0003:01:00.0/host0/target0:2:[2345]/0:2:[2345]:0;
 do echo ${i##*/} $(<${i}/iorequest_cnt) $(<${i}/iodone_cnt); done
  0:2:2:0 0x5eba3d 0x5eba3d
  0:2:3:0 0x773cc9 0x773cc9
  0:2:4:0 0x782c61 0x782c61
  0:2:5:0 0x5ca134 0x5ca134
  root@bostonp15:~#

  So, none of the disks showed any evidence of having lost an I/O. I
  then restarted HTX and aside from having to manually restart one of
  the disks, see no problems with the testing. It appears that what was
  "hung" was purely in userland.

  This does not absolve the kernel or aacraid driver from blame, but it
  shows that the OS "believes" that it completed the I/O and thus
  removed it from the queue. What we don't know is whether the OS truly
  notified HTX about the completion, or if HTX (or userland libraries)
  just failed to process the notification.

  Tests are running again, will see what happens next.

  Update from JIRA:

  I have run some more experiments. Not sure what it tells us, but
  here's what I've seen.

  First test, ran until I got kernel messages about stalled tasks, then
  shutdown HTX. After HTX was down, I checked the above mentioned
  counters and found that on each disk iorequest_cnt matched iodone_cnt.
  The disks were usable and I could restart HTX. This suggests that the
  problem is not in the PM8069 firmware, and makes the case for the
  aacraid driver having a bug somewhat weaker. However, this merely says
  that the driver "completed" the I/O as far as the kernel is concerned,
  not that a completion rippled back to the application.

  I restarted HTX and have run until errors. This time, I am leaving HTX
  running and observing. Two of the disks reached the HTX error
  threshold and the testers stopped (those 2 disks are now idle).
  Another disks saw errors but then stopped and appears to be running
  fine now. The last disk has not seen any errors (yet). On the two idle
  (errored-out) disks I see  iorequest_cnt matches iodone_cnt. I am able
  to "terminate and restart" the two idle disks and HTX appears to be
  testing them again "normally". Note that no reboot was required,
  further supporting the evidence that, as far as the kernel is
  concerned, there is nothing wrong with the disks and their I/O paths.

  So, I don't believe this completely eliminates aacraid from the
  picture, especially given we don't see this behavior on other
  systems/drivers. But, it probably moves the focus of the investigation
  away form the adapter firmware.

  Tried build upstream 4.11 kernel on Ubuntu. This still gets the hangs.
  Both Ubuntu 4.10 and upstream 4.11 have aacraid driver
  1.2.1[50792]-custom.

  Good new/bad news... While doing an initial evaluation of the LSI-3008
  SAS HBA on Boston and Ubuntu 17.04, I am hitting this same problem.
  So, it appears to have nothing specific to do with the PM8069 or
  aacraid driver.

  Some notes on reproduce this. I have been using the github release of
  HTX, built using the following steps:

  1. apt install make gcc g++ git libncurses5-dev libcxl-dev libdapl-dev 
(others may be required)
  2. git clone https://github.com/open-power/HTX
  3. cd HTX
  4. make
  5. make deb

  Then install the resulting "htxubuntu.deb" package.

  Note, HTX will not test disks that have a filesystem or OS installed,
  so there must be at least two disks made available to HTX by clearing
  any previous data. A partition table is optional, in my testing I have
  none.

  Also, it may be desirable to run HTX somewhere other than the conso

[Kernel-packages] [Bug 1709889] Comment bridged from LTC Bugzilla

2017-08-18 Thread bugproxy
--- Comment From dougm...@us.ibm.com 2017-08-18 07:51 EDT---
Test results with the binary kernel package show the same symptoms, I/Os 
getting delayed longer than 10 minutes. It seems that those three patches 
together cause a regression of the "cfq-iosched: fix the delay of cfq_group's 
vdisktime under iops mode" patch.

So, in summary, with *only* the patch:

5be6b75610cefd1e21b98a218211922c2feb6e08  "cfq-iosched: fix the delay of 
cfq_group's vdisktime under
iops mode"

I see some improvement of the I/Os delays, although the delays are still
too long. But by adding these two patches:

4d608baac5f4e72b033a122b2d6d9499532c3afc  "block: Initialize cfqq->ioprio_class 
in cfq_get_queue()"
142bbdfccc8b3e9f7342f2ce8422e76a3b45beae  "cfq: Disable writeback throttling by 
default"

I see a regression of the delays back to what I was seeing without any
patches.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1709889

Title:
  Ubuntu 17.04: Bug in cfq scheduler, I/Os do not get submitted to
  adapter for a very long time.

Status in The Ubuntu-power-systems project:
  In Progress
Status in linux package in Ubuntu:
  In Progress
Status in linux source package in Zesty:
  In Progress

Bug description:
  ---Problem Description---
  When running stress test, sometimes seeing IO hung in dmesg or seeing "Host 
adapter abort request" error.

  ---Steps to Reproduce---
   There are two ways to re-create the issues:
  (1)running HTX, you will see IO timeout backtrace in dmesg in several hours
  (2)running some IO test, then reboot system, repeat this two steps, it takes 
long time to re-create the issue.
   
  ---uname output---
  4.10.0-11-generic

  The bulk of the effort for this issue is currently being worked in
  MicroSemi's JIRA https://jira.pmcs.com/browse/ESDIBMOP-133.

  Ran an interesting test: Ran HTX until I started getting the "stall"
  messages on the console, then shutdown HTX and examined the I/O
  counters for the tested disks in sysfs:

  root@bostonp15:~# for i in 
/sys/devices/pci0003:00/0003:00:00.0/0003:01:00.0/host0/target0:2:[2345]/0:2:[2345]:0;
 do echo ${i##*/} $(<${i}/iorequest_cnt) $(<${i}/iodone_cnt); done
  0:2:2:0 0x5eba3d 0x5eba3d
  0:2:3:0 0x773cc9 0x773cc9
  0:2:4:0 0x782c61 0x782c61
  0:2:5:0 0x5ca134 0x5ca134
  root@bostonp15:~#

  So, none of the disks showed any evidence of having lost an I/O. I
  then restarted HTX and aside from having to manually restart one of
  the disks, see no problems with the testing. It appears that what was
  "hung" was purely in userland.

  This does not absolve the kernel or aacraid driver from blame, but it
  shows that the OS "believes" that it completed the I/O and thus
  removed it from the queue. What we don't know is whether the OS truly
  notified HTX about the completion, or if HTX (or userland libraries)
  just failed to process the notification.

  Tests are running again, will see what happens next.

  Update from JIRA:

  I have run some more experiments. Not sure what it tells us, but
  here's what I've seen.

  First test, ran until I got kernel messages about stalled tasks, then
  shutdown HTX. After HTX was down, I checked the above mentioned
  counters and found that on each disk iorequest_cnt matched iodone_cnt.
  The disks were usable and I could restart HTX. This suggests that the
  problem is not in the PM8069 firmware, and makes the case for the
  aacraid driver having a bug somewhat weaker. However, this merely says
  that the driver "completed" the I/O as far as the kernel is concerned,
  not that a completion rippled back to the application.

  I restarted HTX and have run until errors. This time, I am leaving HTX
  running and observing. Two of the disks reached the HTX error
  threshold and the testers stopped (those 2 disks are now idle).
  Another disks saw errors but then stopped and appears to be running
  fine now. The last disk has not seen any errors (yet). On the two idle
  (errored-out) disks I see  iorequest_cnt matches iodone_cnt. I am able
  to "terminate and restart" the two idle disks and HTX appears to be
  testing them again "normally". Note that no reboot was required,
  further supporting the evidence that, as far as the kernel is
  concerned, there is nothing wrong with the disks and their I/O paths.

  So, I don't believe this completely eliminates aacraid from the
  picture, especially given we don't see this behavior on other
  systems/drivers. But, it probably moves the focus of the investigation
  away form the adapter firmware.

  Tried build upstream 4.11 kernel on Ubuntu. This still gets the hangs.
  Both Ubuntu 4.10 and upstream 4.11 have aacraid driver
  1.2.1[50792]-custom.

  Good new/bad news... While doing an initial evaluation of the LSI-3008
  SAS HBA on Boston and Ubuntu 17.04, I am hitting this same problem.
  So, it appears to have nothing specific to do with the PM8069 o

[Kernel-packages] [Bug 1709889] Comment bridged from LTC Bugzilla

2017-08-17 Thread bugproxy
--- Comment From dougm...@us.ibm.com 2017-08-17 15:03 EDT---
I have started a test run using the binary kernel proposed, and will see if 
there is evidence of the problem. Results will be available tomorrow.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1709889

Title:
  Ubuntu 17.04: Bug in cfq scheduler, I/Os do not get submitted to
  adapter for a very long time.

Status in The Ubuntu-power-systems project:
  In Progress
Status in linux package in Ubuntu:
  In Progress
Status in linux source package in Zesty:
  In Progress

Bug description:
  ---Problem Description---
  When running stress test, sometimes seeing IO hung in dmesg or seeing "Host 
adapter abort request" error.

  ---Steps to Reproduce---
   There are two ways to re-create the issues:
  (1)running HTX, you will see IO timeout backtrace in dmesg in several hours
  (2)running some IO test, then reboot system, repeat this two steps, it takes 
long time to re-create the issue.
   
  ---uname output---
  4.10.0-11-generic

  The bulk of the effort for this issue is currently being worked in
  MicroSemi's JIRA https://jira.pmcs.com/browse/ESDIBMOP-133.

  Ran an interesting test: Ran HTX until I started getting the "stall"
  messages on the console, then shutdown HTX and examined the I/O
  counters for the tested disks in sysfs:

  root@bostonp15:~# for i in 
/sys/devices/pci0003:00/0003:00:00.0/0003:01:00.0/host0/target0:2:[2345]/0:2:[2345]:0;
 do echo ${i##*/} $(<${i}/iorequest_cnt) $(<${i}/iodone_cnt); done
  0:2:2:0 0x5eba3d 0x5eba3d
  0:2:3:0 0x773cc9 0x773cc9
  0:2:4:0 0x782c61 0x782c61
  0:2:5:0 0x5ca134 0x5ca134
  root@bostonp15:~#

  So, none of the disks showed any evidence of having lost an I/O. I
  then restarted HTX and aside from having to manually restart one of
  the disks, see no problems with the testing. It appears that what was
  "hung" was purely in userland.

  This does not absolve the kernel or aacraid driver from blame, but it
  shows that the OS "believes" that it completed the I/O and thus
  removed it from the queue. What we don't know is whether the OS truly
  notified HTX about the completion, or if HTX (or userland libraries)
  just failed to process the notification.

  Tests are running again, will see what happens next.

  Update from JIRA:

  I have run some more experiments. Not sure what it tells us, but
  here's what I've seen.

  First test, ran until I got kernel messages about stalled tasks, then
  shutdown HTX. After HTX was down, I checked the above mentioned
  counters and found that on each disk iorequest_cnt matched iodone_cnt.
  The disks were usable and I could restart HTX. This suggests that the
  problem is not in the PM8069 firmware, and makes the case for the
  aacraid driver having a bug somewhat weaker. However, this merely says
  that the driver "completed" the I/O as far as the kernel is concerned,
  not that a completion rippled back to the application.

  I restarted HTX and have run until errors. This time, I am leaving HTX
  running and observing. Two of the disks reached the HTX error
  threshold and the testers stopped (those 2 disks are now idle).
  Another disks saw errors but then stopped and appears to be running
  fine now. The last disk has not seen any errors (yet). On the two idle
  (errored-out) disks I see  iorequest_cnt matches iodone_cnt. I am able
  to "terminate and restart" the two idle disks and HTX appears to be
  testing them again "normally". Note that no reboot was required,
  further supporting the evidence that, as far as the kernel is
  concerned, there is nothing wrong with the disks and their I/O paths.

  So, I don't believe this completely eliminates aacraid from the
  picture, especially given we don't see this behavior on other
  systems/drivers. But, it probably moves the focus of the investigation
  away form the adapter firmware.

  Tried build upstream 4.11 kernel on Ubuntu. This still gets the hangs.
  Both Ubuntu 4.10 and upstream 4.11 have aacraid driver
  1.2.1[50792]-custom.

  Good new/bad news... While doing an initial evaluation of the LSI-3008
  SAS HBA on Boston and Ubuntu 17.04, I am hitting this same problem.
  So, it appears to have nothing specific to do with the PM8069 or
  aacraid driver.

  Some notes on reproduce this. I have been using the github release of
  HTX, built using the following steps:

  1. apt install make gcc g++ git libncurses5-dev libcxl-dev libdapl-dev 
(others may be required)
  2. git clone https://github.com/open-power/HTX
  3. cd HTX
  4. make
  5. make deb

  Then install the resulting "htxubuntu.deb" package.

  Note, HTX will not test disks that have a filesystem or OS installed,
  so there must be at least two disks made available to HTX by clearing
  any previous data. A partition table is optional, in my testing I have
  none.

  Also, it may be desirable to run HTX somewhere other than

[Kernel-packages] [Bug 1709889] Comment bridged from LTC Bugzilla

2017-08-17 Thread bugproxy
--- Comment From dougm...@us.ibm.com 2017-08-17 08:42 EDT---
Those three patches, at least in the kernel I am running, actually make things 
worse. The characteristics have changed, in what appears to be a general 
slow-down of disk I/O (it took over 12 hours to hit the first set of sever 
stalls), but the delays - when they do occur - or much worse. I saw I/Os 
getting delayed for over 40 minutes.

I have double-checked that the patches are installed. But in spite of
having the patch for the delay length
(5be6b75610cefd1e21b98a218211922c2feb6e08) the behavior is back to what
I was seeing before that patch alone.

I'm attaching the combined diff of the changes I made to the kernel.
Note, the only difference between the "worse" run and the previous
"better" one was the addition of these two patches:

4d608baac5f4e72b033a122b2d6d9499532c3afc  "block: Initialize cfqq->ioprio_class 
in cfq_get_queue()"
142bbdfccc8b3e9f7342f2ce8422e76a3b45beae  "cfq: Disable writeback throttling by 
default"

Which I can't explain, as I don't see how either of those should have
made this worse.

Maybe I need the actual source for your test kernel so I can add my
debug-monitoring code and run. With 40-minute delays the debug-
monitoring code is technically not needed, as HTX will complain. But if,
as I was seeing on the previous kernel, the delays are below 10 minutes
then HTX will never notice and there will be no obvious indication of
the more subtle issue.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1709889

Title:
  Ubuntu 17.04: Bug in cfq scheduler, I/Os do not get submitted to
  adapter for a very long time.

Status in The Ubuntu-power-systems project:
  In Progress
Status in linux package in Ubuntu:
  In Progress
Status in linux source package in Zesty:
  In Progress

Bug description:
  ---Problem Description---
  When running stress test, sometimes seeing IO hung in dmesg or seeing "Host 
adapter abort request" error.

  ---Steps to Reproduce---
   There are two ways to re-create the issues:
  (1)running HTX, you will see IO timeout backtrace in dmesg in several hours
  (2)running some IO test, then reboot system, repeat this two steps, it takes 
long time to re-create the issue.
   
  ---uname output---
  4.10.0-11-generic

  The bulk of the effort for this issue is currently being worked in
  MicroSemi's JIRA https://jira.pmcs.com/browse/ESDIBMOP-133.

  Ran an interesting test: Ran HTX until I started getting the "stall"
  messages on the console, then shutdown HTX and examined the I/O
  counters for the tested disks in sysfs:

  root@bostonp15:~# for i in 
/sys/devices/pci0003:00/0003:00:00.0/0003:01:00.0/host0/target0:2:[2345]/0:2:[2345]:0;
 do echo ${i##*/} $(<${i}/iorequest_cnt) $(<${i}/iodone_cnt); done
  0:2:2:0 0x5eba3d 0x5eba3d
  0:2:3:0 0x773cc9 0x773cc9
  0:2:4:0 0x782c61 0x782c61
  0:2:5:0 0x5ca134 0x5ca134
  root@bostonp15:~#

  So, none of the disks showed any evidence of having lost an I/O. I
  then restarted HTX and aside from having to manually restart one of
  the disks, see no problems with the testing. It appears that what was
  "hung" was purely in userland.

  This does not absolve the kernel or aacraid driver from blame, but it
  shows that the OS "believes" that it completed the I/O and thus
  removed it from the queue. What we don't know is whether the OS truly
  notified HTX about the completion, or if HTX (or userland libraries)
  just failed to process the notification.

  Tests are running again, will see what happens next.

  Update from JIRA:

  I have run some more experiments. Not sure what it tells us, but
  here's what I've seen.

  First test, ran until I got kernel messages about stalled tasks, then
  shutdown HTX. After HTX was down, I checked the above mentioned
  counters and found that on each disk iorequest_cnt matched iodone_cnt.
  The disks were usable and I could restart HTX. This suggests that the
  problem is not in the PM8069 firmware, and makes the case for the
  aacraid driver having a bug somewhat weaker. However, this merely says
  that the driver "completed" the I/O as far as the kernel is concerned,
  not that a completion rippled back to the application.

  I restarted HTX and have run until errors. This time, I am leaving HTX
  running and observing. Two of the disks reached the HTX error
  threshold and the testers stopped (those 2 disks are now idle).
  Another disks saw errors but then stopped and appears to be running
  fine now. The last disk has not seen any errors (yet). On the two idle
  (errored-out) disks I see  iorequest_cnt matches iodone_cnt. I am able
  to "terminate and restart" the two idle disks and HTX appears to be
  testing them again "normally". Note that no reboot was required,
  further supporting the evidence that, as far as the kernel is
  concerned, there is nothing wrong with the disks and their I/O paths.

  So, 

[Kernel-packages] [Bug 1709889] Comment bridged from LTC Bugzilla

2017-08-16 Thread bugproxy
--- Comment From dougm...@us.ibm.com 2017-08-16 08:17 EDT---
Even with all three of those patches, I still am seeing delayed I/Os. The 
pattern looks the same, but I will run for 12+ hours to collect more data. At 
this point, though, I believe that "cfq" should not be the default scheduler. 
Are there reasons that it should be the default? What is the background behind 
the choice to make it the default in Ubuntu? Looking at the upstream code, it 
seems that if no config parameter is used that the default will be "deadline" 
(mq-deadline).

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1709889

Title:
  Ubuntu 17.04: Bug in cfq scheduler, I/Os do not get submitted to
  adapter for a very long time.

Status in The Ubuntu-power-systems project:
  In Progress
Status in linux package in Ubuntu:
  In Progress
Status in linux source package in Zesty:
  In Progress

Bug description:
  ---Problem Description---
  When running stress test, sometimes seeing IO hung in dmesg or seeing "Host 
adapter abort request" error.

  ---Steps to Reproduce---
   There are two ways to re-create the issues:
  (1)running HTX, you will see IO timeout backtrace in dmesg in several hours
  (2)running some IO test, then reboot system, repeat this two steps, it takes 
long time to re-create the issue.
   
  ---uname output---
  4.10.0-11-generic

  The bulk of the effort for this issue is currently being worked in
  MicroSemi's JIRA https://jira.pmcs.com/browse/ESDIBMOP-133.

  Ran an interesting test: Ran HTX until I started getting the "stall"
  messages on the console, then shutdown HTX and examined the I/O
  counters for the tested disks in sysfs:

  root@bostonp15:~# for i in 
/sys/devices/pci0003:00/0003:00:00.0/0003:01:00.0/host0/target0:2:[2345]/0:2:[2345]:0;
 do echo ${i##*/} $(<${i}/iorequest_cnt) $(<${i}/iodone_cnt); done
  0:2:2:0 0x5eba3d 0x5eba3d
  0:2:3:0 0x773cc9 0x773cc9
  0:2:4:0 0x782c61 0x782c61
  0:2:5:0 0x5ca134 0x5ca134
  root@bostonp15:~#

  So, none of the disks showed any evidence of having lost an I/O. I
  then restarted HTX and aside from having to manually restart one of
  the disks, see no problems with the testing. It appears that what was
  "hung" was purely in userland.

  This does not absolve the kernel or aacraid driver from blame, but it
  shows that the OS "believes" that it completed the I/O and thus
  removed it from the queue. What we don't know is whether the OS truly
  notified HTX about the completion, or if HTX (or userland libraries)
  just failed to process the notification.

  Tests are running again, will see what happens next.

  Update from JIRA:

  I have run some more experiments. Not sure what it tells us, but
  here's what I've seen.

  First test, ran until I got kernel messages about stalled tasks, then
  shutdown HTX. After HTX was down, I checked the above mentioned
  counters and found that on each disk iorequest_cnt matched iodone_cnt.
  The disks were usable and I could restart HTX. This suggests that the
  problem is not in the PM8069 firmware, and makes the case for the
  aacraid driver having a bug somewhat weaker. However, this merely says
  that the driver "completed" the I/O as far as the kernel is concerned,
  not that a completion rippled back to the application.

  I restarted HTX and have run until errors. This time, I am leaving HTX
  running and observing. Two of the disks reached the HTX error
  threshold and the testers stopped (those 2 disks are now idle).
  Another disks saw errors but then stopped and appears to be running
  fine now. The last disk has not seen any errors (yet). On the two idle
  (errored-out) disks I see  iorequest_cnt matches iodone_cnt. I am able
  to "terminate and restart" the two idle disks and HTX appears to be
  testing them again "normally". Note that no reboot was required,
  further supporting the evidence that, as far as the kernel is
  concerned, there is nothing wrong with the disks and their I/O paths.

  So, I don't believe this completely eliminates aacraid from the
  picture, especially given we don't see this behavior on other
  systems/drivers. But, it probably moves the focus of the investigation
  away form the adapter firmware.

  Tried build upstream 4.11 kernel on Ubuntu. This still gets the hangs.
  Both Ubuntu 4.10 and upstream 4.11 have aacraid driver
  1.2.1[50792]-custom.

  Good new/bad news... While doing an initial evaluation of the LSI-3008
  SAS HBA on Boston and Ubuntu 17.04, I am hitting this same problem.
  So, it appears to have nothing specific to do with the PM8069 or
  aacraid driver.

  Some notes on reproduce this. I have been using the github release of
  HTX, built using the following steps:

  1. apt install make gcc g++ git libncurses5-dev libcxl-dev libdapl-dev 
(others may be required)
  2. git clone https://github.com/open-power/HTX
  3. cd HTX
  4. make
  5. make deb

 

[Kernel-packages] [Bug 1709889] Comment bridged from LTC Bugzilla

2017-08-16 Thread bugproxy
--- Comment From dougm...@us.ibm.com 2017-08-16 07:23 EDT---
Oh, there is no source code in the linux-source package. I will have to add 
those patches to my own source - which won't exactly test your kernel.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1709889

Title:
  Ubuntu 17.04: Bug in cfq scheduler, I/Os do not get submitted to
  adapter for a very long time.

Status in The Ubuntu-power-systems project:
  In Progress
Status in linux package in Ubuntu:
  In Progress
Status in linux source package in Zesty:
  In Progress

Bug description:
  ---Problem Description---
  When running stress test, sometimes seeing IO hung in dmesg or seeing "Host 
adapter abort request" error.

  ---Steps to Reproduce---
   There are two ways to re-create the issues:
  (1)running HTX, you will see IO timeout backtrace in dmesg in several hours
  (2)running some IO test, then reboot system, repeat this two steps, it takes 
long time to re-create the issue.
   
  ---uname output---
  4.10.0-11-generic

  The bulk of the effort for this issue is currently being worked in
  MicroSemi's JIRA https://jira.pmcs.com/browse/ESDIBMOP-133.

  Ran an interesting test: Ran HTX until I started getting the "stall"
  messages on the console, then shutdown HTX and examined the I/O
  counters for the tested disks in sysfs:

  root@bostonp15:~# for i in 
/sys/devices/pci0003:00/0003:00:00.0/0003:01:00.0/host0/target0:2:[2345]/0:2:[2345]:0;
 do echo ${i##*/} $(<${i}/iorequest_cnt) $(<${i}/iodone_cnt); done
  0:2:2:0 0x5eba3d 0x5eba3d
  0:2:3:0 0x773cc9 0x773cc9
  0:2:4:0 0x782c61 0x782c61
  0:2:5:0 0x5ca134 0x5ca134
  root@bostonp15:~#

  So, none of the disks showed any evidence of having lost an I/O. I
  then restarted HTX and aside from having to manually restart one of
  the disks, see no problems with the testing. It appears that what was
  "hung" was purely in userland.

  This does not absolve the kernel or aacraid driver from blame, but it
  shows that the OS "believes" that it completed the I/O and thus
  removed it from the queue. What we don't know is whether the OS truly
  notified HTX about the completion, or if HTX (or userland libraries)
  just failed to process the notification.

  Tests are running again, will see what happens next.

  Update from JIRA:

  I have run some more experiments. Not sure what it tells us, but
  here's what I've seen.

  First test, ran until I got kernel messages about stalled tasks, then
  shutdown HTX. After HTX was down, I checked the above mentioned
  counters and found that on each disk iorequest_cnt matched iodone_cnt.
  The disks were usable and I could restart HTX. This suggests that the
  problem is not in the PM8069 firmware, and makes the case for the
  aacraid driver having a bug somewhat weaker. However, this merely says
  that the driver "completed" the I/O as far as the kernel is concerned,
  not that a completion rippled back to the application.

  I restarted HTX and have run until errors. This time, I am leaving HTX
  running and observing. Two of the disks reached the HTX error
  threshold and the testers stopped (those 2 disks are now idle).
  Another disks saw errors but then stopped and appears to be running
  fine now. The last disk has not seen any errors (yet). On the two idle
  (errored-out) disks I see  iorequest_cnt matches iodone_cnt. I am able
  to "terminate and restart" the two idle disks and HTX appears to be
  testing them again "normally". Note that no reboot was required,
  further supporting the evidence that, as far as the kernel is
  concerned, there is nothing wrong with the disks and their I/O paths.

  So, I don't believe this completely eliminates aacraid from the
  picture, especially given we don't see this behavior on other
  systems/drivers. But, it probably moves the focus of the investigation
  away form the adapter firmware.

  Tried build upstream 4.11 kernel on Ubuntu. This still gets the hangs.
  Both Ubuntu 4.10 and upstream 4.11 have aacraid driver
  1.2.1[50792]-custom.

  Good new/bad news... While doing an initial evaluation of the LSI-3008
  SAS HBA on Boston and Ubuntu 17.04, I am hitting this same problem.
  So, it appears to have nothing specific to do with the PM8069 or
  aacraid driver.

  Some notes on reproduce this. I have been using the github release of
  HTX, built using the following steps:

  1. apt install make gcc g++ git libncurses5-dev libcxl-dev libdapl-dev 
(others may be required)
  2. git clone https://github.com/open-power/HTX
  3. cd HTX
  4. make
  5. make deb

  Then install the resulting "htxubuntu.deb" package.

  Note, HTX will not test disks that have a filesystem or OS installed,
  so there must be at least two disks made available to HTX by clearing
  any previous data. A partition table is optional, in my testing I have
  none.

  Also, it may be desirable to run HTX somewhere other tha

[Kernel-packages] [Bug 1709889] Comment bridged from LTC Bugzilla

2017-08-15 Thread bugproxy
--- Comment From dougm...@us.ibm.com 2017-08-15 16:36 EDT---
I will start a test with that kernel tomorrow. I will be adding my debug code 
to it, so that I can track delayed I/Os if they occur. I see you posted source 
code, so I can do that easily.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1709889

Title:
  Ubuntu 17.04: Bug in cfq scheduler, I/Os do not get submitted to
  adapter for a very long time.

Status in The Ubuntu-power-systems project:
  In Progress
Status in linux package in Ubuntu:
  In Progress
Status in linux source package in Zesty:
  In Progress

Bug description:
  ---Problem Description---
  When running stress test, sometimes seeing IO hung in dmesg or seeing "Host 
adapter abort request" error.

  ---Steps to Reproduce---
   There are two ways to re-create the issues:
  (1)running HTX, you will see IO timeout backtrace in dmesg in several hours
  (2)running some IO test, then reboot system, repeat this two steps, it takes 
long time to re-create the issue.
   
  ---uname output---
  4.10.0-11-generic

  The bulk of the effort for this issue is currently being worked in
  MicroSemi's JIRA https://jira.pmcs.com/browse/ESDIBMOP-133.

  Ran an interesting test: Ran HTX until I started getting the "stall"
  messages on the console, then shutdown HTX and examined the I/O
  counters for the tested disks in sysfs:

  root@bostonp15:~# for i in 
/sys/devices/pci0003:00/0003:00:00.0/0003:01:00.0/host0/target0:2:[2345]/0:2:[2345]:0;
 do echo ${i##*/} $(<${i}/iorequest_cnt) $(<${i}/iodone_cnt); done
  0:2:2:0 0x5eba3d 0x5eba3d
  0:2:3:0 0x773cc9 0x773cc9
  0:2:4:0 0x782c61 0x782c61
  0:2:5:0 0x5ca134 0x5ca134
  root@bostonp15:~#

  So, none of the disks showed any evidence of having lost an I/O. I
  then restarted HTX and aside from having to manually restart one of
  the disks, see no problems with the testing. It appears that what was
  "hung" was purely in userland.

  This does not absolve the kernel or aacraid driver from blame, but it
  shows that the OS "believes" that it completed the I/O and thus
  removed it from the queue. What we don't know is whether the OS truly
  notified HTX about the completion, or if HTX (or userland libraries)
  just failed to process the notification.

  Tests are running again, will see what happens next.

  Update from JIRA:

  I have run some more experiments. Not sure what it tells us, but
  here's what I've seen.

  First test, ran until I got kernel messages about stalled tasks, then
  shutdown HTX. After HTX was down, I checked the above mentioned
  counters and found that on each disk iorequest_cnt matched iodone_cnt.
  The disks were usable and I could restart HTX. This suggests that the
  problem is not in the PM8069 firmware, and makes the case for the
  aacraid driver having a bug somewhat weaker. However, this merely says
  that the driver "completed" the I/O as far as the kernel is concerned,
  not that a completion rippled back to the application.

  I restarted HTX and have run until errors. This time, I am leaving HTX
  running and observing. Two of the disks reached the HTX error
  threshold and the testers stopped (those 2 disks are now idle).
  Another disks saw errors but then stopped and appears to be running
  fine now. The last disk has not seen any errors (yet). On the two idle
  (errored-out) disks I see  iorequest_cnt matches iodone_cnt. I am able
  to "terminate and restart" the two idle disks and HTX appears to be
  testing them again "normally". Note that no reboot was required,
  further supporting the evidence that, as far as the kernel is
  concerned, there is nothing wrong with the disks and their I/O paths.

  So, I don't believe this completely eliminates aacraid from the
  picture, especially given we don't see this behavior on other
  systems/drivers. But, it probably moves the focus of the investigation
  away form the adapter firmware.

  Tried build upstream 4.11 kernel on Ubuntu. This still gets the hangs.
  Both Ubuntu 4.10 and upstream 4.11 have aacraid driver
  1.2.1[50792]-custom.

  Good new/bad news... While doing an initial evaluation of the LSI-3008
  SAS HBA on Boston and Ubuntu 17.04, I am hitting this same problem.
  So, it appears to have nothing specific to do with the PM8069 or
  aacraid driver.

  Some notes on reproduce this. I have been using the github release of
  HTX, built using the following steps:

  1. apt install make gcc g++ git libncurses5-dev libcxl-dev libdapl-dev 
(others may be required)
  2. git clone https://github.com/open-power/HTX
  3. cd HTX
  4. make
  5. make deb

  Then install the resulting "htxubuntu.deb" package.

  Note, HTX will not test disks that have a filesystem or OS installed,
  so there must be at least two disks made available to HTX by clearing
  any previous data. A partition table is optional, in my testing I have
  none.

  Also, it may b

[Kernel-packages] [Bug 1709889] Comment bridged from LTC Bugzilla

2017-08-15 Thread bugproxy
--- Comment From dougm...@us.ibm.com 2017-08-15 09:32 EDT---
I should also add that the data indicates there is an instability to the delay, 
with the initial stalled I/Os maxing out at about 70 seconds, then the next 
cycle (~10 hours later) and subsequent 2 shows max values in the 90-100 second 
range, and the following 5 cycles show numbers around 110 seconds (but tapering 
off). I think this unpredictability is further indication of a bug in CFQ. 
Clearly it does not live up to it's name, at least for these particular I/Os.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1709889

Title:
  Ubuntu 17.04: Bug in cfq scheduler, I/Os do not get submitted to
  adapter for a very long time.

Status in The Ubuntu-power-systems project:
  In Progress
Status in linux package in Ubuntu:
  In Progress
Status in linux source package in Zesty:
  In Progress

Bug description:
  ---Problem Description---
  When running stress test, sometimes seeing IO hung in dmesg or seeing "Host 
adapter abort request" error.

  ---Steps to Reproduce---
   There are two ways to re-create the issues:
  (1)running HTX, you will see IO timeout backtrace in dmesg in several hours
  (2)running some IO test, then reboot system, repeat this two steps, it takes 
long time to re-create the issue.
   
  ---uname output---
  4.10.0-11-generic

  The bulk of the effort for this issue is currently being worked in
  MicroSemi's JIRA https://jira.pmcs.com/browse/ESDIBMOP-133.

  Ran an interesting test: Ran HTX until I started getting the "stall"
  messages on the console, then shutdown HTX and examined the I/O
  counters for the tested disks in sysfs:

  root@bostonp15:~# for i in 
/sys/devices/pci0003:00/0003:00:00.0/0003:01:00.0/host0/target0:2:[2345]/0:2:[2345]:0;
 do echo ${i##*/} $(<${i}/iorequest_cnt) $(<${i}/iodone_cnt); done
  0:2:2:0 0x5eba3d 0x5eba3d
  0:2:3:0 0x773cc9 0x773cc9
  0:2:4:0 0x782c61 0x782c61
  0:2:5:0 0x5ca134 0x5ca134
  root@bostonp15:~#

  So, none of the disks showed any evidence of having lost an I/O. I
  then restarted HTX and aside from having to manually restart one of
  the disks, see no problems with the testing. It appears that what was
  "hung" was purely in userland.

  This does not absolve the kernel or aacraid driver from blame, but it
  shows that the OS "believes" that it completed the I/O and thus
  removed it from the queue. What we don't know is whether the OS truly
  notified HTX about the completion, or if HTX (or userland libraries)
  just failed to process the notification.

  Tests are running again, will see what happens next.

  Update from JIRA:

  I have run some more experiments. Not sure what it tells us, but
  here's what I've seen.

  First test, ran until I got kernel messages about stalled tasks, then
  shutdown HTX. After HTX was down, I checked the above mentioned
  counters and found that on each disk iorequest_cnt matched iodone_cnt.
  The disks were usable and I could restart HTX. This suggests that the
  problem is not in the PM8069 firmware, and makes the case for the
  aacraid driver having a bug somewhat weaker. However, this merely says
  that the driver "completed" the I/O as far as the kernel is concerned,
  not that a completion rippled back to the application.

  I restarted HTX and have run until errors. This time, I am leaving HTX
  running and observing. Two of the disks reached the HTX error
  threshold and the testers stopped (those 2 disks are now idle).
  Another disks saw errors but then stopped and appears to be running
  fine now. The last disk has not seen any errors (yet). On the two idle
  (errored-out) disks I see  iorequest_cnt matches iodone_cnt. I am able
  to "terminate and restart" the two idle disks and HTX appears to be
  testing them again "normally". Note that no reboot was required,
  further supporting the evidence that, as far as the kernel is
  concerned, there is nothing wrong with the disks and their I/O paths.

  So, I don't believe this completely eliminates aacraid from the
  picture, especially given we don't see this behavior on other
  systems/drivers. But, it probably moves the focus of the investigation
  away form the adapter firmware.

  Tried build upstream 4.11 kernel on Ubuntu. This still gets the hangs.
  Both Ubuntu 4.10 and upstream 4.11 have aacraid driver
  1.2.1[50792]-custom.

  Good new/bad news... While doing an initial evaluation of the LSI-3008
  SAS HBA on Boston and Ubuntu 17.04, I am hitting this same problem.
  So, it appears to have nothing specific to do with the PM8069 or
  aacraid driver.

  Some notes on reproduce this. I have been using the github release of
  HTX, built using the following steps:

  1. apt install make gcc g++ git libncurses5-dev libcxl-dev libdapl-dev 
(others may be required)
  2. git clone https://github.com/open-power/HTX
  3. cd HTX
  4. make
  5. make deb

  Then install

[Kernel-packages] [Bug 1709889] Comment bridged from LTC Bugzilla

2017-08-15 Thread bugproxy
--- Comment From dougm...@us.ibm.com 2017-08-15 09:26 EDT---
Here is what I am observing, and what leads me to think that "cfq" may not 
(yet) be a good choice for the default io-sched.

The test exerciser, HTX (https://github.com/open-power/HTX - POWER arch
only), causes stress on CFQ during certain cycles. I set the debug
timeout threshold for completion of I/Os at 60 seconds (upon timeout,
debugging is printed and then io_schedule() called after which more
debugging is printed). It is known that I/O delays seem to vary
continuously throughout the range, but using a timeout lower than 60
just produced too much output.

During certain cycles, where about 1 million I/Os per hour are being
performed on each disk, we see timeouts being triggered. Essentially,
the timeout happens because CFQ has not even submitted the I/O to SCSI
yet. Earlier debugging showed that the once the I/O actually gets
submitted to SCSI it completes promptly.

Without the patch 5be6b75610ce these I/Os could (sometimes) take an hour
or more to get submitted to SCSI. With the patch, that delay time seems
to max out at around 110 seconds, which is a great improvement however
still indicates a problem.

I typically see about 400-500 I/Os trip the 60-second timeout during a
given ~2 hour cycle (estimated 4 million I/Os total), so it is not a
huge percentage. However, the I/Os affected seem to be related, possibly
by process or thread, and so this could be detrimental to an
application. Note, the number of I/Os taking between 30 and 60 seconds
is not known, but is expected to be much higher. Even 30 seconds may be
an undesirable number. It's not clear just how CFQ chooses the delay
value and what overrides it.

On a run with the scheduler set to "deadline", I never see any I/Os trip
the 60-second timeout.

I think this shows undesirable behavior in CFQ, possibly a bug, and that
it should not be the default scheduler - especially for servers. Is
there some evidence that shows CFQ to be better than deadline in
general?

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1709889

Title:
  Ubuntu 17.04: Bug in cfq scheduler, I/Os do not get submitted to
  adapter for a very long time.

Status in The Ubuntu-power-systems project:
  In Progress
Status in linux package in Ubuntu:
  In Progress
Status in linux source package in Zesty:
  In Progress

Bug description:
  ---Problem Description---
  When running stress test, sometimes seeing IO hung in dmesg or seeing "Host 
adapter abort request" error.

  ---Steps to Reproduce---
   There are two ways to re-create the issues:
  (1)running HTX, you will see IO timeout backtrace in dmesg in several hours
  (2)running some IO test, then reboot system, repeat this two steps, it takes 
long time to re-create the issue.
   
  ---uname output---
  4.10.0-11-generic

  The bulk of the effort for this issue is currently being worked in
  MicroSemi's JIRA https://jira.pmcs.com/browse/ESDIBMOP-133.

  Ran an interesting test: Ran HTX until I started getting the "stall"
  messages on the console, then shutdown HTX and examined the I/O
  counters for the tested disks in sysfs:

  root@bostonp15:~# for i in 
/sys/devices/pci0003:00/0003:00:00.0/0003:01:00.0/host0/target0:2:[2345]/0:2:[2345]:0;
 do echo ${i##*/} $(<${i}/iorequest_cnt) $(<${i}/iodone_cnt); done
  0:2:2:0 0x5eba3d 0x5eba3d
  0:2:3:0 0x773cc9 0x773cc9
  0:2:4:0 0x782c61 0x782c61
  0:2:5:0 0x5ca134 0x5ca134
  root@bostonp15:~#

  So, none of the disks showed any evidence of having lost an I/O. I
  then restarted HTX and aside from having to manually restart one of
  the disks, see no problems with the testing. It appears that what was
  "hung" was purely in userland.

  This does not absolve the kernel or aacraid driver from blame, but it
  shows that the OS "believes" that it completed the I/O and thus
  removed it from the queue. What we don't know is whether the OS truly
  notified HTX about the completion, or if HTX (or userland libraries)
  just failed to process the notification.

  Tests are running again, will see what happens next.

  Update from JIRA:

  I have run some more experiments. Not sure what it tells us, but
  here's what I've seen.

  First test, ran until I got kernel messages about stalled tasks, then
  shutdown HTX. After HTX was down, I checked the above mentioned
  counters and found that on each disk iorequest_cnt matched iodone_cnt.
  The disks were usable and I could restart HTX. This suggests that the
  problem is not in the PM8069 firmware, and makes the case for the
  aacraid driver having a bug somewhat weaker. However, this merely says
  that the driver "completed" the I/O as far as the kernel is concerned,
  not that a completion rippled back to the application.

  I restarted HTX and have run until errors. This time, I am leaving HTX
  running and observing. Two of the disks reached the HTX error
  th

[Kernel-packages] [Bug 1709889] Comment bridged from LTC Bugzilla

2017-08-14 Thread bugproxy
--- Comment From dougm...@us.ibm.com 2017-08-14 10:33 EDT---
I am currently making a long test run to collect some data. It may be the case 
that cfq delays actually increase over time, even with this fix. That may be 
evidence that cfq is not a good default choice for I/O scheduler, but I want to 
collect more data.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1709889

Title:
  Ubuntu 17.04: Bug in cfq scheduler, I/Os do not get submitted to
  adapter for a very long time.

Status in The Ubuntu-power-systems project:
  In Progress
Status in linux package in Ubuntu:
  In Progress
Status in linux source package in Zesty:
  In Progress

Bug description:
  ---Problem Description---
  When running stress test, sometimes seeing IO hung in dmesg or seeing "Host 
adapter abort request" error.

  ---Steps to Reproduce---
   There are two ways to re-create the issues:
  (1)running HTX, you will see IO timeout backtrace in dmesg in several hours
  (2)running some IO test, then reboot system, repeat this two steps, it takes 
long time to re-create the issue.
   
  ---uname output---
  4.10.0-11-generic

  The bulk of the effort for this issue is currently being worked in
  MicroSemi's JIRA https://jira.pmcs.com/browse/ESDIBMOP-133.

  Ran an interesting test: Ran HTX until I started getting the "stall"
  messages on the console, then shutdown HTX and examined the I/O
  counters for the tested disks in sysfs:

  root@bostonp15:~# for i in 
/sys/devices/pci0003:00/0003:00:00.0/0003:01:00.0/host0/target0:2:[2345]/0:2:[2345]:0;
 do echo ${i##*/} $(<${i}/iorequest_cnt) $(<${i}/iodone_cnt); done
  0:2:2:0 0x5eba3d 0x5eba3d
  0:2:3:0 0x773cc9 0x773cc9
  0:2:4:0 0x782c61 0x782c61
  0:2:5:0 0x5ca134 0x5ca134
  root@bostonp15:~#

  So, none of the disks showed any evidence of having lost an I/O. I
  then restarted HTX and aside from having to manually restart one of
  the disks, see no problems with the testing. It appears that what was
  "hung" was purely in userland.

  This does not absolve the kernel or aacraid driver from blame, but it
  shows that the OS "believes" that it completed the I/O and thus
  removed it from the queue. What we don't know is whether the OS truly
  notified HTX about the completion, or if HTX (or userland libraries)
  just failed to process the notification.

  Tests are running again, will see what happens next.

  Update from JIRA:

  I have run some more experiments. Not sure what it tells us, but
  here's what I've seen.

  First test, ran until I got kernel messages about stalled tasks, then
  shutdown HTX. After HTX was down, I checked the above mentioned
  counters and found that on each disk iorequest_cnt matched iodone_cnt.
  The disks were usable and I could restart HTX. This suggests that the
  problem is not in the PM8069 firmware, and makes the case for the
  aacraid driver having a bug somewhat weaker. However, this merely says
  that the driver "completed" the I/O as far as the kernel is concerned,
  not that a completion rippled back to the application.

  I restarted HTX and have run until errors. This time, I am leaving HTX
  running and observing. Two of the disks reached the HTX error
  threshold and the testers stopped (those 2 disks are now idle).
  Another disks saw errors but then stopped and appears to be running
  fine now. The last disk has not seen any errors (yet). On the two idle
  (errored-out) disks I see  iorequest_cnt matches iodone_cnt. I am able
  to "terminate and restart" the two idle disks and HTX appears to be
  testing them again "normally". Note that no reboot was required,
  further supporting the evidence that, as far as the kernel is
  concerned, there is nothing wrong with the disks and their I/O paths.

  So, I don't believe this completely eliminates aacraid from the
  picture, especially given we don't see this behavior on other
  systems/drivers. But, it probably moves the focus of the investigation
  away form the adapter firmware.

  Tried build upstream 4.11 kernel on Ubuntu. This still gets the hangs.
  Both Ubuntu 4.10 and upstream 4.11 have aacraid driver
  1.2.1[50792]-custom.

  Good new/bad news... While doing an initial evaluation of the LSI-3008
  SAS HBA on Boston and Ubuntu 17.04, I am hitting this same problem.
  So, it appears to have nothing specific to do with the PM8069 or
  aacraid driver.

  Some notes on reproduce this. I have been using the github release of
  HTX, built using the following steps:

  1. apt install make gcc g++ git libncurses5-dev libcxl-dev libdapl-dev 
(others may be required)
  2. git clone https://github.com/open-power/HTX
  3. cd HTX
  4. make
  5. make deb

  Then install the resulting "htxubuntu.deb" package.

  Note, HTX will not test disks that have a filesystem or OS installed,
  so there must be at least two disks made available to HTX by clearing
  any previous data. A partiti

[Kernel-packages] [Bug 1709889] Comment bridged from LTC Bugzilla

2017-08-11 Thread bugproxy
--- Comment From dougm...@us.ibm.com 2017-08-11 07:44 EDT---
Testing shows that this commit appears to fix the problem.  After 20 hours, no 
evidence of stalled I/Os.

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=5be6b75610cefd1e21b98a218211922c2feb6e08

This fixes a problem introduced by having two modes of operation for cfq
that each uses a different timebase, and not having separate scheduling
delay (time limit before forcing I/O submit) settings. What appears to
be the default mode, "iops", ended up using a delay that allowed I/Os to
be postponed for up to 2 jiffies (which is hundreds of hours).

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1709889

Title:
  Ubuntu 17.04: Bug in cfq scheduler, I/Os do not get submitted to
  adapter for a very long time.

Status in The Ubuntu-power-systems project:
  New
Status in linux package in Ubuntu:
  New

Bug description:
  ---Problem Description---
  When running stress test, sometimes seeing IO hung in dmesg or seeing "Host 
adapter abort request" error.

  ---Steps to Reproduce---
   There are two ways to re-create the issues:
  (1)running HTX, you will see IO timeout backtrace in dmesg in several hours
  (2)running some IO test, then reboot system, repeat this two steps, it takes 
long time to re-create the issue.
   
  ---uname output---
  4.10.0-11-generic

  The bulk of the effort for this issue is currently being worked in
  MicroSemi's JIRA https://jira.pmcs.com/browse/ESDIBMOP-133.

  Ran an interesting test: Ran HTX until I started getting the "stall"
  messages on the console, then shutdown HTX and examined the I/O
  counters for the tested disks in sysfs:

  root@bostonp15:~# for i in 
/sys/devices/pci0003:00/0003:00:00.0/0003:01:00.0/host0/target0:2:[2345]/0:2:[2345]:0;
 do echo ${i##*/} $(<${i}/iorequest_cnt) $(<${i}/iodone_cnt); done
  0:2:2:0 0x5eba3d 0x5eba3d
  0:2:3:0 0x773cc9 0x773cc9
  0:2:4:0 0x782c61 0x782c61
  0:2:5:0 0x5ca134 0x5ca134
  root@bostonp15:~#

  So, none of the disks showed any evidence of having lost an I/O. I
  then restarted HTX and aside from having to manually restart one of
  the disks, see no problems with the testing. It appears that what was
  "hung" was purely in userland.

  This does not absolve the kernel or aacraid driver from blame, but it
  shows that the OS "believes" that it completed the I/O and thus
  removed it from the queue. What we don't know is whether the OS truly
  notified HTX about the completion, or if HTX (or userland libraries)
  just failed to process the notification.

  Tests are running again, will see what happens next.

  Update from JIRA:

  I have run some more experiments. Not sure what it tells us, but
  here's what I've seen.

  First test, ran until I got kernel messages about stalled tasks, then
  shutdown HTX. After HTX was down, I checked the above mentioned
  counters and found that on each disk iorequest_cnt matched iodone_cnt.
  The disks were usable and I could restart HTX. This suggests that the
  problem is not in the PM8069 firmware, and makes the case for the
  aacraid driver having a bug somewhat weaker. However, this merely says
  that the driver "completed" the I/O as far as the kernel is concerned,
  not that a completion rippled back to the application.

  I restarted HTX and have run until errors. This time, I am leaving HTX
  running and observing. Two of the disks reached the HTX error
  threshold and the testers stopped (those 2 disks are now idle).
  Another disks saw errors but then stopped and appears to be running
  fine now. The last disk has not seen any errors (yet). On the two idle
  (errored-out) disks I see  iorequest_cnt matches iodone_cnt. I am able
  to "terminate and restart" the two idle disks and HTX appears to be
  testing them again "normally". Note that no reboot was required,
  further supporting the evidence that, as far as the kernel is
  concerned, there is nothing wrong with the disks and their I/O paths.

  So, I don't believe this completely eliminates aacraid from the
  picture, especially given we don't see this behavior on other
  systems/drivers. But, it probably moves the focus of the investigation
  away form the adapter firmware.

  Tried build upstream 4.11 kernel on Ubuntu. This still gets the hangs.
  Both Ubuntu 4.10 and upstream 4.11 have aacraid driver
  1.2.1[50792]-custom.

  Good new/bad news... While doing an initial evaluation of the LSI-3008
  SAS HBA on Boston and Ubuntu 17.04, I am hitting this same problem.
  So, it appears to have nothing specific to do with the PM8069 or
  aacraid driver.

  Some notes on reproduce this. I have been using the github release of
  HTX, built using the following steps:

  1. apt install make gcc g++ git libncurses5-dev libcxl-dev libdapl-dev 
(others may be required)
  2. git clone https://github.com/open-power/HTX
  3. cd