Public bug reported:

Somewhere between 4.15.0-112-generic and 5.3.0-62-generic the kernel
config option SATA_MOBILE_LPM_POLICY was changed from 0 (the upstream
default) to 3. This is causing frequent SATA link resets, resulting in
I/O stalls and errors. For example:

ata1.00: exception Emask 0x0 SAct 0xdc0000 SErr 0x50000 action 0x6 frozen
ata1: SError: { PHYRdyChg CommWake }
ata1.00: failed command: WRITE FPDMA QUEUED
ata1.00: cmd 61/20:90:d8:62:c6/00:00:24:00:00/40 tag 18 ncq dma 16384 out
         res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
ata1.00: status: { DRDY }
ata1.00: failed command: READ FPDMA QUEUED
ata1.00: cmd 60/20:98:60:26:1e/00:00:00:00:00/40 tag 19 ncq dma 16384 in
         res 40/00:01:06:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
ata1.00: status: { DRDY }
ata1.00: failed command: WRITE FPDMA QUEUED
ata1.00: cmd 61/08:a0:78:85:11/00:00:03:00:00/40 tag 20 ncq dma 4096 out
         res 40/00:00:00:4f:c2/00:01:00:00:00/00 Emask 0x4 (timeout)
ata1.00: status: { DRDY }
ata1.00: failed command: WRITE FPDMA QUEUED
ata1.00: cmd 61/10:b0:80:60:c6/00:00:24:00:00/40 tag 22 ncq dma 8192 out
         res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
ata1.00: status: { DRDY }
ata1.00: failed command: READ FPDMA QUEUED
ata1.00: cmd 60/08:b8:d0:13:ac/00:00:02:00:00/40 tag 23 ncq dma 4096 in
         res 40/00:01:01:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
ata1.00: status: { DRDY }
ata1: hard resetting link
ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
ata1.00: configured for UDMA/100
ata1.00: device reported invalid CHS sector 0
ata1.00: device reported invalid CHS sector 0
sd 0:0:0:0: [sda] tag#23 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
sd 0:0:0:0: [sda] tag#23 Sense Key : Illegal Request [current] 
sd 0:0:0:0: [sda] tag#23 Add. Sense: Unaligned write command
sd 0:0:0:0: [sda] tag#23 CDB: Read(10) 28 00 02 ac 13 d0 00 00 08 00
blk_update_request: I/O error, dev sda, sector 44831696 op 0x0:(READ) flags 
0x80700 phys_seg 1 prio class 0
ata1: EH complete

Available workarounds:

1) downgrading to 4.15.0-*-generic
2) appending 'ahci.mobile_lpm_policy=n' to the kernel command line, where 'n' 
is either 0, 1 or 2.

The meanings of policy numbers can be found at
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/ata/Kconfig?h=v5.8-rc7#n118:

        0 => Keep firmware settings
        1 => Maximum performance
        2 => Medium power
        3 => Medium power with Device Initiated PM enabled
        4 => Minimum power

The computer in question is an Intel NUC DN2820FYK (running the latest
system firmware version), containing an embedded Intel Corporation Atom
Processor E3800 Series SATA AHCI Controller (rev 0e) controller. The
hard drive is a HITACHI HTS723232L9SA60.

I have confirmed that the issue persists in the latest mainline kernel
build (5.8.0-050800rc7-generic).

ProblemType: Bug
DistroRelease: Ubuntu 18.04
Package: linux-image-5.4.0-42-generic 5.4.0-42.46~18.04.1
ProcVersionSignature: Ubuntu 5.4.0-42.46~18.04.1-generic 5.4.44
Uname: Linux 5.4.0-42-generic x86_64
ApportVersion: 2.20.9-0ubuntu7.15
Architecture: amd64
Date: Sat Aug  1 10:51:29 2020
SourcePackage: linux-signed-hwe-5.4
UpgradeStatus: Upgraded to bionic on 2020-06-28 (33 days ago)

** Affects: linux-signed-hwe-5.4 (Ubuntu)
     Importance: Undecided
         Status: New


** Tags: amd64 apport-bug bionic

** Attachment added: "Kernel messages (contains further examples of the errors)"
   https://bugs.launchpad.net/bugs/1889968/+attachment/5397627/+files/dmesg.txt

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1889968

Title:
  [regression] Changed CONFIG_SATA_MOBILE_LPM_POLICY=3 default causes
  I/O errors

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-signed-hwe-5.4/+bug/1889968/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to