ata1.00: failed command: WRITE FPDMA QUEUED on new AMD AM4 MSI B350 Motherboard
With both 4.11 and 4.12 kernels I get the following when doing heavy disk I/O, like a kernel build with "make -j 15". Even copying the kernel source tree from one place to another. The hardware is an MSI B350 Tomahawk Arctic MB with 16GB of memory and a Ryzen 1700 processor. The disk being used is a 160Gb Seagate ST3160815AS that has error free media according to "badblocks -w". Jul 6 13:34:43 cpu0 kernel: ata1.00: exception Emask 0x11 SAct 0x7ffb SErr 0x40 action 0x6 frozen Jul 6 13:34:43 cpu0 kernel: ata1.00: irq_stat 0x4808, interface fatal error Jul 6 13:34:43 cpu0 kernel: ata1: SError: { Handshk } Jul 6 13:34:43 cpu0 kernel: ata1.00: failed command: WRITE FPDMA QUEUED Jul 6 13:34:43 cpu0 kernel: ata1.00: cmd 61/08:00:57:89:90/00:00:03:00:00/40 tag 0 ncq dma 4096 out res 40/00:b8:2f:ff:b3/00:00:02:00:00/40 Emask 0x10 (ATA bus error) Jul 6 13:34:43 cpu0 kernel: ata1.00: status: { DRDY } Jul 6 13:34:43 cpu0 kernel: ata1.00: failed command: WRITE FPDMA QUEUED Jul 6 13:34:43 cpu0 kernel: ata1.00: cmd 61/08:08:87:89:90/00:00:03:00:00/40 tag 1 ncq dma 4096 out res 40/00:b8:2f:ff:b3/00:00:02:00:00/40 Emask 0x10 (ATA bus error) Jul 6 13:34:43 cpu0 kernel: ata1.00: status: { DRDY } Jul 6 13:34:43 cpu0 kernel: ata1.00: failed command: WRITE FPDMA QUEUED Jul 6 13:34:43 cpu0 kernel: ata1.00: cmd 61/20:10:97:89:90/00:00:03:00:00/40 tag 2 ncq dma 16384 out res 40/00:b8:2f:ff:b3/00:00:02:00:00/40 Emask 0x10 (ATA bus error) When I set the kernel cmdline option libata.force=noncq, the messages change into: [ 1724.372101] ata1.00: exception Emask 0x10 SAct 0x0 SErr 0x40 action 0x6 frozen [ 1724.375888] ata1.00: irq_stat 0x4801, interface fatal error [ 1724.379721] ata1: SError: { Handshk } [ 1724.383691] ata1.00: failed command: WRITE DMA EXT [ 1724.383695] ata1.00: cmd 35/00:50:67:0d:e4/00:09:02:00:00/e0 tag 10 dma 1220608 out res 51/84:50:67:0d:e4/00:09:02:00:00/e0 Emask 0x10 (ATA bus error) [ 1724.383699] ata1.00: status: { DRDY ERR } [ 1724.383700] ata1.00: error: { ICRC ABRT } [ 1724.383706] ata1: hard resetting link [ 1724.850060] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300) [ 1724.959883] ata1.00: configured for UDMA/133 [ 1724.959910] ata1: EH complete [ 1921.704356] ata1.00: exception Emask 0x10 SAct 0x0 SErr 0x40 action 0x6 frozen [ 1921.708292] ata1.00: irq_stat 0x4801, interface fatal error [ 1921.712210] ata1: SError: { Handshk } [ 1921.716294] ata1.00: failed command: WRITE DMA EXT [ 1921.716297] ata1.00: cmd 35/00:90:ef:93:86/00:03:02:00:00/e0 tag 18 dma 466944 out res 51/84:90:ef:93:86/00:03:02:00:00/e0 Emask 0x10 (ATA bus error) [ 1921.716298] ata1.00: status: { DRDY ERR } [ 1921.716298] ata1.00: error: { ICRC ABRT } [ 1921.716303] ata1: hard resetting link [ 1922.175312] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300) [ 1922.284165] ata1.00: configured for UDMA/133 [ 1922.288602] ata1: EH complete smartctl shows no issues with the drive. In fact I can take this very drive and install it an an AM3 machine and everything works just fine. I have also installed a PCI-e Sata card and connected the drive to that and that works just fine also. So I have either a linux kernel problem or a hardware problem on this brand new AM4 motherboard. I don't really know what it is other than it is something related with the AMD B350 chipset. It is a fairly new chip set so I am suspicious of the kernel. # smartctl -a /dev/sda smartctl 6.2 2013-11-07 r3856 [x86_64-linux-4.11.6-lcrs] (SUSE RPM) Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org === START OF INFORMATION SECTION === Model Family: Seagate Barracuda 7200.10 Device Model: ST3160815AS Serial Number:6RACD737 Firmware Version: 4.AAB User Capacity:160,041,885,696 bytes [160 GB] Sector Size: 512 bytes logical/physical Device is:In smartctl database [for details use: -P show] ATA Version is: ATA/ATAPI-7 (minor revision not indicated) Local Time is:Fri Jul 7 13:50:50 2017 EDT SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status: (0x82) Offline data collection activity was completed without error. Auto Offline Data Collection: Enabled. Self-test execution status: ( 0) The previous self-test routine completed without error or no self-test has ever been run. Total time to complete Offline data collection:( 430) seconds. Offline data collection capabilities:(0x5b) SMART execute Offline immediate. Auto Offline data collection
ata1.00: failed command: WRITE FPDMA QUEUED on new AMD AM4 MSI B350 Motherboard
With both 4.11 and 4.12 kernels I get the following when doing heavy disk I/O, like a kernel build with "make -j 15". Even copying the kernel source tree from one place to another. The hardware is an MSI B350 Tomahawk Arctic MB with 16GB of memory and a Ryzen 1700 processor. The disk being used is a 160Gb Seagate ST3160815AS that has error free media according to "badblocks -w". Jul 6 13:34:43 cpu0 kernel: ata1.00: exception Emask 0x11 SAct 0x7ffb SErr 0x40 action 0x6 frozen Jul 6 13:34:43 cpu0 kernel: ata1.00: irq_stat 0x4808, interface fatal error Jul 6 13:34:43 cpu0 kernel: ata1: SError: { Handshk } Jul 6 13:34:43 cpu0 kernel: ata1.00: failed command: WRITE FPDMA QUEUED Jul 6 13:34:43 cpu0 kernel: ata1.00: cmd 61/08:00:57:89:90/00:00:03:00:00/40 tag 0 ncq dma 4096 out res 40/00:b8:2f:ff:b3/00:00:02:00:00/40 Emask 0x10 (ATA bus error) Jul 6 13:34:43 cpu0 kernel: ata1.00: status: { DRDY } Jul 6 13:34:43 cpu0 kernel: ata1.00: failed command: WRITE FPDMA QUEUED Jul 6 13:34:43 cpu0 kernel: ata1.00: cmd 61/08:08:87:89:90/00:00:03:00:00/40 tag 1 ncq dma 4096 out res 40/00:b8:2f:ff:b3/00:00:02:00:00/40 Emask 0x10 (ATA bus error) Jul 6 13:34:43 cpu0 kernel: ata1.00: status: { DRDY } Jul 6 13:34:43 cpu0 kernel: ata1.00: failed command: WRITE FPDMA QUEUED Jul 6 13:34:43 cpu0 kernel: ata1.00: cmd 61/20:10:97:89:90/00:00:03:00:00/40 tag 2 ncq dma 16384 out res 40/00:b8:2f:ff:b3/00:00:02:00:00/40 Emask 0x10 (ATA bus error) When I set the kernel cmdline option libata.force=noncq, the messages change into: [ 1724.372101] ata1.00: exception Emask 0x10 SAct 0x0 SErr 0x40 action 0x6 frozen [ 1724.375888] ata1.00: irq_stat 0x4801, interface fatal error [ 1724.379721] ata1: SError: { Handshk } [ 1724.383691] ata1.00: failed command: WRITE DMA EXT [ 1724.383695] ata1.00: cmd 35/00:50:67:0d:e4/00:09:02:00:00/e0 tag 10 dma 1220608 out res 51/84:50:67:0d:e4/00:09:02:00:00/e0 Emask 0x10 (ATA bus error) [ 1724.383699] ata1.00: status: { DRDY ERR } [ 1724.383700] ata1.00: error: { ICRC ABRT } [ 1724.383706] ata1: hard resetting link [ 1724.850060] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300) [ 1724.959883] ata1.00: configured for UDMA/133 [ 1724.959910] ata1: EH complete [ 1921.704356] ata1.00: exception Emask 0x10 SAct 0x0 SErr 0x40 action 0x6 frozen [ 1921.708292] ata1.00: irq_stat 0x4801, interface fatal error [ 1921.712210] ata1: SError: { Handshk } [ 1921.716294] ata1.00: failed command: WRITE DMA EXT [ 1921.716297] ata1.00: cmd 35/00:90:ef:93:86/00:03:02:00:00/e0 tag 18 dma 466944 out res 51/84:90:ef:93:86/00:03:02:00:00/e0 Emask 0x10 (ATA bus error) [ 1921.716298] ata1.00: status: { DRDY ERR } [ 1921.716298] ata1.00: error: { ICRC ABRT } [ 1921.716303] ata1: hard resetting link [ 1922.175312] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300) [ 1922.284165] ata1.00: configured for UDMA/133 [ 1922.288602] ata1: EH complete smartctl shows no issues with the drive. In fact I can take this very drive and install it an an AM3 machine and everything works just fine. I have also installed a PCI-e Sata card and connected the drive to that and that works just fine also. So I have either a linux kernel problem or a hardware problem on this brand new AM4 motherboard. I don't really know what it is other than it is something related with the AMD B350 chipset. It is a fairly new chip set so I am suspicious of the kernel. # smartctl -a /dev/sda smartctl 6.2 2013-11-07 r3856 [x86_64-linux-4.11.6-lcrs] (SUSE RPM) Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org === START OF INFORMATION SECTION === Model Family: Seagate Barracuda 7200.10 Device Model: ST3160815AS Serial Number:6RACD737 Firmware Version: 4.AAB User Capacity:160,041,885,696 bytes [160 GB] Sector Size: 512 bytes logical/physical Device is:In smartctl database [for details use: -P show] ATA Version is: ATA/ATAPI-7 (minor revision not indicated) Local Time is:Fri Jul 7 13:50:50 2017 EDT SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status: (0x82) Offline data collection activity was completed without error. Auto Offline Data Collection: Enabled. Self-test execution status: ( 0) The previous self-test routine completed without error or no self-test has ever been run. Total time to complete Offline data collection:( 430) seconds. Offline data collection capabilities:(0x5b) SMART execute Offline immediate. Auto Offline data collection