This issue appears to be present again on kernsl 3.13 (all) 3.16 (all)
and 3.17 (all)

upon shifting sata link power from min_power state to max_performance
state all kernels report various forms of this error:

[   45.200582] ata3.00: exception Emask 0x10 SAct 0x8000 SErr 0x50000 action 
0xe frozen
[   45.200586] ata3.00: irq_stat 0x00400000, PHY RDY changed
[   45.200589] ata3: SError: { PHYRdyChg CommWake }
[   45.200592] ata3.00: failed command: WRITE FPDMA QUEUED
[   45.200596] ata3.00: cmd 61/e8:78:00:3f:48/00:00:04:00:00/40 tag 15 ncq 
118784 out
[   45.200596]          res 40/00:7c:00:3f:48/00:00:04:00:00/40 Emask 0x10 (ATA 
bus error)
[   45.200597] ata3.00: status: { DRDY }
[   45.200601] ata3: hard resetting link
[   45.925051] ata3: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[   45.925911] ata3.00: configured for UDMA/133
[   45.941016] ahci 0000:00:1f.2: port does not support device sleep
[   45.941029] ata3: EH complete

With the current 3.13 kernel reporting the most severe errors of block
write failures, etc.

The machine this is being tested on is an A05 bios Dell XPS13 (9333)

[    2.288104] ata3.00: ATA-8: LITEONIT LMT-256L9M-11 MSATA 256GB, HM8110B, max 
UDMA/133
[    2.288554] scsi 2:0:0:0: Direct-Access     ATA      LITEONIT LMT-256 10B  
PQ: 0 ANSI: 5

As this machine is brand new, it's possible that the HW is actually
failing, however SMART doesn't indicate any problems with the block
device

smartctl 6.2 2013-07-26 r3841 [x86_64-linux-3.17.0-031700-generic] (local build)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Device Model:     LITEONIT LMT-256L9M-11 MSATA 256GB
Serial Number:    TW0N42H75508548P1854
Firmware Version: HM8110B
User Capacity:    256,060,514,304 bytes [256 GB]
Sector Size:      512 bytes logical/physical
Rotation Rate:    Solid State Device
Device is:        Not in smartctl database [for details use: -P showall]
ATA Version is:   ATA8-ACS, ATA/ATAPI-7 T13/1532D revision 4a
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Fri Oct 10 13:39:25 2014 MDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
                                        was never started.
                                        Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0) The previous self-test routine completed
                                        without error or no self-test has ever 
                                        been run.
Total time to complete Offline 
data collection:                (   10) seconds.
Offline data collection
capabilities:                    (0x15) SMART execute Offline immediate.
                                        No Auto Offline data collection support.
                                        Abort Offline collection upon new
                                        command.
                                        No Offline surface scan supported.
                                        Self-test supported.
                                        No Conveyance Self-test supported.
                                        No Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering        
                                                                                
                                                                             
                                        power-saving mode.                      
                                                                                
                                                                             
                                        Supports SMART auto save timer.         
                                                                                
                                                                             
Error logging capability:        (0x01) Error logging supported.                
                                                                                
                                                                             
                                        General Purpose Logging supported.      
                                                                                
                                                                             
Short self-test routine                                                         
                                                                                
                                                                             
recommended polling time:        (   1) minutes.                                
                                                                                
                                                                             
Extended self-test routine                                                      
                                                                                
                                                                             
recommended polling time:        (  10) minutes.                                
                                                                                
                                                                             
SCT capabilities:              (0x003d) SCT Status supported.                   
                                                                                
                                                                             
                                        SCT Error Recovery Control supported.   
                                                                                
                                                                             
                                        SCT Feature Control supported.          
                                                                                
                                                                             
                                        SCT Data Table supported.               
                                                                                
                                                                             
                                                                                
                                                                                
                                                                             
SMART Attributes Data Structure revision number: 1                              
                                                                                
                                                                             
Vendor Specific SMART Attributes with Thresholds:                               
                                                                                
                                                                             
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  
WHEN_FAILED RAW_VALUE                                                           
                                                                                
  
  5 Reallocated_Sector_Ct   0x0003   100   100   000    Pre-fail  Always       
-       0
 12 Power_Cycle_Count       0x0003   100   100   000    Pre-fail  Always       
-       46
175 Program_Fail_Count_Chip 0x0003   100   100   000    Pre-fail  Always       
-       0
176 Erase_Fail_Count_Chip   0x0003   100   100   000    Pre-fail  Always       
-       0
177 Wear_Leveling_Count     0x0003   100   100   000    Pre-fail  Always       
-       1946
178 Used_Rsvd_Blk_Cnt_Chip  0x0003   100   100   000    Pre-fail  Always       
-       0
179 Used_Rsvd_Blk_Cnt_Tot   0x0003   100   100   000    Pre-fail  Always       
-       0
180 Unused_Rsvd_Blk_Cnt_Tot 0x0033   100   100   000    Pre-fail  Always       
-       1216
181 Program_Fail_Cnt_Total  0x0003   100   100   000    Pre-fail  Always       
-       0
182 Erase_Fail_Count_Total  0x0003   100   100   000    Pre-fail  Always       
-       0
187 Reported_Uncorrect      0x0003   100   100   000    Pre-fail  Always       
-       0
195 Hardware_ECC_Recovered  0x0003   100   100   000    Pre-fail  Always       
-       0
241 Total_LBAs_Written      0x0003   100   100   000    Pre-fail  Always       
-       8704
242 Total_LBAs_Read         0x0003   100   100   000    Pre-fail  Always       
-       1385

SMART Error Log Version: 0
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  
LBA_of_first_error
# 1  Short offline       Completed without error       00%         0         -
# 2  Short offline       Completed without error       00%         0         -

Selective Self-tests/Logging not supported

-- 
You received this bug notification because you are a member of Ubuntu
Server Team, which is subscribed to pm-utils in Ubuntu.
https://bugs.launchpad.net/bugs/539467

Title:
  SATA link power management causes disk errors and corruption

To manage notifications about this bug go to:
https://bugs.launchpad.net/linux/+bug/539467/+subscriptions

-- 
Ubuntu-server-bugs mailing list
Ubuntu-server-bugs@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/ubuntu-server-bugs

Reply via email to