Is this an SSD problem or a controller problem?

2013-02-14 Thread FF
This error has shown up at boot up, and then again trying to fire up
smartd.conf.

(aprobe0:ata2:0:0:0): SETFEATURES DISABLE SATA FEATURE. ACB: ef 90 00 00 00
40 00 00 00 00 02 00
(aprobe0:ata2:0:0:0): CAM status: ATA Status Error
(aprobe0:ata2:0:0:0): ATA status: 51 (DRDY SERV ERR), error: 04 (ABRT )
(aprobe0:ata2:0:0:0): RES: 51 04 00 00 00 00 00 00 00 02 00
(aprobe0:ata2:0:0:0): Retrying command
(aprobe0:ata2:0:0:0): SETFEATURES DISABLE SATA FEATURE. ACB: ef 90 00 00 00
40 00 00 00 00 02 00
(aprobe0:ata2:0:0:0): CAM status: ATA Status Error
(aprobe0:ata2:0:0:0): ATA status: 51 (DRDY SERV ERR), error: 04 (ABRT )
(aprobe0:ata2:0:0:0): RES: 51 04 00 00 00 00 00 00 00 02 00
(aprobe0:ata2:0:0:0): Error 5, Retries exhausted


The device is an SSD on ada15.

smartctl 6.0 2012-10-10 r3643 [FreeBSD 9.1-RELEASE amd64] (local build)
Copyright (C) 2002-12, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family: Indilinx Everest/Martini based SSDs
Device Model: OCZ-VERTEX4
Serial Number:OCZ-7T2T10Q1P4294S68
LU WWN Device Id: 5 e83a97 7039f0a47
Firmware Version: 1.5
User Capacity:128,035,676,160 bytes [128 GB]
Sector Size:  512 bytes logical/physical
Rotation Rate:Solid State Device
Device is:In smartctl database [for details use: -P show]
ATA Version is:   ACS-2 (minor revision not indicated)
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 3.0 Gb/s)
Local Time is:Thu Feb 14 14:16:00 2013 EST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

dmesg | grep ada15
ada15 at ata2 bus 0 scbus12 target 0 lun 0
ada15: OCZ-VERTEX4 1.5 ATA-9 SATA 3.x device
ada15: 300.000MB/s transfers (SATA 2.x, UDMA5, PIO 8192bytes)
ada15: 122104MB (250069680 512 byte sectors: 16H 63S/T 16383C)
ada15: Previously was known as ad28

 smartctl -a /dev/ada15
smartctl 6.0 2012-10-10 r3643 [FreeBSD 9.1-RELEASE amd64] (local build)
Copyright (C) 2002-12, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family: Indilinx Everest/Martini based SSDs
Device Model: OCZ-VERTEX4
Serial Number:OCZ-7T2T10Q1P4294S68
LU WWN Device Id: 5 e83a97 7039f0a47
Firmware Version: 1.5
User Capacity:128,035,676,160 bytes [128 GB]
Sector Size:  512 bytes logical/physical
Rotation Rate:Solid State Device
Device is:In smartctl database [for details use: -P show]
ATA Version is:   ACS-2 (minor revision not indicated)
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 3.0 Gb/s)
Local Time is:Thu Feb 14 16:09:21 2013 EST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
was never started.
Auto Offline Data Collection:
Disabled.
Self-test execution status:  ( 249) Self-test routine in progress...
90% of test remaining.
Total time to complete Offline
data collection:(0) seconds.
Offline data collection
capabilities:(0x1d) SMART execute Offline immediate.
No Auto Offline data collection
support.
Abort Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
No Conveyance Self-test supported.
No Selective Self-test supported.
SMART capabilities:(0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability:(0x00) Error logging NOT supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time:(   0) minutes.
Extended self-test routine
recommended polling time:(   0) minutes.

SMART Attributes Data Structure revision number: 18
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME  FLAG VALUE WORST THRESH TYPE  UPDATED
WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate 0x   006   000   000Old_age
Offline  -   6
  3 Spin_Up_Time0x   100   100   000Old_age
Offline  -   0
  4 Start_Stop_Count0x   100   100   000Old_age
Offline  -   0
  5 Reallocated_Sector_Ct   0x   100   100   000Old_age
Offline  -   0
  9 Power_On_Hours  0x   100   100   000Old_age
Offline  -   3334
 12 Power_Cycle_Count   0x   100   100   000   

Re: Is this an SSD problem or a controller problem?

2013-02-14 Thread Alexander Motin
The analysis done by Jeremy is probably right. The device return errors in
response to commands disabling capability that it reported as suppoted and
enabled. That is not fatal, but just annoying. Send me please output of the
'camcontrol identify ada15 -v' to check.

In case of smartctl I guess there is some more problem causing command
timeout and following device reinitialization with the same errors. That
may already be fixed in 9-stable branch.
15.02.2013 3:13 пользователь Jeremy Chadwick j...@koitsu.org написал:

 (Please keep me CC'd as I am not subscribed to this list)

 (Also CC'ing mav@ since he can shed some light on this too)

 Re:
 http://lists.freebsd.org/pipermail/freebsd-questions/2013-February/249183.html

 It is neither an SSD problem nor a controller problem.

 FreeBSD is issuing a specific ATA CDB command to the SSD, and the SSD
 rejects this request, returning ABRT status.  This is perfectly normal
 per ATA specification; the error is harmless.

 You should open a PR on this matter, as FreeBSD should be adjusted in
 some manner to deal with this situation, either via appropriate
 workarounds or a drive quirk.  mav@ would know what's best.

 You will need to provide output from the following commands in your PR:

 * dmesg
 * camcontrol identify ada15
 * pciconf -lvcb
 * Same lines you did in your Email

 Further technical details, which you can put into the PR if you want:

 Looking at src/sys/cam/ata/ata_all.c we can see that the output of the
 ACB is in bytes, output per ata_cmd_string().  Thus:

  ACB: ef 90 00 00 00 40 00 00 00 00 02 00

 Decoding per T13/2015-D rev 3 (ATA8-ACS2) working draft spec:

 0xef   = command  = SET FEATURES
 0x90   = features = Disable use of SATA feature
 0x00 0x00 0x00 = lba_*= n/a
 0x40   = device   = n/a
 0x00 0x00 0x00 = lba_*_exp= n/a
 0x00   = features_exp = n/a
 0x02   = sector_count = Enable/Disable DMA Setup FIS
 Auto-Activate Optimisation
 0x00   = sector_count_exp = n/a

 DMA Setup FIS is defined as:

 7.50.16.3 Enable/Disable DMA Setup FIS Auto-Activate Optimization

 A Count field value of 02h is used to enable or disable DMA Setup FIS
 Auto-Activate optimization. See SATA 2.6 for more information. The
 enable/disable state for the auto-activate optimization shall be
 preserved across software reset. The enable/disable state for the
 auto-activate optimization shall be reset to its default state upon
 COMRESET.

 This feature has to do with NCQ capability for certain types of DMA
 transfers.

 src/sys/cam/ata/ata_xpt.c contains the responsible code.  I could be
 wrong here (mav@ please correct me), but in probestart(), there is:

  452 case PROBE_SETDMAAA:
  453 cam_fill_ataio(ataio,
  454 1,
  455 probedone,
  456 CAM_DIR_NONE,
  457 0,
  458 NULL,
  459 0,
  460 30*1000);
  461 ata_28bit_cmd(ataio, ATA_SETFEATURES,
  462 (softc-caps  CTS_SATA_CAPS_H_DMAAA) ? 0x10 :
 0x90,
  463 0, 0x02);
  464 break;

 CTS_SATA_CAPS_H_DMAAA is defined per include/cam/cam_ccb.h as
 Auto-activation, and its name implies DMA, so this would match the
 feature in question.

 This would explain why you see it when the machine boots (xpt(4) probe),
 as well as when smartctl is run or smartd starts (uses xpt(4)).

 However, I noticed this piece of code in probedone():

  739 /*
  740  * Some HP SATA disks report supported DMA
 Auto-Activation,
  741  * but return ABORT on attempt to enable it.
  742  */
  743 } else if (softc-action == PROBE_SETDMAAA 
  744 status == CAM_ATA_STATUS_ERROR) {
  745 goto noerror;

 Which makes me scratch my head -- the comment and logic seems to imply
 there shouldn't be any error condition reported, but you do see one.

 This also implies that the drive advertises per SATA protocol DMA AA yet
 when xpt(4) tries to disable it the drive rejects that request with ABRT.

 I don't know why OCZ rejects disabling that feature, but whatever.

 Addendum note for mav@ -- we also need to add an ADA_Q_4K quirk entry to
 ata_da.c for Vertex 4 SSDs (OCZ-VERTEX4).

 --
 | Jeremy Chadwick   j...@koitsu.org |
 | UNIX Systems Administratorhttp://jdc.koitsu.org/ |
 | Mountain View, CA, US|
 | Making life hard for others since 1977. PGP 4BD6C0CB |


___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org