Hmm, I rebooted this server for the first time since I was testing the SSD, and it marked the SSD faulty again :( --
r...@ike ~ # fmadm faulty --------------- ------------------------------------ -------------- --------- TIME EVENT-ID MSG-ID SEVERITY --------------- ------------------------------------ -------------- --------- Aug 19 19:46:15 091fd12e-0e26-49c4-87df-85e6b46d78fd DISK-8000-2J Critical Fault class : fault.io.disk.self-test-failure Affects : dev:///:devid=id1,s...@sata_____ssdsa2sh032g1gn___cvem902600j6032hgn//p...@2,0/pci1022,7...@8/pci11ab,1...@1/d...@0,0 faulted but still in service FRU : "HD_ID_4" (hc://:product-id=Sun-Fire-X4500:chassis-id=0819AMT059:server-id=ike:serial=CVEM902600J6032HGN:part=SSDSA2SH032G1GN-INTEL:revision=045C8626/bay=4/disk=0) faulty I'm going to mark it as repaired and see if it gets marked faulty again. I never heard back from you as to a possible resolution to this? Any progress? Thanks... On Tue, 28 Jul 2009, Paul B. Henson wrote: > Just wondering if you've made any further progress on handling the buggy > Intel firmware. So far I haven't had any further fma issues with the SSD, > but on general principle it would be nice if everything worked the way > it's supposed to :). Until then, I'll be sure to seed the self test log > before putting a new SSD in... > > On Thu, 18 Jun 2009, Paul B. Henson wrote: > > > On Thu, 18 Jun 2009, Eric Schrock wrote: > > > > > totally invalid data in response to the ATA READ EXT LOG command for log > > > 0x07 (Extended SMART self-test log). The spec defines that byte 0 > > > must be 0x1 and that byte 1 is reserved. > > > > > > You can see this from your previous smartctl output from Linux: > > > > Yes, I had noticed that. > > > > > This is apparently causing us to trip up in strange ways. I don't know > > > how the hardware SATL translation is not getting tripped up. Some more > > > investigation is necessary, but it's clear the firmware on this drive is > > > quite broken. > > > > You don't happen to have a good contact at Intel I could complain to :)? I > > somehow think my chances if I cold call their support line with this issue > > are pretty slim to none :(. > > > > smartctl evidently works around this issue, in fact, on reviewing the > > documentation, it looks like a *lot* of drives aren't exactly spec > > compliant and there are numerous workarounds to try and do the right thing. > > Is this something you think you would work around in Solaris code, or would > > end resolution require Intel to fix their buggy firmware? > > > > Fortunately, after initiating the self tests under Linux, the incorrect > > data being returned no longer causes a fault. And since nothing is > > initiating self tests under Solaris, you don't really lose anything from > > invalid self test results. > > > > Thanks again, and let me know if you need anything else. -- Paul B. Henson | (909) 979-6361 | http://www.csupomona.edu/~henson/ Operating Systems and Network Analyst | hen...@csupomona.edu California State Polytechnic University | Pomona CA 91768 _______________________________________________ fm-discuss mailing list fm-discuss@opensolaris.org