> On Apr 10, 2017, at 4:30 PM, Machine Man <gearbo...@outlook.com> wrote: > > Do you select drives based on DWPD?
Not really. Inside a given product line, the difference in DWPD is a matter of overprovisioning. You can adjust the overprovisioning yourself, if needed. note to the lurkers, overprovisioning also impacts the write performance of garbage collection > I am struggling to $500 - $700 drives in stock. I am limited to a number of > distributors and pretty much unless its HP, Cisco or Dell its not kept in > stock. On a number of disks options I got a ship date of late June and all 3 > distributors indicating SSD drives are constrained. Yes, there is a global shortage and all major vendors are on allocation. > I am now down to adding a single SSD during busy hours or when the alerts > start rolling in and removing the ZIL afterhours or when the load reduces > again. > > My only other options for the next 3 weeks are: > 1 - add 15K drives for ZIL and see if that helps. > 2 - Hope for the best on the single old OCS Talos 2 I have bad luck with these > 3 - Mix SAS/SATA on the same backplane. No guarantees, but for more modern expanders and HBAs, we see fewer problems mixing. I wouldn’t attempt for 3G SAS/SATA, but 12G seems more robust. — richard > > I was 100% banking on the ZeusRAM since that is what I could get my hands > immediately. > From: Richard Elling <richard.ell...@richardelling.com> > Sent: Monday, April 10, 2017 5:49:55 PM > To: Machine Man > Cc: omnios-discuss@lists.omniti.com > Subject: Re: [OmniOS-discuss] ZeusRAM - predictive failure > > >> On Apr 10, 2017, at 2:39 PM, Machine Man <gearbo...@outlook.com >> <mailto:gearbo...@outlook.com>> wrote: >> >> Thank you. I am sending it back to where we purchased it from. I thought >> these were no longer avail, but the distributor still listed them and had in >> stock. >> I was hesitant to purchase, but I am in desperate need for a ZIL. > > ZeusRAMs have been EOL for a year or more. AIUI, the parts are no longer > available to build them. > We do see better performance from the modern, enterprise-class, 12G SAS parts > from HGST and Toshiba. > Unfortunately, they are priced by $/GB and not $/latency, so the smaller > capacity (GB) drives are also slower. > — richard > >> >> >> From: Richard Elling <richard.ell...@richardelling.com >> <mailto:richard.ell...@richardelling.com>> >> Sent: Monday, April 10, 2017 4:15:32 PM >> To: Machine Man >> Cc: omnios-discuss@lists.omniti.com <mailto:omnios-discuss@lists.omniti.com> >> Subject: Re: [OmniOS-discuss] ZeusRAM - predictive failure >> >> >>> On Apr 10, 2017, at 1:00 PM, Machine Man <gearbo...@outlook.com >>> <mailto:gearbo...@outlook.com>> wrote: >>> >>> Today I received one of the ZeusRAM that I ordered, both brand new. I was >>> struggling to find SAS SSD drives that were available in my price range as >>> I desperately need to add a ZIL. >>> I decided to order ZeusRAM since they had one in stock and figured I'll add >>> it while waiting for the other one as they are really should not be prone >>> to failure based on design. I have not used them and would normally just >>> prefer to use regular SSD drives. >>> >>> Slotted ZeusRAM in and it began to rapidly blink the same as the disks that >>> are currently in the pool on that backplain. Running the command format >>> would never return with a list of disks. I left it for about 15 min and >>> pulled it since it says on the disk that it can take up to 10 min for the >>> caps. I could see there is an amber and green LED on the drive itself >>> blinking, even when removed. >>> I slotted it back in and the disk was then available. After a few min the >>> fault light cam on and the disk was unavailable due to the following: >>> >>> Fault class : fault.io.disk.predictive-failure >> >> This occurs when the drive responds to an I/O and indicates a predictive >> failure or >> the periodic query for drives sees a predicted failure. It is the drive >> telling the OS that >> the drive thinks it will fail. There is nothing you can do on the OS to >> “fix” this. >> >> It is possible that HGST (nee STEC) can help with further diagnosis using >> the vendor-specific >> log pages. Several years ago, STEC helped us with root cause of failing >> ultracapacitor in a drive. >> AFAIK, there is no publicly available decoder for those log pages. >> — richard >> >> >>> Affects : >>> dev:///:devid=id1,sd@n5000a720300b3d57//pci@0,0/pci8086,340e@7/pci1000,3040@0/iport@f0/disk@w5000a72a300b3d57,0 >>> >>> <dev:///:devid=id1,sd@n5000a720300b3d57//pci@0,0/pci8086,340e@7/pci1000,3040@0/iport@f0/disk@w5000a72a300b3d57,0> >>> faulted and taken out of service >>> FRU : "Slot 09" >>> (hc://:product-id=LSI-SAS2X36:server-id=:chassis-id=50030480178cf57f:serial=STM000****:part=STEC-ZeusRAM:revision=C025/ses-enclosure=1/bay=8/disk=0 >>> >>> <hc://:product-id=LSI-SAS2X36:server-id=:chassis-id=50030480178cf57f:serial=STM000****:part=STEC-ZeusRAM:revision=C025/ses-enclosure=1/bay=8/disk=0>) >>> faulty >>> Description : SMART health-monitoring firmware reported that a disk >>> failure is imminent. >>> >>> >>> I cleared the fault and the drive was then usable again for a few min same >>> thing happened. Eventually the amber light on the disk itself (not the >>> enclosure disk light) no longer blinked and the disks was online for quite >>> some time before the alert above reappeared. >>> >>> >>> === START OF INFORMATION SECTION === >>> Vendor: STEC >>> Product: ZeusRAM >>> Revision: C025 >>> Compliance: SPC-4 >>> User Capacity: 8,000,000,000 bytes [8.00 GB] >>> Logical block size: 512 bytes >>> Rotation Rate: Solid State Device >>> Form Factor: 3.5 inches >>> Logical Unit id: 0x5000a720300b3d57 >>> Serial number: STM000****** >>> Device type: disk >>> Transport protocol: SAS (SPL-3) >>> Local Time is: Mon Apr 10 19:17:23 2017 UTC >>> SMART support is: Available - device has SMART capability. >>> SMART support is: Enabled >>> Temperature Warning: Enabled >>> === START OF READ SMART DATA SECTION === >>> SMART Health Status: OK >>> Current Drive Temperature: 40 C >>> Drive Trip Temperature: 80 C >>> Elements in grown defect list: 0 >>> Vendor (Seagate) cache information >>> Blocks sent to initiator = 0 >>> Blocks sent to initiator = 0 >>> Error counter log: >>> Errors Corrected by Total Correction Gigabytes >>> Total >>> ECC rereads/ errors algorithm processed >>> uncorrected >>> fast | delayed rewrites corrected invocations [10^9 bytes] >>> errors >>> read: 0 0 0 0 0 21.323 >>> 0 >>> write: 0 0 0 0 0 83.809 >>> 0 >>> Non-medium error count: 0 >>> >>> >>> >>> Is there anything special that should be done for ZeusRAM in sd.conf? Its a >>> node install and both nodes can see all the drives. I don't see any smart >>> errors listed, but running fmadm it will show the disk as faulty due to >>> predictive failure. >>> OmniOS r20 all patches applied. >>> >>> >>> thanks, >>> _______________________________________________ >>> OmniOS-discuss mailing list >>> OmniOS-discuss@lists.omniti.com <mailto:OmniOS-discuss@lists.omniti.com> >>> http://lists.omniti.com/mailman/listinfo/omnios-discuss >>> <http://lists.omniti.com/mailman/listinfo/omnios-discuss> >> -- >> >> richard.ell...@richardelling.com <mailto:richard.ell...@richardelling.com> >> +1-760-896-4422 -- richard.ell...@richardelling.com +1-760-896-4422
_______________________________________________ OmniOS-discuss mailing list OmniOS-discuss@lists.omniti.com http://lists.omniti.com/mailman/listinfo/omnios-discuss