On 06/17/15 16:30, Mikael wrote: > 2015-06-18 0:53 GMT+05:30 Theo de Raadt <dera...@cvs.openbsd.org>: > >> > 2) General on SSD: When an SSD starts to shrink because it starts to wear >> > out, how is this handled and how does this appear to the OS, logs, and >> > system software? >> >> Invisible. Even when a few drives make it visible in some way, it is >> highly proprietary. >> > > What is then proper behavior for a program or system using an SSD, to deal > with SSD degradation?:
replace drive before it is an issue. > So say you have a program altering a file's contents all the time, or you > have file turnover on a system (rm f123; echo importantdata > f124). At > some point the SSD will shrink and down the line reach zero capacity. That's not how it works. The SSD has some number of spare storage blocks. When it finds a bad block, it locks out the bad block and swaps in a good block. Curiously -- this is EXACTLY how modern "spinning rust" hard disks have worked for about ... 20 years (yeah. The "pre-modern" disks were more exciting). Write, verify, if error on verify, write to another storage block, remap new block to old logical location. Nothing new here (this is why people say that "heads", "cylinders" and "sectors per track" have been meaningless for some time). When the disk runs out of places to write the good data, it throws a permanent write error back to the OS and you have a really bad day. The only difference in this with SSDs is the amount of storage dedicated to this (be scared?). Neither SSDs nor magnetic disks "shrink" to the outside world. The moment they need a replacement block that doesn't exist, the disk has lost data for you and you should call it failed...it has not "shrunk". Now, in both cases, this is assuming the drive fails in the way you expect -- that the "flaw" will be spotted on immediate read-after-write, while the data is still in the disk's cache or buffer. There is more than one way magnetic disks fail, there's more than one way SSDs fail. People tend to hyperventilate over the one way and forget all the rest. Run your SSDs in production servers for two or three years, then swap them out. That's about the warranty on the entire box. The people that believe in the manufacturer's warranty being the measure of suitability for production replace their machines then anyway. Zero your SSDs, give them to your staff to stick in their laptops or game computers, or use them for experimentation and dev systems after that. Don't hyperventilate over ONE mode of failure, the majority of your SSDs that fail will probably fail for other reasons. [snip] > 3) On OBSD, how would you generally suggest to make a magnet-SSD hybrid > disk setup where the SSD gives the speed and maget storage security? Hybrid disks are a specific thing (or a few specific things) -- magnetic disks with an SSD cache or magnetic/SSD combos where the first X% of the disk is SSD, the rest is magnetic (or vise-versa, I guess, but I don't recall having seen that). SSD cache, you use like any other disk. Split mode, you use as multiple partitions, as appropriate. You clarified this to being about a totally different thing...mirroring an SSD with a Rotating Rust disk. At this point, most RAID systems I've seen do not support a "preferred read" device. Maybe they should start thinking about that. Maybe they shouldn't -- most applications that NEED the SSD performance for something other than single user jollies (i.e., a database server vs. having your laptop boot faster) will face-plant severely should performance suddenly drop by an order of magnitude. In many of these cases, the performance drops to the point that the system death-spirals as queries come in faster than they are answered. (this is why when you have an imbalanced redundant pair of machines, the faster machine should always be the standby machine, not the primary. Sometimes "Does the same job just slower" is still quite effectively "down"). Nick.