Re: when SSDs are not so solid or why no TRIM support can be a good thing :)
For having a *guaranteedly intact* storage, what is the way then? This is with the background of recent discussions that touched on https://www.usenix.org/legacy/events/fast08/tech/full_papers/bairavasundaram/bairavasundaram_html/index.html and https://blog.algolia.com/when-solid-state-drives-are-not-that-solid/ . What about having *two SSD*:s in softraid RAID1, and as soon as any IO failure is found on either SSD, that one would be replaced? If the underlying read operations are made from both SSD:s each time and the machine has ECC RAM (??and UFS is checksummed enough??), then at least the OS would be able to detect corruption (??, fix anything??) and return proper read failures (or sigsegv) properly. Mikael 2015-06-18 16:23 GMT+07:00 Karel Gardas gard...@gmail.com: On Thu, Jun 18, 2015 at 9:08 AM, David Dahlberg david.dahlb...@fkie.fraunhofer.de wrote: Am Donnerstag, den 18.06.2015, 02:15 +0530 schrieb Mikael: 2015-06-18 2:07 GMT+05:30 Gareth Nelson gar...@garethnelson.com: No I meant, you plug in a 2TB SSD and a 2TB magnet HD, is there any way to make them properly mirror each other [so the SSD performance is delivered while the magnet disk safeguards contents] - would you use softraid here? No. If you use a RAID1, you'll get the performance of the worse of both disks. To support multiple disks with different characteristics and to get the most out of it was AFAIK one of motivations for Matthew Dillon to write HAMMER. I'm not sure about RAID1 in general, but I'm reading softraid code recently and based on it I would claim that you get write performance of the slowest drive (assuming OpenBSD schedule writes to different drives in parallel), but read performance slightly higher than slower drive since the read is done in round-robin fashion hence SSD will speed it a little bit. Anyway, the interesting question is if it makes sense to balance this interleaving reading based on actual drive performance. AFAIK this should be possible, but IMHO it'll not be that reliable, i.e. it'll not provide that much of added reliability. Since reliability is my concern, I'm more looking forward to see kind of virtual drive with implemented block checksumming in OpenBSD, that IMHO will provide some added reliability when run for example in RAID1 setup. Karel
Re: when SSDs are not so solid or why no TRIM support can be a good thing :)
On Thu, Jun 25, 2015 at 12:57 PM, Mikael mikael.tr...@gmail.com wrote: For having a *guaranteedly intact* storage, what is the way then? This is with the background of recent discussions that touched on https://www.usenix.org/legacy/events/fast08/tech/full_papers/bairavasundaram/bairavasundaram_html/index.htmland https://blog.algolia.com/when-solid-state-drives-are-not-that-solid/ . What about having *two SSD*:s in softraid RAID1, and as soon as any IO failure is found on either SSD, that one would be replaced? If the underlying read operations are made from both SSD:s each time and the machine has ECC RAM (??and UFS is checksummed enough??), then at least the OS would be able to detect corruption (??, fix anything??) and return proper read failures (or sigsegv) properly. I'm afraid that as far as SSD is not signalling any issue you may end with corrupted data in the RAM and even softraid RAID1 will not help you. AFAIK FFS does not provide any checksumming support for user data so this is the same issue again. I've tinkering with an idea to enhance softraid RAID1 with checksumming support. Currently reading papers and code to grasp some knowledge about the topic. The thread is here: https://www.marc.info/?l=openbsd-techm=143447306012773w=1 -- if you are quicker than me implementing it, then great! I'll probably switch to some other task in OpenBSD domain. :-) Cheers, Karel
Re: when SSDs are not so solid or why no TRIM support can be a good thing :)
Chris Cappuccio, 19 Jun 2015 09:59: The problem identified in this article is _NOT_ TRIM support. It's QUEUED TRIM support. It's an exotic firmware feature that is BROKEN. Suffice to say, if Windows doesn't exercise an exotic feature in PC hardware, it may not be well tested by anybody! the author has clarified in the comments bellow the article that TRIM was the issue and not QUEUED TRIM. -f -- you have 2 choices for dinner -- take it or leave it.
Re: when SSDs are not so solid or why no TRIM support can be a good thing :)
Mikael [mikael.tr...@gmail.com] wrote: 2015-06-18 2:07 GMT+05:30 Gareth Nelson gar...@garethnelson.com: On point 3, hybrid SSD drives usually just present a standard IDE interface - just use a SATA controller and you don't need to worry about it No I meant, you plug in a 2TB SSD and a 2TB magnet HD, is there any way to make them properly mirror each other [so the SSD performance is delivered while the magnet disk safeguards contents] - would you use softraid here? You would do nightly backups. RAID 1 would limit your write performance to that of the HDD.
Re: when SSDs are not so solid or why no TRIM support can be a good thing :)
Karel Gardas [gard...@gmail.com] wrote: Honestly with ~20% provision, once your SSD starts to shrink down, it's already good enough to be put into dustbin. The recent SSD endurance reviews on the review sites seem to show that it takes a long, long, long time before the modern SSD indicates that it has to remap blocks due to errors, except for the Samsung TLC drives, which operate for a long time in this state. Most drives appear to indicate few to no remaps until they are close to the end of their useful life. Another question is of this buggy TRIM, but I'm afraid this may be hard fight even with replication and checksumming filesystems (ZFS/HAMMER/BTRFS). The problem identified in this article is _NOT_ TRIM support. It's QUEUED TRIM support. It's an exotic firmware feature that is BROKEN. Suffice to say, if Windows doesn't exercise an exotic feature in PC hardware, it may not be well tested by anybody! Queued TRIM support is overkill. Regular TRIM support could be achieved by just telling the drive which blocks are to be zeroed during idle times with a stand-alone utility. That is the most reliable way to use TRIM on all drives with the current state-of-the-art
Re: when SSDs are not so solid or why no TRIM support can be a good thing :)
On Wed, Jun 17, 2015 at 8:27 PM, Nick Holland n...@holland-consulting.net wrote: been meaningless for some time). When the disk runs out of places to write the good data, it throws a permanent write error back to the OS and you have a really bad day. The only difference in this with SSDs is the amount of storage dedicated to this (be scared?). I'm guessing that spare space management is typically handled entirely within the drive and is not exposed as an API, right? In other words, you can't say to the drive you say you're out of spare space, but let's take this space here that I'm not using and use those as new spare space so I can keep using this drive with a reduced capacity.
Re: when SSDs are not so solid or why no TRIM support can be a good thing :)
On 06/19/15 13:38, andrew fabbro wrote: On Wed, Jun 17, 2015 at 8:27 PM, Nick Holland n...@holland-consulting.net wrote: been meaningless for some time). When the disk runs out of places to write the good data, it throws a permanent write error back to the OS and you have a really bad day. The only difference in this with SSDs is the amount of storage dedicated to this (be scared?). I'm guessing that spare space management is typically handled entirely within the drive and is not exposed as an API, right? right. Just like a magnetic disk. In other words, you can't say to the drive you say you're out of spare space, but let's take this space here that I'm not using and use those as new spare space so I can keep using this drive with a reduced capacity. right. Just like a magnetic disk. Really. Not much new here, just faster. Seems the more people try to do special things for SSDs, the more they get into trouble. Stop. Just treat the SSD as a really fast disk, and you will be happy. SSDs -- over all -- will probably last for the first life (about three years) of a computer like rotating rust (i.e., some failures, but nothing too surprising). For recycled hw, well, let's see how it works out, but I very often replace disks anyway on recycled computers -- that once huge 120G disk is not impressive any more and using the old disks on things that don't matter much. With the current crop of sub $100US 2T disks, I wonder how long they will last, too. Nick.
Re: when SSDs are not so solid or why no TRIM support can be a good thing :)
On 2015-06-18, Nick Holland n...@holland-consulting.net wrote: The SSD has some number of spare storage blocks. When it finds a bad block, it locks out the bad block and swaps in a good block. Curiously -- this is EXACTLY how modern spinning rust hard disks have worked for about ... 20 years Easily 25, for SCSI disks. Now, in both cases, this is assuming the drive fails in the way you expect -- that the flaw will be spotted on immediate read-after-write, while the data is still in the disk's cache or buffer. There is more than one way magnetic disks fail, there's more than one way SSDs fail. People tend to hyperventilate over the one way and forget all the rest. They also tend to forget that magnetic disks also corrupt data, or never write it, or write it to the wrong place on disk. Time to remind people of this great paper: An Analysis of Data Corruption in the Storage Stack https://www.usenix.org/legacy/events/fast08/tech/full_papers/bairavasundaram/bairavasundaram_html/index.html If nothing else, read section 2.3 Corruption Classes. It should scare the bejesus out of you. -- Christian naddy Weisgerber na...@mips.inka.de
Re: when SSDs are not so solid or why no TRIM support can be a good thing :)
On Thu, Jun 18, 2015 at 1:53 PM, Christian Weisgerber na...@mips.inka.de wrote: They also tend to forget that magnetic disks also corrupt data, or never write it, or write it to the wrong place on disk. Time to remind people of this great paper: An Analysis of Data Corruption in the Storage Stack https://www.usenix.org/legacy/events/fast08/tech/full_papers/bairavasundaram/bairavasundaram_html/index.html If nothing else, read section 2.3 Corruption Classes. It should scare the bejesus out of you. Nice text! I especially like 6.2 Lessons Learned, thanks for sharing! Karel
Re: when SSDs are not so solid or why no TRIM support can be a good thing :)
Am Donnerstag, den 18.06.2015, 02:15 +0530 schrieb Mikael: 2015-06-18 2:07 GMT+05:30 Gareth Nelson gar...@garethnelson.com: No I meant, you plug in a 2TB SSD and a 2TB magnet HD, is there any way to make them properly mirror each other [so the SSD performance is delivered while the magnet disk safeguards contents] - would you use softraid here? No. If you use a RAID1, you'll get the performance of the worse of both disks. To support multiple disks with different characteristics and to get the most out of it was AFAIK one of motivations for Matthew Dillon to write HAMMER. -- David Dahlberg Fraunhofer FKIE, Dept. Communication Systems (KOM) | Tel: +49-228-9435-845 Fraunhoferstr. 20, 53343 Wachtberg, Germany| Fax: +49-228-856277
Re: when SSDs are not so solid or why no TRIM support can be a good thing :)
On Thu, Jun 18, 2015 at 9:08 AM, David Dahlberg david.dahlb...@fkie.fraunhofer.de wrote: Am Donnerstag, den 18.06.2015, 02:15 +0530 schrieb Mikael: 2015-06-18 2:07 GMT+05:30 Gareth Nelson gar...@garethnelson.com: No I meant, you plug in a 2TB SSD and a 2TB magnet HD, is there any way to make them properly mirror each other [so the SSD performance is delivered while the magnet disk safeguards contents] - would you use softraid here? No. If you use a RAID1, you'll get the performance of the worse of both disks. To support multiple disks with different characteristics and to get the most out of it was AFAIK one of motivations for Matthew Dillon to write HAMMER. I'm not sure about RAID1 in general, but I'm reading softraid code recently and based on it I would claim that you get write performance of the slowest drive (assuming OpenBSD schedule writes to different drives in parallel), but read performance slightly higher than slower drive since the read is done in round-robin fashion hence SSD will speed it a little bit. Anyway, the interesting question is if it makes sense to balance this interleaving reading based on actual drive performance. AFAIK this should be possible, but IMHO it'll not be that reliable, i.e. it'll not provide that much of added reliability. Since reliability is my concern, I'm more looking forward to see kind of virtual drive with implemented block checksumming in OpenBSD, that IMHO will provide some added reliability when run for example in RAID1 setup. Karel
Re: when SSDs are not so solid or why no TRIM support can be a good thing :)
On 17/06/15 08:05, frantisek holop wrote: https://blog.algolia.com/when-solid-state-drives-are-not-that-solid/ also note the part relating to ext4: I have to admit, I slept better before reading the changelog. fast, features, realiable: pick any 2. -f I don't think TRIM is to blame here. I don't understand why someone on their sane mind would use latests versions of Ubuntu and Linux for servers. And yes, I know Ubuntu for Servers is a thing, and yes, I know the fight this instability with redundancy, but stil... About EXT4: it is not exactly the most trust-worthy filesystem there is. Interesting reading, though.
Re: when SSDs are not so solid or why no TRIM support can be a good thing :)
1) From the article, what can we see that Ext4/Linux actually did wrong here? - Is it that the TRUNCATE command should be abandoned completely, or was it how it matched supported/unsupported drives, or something else? Mariano was being a jerk by assuming it is a bug in ext4 or other code. Bringing his biases to the table, perhaps? The problem is simple and comes as no surprise to many: Until quite recently, many SSD drives had bugs in their TRIM support, probably because TRIM was underutilized by operating systems. Even when operating systems use TRIM, they care rather cautious, because [extremely long explanation full of pain deleted]. So it is not Linux filesystem code. The phase of the moon is not linked to these problems either. 2) General on SSD: When an SSD starts to shrink because it starts to wear out, how is this handled and how does this appear to the OS, logs, and system software? Invisible. Even when a few drives make it visible in some way, it is highly proprietary.
Re: when SSDs are not so solid or why no TRIM support can be a good thing :)
Wait, just for my (and I guesss some others') clarity, three questions here: 1) From the article, what can we see that Ext4/Linux actually did wrong here? - Is it that the TRUNCATE command should be abandoned completely, or was it how it matched supported/unsupported drives, or something else? 2) General on SSD: When an SSD starts to shrink because it starts to wear out, how is this handled and how does this appear to the OS, logs, and system software? 3) On OBSD, how would you generally suggest to make a magnet-SSD hybrid disk setup where the SSD gives the speed and maget storage security? Thanks! 2015-06-17 23:17 GMT+05:30 Mariano Ignacio Baragiola mari...@baragiola.com.ar: On 17/06/15 08:05, frantisek holop wrote: https://blog.algolia.com/when-solid-state-drives-are-not-that-solid/ also note the part relating to ext4: I have to admit, I slept better before reading the changelog. fast, features, realiable: pick any 2. -f I don't think TRIM is to blame here. I don't understand why someone on their sane mind would use latests versions of Ubuntu and Linux for servers. And yes, I know Ubuntu for Servers is a thing, and yes, I know the fight this instability with redundancy, but stil... About EXT4: it is not exactly the most trust-worthy filesystem there is. Interesting reading, though.
Re: when SSDs are not so solid or why no TRIM support can be a good thing :)
2015-06-18 2:07 GMT+05:30 Gareth Nelson gar...@garethnelson.com: On point 3, hybrid SSD drives usually just present a standard IDE interface - just use a SATA controller and you don't need to worry about it No I meant, you plug in a 2TB SSD and a 2TB magnet HD, is there any way to make them properly mirror each other [so the SSD performance is delivered while the magnet disk safeguards contents] - would you use softraid here?
Re: when SSDs are not so solid or why no TRIM support can be a good thing :)
Honestly with ~20% provision, once your SSD starts to shrink down, it's already good enough to be put into dustbin. Another question is of this buggy TRIM, but I'm afraid this may be hard fight even with replication and checksumming filesystems (ZFS/HAMMER/BTRFS). Cheers, Karel On Wed, Jun 17, 2015 at 10:30 PM, Mikael mikael.tr...@gmail.com wrote: 2015-06-18 0:53 GMT+05:30 Theo de Raadt dera...@cvs.openbsd.org: 2) General on SSD: When an SSD starts to shrink because it starts to wear out, how is this handled and how does this appear to the OS, logs, and system software? Invisible. Even when a few drives make it visible in some way, it is highly proprietary. What is then proper behavior for a program or system using an SSD, to deal with SSD degradation?: So say you have a program altering a file's contents all the time, or you have file turnover on a system (rm f123; echo importantdata f124). At some point the SSD will shrink and down the line reach zero capacity. The degradation process will be such that there will be no file content loss as long as the shrinking doesn't exceed the FS total files size right? (Spontaneously I'd presume the SSD informs the OS of shrinking at sector write time through failing sector writes, and the OS registers shrunk parts in the FS as broken sectors.) Will the SSD+OS reflect the degradation status in getfsstat(2) f_blocks / df(1) blocks so the program just ensure there's some f_bavail / avail all the time by simply shrinking (ftruncate etc.) its files accordingly and when f_blocks is too small shuts down completely? 3) On OBSD, how would you generally suggest to make a magnet-SSD hybrid disk setup where the SSD gives the speed and maget storage security?
Re: when SSDs are not so solid or why no TRIM support can be a good thing :)
If I wanted a setup like that i'd just use RAID, note the obvious - write performance will be the same (or possibly slightly slower due to the added RAID layer) --- âLanie, Iâm going to print more printers. Lots more printers. One for everyone. Thatâs worth going to jail for. Thatâs worth anything.â - Printcrime by Cory Doctrow Please avoid sending me Word or PowerPoint attachments. See http://www.gnu.org/philosophy/no-word-attachments.html On Wed, Jun 17, 2015 at 9:45 PM, Mikael mikael.tr...@gmail.com wrote: 2015-06-18 2:07 GMT+05:30 Gareth Nelson gar...@garethnelson.com: On point 3, hybrid SSD drives usually just present a standard IDE interface - just use a SATA controller and you don't need to worry about it No I meant, you plug in a 2TB SSD and a 2TB magnet HD, is there any way to make them properly mirror each other [so the SSD performance is delivered while the magnet disk safeguards contents] - would you use softraid here?
Re: when SSDs are not so solid or why no TRIM support can be a good thing :)
2015-06-18 0:53 GMT+05:30 Theo de Raadt dera...@cvs.openbsd.org: 2) General on SSD: When an SSD starts to shrink because it starts to wear out, how is this handled and how does this appear to the OS, logs, and system software? Invisible. Even when a few drives make it visible in some way, it is highly proprietary. What is then proper behavior for a program or system using an SSD, to deal with SSD degradation?: So say you have a program altering a file's contents all the time, or you have file turnover on a system (rm f123; echo importantdata f124). At some point the SSD will shrink and down the line reach zero capacity. The degradation process will be such that there will be no file content loss as long as the shrinking doesn't exceed the FS total files size right? (Spontaneously I'd presume the SSD informs the OS of shrinking at sector write time through failing sector writes, and the OS registers shrunk parts in the FS as broken sectors.) Will the SSD+OS reflect the degradation status in getfsstat(2) f_blocks / df(1) blocks so the program just ensure there's some f_bavail / avail all the time by simply shrinking (ftruncate etc.) its files accordingly and when f_blocks is too small shuts down completely? 3) On OBSD, how would you generally suggest to make a magnet-SSD hybrid disk setup where the SSD gives the speed and maget storage security?
Re: when SSDs are not so solid or why no TRIM support can be a good thing :)
On 06/17/15 16:30, Mikael wrote: 2015-06-18 0:53 GMT+05:30 Theo de Raadt dera...@cvs.openbsd.org: 2) General on SSD: When an SSD starts to shrink because it starts to wear out, how is this handled and how does this appear to the OS, logs, and system software? Invisible. Even when a few drives make it visible in some way, it is highly proprietary. What is then proper behavior for a program or system using an SSD, to deal with SSD degradation?: replace drive before it is an issue. So say you have a program altering a file's contents all the time, or you have file turnover on a system (rm f123; echo importantdata f124). At some point the SSD will shrink and down the line reach zero capacity. That's not how it works. The SSD has some number of spare storage blocks. When it finds a bad block, it locks out the bad block and swaps in a good block. Curiously -- this is EXACTLY how modern spinning rust hard disks have worked for about ... 20 years (yeah. The pre-modern disks were more exciting). Write, verify, if error on verify, write to another storage block, remap new block to old logical location. Nothing new here (this is why people say that heads, cylinders and sectors per track have been meaningless for some time). When the disk runs out of places to write the good data, it throws a permanent write error back to the OS and you have a really bad day. The only difference in this with SSDs is the amount of storage dedicated to this (be scared?). Neither SSDs nor magnetic disks shrink to the outside world. The moment they need a replacement block that doesn't exist, the disk has lost data for you and you should call it failed...it has not shrunk. Now, in both cases, this is assuming the drive fails in the way you expect -- that the flaw will be spotted on immediate read-after-write, while the data is still in the disk's cache or buffer. There is more than one way magnetic disks fail, there's more than one way SSDs fail. People tend to hyperventilate over the one way and forget all the rest. Run your SSDs in production servers for two or three years, then swap them out. That's about the warranty on the entire box. The people that believe in the manufacturer's warranty being the measure of suitability for production replace their machines then anyway. Zero your SSDs, give them to your staff to stick in their laptops or game computers, or use them for experimentation and dev systems after that. Don't hyperventilate over ONE mode of failure, the majority of your SSDs that fail will probably fail for other reasons. [snip] 3) On OBSD, how would you generally suggest to make a magnet-SSD hybrid disk setup where the SSD gives the speed and maget storage security? Hybrid disks are a specific thing (or a few specific things) -- magnetic disks with an SSD cache or magnetic/SSD combos where the first X% of the disk is SSD, the rest is magnetic (or vise-versa, I guess, but I don't recall having seen that). SSD cache, you use like any other disk. Split mode, you use as multiple partitions, as appropriate. You clarified this to being about a totally different thing...mirroring an SSD with a Rotating Rust disk. At this point, most RAID systems I've seen do not support a preferred read device. Maybe they should start thinking about that. Maybe they shouldn't -- most applications that NEED the SSD performance for something other than single user jollies (i.e., a database server vs. having your laptop boot faster) will face-plant severely should performance suddenly drop by an order of magnitude. In many of these cases, the performance drops to the point that the system death-spirals as queries come in faster than they are answered. (this is why when you have an imbalanced redundant pair of machines, the faster machine should always be the standby machine, not the primary. Sometimes Does the same job just slower is still quite effectively down). Nick.
Re: when SSDs are not so solid or why no TRIM support can be a good thing :)
On point 3, hybrid SSD drives usually just present a standard IDE interface - just use a SATA controller and you don't need to worry about it --- âLanie, Iâm going to print more printers. Lots more printers. One for everyone. Thatâs worth going to jail for. Thatâs worth anything.â - Printcrime by Cory Doctrow Please avoid sending me Word or PowerPoint attachments. See http://www.gnu.org/philosophy/no-word-attachments.html On Wed, Jun 17, 2015 at 8:15 PM, Mikael mikael.tr...@gmail.com wrote: Wait, just for my (and I guesss some others') clarity, three questions here: 1) From the article, what can we see that Ext4/Linux actually did wrong here? - Is it that the TRUNCATE command should be abandoned completely, or was it how it matched supported/unsupported drives, or something else? 2) General on SSD: When an SSD starts to shrink because it starts to wear out, how is this handled and how does this appear to the OS, logs, and system software? 3) On OBSD, how would you generally suggest to make a magnet-SSD hybrid disk setup where the SSD gives the speed and maget storage security? Thanks! 2015-06-17 23:17 GMT+05:30 Mariano Ignacio Baragiola mari...@baragiola.com.ar: On 17/06/15 08:05, frantisek holop wrote: https://blog.algolia.com/when-solid-state-drives-are-not-that-solid/ also note the part relating to ext4: I have to admit, I slept better before reading the changelog. fast, features, realiable: pick any 2. -f I don't think TRIM is to blame here. I don't understand why someone on their sane mind would use latests versions of Ubuntu and Linux for servers. And yes, I know Ubuntu for Servers is a thing, and yes, I know the fight this instability with redundancy, but stil... About EXT4: it is not exactly the most trust-worthy filesystem there is. Interesting reading, though.
Re: when SSDs are not so solid or why no TRIM support can be a good thing :)
Paranoia over SSDs messing up is why I got hybrid drives - still get a decent performance boost but all my data is on good old fashioned magnetic platters --- âLanie, Iâm going to print more printers. Lots more printers. One for everyone. Thatâs worth going to jail for. Thatâs worth anything.â - Printcrime by Cory Doctrow Please avoid sending me Word or PowerPoint attachments. See http://www.gnu.org/philosophy/no-word-attachments.html On Wed, Jun 17, 2015 at 12:05 PM, frantisek holop min...@obiit.org wrote: https://blog.algolia.com/when-solid-state-drives-are-not-that-solid/ also note the part relating to ext4: I have to admit, I slept better before reading the changelog. fast, features, realiable: pick any 2. -f -- think honk if you're a telepath.
when SSDs are not so solid or why no TRIM support can be a good thing :)
https://blog.algolia.com/when-solid-state-drives-are-not-that-solid/ also note the part relating to ext4: I have to admit, I slept better before reading the changelog. fast, features, realiable: pick any 2. -f -- think honk if you're a telepath.
Re: TRIM support?
On Tue, 20 Apr 2010 16:22:32 -0500 Marco Peereboom sl...@peereboom.us wrote: On Tue, Apr 20, 2010 at 03:01:58PM -0500, Marco Peereboom wrote: On Tue, Apr 20, 2010 at 03:48:23PM -0400, Ted Unangst wrote: On Tue, Apr 20, 2010 at 3:11 PM, Marco Peereboom sl...@peereboom.us wrote: And no, TRIM isn't supported. The problem is that we're copying the entire disk, so, as far as the disk (i.e., SSDs) is aware, that disk is 100% full-- all blocks are marked as used even if they're empty. If I understand correctly, how the controller handles block reallocation in this scenario depends how it is You are. The whole not write so often is really really really uninteresting. It's not about writing too often, it's about the performance hit doing a read/modify/write when there's no free blocks. Like the 4k sector problem, but potentially even worse. On the other hand, it depends on how much writing your server will do in service. If you aren't writing large files, you won't notice much difference, and the benefit of ultra fast random access is more than worth it. I am 100% unconvinced. I *was* 100% unconvinced. I am much better educated now. Yes this could be neat :-) Heck, that didn't take long. ;) The impact of read/modify/write can be significant on SSD's, and performance of these devices degrades over time/use. The percent of the over-time degradation attributed to the read/modify/write issue is typically unknown, so TRIM just helps but doesn't solve the whole enchilada. Unfortunately, just getting TRIM support implemented in the low levels doesn't solve the entire problem, you also have to teach the filesystems to use it. At this point, even if TRIM was supported, using currently available SSD's on a mail server with tons of small writes means you're doomed to constantly baby sitting the performance of the system since it will degrade. You might be better off in the long run using multiple rotating disks that are half as fast, and half the price, but won't degrade. jcr -- The OpenBSD Journal - http://www.undeadly.org
Re: TRIM support?
On Tue, Apr 20, 2010 at 1:52 PM, Chris Dukes pak...@pr.neotoma.org wrote: Just rethink your deployment strategy to not use 'dd'. Even Windows cloning systems stopped trying to copy all bits on the disk 6+ years ago. 'dd' made some sense when the disk was mostly full and there was a huge penalty to keep seeking between data and metadata. 'dd' continues to make sense if you need to make a copy of everything before attempting to recover data or metadata. OSX idiocy FYI, Mac OS X still benefits from 'dd' because of arguably idiotic Metadata. Trying to get files on an HFS+ volume to remain in tact while copying to/from a non-HFS+ environment is the stuff of nightmares. Even if you think you have properly retained all that annoying metadata, you'll still have to extract it and test it under whatever application it was which created that metadata. Often, just linefeed conversion (or lack thereof) will break OSX applications. /OSX idiocy If you want to backup/restore data (for the purposes of imaging and deployment) you really should be using some kind of tar solution. rsync is nice, but tar works well with everything you'll have in sys and userland. Tar versions vary, but I disciplined myself from the start to understand the details of OpenBSD tar. I then use a -I /tmp/includes file for specialization. As long as you are creating and extracting your data sources from within OpenBSD (bsd.rd perhaps) tar should do the trick. Using the built-in 'make release' is even more fun.
Re: TRIM support?
On Tue, Apr 20, 2010 at 11:03:30PM -0700, J.C. Roberts wrote: degrade. You might be better off in the long run using multiple rotating disks that are half as fast, and half the price, but won't degrade. It's my understanding that if you have a decent SSD, write response times can (under some workloads) degrade but never below the performance of even a very fast rotating disk. So, i've stopped worrying about things like TRIM and trying to avoid writes. I'll align my partitions, but apart from that I just enjoy the extremely low read response times, and my almost-always-quite-low write response times. :) -- Jurjen Oskam Savage's Law of Expediency: You want it bad, you'll get it bad.
Re: TRIM support?
Daniel Barowy [dbar...@barowy.net] wrote: The problem is that we're copying the entire disk, so, as far as the disk (i.e., SSDs) is aware, that disk is 100% full-- all blocks are To make your deployment faster, just use fdisk, disklabel, newfs to setup the disk and tar to copy the files. That's a much smarter/faster way to go, even with TRIM support. I automated it in 'growimg' for flashdist/flashrd, for instance. Of course that assumes you only have one disklabel partition, but your servers are probably more complicated than that.
Re: TRIM support?
On Wed, 21 Apr 2010, Chris Cappuccio wrote: To make your deployment faster, just use fdisk, disklabel, newfs to setup the disk and tar to copy the files. That's a much smarter/faster way to go, even with TRIM support. I automated it in 'growimg' for flashdist/flashrd, for instance. Of course that assumes you only have one disklabel partition, but your servers are probably more complicated than that. The nice thing about dd is that it is simple-- you can set up a system with a shell one-liner and after the reboot, just change a few config files. The idea is that novice administrators on our staff could get something up and running quickly. The reality is that our novice administrators rarely do any real server deployment-- it's really just me and another guy-- so when it comes down to it, this is just a time-saving measure for us. The genesis of it was from doing this with CF onto Soekris or other SBCs where actually doing an install directly onto the CF is a PITA (do the install in a VM, then dd the VM's disk). It doesn't need to be simple as long as it saves time and errors. So I think I will indeed look into ideas like yours. I like Chris Dukes' suggestion to replace baseXX.tgz and use the installer, since unlike Soekris boxes, these machines have CD readers and video hardware. Dan
Re: TRIM support?
On Wed, Apr 21, 2010 at 01:52:58PM -0400, Daniel Barowy wrote: The reality is that our novice administrators rarely do any real server deployment-- it's really just me and another guy-- so when it comes down to it, this is just a time-saving measure for us. The genesis of it was from doing this with CF onto Soekris or other SBCs where actually doing an install directly onto the CF is a PITA (do the install in a VM, then dd the VM's disk). It doesn't need to be simple as long as it saves time and errors. So I think I will indeed look into ideas like yours. I like Chris Dukes' suggestion to replace baseXX.tgz and use the installer, since unlike Soekris boxes, these machines have CD readers and video hardware. I also keep toying with the idea of an ugly little 'C' program that reads the source page by page. If the page is null, it does a seek instead of a write on the target. As for your soekris deploy method, I wonder if just telling the VM that the block device for the CF is the disk would suffice. -- Chris Dukes
Re: TRIM support?
On 2010-04-21, Daniel Barowy dbar...@barowy.net wrote: On Wed, 21 Apr 2010, Chris Cappuccio wrote: To make your deployment faster, just use fdisk, disklabel, newfs to setup the disk and tar to copy the files. That's a much smarter/faster way to go, even with TRIM support. I automated it in 'growimg' for flashdist/flashrd, for instance. Of course that assumes you only have one disklabel partition, but your servers are probably more complicated than that. The nice thing about dd is that it is simple-- you can set up a system with a shell one-liner and after the reboot, just change a few config files. The idea is that novice administrators on our staff could get something up and running quickly. It's simpler, until your CF vendor decides to change to a slightly smaller device. And then you wish you spent the extra 2 minutes writing a quick shell script to untar things rather than dd, or a site*.tgz for the installer.. The reality is that our novice administrators rarely do any real server deployment-- it's really just me and another guy-- so when it comes down to it, this is just a time-saving measure for us. The genesis of it was from doing this with CF onto Soekris or other SBCs where actually doing an install directly onto the CF is a PITA (do the install in a VM, then dd the VM's disk). It doesn't need to be simple as long as it saves time and errors. So I think I will indeed look into ideas like yours. I like Chris Dukes' suggestion to replace baseXX.tgz and use the installer, since unlike Soekris boxes, these machines have CD readers and video hardware. Sounds like you *really* need to learn pxeboot(8). It's very very straightforward on OpenBSD.
Re: TRIM support?
On Wed, 21 Apr 2010 18:40:10 +0200 Jurjen Oskam jur...@stupendous.org wrote: On Tue, Apr 20, 2010 at 11:03:30PM -0700, J.C. Roberts wrote: degrade. You might be better off in the long run using multiple rotating disks that are half as fast, and half the price, but won't degrade. It's my understanding that if you have a decent SSD, write response times can (under some workloads) degrade but never below the performance of even a very fast rotating disk. So, i've stopped worrying about things like TRIM and trying to avoid writes. I'll align my partitions, but apart from that I just enjoy the extremely low read response times, and my almost-always-quite-low write response times. :) I agree. If your application does not require pushing things to their limits, TRIM and degradation can be a non-issue. When you have the budget to over-allocate with SSD's to exceed your requirements (e.g. a plan to compensate for increased demand as well as degradation over time), then they are an excellent choice. The problems only arise when you don't have the budget, don't understand your workload, don't do adequate testing, and don't plan for the eventual degradation. Since all of these issues except degradation also effect rotating storage, it's mostly a planing problem where SSD's just add other variables to the analysis. The trouble is SSD vendors are not particularly forthcoming about the limitations, so they are not well understood by integrators and this can result in poor planing. jcr -- The OpenBSD Journal - http://www.undeadly.org
TRIM support?
Hello, Anyone know the status/plans of TRIM support in OpenBSD? I poked around a bit in ahci.c and scsi.c, but nothing pops out at me (I also don't really know what I'm looking for). Thanks, Dan
Re: TRIM support?
What problem are you trying to solve? And no, TRIM isn't supported. On Tue, Apr 20, 2010 at 01:58:30PM -0400, Daniel Barowy wrote: Hello, Anyone know the status/plans of TRIM support in OpenBSD? I poked around a bit in ahci.c and scsi.c, but nothing pops out at me (I also don't really know what I'm looking for). Thanks, Dan
Re: TRIM support?
On Tue, 20 Apr 2010, Marco Peereboom wrote: What problem are you trying to solve? And no, TRIM isn't supported. My concern is the procedure we've been using to deploy OpenBSD machines. We set up a base machine with a standard disk layout, utilities, admin account, etc... and then make a copy of the entire disk using dd. We save this on our SAN, and when we want a new machine, simply pull a disk off the shelf, copy the image to the disk, boot, then customize. The problem is that we're copying the entire disk, so, as far as the disk (i.e., SSDs) is aware, that disk is 100% full-- all blocks are marked as used even if they're empty. If I understand correctly, how the controller handles block reallocation in this scenario depends how it is implemented in the disk's firmware, with some being better than others. At present, we have Intel X25-E disks. So, if the above is correct, then I will need to either rethink our deployment strategy (like, always leave some spae on the disk, untouched by dd), or else try not to write so often (like, using a ramdisk). I could also be overestimating the importance of all of this. Thanks, Dan
Re: TRIM support?
On Tue, Apr 20, 2010 at 02:56:11PM -0400, Daniel Barowy wrote: On Tue, 20 Apr 2010, Marco Peereboom wrote: What problem are you trying to solve? And no, TRIM isn't supported. My concern is the procedure we've been using to deploy OpenBSD machines. We set up a base machine with a standard disk layout, utilities, admin account, etc... and then make a copy of the entire disk using dd. We save this on our SAN, and when we want a new machine, simply pull a disk off the shelf, copy the image to the disk, boot, then customize. The problem is that we're copying the entire disk, so, as far as the disk (i.e., SSDs) is aware, that disk is 100% full-- all blocks are marked as used even if they're empty. If I understand correctly, how the controller handles block reallocation in this scenario depends how it is implemented in the disk's firmware, with some being better than others. At present, we have Intel X25-E disks. So, if the above is correct, then I will need to either rethink our deployment strategy (like, always leave some spae on the disk, untouched by dd), or else try not to write so often (like, using a ramdisk). I could also be overestimating the importance of all of this. You are. The whole not write so often is really really really uninteresting. Thanks, Dan
Re: TRIM support?
On Tue, Apr 20, 2010 at 3:11 PM, Marco Peereboom sl...@peereboom.us wrote: And no, TRIM isn't supported. The problem is that we're copying the entire disk, so, as far as the disk (i.e., SSDs) is aware, that disk is 100% full-- all blocks are marked as used even if they're empty. If I understand correctly, how the controller handles block reallocation in this scenario depends how it is You are. The whole not write so often is really really really uninteresting. It's not about writing too often, it's about the performance hit doing a read/modify/write when there's no free blocks. Like the 4k sector problem, but potentially even worse. On the other hand, it depends on how much writing your server will do in service. If you aren't writing large files, you won't notice much difference, and the benefit of ultra fast random access is more than worth it.
Re: TRIM support?
On Tue, Apr 20, 2010 at 02:56:11PM -0400, Daniel Barowy wrote: The problem is that we're copying the entire disk, so, as far as the disk (i.e., SSDs) is aware, that disk is 100% full-- all blocks are marked as used even if they're empty. If I understand correctly, how the controller handles block reallocation in this scenario depends how it is implemented in the disk's firmware, with some being better than others. At present, we have Intel X25-E disks. Err, just how frequently are you doing this? The answer is going to change a bit if you're doing this infrequently vs. doing this as a part of manufacturing turn key boxes. I am going to assume the former, not the latter. If you don't want as many blocks to appear as used, write to fewer blocks. IE partition it, slice it, mkfs it, and restore from a tarball. You can even put your gzipped tarball of the base system where the installer expects to find base##.tgz and tell it to only install your tarball. So, if the above is correct, then I will need to either rethink our deployment strategy (like, always leave some spae on the disk, untouched by dd), or else try not to write so often (like, using a ramdisk). I could also be overestimating the importance of all of this. Just rethink your deployment strategy to not use 'dd'. Even Windows cloning systems stopped trying to copy all bits on the disk 6+ years ago. 'dd' made some sense when the disk was mostly full and there was a huge penalty to keep seeking between data and metadata. 'dd' continues to make sense if you need to make a copy of everything before attempting to recover data or metadata. -- Chris Dukes
Re: TRIM support?
On Tue, Apr 20, 2010 at 03:48:23PM -0400, Ted Unangst wrote: On Tue, Apr 20, 2010 at 3:11 PM, Marco Peereboom sl...@peereboom.us wrote: And no, TRIM isn't supported. The problem is that we're copying the entire disk, so, as far as the disk (i.e., SSDs) is aware, that disk is 100% full-- all blocks are marked as used even if they're empty. If I understand correctly, how the controller handles block reallocation in this scenario depends how it is You are. The whole not write so often is really really really uninteresting. It's not about writing too often, it's about the performance hit doing a read/modify/write when there's no free blocks. Like the 4k sector problem, but potentially even worse. On the other hand, it depends on how much writing your server will do in service. If you aren't writing large files, you won't notice much difference, and the benefit of ultra fast random access is more than worth it. I am 100% unconvinced.
Re: TRIM support?
On Tue, 20 Apr 2010, Ted Unangst wrote: It's not about writing too often, it's about the performance hit doing a read/modify/write when there's no free blocks. Like the 4k sector problem, but potentially even worse. On the other hand, it depends on how much writing your server will do in service. If you aren't writing large files, you won't notice much difference, and the benefit of ultra fast random access is more than worth it. Right now, the machines I am working on are mail gateways. They'll need to do frequent small writes as mail is shuffled between various queues. As long as we keep up with incoming mail, we're fine-- this is less of an issue now that spamd turns away most connections before they submit any data for processing. We were looking for a general answer, though, since the same strategy is used to deploy machines for other purposes (databases, web servers, routers, etc), although any application that requires lots of storage will probably get a big disk (or more likely, NFS to a big disk) specifically for that purpose. Thanks for the answers, everyone. I have some good ideas to look into. Dan
Re: TRIM support?
On Tue, Apr 20, 2010 at 03:01:58PM -0500, Marco Peereboom wrote: On Tue, Apr 20, 2010 at 03:48:23PM -0400, Ted Unangst wrote: On Tue, Apr 20, 2010 at 3:11 PM, Marco Peereboom sl...@peereboom.us wrote: And no, TRIM isn't supported. The problem is that we're copying the entire disk, so, as far as the disk (i.e., SSDs) is aware, that disk is 100% full-- all blocks are marked as used even if they're empty. If I understand correctly, how the controller handles block reallocation in this scenario depends how it is You are. The whole not write so often is really really really uninteresting. It's not about writing too often, it's about the performance hit doing a read/modify/write when there's no free blocks. Like the 4k sector problem, but potentially even worse. On the other hand, it depends on how much writing your server will do in service. If you aren't writing large files, you won't notice much difference, and the benefit of ultra fast random access is more than worth it. I am 100% unconvinced. I *was* 100% unconvinced. I am much better educated now. Yes this could be neat :-)
Re: TRIM support?
On 21/04/2010, at 3:58 AM, Daniel Barowy wrote: Hello, Anyone know the status/plans of TRIM support in OpenBSD? I poked around a bit in ahci.c and scsi.c, but nothing pops out at me (I also don't really know what I'm looking for). the status of TRIM support is that there is none. i have no plans currently, though that could change if i ever get gear that would make good use of it. tweaking the scsi and atascsi layers to support unmap and trim is simple, but making the block and fs layers make use of it would be interesting. dlg
Re: TRIM support?
On Tue, Apr 20, 2010 at 19:51, David Gwynne l...@animata.net wrote: On 21/04/2010, at 3:58 AM, Daniel Barowy wrote: Hello, B Anyone know the status/plans of TRIM support in OpenBSD? B I poked around a bit in ahci.c and scsi.c, but nothing pops out at me (I also don't really know what I'm looking for). the status of TRIM support is that there is none. i have no plans currently, though that could change if i ever get gear that would make good use of it. tweaking the scsi and atascsi layers to support unmap and trim is simple, but making the block and fs layers make use of it would be interesting. dlg looks like the new version of clonezilla supports OpenBSD...