Re: [zfs-discuss] SSD vs "hybrid" drive - any advice?
> From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- > boun...@opensolaris.org] On Behalf Of Erik Trimble > > SSDs do Garbage Collection when the controller has spare cycles. I'm not > certain if there is a time factor (i.e. is it periodic, or just when > there's time in the controller's queue). So, theoretically, TRIM helps > GC when the drive is at low utilization, but not when the SSD is under > significant load. Under high load, the SSD doesn't have the luxury of > searching the NAND for "unused" blocks, aggregating them, writing them > to a new page, and then erasing the old location. It has to allocate > stuff NOW, so it goes right to the dreaded read-modify-erase-write > cycle. Even under high load, the SSD can "process" the TRIM request > (i.e. it will mark a block as unused), but that's not useful until a GC > is performed (unless you are so lucky as to mark an *entire* page as > unused), so, it doesn't really matter. The GC run is what "fixes" the > NAND allocation, not the TRIM command itself. This is really all a detail which is device specific. If the SSD controller is designed to do so, it can simultaneously erase one block on one bank, while performing RMEW or whatever operation on another bank. Again, I don't know if/when/which drives are built with that design, but I know it's not architecturally impossible. In which case, the TRIM would be useful, even for devices that are working under maximum load. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] SSD vs "hybrid" drive - any advice?
On 7/25/2011 8:03 AM, Orvar Korvar wrote: "There is at least a common perception (misperception?) that devices cannot process TRIM requests while they are 100% busy processing other tasks." Just to confirm; SSD disks can do TRIM while processing other tasks? I heard that Illumos is working on TRIM support for ZFS and will release something soon. Anyone knows more? SSDs do Garbage Collection when the controller has spare cycles. I'm not certain if there is a time factor (i.e. is it periodic, or just when there's time in the controller's queue). So, theoretically, TRIM helps GC when the drive is at low utilization, but not when the SSD is under significant load. Under high load, the SSD doesn't have the luxury of searching the NAND for "unused" blocks, aggregating them, writing them to a new page, and then erasing the old location. It has to allocate stuff NOW, so it goes right to the dreaded read-modify-erase-write cycle. Even under high load, the SSD can "process" the TRIM request (i.e. it will mark a block as unused), but that's not useful until a GC is performed (unless you are so lucky as to mark an *entire* page as unused), so, it doesn't really matter. The GC run is what "fixes" the NAND allocation, not the TRIM command itself. I can't speak for the ZFS developers as to TRIM support. I *believe* this would have to happen both at the device level and the filesystem level. But, I certainly could be wrong. (IllumOS currently supports TRIM in the SATA framework - not sure about the SAS framework) -- Erik Trimble Java Platform Group Infrastructure Mailstop: usca22-317 Phone: x67195 Santa Clara, CA Timezone: US/Pacific (UTC-0800) ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] SSD vs "hybrid" drive - any advice?
On 7/25/2011 4:49 AM, joerg.schill...@fokus.fraunhofer.de wrote: Erik Trimble wrote: On 7/25/2011 3:32 AM, Orvar Korvar wrote: How long have you been using a SSD? Do you see any performance decrease? I mean, ZFS does not support TRIM, so I wonder about long term effects... Frankly, for the kind of use that ZFS puts on a SSD, TRIM makes no impact whatsoever. TRIM is primarily useful for low-volume changes - that is, for a filesystem that generally has few deletes over time (i.e. rate of change is low). Using a SSD as a ZIL or L2ARC device puts a very high write load on the device (even as an L2ARC, there is a considerably higher write load than a "typical" filesystem use). SSDs in such a configuration can't really make use of TRIM, and depend on the internal SSD controller block re-allocation algorithms to improve block layout. Now, if you're using the SSD as primary media (i.e. in place of a Hard Drive), there is a possibility that TRIM could help. I honestly can't be sure that it would help, however, as ZFS's Copy-on-Write nature means that it tends to write entire pages of blocks, rather than just small blocks. Which is fine from the SSD's standpoint. Writing to an SSD is: clear + write + verify As the SSD cannot know that the rewritten blocks have been unused for a while, the SSD cannot handle the clear operation at a time when there is no interest in the block, the TRIM command is needed to give this knowledge to the SSD. Jörg Except in many cases with ZFS, that data is irrelevant by the time it can be used, or is much less useful than with other filesystems. Copy-on-Write tends to end up with whole SSD pages of blocks being rendered "unused", rather than individual blocks inside pages. So, the SSD often can avoid the read-erase-modify-write cycle, and just do erase-write instead. TRIM *might* help somewhat when you have a relatively quiet ZFS filesystem, but I'm not really convinced of how much of a benefit it would be. As I've mentioned in other posts, ZIL and L2ARC are too "hot" for TRIM to have any noticeable impact - the SSD is constantly being used, and has no time for GC. It's stuck in the read-erase-modify-write cycle even with TRIM. -- Erik Trimble Java Platform Group Infrastructure Mailstop: usca22-317 Phone: x67195 Santa Clara, CA Timezone: US/Pacific (UTC-0800) ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] SSD vs "hybrid" drive - any advice?
On 7/25/2011 4:28 AM, Tomas Ögren wrote: On 25 July, 2011 - Erik Trimble sent me these 2,0K bytes: On 7/25/2011 3:32 AM, Orvar Korvar wrote: How long have you been using a SSD? Do you see any performance decrease? I mean, ZFS does not support TRIM, so I wonder about long term effects... Frankly, for the kind of use that ZFS puts on a SSD, TRIM makes no impact whatsoever. TRIM is primarily useful for low-volume changes - that is, for a filesystem that generally has few deletes over time (i.e. rate of change is low). Using a SSD as a ZIL or L2ARC device puts a very high write load on the device (even as an L2ARC, there is a considerably higher write load than a "typical" filesystem use). SSDs in such a configuration can't really make use of TRIM, and depend on the internal SSD controller block re-allocation algorithms to improve block layout. Now, if you're using the SSD as primary media (i.e. in place of a Hard Drive), there is a possibility that TRIM could help. I honestly can't be sure that it would help, however, as ZFS's Copy-on-Write nature means that it tends to write entire pages of blocks, rather than just small blocks. Which is fine from the SSD's standpoint. You still need the flash erase cycle. On a related note: I've been using a OCZ Vertex 2 as my primary drive in a laptop, which runs Windows XP (no TRIM support). I haven't noticed any dropoff in performance in the year its be in service. I'm doing typical productivity laptop-ish things (no compiling, etc.), so it appears that the internal SSD controller is more than smart enough to compensate even without TRIM. Honestly, I think TRIM isn't really useful for anyone. It took too long to get pushed out to the OSes, and the SSD vendors seem to have just compensated by making a smarter controller able to do better reallocation. Which, to me, is the better ideal, in any case. Bullshit. I just got a OCZ Vertex 3, and the first fill was 450-500MB/s. Second and sequent fills are at half that speed. I'm quite confident that it's due to the flash erase cycle that's needed, and if stuff can be TRIM:ed (and thus flash erased as well), speed would be regained. Overwriting an previously used block requires a flash erase, and if that can be done in the background when the timing is not critical instead of just before you can actually write the block you want, performance will increase. /Tomas I should have been more clear: I consider the "native" speed of a SSD to be that which is obtained AFTER you've filled the entire drive once. That is, once you've blown through the extra reserve NAND, and are now into the full read/erase/write cycle. IMHO, that's really what the sustained performance of an SSD is, not the bogus numbers reported by venders. TRIM is really only useful for drives which have a low enough load factor to do background GC on unused blocks. For ZFS, that *might* be the case when the SSD is used as primary backing store, but certainly isn't the case when it's used as ZIL or L2ARC. Even with TRIM, performance after a complete fill of the SSD will drop noticeable, as the SSD has to do GC sometime. You might not notice it right away given your usage pattern, but, with OR without TRIM, a "used" SSD under load will perform the same. -- Erik Trimble Java Platform Group Infrastructure Mailstop: usca22-317 Phone: x67195 Santa Clara, CA Timezone: US/Pacific (UTC-0800) ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] SSD vs "hybrid" drive - any advice?
On 7/25/2011 6:43 AM, Edward Ned Harvey wrote: From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- boun...@opensolaris.org] On Behalf Of Erik Trimble Honestly, I think TRIM isn't really useful for anyone. I'm going to have to disagree. There are only two times when TRIM isn't useful: 1) Your demand of the system is consistently so low that it never adds up to anything meaningful... Basically you always have free unused blocks so adding more unused blocks to the pile doesn't matter at all, or you never bother to delete anything... Or it's just a lightweight server processing requests where network latency greatly outweighs any disk latency, etc. AKA your demand is very low. or 2) Your demand of the system is consistently so high that even with TRIM, the device would never be able to find any idle time to perform an erase cycle on blocks marked for TRIM. In case #2, it is at least theoretically possible for devices to become smart enough to process the TRIM block erasures in parallel even while there are other operations taking place simultaneously. I don't know if device mfgrs implement things that way today. There is at least a common perception (misperception?) that devices cannot process TRIM requests while they are 100% busy processing other tasks. Or your disk is always 100% full. I guess that makes 3 cases, but the 3rd one is esoteric. What I'm saying is that #2 occurs all the time with ZFS, at least as a ZIL or L2ARC. TRIM is really only useful when the SSD has some "downtime" to work. As a ZIL or L2ARC, the SSD *has* no pauses, and can't do GC in the background usefully (which is what TRIM helps). Instead, what I've seen is that the increased "smarts" of the new generation SSD controllers do a better job of on-the-fly reallocation. -- Erik Trimble Java Platform Group Infrastructure Mailstop: usca22-317 Phone: x67195 Santa Clara, CA Timezone: US/Pacific (UTC-0800) ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Large scale performance query
> Even with a controller per JBOD, you'll be limited by the SAS > connection. The 7k3000 has throughput from 115 - 150 MB/s, meaning > each of your JBODs will be capable of 5.2 GB/sec - 6.8 GB/sec, roughly > 10 times the bandwidth of a single SAS 6g connection. Use multipathing > if you can to increase the bandwidth to each JBOD. With (something like) LSI 9211 and those supermicro babies I guess he's planning on using, you'll have one quad-port SAS2 cable to each backplane/SAS expander, one in front and one in the back, meaning theroretical 24Gbps (or 2,4GBps) to each backplane. With a maximum of 24 drives per back, this should probably suffice, since you'll never get 150MB/s sustained from all drives. Vennlige hilsener / Best regards roy -- Roy Sigurd Karlsbakk (+47) 97542685 r...@karlsbakk.net http://blogg.karlsbakk.net/ -- I all pedagogikk er det essensielt at pensum presenteres intelligibelt. Det er et elementært imperativ for alle pedagoger å unngå eksessiv anvendelse av idiomer med fremmed opprinnelse. I de fleste tilfeller eksisterer adekvate og relevante synonymer på norsk. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Large scale performance query
> Workloads: > > Mainly streaming compressed data. That is, pulling compressed data in > a sequential manner however could have multiple streams happening at > once making it somewhat random. We are hoping to have 5 clients pull > 500Mbit sustained. That shouldn't be much of a problem with that amount of drives. I have a couple of smaller setups with 11x7-drive raidz2, about 100TiB each, and even they can handle 2,5Gbps load. > Considerations: > > The main reason RAIDZ3 was chosen was so we can distribute the parity > across the JBOD enclosures. With this method even if an entire JBOD > enclosure is taken offline the data is still accessible. Sounds like a good idea to me. > How to manage the physical locations of such a vast number of drives? > I have read this ( > http://blogs.oracle.com/eschrock/entry/external_storage_enclosures_in_solaris > ) and am hoping some can shed some light if the SES2 enclosure > identification has worked for them? (enclosures are SES2) Which enclosures will you be using? From the data you've posted, it looks like SuperMicro, and AFAIK, the ones we have, don't support SES2. > What kind of performance would you expect from this setup? I know we > can multiple the base IOPS by 24 but what about max sequential > read/write? Parallell read/write from several clients will look like random I/O on the server. If bandwidth is crucial, use RAID1+0. Also, it looks to me you're planning to fill up all external bays with data drives - where do you plan to put the root? If you're looking at the SuperMicro SC847 line, there's indeed room for a couple of 2,5" drives inside, but the chassis is screwed tightly together and doesn't allow for opening during runtime. Also, those drives are placed in a rather awkward slot. If planning to use RAIDz, a couple of SSDs for the SLOG will help write performance a lot especially during scrub/resilver. For streaming, L2ARC won't be of much use, though. Finally, a few spares won't hurt even with redundancy levels as high as RAIDz3. Vennlige hilsener / Best regards roy -- Roy Sigurd Karlsbakk (+47) 97542685 r...@karlsbakk.net http://blogg.karlsbakk.net/ -- I all pedagogikk er det essensielt at pensum presenteres intelligibelt. Det er et elementært imperativ for alle pedagoger å unngå eksessiv anvendelse av idiomer med fremmed opprinnelse. I de fleste tilfeller eksisterer adekvate og relevante synonymer på norsk. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Large scale performance query
On Sun, Jul 24, 2011 at 11:34 PM, Phil Harrison wrote: > What kind of performance would you expect from this setup? I know we can > multiple the base IOPS by 24 but what about max sequential read/write? > You should have a theoretical max close to 144x single-disk throughput. Each raidz3 has 6 "data drives" which can be read from simultaneously, multiplied by your 24 vdevs. Of course, you'll hit your controllers' limits well before that. Even with a controller per JBOD, you'll be limited by the SAS connection. The 7k3000 has throughput from 115 - 150 MB/s, meaning each of your JBODs will be capable of 5.2 GB/sec - 6.8 GB/sec, roughly 10 times the bandwidth of a single SAS 6g connection. Use multipathing if you can to increase the bandwidth to each JBOD. Depending on the types of access that clients are performing, your cache devices may not be any help. If the data is read multiple times by multiple clients, then you'll see some benefit. If it's only being read infrequently or by one client, it probably won't help much at all. That said, if your access is mostly sequential then random access latency shouldn't affect you too much, and you will still have more bandwidth from your main storage pools than from the cache devices. -B -- Brandon High : bh...@freaks.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Large scale performance query
they dont go into too much detail on their setup, and they are not running Solaris, but they do mention how their SATA cards see different drives, based on where they are placed they also have a second revision at http://blog.backblaze.com/2011/07/20/petabytes-on-a-budget-v2-0revealing-more-secrets/ which talks about building their system with 135Tb in a single 45 bay 4U box... I am also interested in this kind of scale... Looking at the BackBlaze box, i am thinking of building something like this, but not in one go... so, anything you do find out in your build, keep us informed! :) --Tiernan On Mon, Jul 25, 2011 at 4:25 PM, Roberto Waltman wrote: > > Phil Harrison wrote: > > Hi All, > > > > Hoping to gain some insight from some people who have done large scale > > systems before? I'm hoping to get some performance estimates, suggestions > > and/or general discussion/feedback. > > No personal experience, but you may find this useful: > "Petabytes on a budget" > > > http://blog.backblaze.com/2009/09/01/petabytes-on-a-budget-how-to-build-cheap-cloud-storage/ > > -- > > Roberto Waltman > > ___ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > -- Tiernan O'Toole blog.lotas-smartman.net www.geekphotographer.com www.tiernanotoole.ie ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS Mount Options
Tony MacDoodle wrote: I have a zfs pool called logs (about 200G). I would like to create 2 volumes using this chunk of storage. However, they would have different mount points. ie. 50G would be mounted as /oarcle/logs 100G would be mounted as /session/logs is this possible? Yes... zfs create -o mountpoint=/oracle/logs logs/oracle zfs create -o mountpoint=/session/logs logs/session If you don't otherwise specify, the two filesystem will share the pool without any constraints. If you wish to limit their max space... zfs set quota=50g logs/oracle zfs set quota=100g logs/session and/or if you wish to reserve a minimum space... zfs set reservation=50g logs/oracle zfs set reservation=100g logs/session Do I have to use the legacy mount options? You don't have to. -- Andrew Gabriel ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] ZFS Mount Options
I have a zfs pool called logs (about 200G). I would like to create 2 volumes using this chunk of storage. However, they would have different mount points. ie. 50G would be mounted as /oarcle/logs 100G would be mounted as /session/logs is this possible? Do I have to use the legacy mount options? Thas ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Large scale performance query
Phil Harrison wrote: > Hi All, > > Hoping to gain some insight from some people who have done large scale > systems before? I'm hoping to get some performance estimates, suggestions > and/or general discussion/feedback. No personal experience, but you may find this useful: "Petabytes on a budget" http://blog.backblaze.com/2009/09/01/petabytes-on-a-budget-how-to-build-cheap-cloud-storage/ -- Roberto Waltman ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Large scale performance query
Wow. If you ever finish this monster, I would really like to hear more about the performance and how you connected everything. Could be useful as a reference for anyone else building big stuff. *drool* -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] SSD vs "hybrid" drive - any advice?
"There is at least a common perception (misperception?) that devices cannot process TRIM requests while they are 100% busy processing other tasks." Just to confirm; SSD disks can do TRIM while processing other tasks? I heard that Illumos is working on TRIM support for ZFS and will release something soon. Anyone knows more? -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Large scale performance query
Hi All, Hoping to gain some insight from some people who have done large scale systems before? I'm hoping to get some performance estimates, suggestions and/or general discussion/feedback. I cannot discuss the exact specifics of the purpose but will go into as much detail as I can. Technical Specs: 216x 3TB 7k3000 HDDs 24x 9 drive RAIDZ3 4x JBOD Chassis (45 bay) 1x server (36 bay) 2x AMD 12 Core CPU 128GB EEC RAM 2x 480GB SSD Cache 10Gbit NIC Workloads: Mainly streaming compressed data. That is, pulling compressed data in a sequential manner however could have multiple streams happening at once making it somewhat random. We are hoping to have 5 clients pull 500Mbit sustained. Considerations: The main reason RAIDZ3 was chosen was so we can distribute the parity across the JBOD enclosures. With this method even if an entire JBOD enclosure is taken offline the data is still accessible. Questions: How to manage the physical locations of such a vast number of drives? I have read this ( http://blogs.oracle.com/eschrock/entry/external_storage_enclosures_in_solaris) and am hoping some can shed some light if the SES2 enclosure identification has worked for them? (enclosures are SES2) What kind of performance would you expect from this setup? I know we can multiple the base IOPS by 24 but what about max sequential read/write? Thanks, Phil ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] SSD vs "hybrid" drive - any advice?
> From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- > boun...@opensolaris.org] On Behalf Of Erik Trimble > > Honestly, I think TRIM isn't really useful for anyone. I'm going to have to disagree. There are only two times when TRIM isn't useful: 1) Your demand of the system is consistently so low that it never adds up to anything meaningful... Basically you always have free unused blocks so adding more unused blocks to the pile doesn't matter at all, or you never bother to delete anything... Or it's just a lightweight server processing requests where network latency greatly outweighs any disk latency, etc. AKA your demand is very low. or 2) Your demand of the system is consistently so high that even with TRIM, the device would never be able to find any idle time to perform an erase cycle on blocks marked for TRIM. In case #2, it is at least theoretically possible for devices to become smart enough to process the TRIM block erasures in parallel even while there are other operations taking place simultaneously. I don't know if device mfgrs implement things that way today. There is at least a common perception (misperception?) that devices cannot process TRIM requests while they are 100% busy processing other tasks. Or your disk is always 100% full. I guess that makes 3 cases, but the 3rd one is esoteric. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Summary: Dedup memory and performance (again, again)
> From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- > boun...@opensolaris.org] On Behalf Of Ian Collins > > Add to that: if running dedup, get plenty of RAM and cache. Add plenty RAM. And tweak your arc_meta_limit. You can at least get dedup performance that's on the same order of magnitude as performance without dedup. Cache devices don't really help dedup very much - Because each DDT stored in ARC/L2ARC takes 376 bytes, and each reference to an L2ARC entry requires 176 bytes of ARC. So in order to prevent an individual DDT entry from being evicted to disk, you must either keep the 376 bytes in ARC, or evict it to L2ARC and keep 176 bytes. This is a very small payload. A good payload would be to evict a 128k block from ARC into L2ARC, keeping the 176 bytes only in ARC. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] SSD vs "hybrid" drive - any advice?
Erik Trimble wrote: > On 7/25/2011 3:32 AM, Orvar Korvar wrote: > > How long have you been using a SSD? Do you see any performance decrease? I > > mean, ZFS does not support TRIM, so I wonder about long term effects... > > Frankly, for the kind of use that ZFS puts on a SSD, TRIM makes no > impact whatsoever. > > TRIM is primarily useful for low-volume changes - that is, for a > filesystem that generally has few deletes over time (i.e. rate of change > is low). > > Using a SSD as a ZIL or L2ARC device puts a very high write load on the > device (even as an L2ARC, there is a considerably higher write load than > a "typical" filesystem use). SSDs in such a configuration can't really > make use of TRIM, and depend on the internal SSD controller block > re-allocation algorithms to improve block layout. > > Now, if you're using the SSD as primary media (i.e. in place of a Hard > Drive), there is a possibility that TRIM could help. I honestly can't > be sure that it would help, however, as ZFS's Copy-on-Write nature means > that it tends to write entire pages of blocks, rather than just small > blocks. Which is fine from the SSD's standpoint. Writing to an SSD is: clear + write + verify As the SSD cannot know that the rewritten blocks have been unused for a while, the SSD cannot handle the clear operation at a time when there is no interest in the block, the TRIM command is needed to give this knowledge to the SSD. Jörg -- EMail:jo...@schily.isdn.cs.tu-berlin.de (home) Jörg Schilling D-13353 Berlin j...@cs.tu-berlin.de(uni) joerg.schill...@fokus.fraunhofer.de (work) Blog: http://schily.blogspot.com/ URL: http://cdrecord.berlios.de/private/ ftp://ftp.berlios.de/pub/schily ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] SSD vs "hybrid" drive - any advice?
On 25 July, 2011 - Erik Trimble sent me these 2,0K bytes: > On 7/25/2011 3:32 AM, Orvar Korvar wrote: >> How long have you been using a SSD? Do you see any performance decrease? I >> mean, ZFS does not support TRIM, so I wonder about long term effects... > > Frankly, for the kind of use that ZFS puts on a SSD, TRIM makes no > impact whatsoever. > > TRIM is primarily useful for low-volume changes - that is, for a > filesystem that generally has few deletes over time (i.e. rate of change > is low). > > Using a SSD as a ZIL or L2ARC device puts a very high write load on the > device (even as an L2ARC, there is a considerably higher write load than > a "typical" filesystem use). SSDs in such a configuration can't really > make use of TRIM, and depend on the internal SSD controller block > re-allocation algorithms to improve block layout. > > Now, if you're using the SSD as primary media (i.e. in place of a Hard > Drive), there is a possibility that TRIM could help. I honestly can't > be sure that it would help, however, as ZFS's Copy-on-Write nature means > that it tends to write entire pages of blocks, rather than just small > blocks. Which is fine from the SSD's standpoint. You still need the flash erase cycle. > On a related note: I've been using a OCZ Vertex 2 as my primary drive > in a laptop, which runs Windows XP (no TRIM support). I haven't noticed > any dropoff in performance in the year its be in service. I'm doing > typical productivity laptop-ish things (no compiling, etc.), so it > appears that the internal SSD controller is more than smart enough to > compensate even without TRIM. > > > Honestly, I think TRIM isn't really useful for anyone. It took too long > to get pushed out to the OSes, and the SSD vendors seem to have just > compensated by making a smarter controller able to do better > reallocation. Which, to me, is the better ideal, in any case. Bullshit. I just got a OCZ Vertex 3, and the first fill was 450-500MB/s. Second and sequent fills are at half that speed. I'm quite confident that it's due to the flash erase cycle that's needed, and if stuff can be TRIM:ed (and thus flash erased as well), speed would be regained. Overwriting an previously used block requires a flash erase, and if that can be done in the background when the timing is not critical instead of just before you can actually write the block you want, performance will increase. /Tomas -- Tomas Ögren, st...@acc.umu.se, http://www.acc.umu.se/~stric/ `- Sysadmin at {cs,acc}.umu.se ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Zil on multiple usb keys
On 07/23/11 04:57, Michael DeMan wrote: Generally performance is going to pretty bad as well - USB sticks are not made to be written too rapidly. They are entirely different animals than SSDs. I would not be surprised (but would be curious to know if you still move forward on this) that you will find performance even worse trying to do this. Back in the snv_120 ish era I tried this experiement on both my pool and on a friends. In both cases we were serving NFS (he was also doing CIFS) which was mostly read but also had periods where 1-2 G of data was rapidly added (uploading photos or videos) over the network. In both the USB "flash drive" and in the case of a San Disk Extreme IV CF card in a CF->IDE enclosure the performance did not improve and in fact in the case of the CF card the enclosure was bugging such that the changes we had to make to the ata config did actually make it slower. I removed the separate log device from both of those pools (by manual hacking with specially build zfs kernel modules because slog removal didn't exist back then.). -- Darren J Moffat ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] SSD vs "hybrid" drive - any advice?
On 7/25/2011 3:32 AM, Orvar Korvar wrote: How long have you been using a SSD? Do you see any performance decrease? I mean, ZFS does not support TRIM, so I wonder about long term effects... Frankly, for the kind of use that ZFS puts on a SSD, TRIM makes no impact whatsoever. TRIM is primarily useful for low-volume changes - that is, for a filesystem that generally has few deletes over time (i.e. rate of change is low). Using a SSD as a ZIL or L2ARC device puts a very high write load on the device (even as an L2ARC, there is a considerably higher write load than a "typical" filesystem use). SSDs in such a configuration can't really make use of TRIM, and depend on the internal SSD controller block re-allocation algorithms to improve block layout. Now, if you're using the SSD as primary media (i.e. in place of a Hard Drive), there is a possibility that TRIM could help. I honestly can't be sure that it would help, however, as ZFS's Copy-on-Write nature means that it tends to write entire pages of blocks, rather than just small blocks. Which is fine from the SSD's standpoint. On a related note: I've been using a OCZ Vertex 2 as my primary drive in a laptop, which runs Windows XP (no TRIM support). I haven't noticed any dropoff in performance in the year its be in service. I'm doing typical productivity laptop-ish things (no compiling, etc.), so it appears that the internal SSD controller is more than smart enough to compensate even without TRIM. Honestly, I think TRIM isn't really useful for anyone. It took too long to get pushed out to the OSes, and the SSD vendors seem to have just compensated by making a smarter controller able to do better reallocation. Which, to me, is the better ideal, in any case. -- Erik Trimble Java Platform Group Infrastructure Mailstop: usca22-317 Phone: x67195 Santa Clara, CA Timezone: US/Pacific (UTC-0800) ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] SSD vs "hybrid" drive - any advice?
How long have you been using a SSD? Do you see any performance decrease? I mean, ZFS does not support TRIM, so I wonder about long term effects... -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss