Re: [zfs-discuss] zfs send/receive - actual performance
On 25 mars 2010, at 22:00, Bruno Sousa bso...@epinfante.com wrote: Hi, Indeed the 3 disks per vdev (raidz2) seems a bad idea...but it's the system i have now. Regarding the performance...let's assume that a bonnie++ benchmark could go to 200 mg/s in. The possibility of getting the same values (or near) in a zfs send / zfs receive is just a matter of putting , let's say a 10gbE card between both systems? I have the impression that benchmarks are always synthetic, therefore live/production environments behave quite differently. Again, it might be just me, but with 1gb link being able to replicate 2 servers with a average speed above 60 mb/s does seems quite good. However, like i said i would like to know other results from other guys... Don't forget to factor in your transport mechanism. If you're using ssh to pipe the send/recv data your overall speed may end up being CPU bound since I think that ssh will be single threaded so even on a multicore system, you'll only be able to consume one core and here raw clock speed will make difference. Cheers, Erik ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs send/receive - actual performance
Hi, I think that in this case the cpu is not the bottleneck, since i'm not using ssh. However my 1gb network link probably is the bottleneck. Bruno On 26-3-2010 9:25, Erik Ableson wrote: On 25 mars 2010, at 22:00, Bruno Sousa bso...@epinfante.com wrote: Hi, Indeed the 3 disks per vdev (raidz2) seems a bad idea...but it's the system i have now. Regarding the performance...let's assume that a bonnie++ benchmark could go to 200 mg/s in. The possibility of getting the same values (or near) in a zfs send / zfs receive is just a matter of putting , let's say a 10gbE card between both systems? I have the impression that benchmarks are always synthetic, therefore live/production environments behave quite differently. Again, it might be just me, but with 1gb link being able to replicate 2 servers with a average speed above 60 mb/s does seems quite good. However, like i said i would like to know other results from other guys... Don't forget to factor in your transport mechanism. If you're using ssh to pipe the send/recv data your overall speed may end up being CPU bound since I think that ssh will be single threaded so even on a multicore system, you'll only be able to consume one core and here raw clock speed will make difference. Cheers, Erik smime.p7s Description: S/MIME Cryptographic Signature ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs send/receive - actual performance
Hi, The jumbo-frames in my case give me a boost of around 2 mb/s, so it's not that much. Now i will play with link aggregation and see how it goes, and of course i'm counting that incremental replication will be slower...but since the amount of data would be much less probably it will still deliver a good performance. And what a relief to know that i'm not alone when i say that storage management is part science, part arts and part voodoo magic ;) Cheers, Bruno On 25-3-2010 23:22, Ian Collins wrote: On 03/26/10 10:00 AM, Bruno Sousa wrote: [Boy top-posting sure mucks up threads!] Hi, Indeed the 3 disks per vdev (raidz2) seems a bad idea...but it's the system i have now. Regarding the performance...let's assume that a bonnie++ benchmark could go to 200 mg/s in. The possibility of getting the same values (or near) in a zfs send / zfs receive is just a matter of putting , let's say a 10gbE card between both systems? Maybe, or a 2x1G LAG would me more cost effective (and easier to check!). The only way to know for sure is to measure. I managed to get slightly better transfers by enabling jumbo frames. I have the impression that benchmarks are always synthetic, therefore live/production environments behave quite differently. Very true, especially in the black arts of storage management! Again, it might be just me, but with 1gb link being able to replicate 2 servers with a average speed above 60 mb/s does seems quite good. However, like i said i would like to know other results from other guys... As I said, the results are typical for a 1G link. Don't forget you are measuring full copies, incremental replications may well be significantly slower. -- Ian. -- This message has been scanned for viruses and dangerous content by *MailScanner* http://www.mailscanner.info/, and is believed to be clean. smime.p7s Description: S/MIME Cryptographic Signature ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] RAIDZ2 configuration
Using fewer than 4 disks in a raidz2 defeats the purpose of raidz2, as you will always be in a degraded mode. Freddie, are you nuts? This is false. Sure you can use raidz2 with 3 disks in it. But it does seem pointless to do that instead of a 3-way mirror. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] RAIDZ2 configuration
Coolio. Learn something new everyday. One more way that raidz is different from RAID5/6/etc. Freddie, again, you're wrong. Yes, it's perfectly acceptable to create either raid-5 or raidz using 2 disks. It's not degraded, but it does seem pointless to do this instead of a mirror. Likewise, it's perfectly acceptable to create a raid-6 or raid-dp or raidz2 using 3 disks. It's not degraded, but seems pointless to do this instead of a 3-way mirror. Since it's pointless, some hardware vendors may not implement it in their raid controllers. They might only give you the option of creating a mirror instead. But that doesn't mean it's invalid raid configuration. So, is it just a standard that hardware/software RAID setups require 3 drives for a RAID5 array? And 4 drives for RAID6? It is just standard not to create a silly 2-disk raid5 or raidz. But don't use the word require. It is common practice to create raidz2 only with 4 disks or more, but again, don't use the word require. Some people do in fact create these silly configurations just because they're unfamiliar with what it all means. Take Bruno's original post as example, and that article he referenced on sun.com. How these things get started, I'll never know. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] RAIDZ2 configuration
Just because most people are probably too lazy to click the link, I’ll paste a phrase from that sun.com webpage below: “Creating a single-parity RAID-Z pool is identical to creating a mirrored pool, except that the ‘raidz’ or ‘raidz1’ keyword is used instead of ‘mirror’.” And “zpool create tank raidz2 c1t0d0 c2t0d0 c3t0d0” So … Shame on you, Sun, for doing this to your poor unfortunate readers. It would be nice if the page were a wiki, or somehow able to have feedback submitted… From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of Bruno Sousa Sent: Thursday, March 25, 2010 3:28 PM To: Freddie Cash Cc: ZFS filesystem discussion list Subject: Re: [zfs-discuss] RAIDZ2 configuration Hmm...it might be completely wrong , but the idea of raidz2 vdev with 3 disks came from the reading of http://docs.sun.com/app/docs/doc/819-5461/gcvjg?a=view . This particular page has the following example : zpool create tank raidz2 c1t0d0 c2t0d0 c3t0d0 # zpool status -v tank pool: tank state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM tank ONLINE 0 0 0 raidz2 ONLINE 0 0 0 c1t0d0ONLINE 0 0 0 c2t0d0ONLINE 0 0 0 c3t0d0ONLINE 0 0 0 So...what am i missing here? Just a bad example in the sun documentation regarding zfs? Bruno On 25-3-2010 20:10, Freddie Cash wrote: On Thu, Mar 25, 2010 at 11:47 AM, Bruno Sousa bso...@epinfante.com wrote: What do you mean by Using fewer than 4 disks in a raidz2 defeats the purpose of raidz2, as you will always be in a degraded mode ? Does it means that having 2 vdevs with 3 disks it won't be redundant in the advent of a drive failure? raidz1 is similar to raid5 in that it is single-parity, and requires a minimum of 3 drives (2 data + 1 parity) raidz2 is similar to raid6 in that it is double-parity, and requires a minimum of 4 drives (2 data + 2 parity) IOW, a raidz2 vdev made up of 3 drives will always be running in degraded mode (it's missing a drive). -- Freddie Cash fjwc...@gmail.com -- This message has been scanned for viruses and dangerous content by http://www.mailscanner.info/ MailScanner, and is believed to be clean. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS where to go!
OK, I have 3Ware looking into a driver for my cards (3ware 9500S-8) as I dont see an OpenSolaris driver for them. But this leads me that they do have a FreeBSD Driver, so I could still use ZFS. What does everyone thing about that? I bet it is not as mature as on OpenSolaris. mature is not the right term in this case. FreeBSD has been around much longer than opensolaris, and it's equally if not more mature. FreeBSD is probably somewhat less featureful. Because their focus is heavily on the reliability and stability side, rather than early adoption. Also it's less popular so there are ... less package availability. And FreeBSD in general will be built using older versions of packages than what's in OpenSolaris. Both are good OSes. If you can use FreeBSD but OpenSolaris doesn't have the driver for your hardware, go for it. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS where to go!
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 26.03.2010 12:46, Edward Ned Harvey wrote: OK, I have 3Ware looking into a driver for my cards (3ware 9500S-8) as I dont see an OpenSolaris driver for them. But this leads me that they do have a FreeBSD Driver, so I could still use ZFS. What does everyone thing about that? I bet it is not as mature as on OpenSolaris. mature is not the right term in this case. FreeBSD has been around much longer than opensolaris, and it's equally if not more mature. FreeBSD is probably somewhat less featureful. Because their focus is heavily on the reliability and stability side, rather than early adoption. Also it's less popular so there are ... less package availability. Have you had a look at /usr/ports? ;) As of a few days ago (when I last updated ports): 21430 I know, strictly speaking ports isn't packages since things are compiled locally (but you can output the result into packages if you need to install on several systems). And FreeBSD in general will be built using older versions of packages than what's in OpenSolaris. Where did you get that info? Of course, ZFS is a little older: NAMEPROPERTY VALUESOURCE pollux version 14 default But for other packages FreeBSD is atleast as cutting edge as (Open)Solaris. Both are good OSes. If you can use FreeBSD but OpenSolaris doesn't have the driver for your hardware, go for it. Finally something we agree on. ;) FreeBSD also has a less restrictive license. //Svein - -- - +---+--- /\ |Svein Skogen | sv...@d80.iso100.no \ / |Solberg Østli 9| PGP Key: 0xE5E76831 X|2020 Skedsmokorset | sv...@jernhuset.no / \ |Norway | PGP Key: 0xCE96CE13 | | sv...@stillbilde.net ascii | | PGP Key: 0x58CD33B6 ribbon |System Admin | svein-listm...@stillbilde.net Campaign|stillbilde.net | PGP Key: 0x22D494A4 +---+--- |msn messenger: | Mobile Phone: +47 907 03 575 |sv...@jernhuset.no | RIPE handle:SS16503-RIPE - +---+--- If you really are in a hurry, mail me at svein-mob...@stillbilde.net This mailbox goes directly to my cellphone and is checked even when I'm not in front of my computer. - Picture Gallery: https://gallery.stillbilde.net/v/svein/ - -BEGIN PGP SIGNATURE- Version: GnuPG v2.0.12 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkusoH4ACgkQSBMQn1jNM7Y53QCgvx+rSQRk9AmkmvZpWILVV9SE wSwAoN/YELyPIQWbxcUSIh1Ut60pbxak =XlC/ -END PGP SIGNATURE- ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS backup configuration
It seems like the zpool export will ques the drives and mark the pool as exported. This would be good if we wanted to move the pool at that time but we are thinking of a disaster recovery scenario. It would be nice to export just the config to where if our controller dies, we can use the zpool import on another box to get back up and running. Correct, zpool export will offline your disks so you can remove them and bring them somewhere else. I don't think you need to do anything in preparation for possible server failure. Am I wrong about this? I believe once your first server is down, you just move your disks to another system, and then zpool import. I don't believe the export is necessary in order to do an import. You would only export if you wanted to disconnect while the system is still powered on. You just export to tell the running OS I'm about to remove those disks, so don't freak out. But if there is no running OS, you don't worry about it. Again, I'm only 98% sure of the above. So it might be wise to test on a sandbox system. One thing that is worth mention: If you have an HBA such as 3ware, or Perc, or whatever ... it might be impossible to move the disks to a different HBA, such as Perc or 3ware (swapped one for the other). If your original system is using Perc 6/i, only move them to another system with Perc 6/i (and if possible, ensure the controller is using the same rev of firmware.) If you're using a simple unintelligent non-raid sas or sata controller, you should be good. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs send and ARC
In the Thoughts on ZFS Pool Backup Strategies thread it was stated that zfs send, sends uncompress data and uses the ARC. If zfs send sends uncompress data which has already been compress this is not very efficient, and it would be *nice* to see it send the original compress data. (or an option to do it) You've got 2 questions in your post. The one above first ... It's true that zfs send sends uncompressed data. So I've heard. I haven't tested it personally. I seem to remember there's some work to improve this, but not available yet. Because it was easier to implement the uncompressed send, and that already is super-fast compared to all the alternatives. I thought I would ask a true or false type questions mainly for curiosity sake. If zfs send uses standard ARC cache (when something is not already in the ARC) I would expect this to hurt (to some degree??) the performance of the system. (ie I assume it has the effect of replacing current/useful data in the cache with not very useful/old data And this is a separate question. I can't say first-hand what ZFS does, but I have an educated guess. I would say, for every block the zfs send needs to read ... if the block is in ARC or L2ARC, then it won't fetch again from disk. But it is not obliterating the ARC or L2ARC with old data. Because it's smart enough to work at a lower level than a user-space process, and tell the kernel (or whatever) something like I'm only reading this block once; don't bother caching it for my sake. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS where to go!
On Fri, Mar 26, 2010 at 07:46:01AM -0400, Edward Ned Harvey wrote: And FreeBSD in general will be built using older versions of packages than what's in OpenSolaris. Both are good OSes. If you can use FreeBSD but OpenSolaris doesn't have the driver for your hardware, go for it. While I use zfs with FreeBSD (FreeNAS appliance with 4x SATA 1 TByte drives) it is trailing OpenSolaris by at least a year if not longer and hence lacks many key features people pick zfs over other file systems. The performance, especially CIFS is quite lacking. Purportedly (I have never seen the source nor am I a developer), such crucial features are nontrivial to backport because FreeBSD doesn't practice layer separation. Inasmuch this is still true for the future we'll see once the Oracle/Sun dust settles. -- Eugen* Leitl a href=http://leitl.org;leitl/a http://leitl.org __ ICBM: 48.07100, 11.36820 http://www.ativel.com http://postbiota.org 8B29F6BE: 099D 78BA 2FD3 B014 B08A 7779 75B0 2443 8B29 F6BE ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS where to go!
While I use zfs with FreeBSD (FreeNAS appliance with 4x SATA 1 TByte drives) it is trailing OpenSolaris by at least a year if not longer and hence lacks many key features people pick zfs over other file systems. The performance, especially CIFS is quite lacking. Purportedly (I have never seen the source nor am I a developer), such crucial features are nontrivial to backport because FreeBSD doesn't practice layer separation. Inasmuch this is still true for the future we'll see once the Oracle/Sun dust settles. I'm not sure if it's a version thing, or something else ... I am running solaris 10u6 (at least a year or two old) and the performance of that is not just fine ... it's super awesome. An important note, though, is that I'm using samba and not the zfs built-in cifs. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs send and ARC
On Fri, March 26, 2010 07:06, Edward Ned Harvey wrote: In the Thoughts on ZFS Pool Backup Strategies thread it was stated that zfs send, sends uncompress data and uses the ARC. If zfs send sends uncompress data which has already been compress this is not very efficient, and it would be *nice* to see it send the original compress data. (or an option to do it) You've got 2 questions in your post. The one above first ... It's true that zfs send sends uncompressed data. So I've heard. I haven't tested it personally. I seem to remember there's some work to improve this, but not available yet. Because it was easier to implement the uncompressed send, and that already is super-fast compared to all the alternatives. I don't know that it makes sense to. There are lots of existing filter packages that do compression; so if you want compression, just put them in your pipeline. That way you're not limited by what zfs send has implemented, either. When they implement bzip98 with a new compression technology breakthrough, you can just use it :-) . -- David Dyer-Bennet, d...@dd-b.net; http://dd-b.net/ Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/ Photos: http://dd-b.net/photography/gallery/ Dragaera: http://dragaera.info ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] RAIDZ2 configuration
On Fri, Mar 26 at 7:29, Edward Ned Harvey wrote: Using fewer than 4 disks in a raidz2 defeats the purpose of raidz2, as you will always be in a degraded mode. Freddie, are you nuts? This is false. Sure you can use raidz2 with 3 disks in it. But it does seem pointless to do that instead of a 3-way mirror. One thing about mirrors is you can put each side of your mirror on a different controller, so that any single controller failure doesn't cause your pool to go down. While controller failure rates are very low, using 16/24 or 14/21 drives for parity on a dataset seems crazy to me. I know disks can be unreliable, but they shouldn't be THAT unreliable. I'd think that spending fewer drives for hot redundancy and then spending some of the balance on an isolated warm/cold backup solution would be more cost effective. http://blog.richardelling.com/2010/02/zfs-data-protection-comparison.html Quoting from the summary, at some point, the system design will be dominated by common failures and not the failure of independent disks. Another thought is that if heavy seeking is more likely to lead to high temperature and/or drive failure, then reserving one or two slots for an SSD L2ARC might be a good idea. It'll take a lot of load off of your spindles if your data set fits or mostly fits within the L2ARC. You'd need a lot of RAM to make use of a large L2ARC though, just something to keep in mind. We have a 32GB X25-E as L2ARC and though it's never more than ~5GB full with our workloads, most every file access saturates the wire (1.0 Gb/s ethernet) once the cache has warmed up, resulting in very little IO to our spindles. --eric -- Eric D. Mudama edmud...@mail.bounceswoosh.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS hex dump diagrams?
On Fri, Mar 26 at 11:10, Sanjeev wrote: On Thu, Mar 25, 2010 at 02:45:12PM -0700, John Bonomi wrote: I'm sorry if this is not the appropriate place to ask, but I'm a student and for an assignment I need to be able to show at the hex level how files and their attributes are stored and referenced in ZFS. Are there any resources available that will show me how this is done? You could try zdb. Or just look at the source code. -- Eric D. Mudama edmud...@mail.bounceswoosh.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS hex dump diagrams?
Hi, You might take a look at http://www.osdevcon.org/2008/files/osdevcon2008-max.pdf and http://www.osdevcon.org/2008/files/osdevcon2008-proceedings.pdf, starting at page 36. Or you might just use od -x file for the file part of your assignment. Have fun. max Eric D. Mudama wrote: On Fri, Mar 26 at 11:10, Sanjeev wrote: On Thu, Mar 25, 2010 at 02:45:12PM -0700, John Bonomi wrote: I'm sorry if this is not the appropriate place to ask, but I'm a student and for an assignment I need to be able to show at the hex level how files and their attributes are stored and referenced in ZFS. Are there any resources available that will show me how this is done? You could try zdb. Or just look at the source code. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] RAIDZ2 configuration
On Fri, March 26, 2010 07:38, Edward Ned Harvey wrote: Coolio. Learn something new everyday. One more way that raidz is different from RAID5/6/etc. Freddie, again, you're wrong. Yes, it's perfectly acceptable to create either raid-5 or raidz using 2 disks. It's not degraded, but it does seem pointless to do this instead of a mirror. I think the word you're looking for is possible, not acceptable. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs send and ARC
On Fri, March 26, 2010 09:46, David Dyer-Bennet wrote: I don't know that it makes sense to. There are lots of existing filter packages that do compression; so if you want compression, just put them in your pipeline. That way you're not limited by what zfs send has implemented, either. When they implement bzip98 with a new compression technology breakthrough, you can just use it :-) . Actually a better example may be using parallel implementations of popular algorithms: http://www.zlib.net/pigz/ http://www.google.com/search?q=parallel+bzip Given the amount of cores we have nowadays (especially the Niagara-based CPUs), might as well use them. There are also better algorithms out there (some of which assume parallelism): http://en.wikipedia.org/wiki/Xz http://en.wikipedia.org/wiki/7z If you're using OpenSSH, there are also some third-party patches that may help in performance: http://www.psc.edu/networking/projects/hpn-ssh/ However, if the data is already compressed (and/or deduped), there's no sense in doing it again. If ZFS does have to go to disk, might as well send the data as-is. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS where to go!
On Fri, 26 Mar 2010, Edward Ned Harvey wrote: mature is not the right term in this case. FreeBSD has been around much longer than opensolaris, and it's equally if not more mature. FreeBSD is probably somewhat less featureful. Because their focus is heavily on the reliability and stability side, rather than early adoption. Also it's less popular so there are ... less package availability. And FreeBSD in general will be built using older versions of packages than what's in OpenSolaris. I am confused. What is the meaning of package and why would OpenSolaris be ahead of FreeBSD when it comes to packages? I am not sure what the meaning of package is but the claim seems quite dubious to me. To be sure, FreeBSD 8.0 is behind with zfs versions: % zpool upgrade This system is currently running ZFS pool version 13. but of course this is continually being worked on, and the latest stuff (with dedup) is in the process of being ported for delivery in FreeBSD 9.0 (and possibly FreeBSD 8.X). I think that the main advantage that Solaris ultimately has over FreeBSD when it comes to zfs is that Solaris provides an advanced fault management system and FreeBSD does not. Bob -- Bob Friesenhahn bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer,http://www.GraphicsMagick.org/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] ZFS and 4kb sector Drives (All new western digital GREEN Drives?)
Does zfs handle 4kb sectors properly or does it always assume 512b sectors? If it does, we could manually create a slice properly aligned and set zfs to use it... -- The sender of this email subscribes to Perimeter E-Security's email anti-virus service. This email has been scanned for malicious code and is believed to be virus free. For more information on email security please visit: http://www.perimeterusa.com/services/messaging This communication is confidential, intended only for the named recipient(s) above and may contain trade secrets or other information that is exempt from disclosure under applicable law. Any use, dissemination, distribution or copying of this communication by anyone other than the named recipient(s) is strictly prohibited. If you have received this communication in error, please delete the email and immediately notify our Command Center at 203-541-3444. Thanks ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS and 4kb sector Drives (All new western digital GREEN Drives?)
Yes, it does. Bottone, Frank 写道: Does zfs handle 4kb sectors properly or does it always assume 512b sectors? If it does, we could manually create a slice properly aligned and set zfs to use it… -- The sender of this email subscribes to Perimeter E-Security's email anti-virus service. This email has been scanned for malicious code and is believed to be virus free. For more information on email security please visit: http://www.perimeterusa.com/services/messaging This communication is confidential, intended only for the named recipient(s) above and may contain trade secrets or other information that is exempt from disclosure under applicable law. Any use, dissemination, distribution or copying of this communication by anyone other than the named recipient(s) is strictly prohibited. If you have received this communication in error, please delete the email and immediately notify our Command Center at 203-541-3444. Thanks ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS and 4kb sector Drives (All new western digital GREEN Drives?)
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 26.03.2010 16:55, Bottone, Frank wrote: Does zfs handle 4kb sectors properly or does it always assume 512b sectors? If it does, we could manually create a slice properly aligned and set zfs to use it? A real simple patch would be to attempt alignment with 4096 every time (since 4096 is a multiple of 512 there really wouldn't be a performance penalty here). This would mean that things are optimal on _ALL_ disks. (and allow those of us using more advanced diskcontrollers to set the strip-size (strip size, not stripe size) to 4K as well) //Svein - -- - +---+--- /\ |Svein Skogen | sv...@d80.iso100.no \ / |Solberg Østli 9| PGP Key: 0xE5E76831 X|2020 Skedsmokorset | sv...@jernhuset.no / \ |Norway | PGP Key: 0xCE96CE13 | | sv...@stillbilde.net ascii | | PGP Key: 0x58CD33B6 ribbon |System Admin | svein-listm...@stillbilde.net Campaign|stillbilde.net | PGP Key: 0x22D494A4 +---+--- |msn messenger: | Mobile Phone: +47 907 03 575 |sv...@jernhuset.no | RIPE handle:SS16503-RIPE - +---+--- If you really are in a hurry, mail me at svein-mob...@stillbilde.net This mailbox goes directly to my cellphone and is checked even when I'm not in front of my computer. - Picture Gallery: https://gallery.stillbilde.net/v/svein/ - -BEGIN PGP SIGNATURE- Version: GnuPG v2.0.12 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkus2cMACgkQSBMQn1jNM7YHeACgoxq5z6Eylrfn9IeIer+epJPs ylwAoKgxYWRaT7IQ+JTjeTQk8goCBcFT =TB1W -END PGP SIGNATURE- ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS and 4kb sector Drives (All new western digital GREEN Drives?)
Awesome! Just when I thought zfs couldn’t get any better... -Original Message- From: larry@sun.com [mailto:larry@sun.com] Sent: Friday, March 26, 2010 11:58 AM To: Bottone, Frank Cc: zfs-discuss@opensolaris.org Subject: Re: [zfs-discuss] ZFS and 4kb sector Drives (All new western digital GREEN Drives?) Yes, it does. Bottone, Frank 写道: Does zfs handle 4kb sectors properly or does it always assume 512b sectors? If it does, we could manually create a slice properly aligned and set zfs to use it… -- The sender of this email subscribes to Perimeter E-Security's email anti-virus service. This email has been scanned for malicious code and is believed to be virus free. For more information on email security please visit: http://www.perimeterusa.com/services/messaging This communication is confidential, intended only for the named recipient(s) above and may contain trade secrets or other information that is exempt from disclosure under applicable law. Any use, dissemination, distribution or copying of this communication by anyone other than the named recipient(s) is strictly prohibited. If you have received this communication in error, please delete the email and immediately notify our Command Center at 203-541-3444. Thanks ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss -- The sender of this email subscribes to Perimeter E-Security's email anti-virus service. This email has been scanned for malicious code and is believed to be virus free. For more information on email security please visit: http://www.perimeterusa.com/services/messaging This communication is confidential, intended only for the named recipient(s) above and may contain trade secrets or other information that is exempt from disclosure under applicable law. Any use, dissemination, distribution or copying of this communication by anyone other than the named recipient(s) is strictly prohibited. If you have received this communication in error, please delete the email and immediately notify our Command Center at 203-541-3444. Thank ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] RAID10
Hi All, I am looking at ZFS and I get that they call it RAIDZ which is similar to RAID 5, but what about RAID 10? Isn't a RAID 10 setup better for data protection? So if I have 8 x 1.5tb drives, wouldn't I: - mirror drive 1 and 5 - mirror drive 2 and 6 - mirror drive 3 and 7 - mirror drive 4 and 8 Then stripe 1,2,3,4 Then stripe 5,6,7,8 How does one do this with ZFS? -Jason ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] RAID10
And I should mention that I have a boot drive (500gb SATA) so I dont have to consider booting from the RAID, I just want to use it for storage. - Original Message - From: Slack-Moehrle mailingli...@mailnewsrss.com To: zfs-discuss zfs-discuss@opensolaris.org Sent: Friday, March 26, 2010 11:39:35 AM Subject: [zfs-discuss] RAID10 Hi All, I am looking at ZFS and I get that they call it RAIDZ which is similar to RAID 5, but what about RAID 10? Isn't a RAID 10 setup better for data protection? So if I have 8 x 1.5tb drives, wouldn't I: - mirror drive 1 and 5 - mirror drive 2 and 6 - mirror drive 3 and 7 - mirror drive 4 and 8 Then stripe 1,2,3,4 Then stripe 5,6,7,8 How does one do this with ZFS? -Jason ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] RAID10
Slack-Moehrle wrote: And I should mention that I have a boot drive (500gb SATA) so I dont have to consider booting from the RAID, I just want to use it for storage. - Original Message - From: Slack-Moehrle mailingli...@mailnewsrss.com To: zfs-discuss zfs-discuss@opensolaris.org Sent: Friday, March 26, 2010 11:39:35 AM Subject: [zfs-discuss] RAID10 Hi All, I am looking at ZFS and I get that they call it RAIDZ which is similar to RAID 5, but what about RAID 10? Isn't a RAID 10 setup better for data protection? So if I have 8 x 1.5tb drives, wouldn't I: - mirror drive 1 and 5 - mirror drive 2 and 6 - mirror drive 3 and 7 - mirror drive 4 and 8 Then stripe 1,2,3,4 Then stripe 5,6,7,8 How does one do this with ZFS? You don't, because your description is insane. You mirror each pair, then stripe each mirror, not the drives in the mirror (not really a stripe in ZFS, but...) zpool create mypool mirror 1 5 mirror 2 6 mirror 3 7 mirror 4 8 Relpacing the numbers with the actual device names ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] RAID10
On Fri, Mar 26, 2010 at 1:39 PM, Slack-Moehrle mailingli...@mailnewsrss.com wrote: Hi All, I am looking at ZFS and I get that they call it RAIDZ which is similar to RAID 5, but what about RAID 10? Isn't a RAID 10 setup better for data protection? So if I have 8 x 1.5tb drives, wouldn't I: - mirror drive 1 and 5 - mirror drive 2 and 6 - mirror drive 3 and 7 - mirror drive 4 and 8 Then stripe 1,2,3,4 Then stripe 5,6,7,8 How does one do this with ZFS? -Jason Just keep adding mirrored vdev's to the pool. It isn't exactly like a raid-10, as zfs doesn't to a typical raid-0 stripe, per se. It is the same basic concept as raid-10 though. You would be striping across all of the mirrored sets, not just a subset. So you would do: zpool create tank mirror drive1 drive2 mirror drive3 drive4 mirror drive5 drive6 mirror drive7 drive8 See here: http://www.stringliterals.com/?p=132 http://www.stringliterals.com/?p=132 --Tim ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] RAID10
On Fri, 26 Mar 2010, Slack-Moehrle wrote: Hi All, I am looking at ZFS and I get that they call it RAIDZ which is similar to RAID 5, but what about RAID 10? Isn't a RAID 10 setup better for data protection? I think so--at the expense of extra disks for a given amount of available storage. So if I have 8 x 1.5tb drives, wouldn't I: - mirror drive 1 and 5 - mirror drive 2 and 6 - mirror drive 3 and 7 - mirror drive 4 and 8 How does one do this with ZFS? Try this: zpool create dpool mirror drive1 drive5 mirror drive2 drive6 \ mirror drive3 drive7 mirror drive4 drive8 Isn't ZFS great?! -- Rich Teer, Publisher Vinylphile Magazine www.vinylphilemag.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] RAID10
On Fri, Mar 26, 2010 at 11:39 AM, Slack-Moehrle mailingli...@mailnewsrss.com wrote: I am looking at ZFS and I get that they call it RAIDZ which is similar to RAID 5, but what about RAID 10? Isn't a RAID 10 setup better for data protection? So if I have 8 x 1.5tb drives, wouldn't I: - mirror drive 1 and 5 - mirror drive 2 and 6 - mirror drive 3 and 7 - mirror drive 4 and 8 Then stripe 1,2,3,4 Then stripe 5,6,7,8 How does one do this with ZFS? Overly-simplified, a ZFS pool is a RAID0 stripeset across all the member vdevs, which can be either mirrors (essentially RAID10), or raidz1 (essentially RAID50), or raidz2 (essentially RAID60), or raidz3 (essentially RAID70???). A pool with a single mirror vdev is just a RAID1. A pool with a single raidz1 vdev is just a RAID5. And so on. But, as you add vdevs to a pool, it becomes a stripeset across all the vdevs. -- Freddie Cash fjwc...@gmail.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] RAID10
So if I have 8 x 1.5tb drives, wouldn't I: - mirror drive 1 and 5 - mirror drive 2 and 6 - mirror drive 3 and 7 - mirror drive 4 and 8 Then stripe 1,2,3,4 Then stripe 5,6,7,8 How does one do this with ZFS? So you would do: zpool create tank mirror drive1 drive2 mirror drive3 drive4 mirror drive5 drive6 mirror drive7 drive8 See here: http://www.stringliterals.com/?p=132 So, effectively mirroring the drives, but the pool that is created is one giant pool of all of the mirrors? I looked at: http://en.wikipedia.org/wiki/Non-standard_RAID_levels#RAID-Z and they had a brief description of RAIDZ2. Can someone explain in terms of usable space RAIDZ vs RAIDZ2 vs RAIDZ3? With 8 x 1.5tb? I apologize for seeming dense, I just am confused about non-stardard raid setups, they seem tricky. -Jason ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] RAID10
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 26.03.2010 20:04, Slack-Moehrle wrote: So if I have 8 x 1.5tb drives, wouldn't I: - mirror drive 1 and 5 - mirror drive 2 and 6 - mirror drive 3 and 7 - mirror drive 4 and 8 Then stripe 1,2,3,4 Then stripe 5,6,7,8 How does one do this with ZFS? So you would do: zpool create tank mirror drive1 drive2 mirror drive3 drive4 mirror drive5 drive6 mirror drive7 drive8 See here: http://www.stringliterals.com/?p=132 So, effectively mirroring the drives, but the pool that is created is one giant pool of all of the mirrors? I looked at: http://en.wikipedia.org/wiki/Non-standard_RAID_levels#RAID-Z and they had a brief description of RAIDZ2. Can someone explain in terms of usable space RAIDZ vs RAIDZ2 vs RAIDZ3? With 8 x 1.5tb? I apologize for seeming dense, I just am confused about non-stardard raid setups, they seem tricky. raidz eats one disk. Like RAID5 raidz2 digests another one. Like RAID6 raidz3 yet another one. Like ... h... //Svein - -- - +---+--- /\ |Svein Skogen | sv...@d80.iso100.no \ / |Solberg Østli 9| PGP Key: 0xE5E76831 X|2020 Skedsmokorset | sv...@jernhuset.no / \ |Norway | PGP Key: 0xCE96CE13 | | sv...@stillbilde.net ascii | | PGP Key: 0x58CD33B6 ribbon |System Admin | svein-listm...@stillbilde.net Campaign|stillbilde.net | PGP Key: 0x22D494A4 +---+--- |msn messenger: | Mobile Phone: +47 907 03 575 |sv...@jernhuset.no | RIPE handle:SS16503-RIPE - +---+--- If you really are in a hurry, mail me at svein-mob...@stillbilde.net This mailbox goes directly to my cellphone and is checked even when I'm not in front of my computer. - Picture Gallery: https://gallery.stillbilde.net/v/svein/ - -BEGIN PGP SIGNATURE- Version: GnuPG v2.0.12 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkutBbgACgkQSBMQn1jNM7aXPQCfSd92B8GilEiRa6LR/ltAF00X ENQAoIqlAdtCBHKiiiVbl1C9o0AZNRER =8ueU -END PGP SIGNATURE- ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] RAID10
RAIDZ = RAID5, so lose 1 drive (1.5TB) RAIDZ2 = RAID6, so lose 2 drives (3TB) RAIDZ3 = RAID7(?), so lose 3 drives (4.5TB). What you lose in useable space, you gain in redundancy. -m -Original Message- From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of Slack-Moehrle Sent: Friday, March 26, 2010 12:04 PM To: Tim Cook Cc: zfs-discuss Subject: Re: [zfs-discuss] RAID10 So if I have 8 x 1.5tb drives, wouldn't I: - mirror drive 1 and 5 - mirror drive 2 and 6 - mirror drive 3 and 7 - mirror drive 4 and 8 Then stripe 1,2,3,4 Then stripe 5,6,7,8 How does one do this with ZFS? So you would do: zpool create tank mirror drive1 drive2 mirror drive3 drive4 mirror drive5 drive6 mirror drive7 drive8 See here: http://www.stringliterals.com/?p=132 So, effectively mirroring the drives, but the pool that is created is one giant pool of all of the mirrors? I looked at: http://en.wikipedia.org/wiki/Non-standard_RAID_levels#RAID-Z and they had a brief description of RAIDZ2. Can someone explain in terms of usable space RAIDZ vs RAIDZ2 vs RAIDZ3? With 8 x 1.5tb? I apologize for seeming dense, I just am confused about non-stardard raid setups, they seem tricky. -Jason ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] RAID10
Can someone explain in terms of usable space RAIDZ vs RAIDZ2 vs RAIDZ3? With 8 x 1.5tb? I apologize for seeming dense, I just am confused about non-stardard raid setups, they seem tricky. raidz eats one disk. Like RAID5 raidz2 digests another one. Like RAID6 raidz3 yet another one. Like ... h... So: RAIDZ would be 8 x 1.5tb = 12tb - 1.5tb = 10.5tb RAIDZ2 would be 8 x 1.5tb = 12tb - 3.0tb = 9.0tb RAIDZ3 would be 8 x 1.5tb = 12tb - 4.5tb = 7.5tb But not really that usable space for each since the mirroring? So do you not mirror drives with RAIDZ2 or RAIDZ3 because you would have nothing for space left -Jason ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] RAID10
On Fri, 26 Mar 2010, Freddie Cash wrote: Overly-simplified, a ZFS pool is a RAID0 stripeset across all the member vdevs, which can be Except that ZFS does not support RAID0. I don't know why you guys persist with these absurd claims and continue to use wrong and misleading terminology. What you guys are effectively doing is calling a mule a horse because it has four legs, two ears, and a tail, like a donkey. Bob -- Bob Friesenhahn bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer,http://www.GraphicsMagick.org/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] RAID10
Bob Friesenhahn wrote: Except that ZFS does not support RAID0. I don't know why you guys persist with these absurd claims and continue to use wrong and misleading terminology. What is the main difference between RAID0 and striping (what zfs really does, i guess?) ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] RAID10
On Fri, Mar 26, 2010 at 12:25:54PM -0700, Malte Schirmacher wrote: Bob Friesenhahn wrote: Except that ZFS does not support RAID0. I don't know why you guys persist with these absurd claims and continue to use wrong and misleading terminology. What is the main difference between RAID0 and striping (what zfs really does, i guess?) There's a difference in implementation, but, for your purposes of describing how the vdevs stripe, I'd say it's fair enough. :) Some folks are just a little sensitive about ZFS being compared to standard RAID is all, so what's your P's and Q's around here! ;) Ray ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] RAID10
On Fri, Mar 26, 2010 at 12:21 PM, Bob Friesenhahn bfrie...@simple.dallas.tx.us wrote: On Fri, 26 Mar 2010, Freddie Cash wrote: Overly-simplified, a ZFS pool is a RAID0 stripeset across all the member vdevs, which can be Except that ZFS does not support RAID0. Wow, what part of overly simplified did you not read, see, understand, or parse? You even quoted it. I don't know why you guys persist with these absurd claims and continue to use wrong and misleading terminology. So, mister I'm so much better than everyone because I know that ZFS doesn't use RAID0 but don't provide any actual useful info: How would you describe how a ZFS pool works for striping data across multiple vdevs, in such a way that someone coming from a RAID background can understand, without using fancy-shmancy terms that no one else has ever heard? (Especially considering how confused the OP was as to how even a RAID10 array works.) Where I come from, you start with what the person knows (RAID terminology), find ways to relate that to the new knowledge domain (basically a RAID0 stripeset), and then later build on that to explain all the fancy-shmancy terminology and nitty-gritty of how it works. We didn't all pop into the work full of all the knowledge of everything. What you guys are effectively doing is calling a mule a horse because it has four legs, two ears, and a tail, like a donkey. For someone who's only ever seen, dealt with, and used horses, then (overly simplified), a mule is like a horse. Just as it is like a donkey. From there, you can go on to explain how a mule actually came to be, and what makes it different from a horse and a donkey. And what makes it better than either. -- Freddie Cash fjwc...@gmail.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS and 4kb sector Drives (All new western digital GREEN Drives?)
On Mar 26, 2010, at 8:58 AM, Svein Skogen wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 26.03.2010 16:55, Bottone, Frank wrote: Does zfs handle 4kb sectors properly or does it always assume 512b sectors? If it does, we could manually create a slice properly aligned and set zfs to use it? A real simple patch would be to attempt alignment with 4096 every time (since 4096 is a multiple of 512 there really wouldn't be a performance penalty here). This would mean that things are optimal on _ALL_ disks. (and allow those of us using more advanced diskcontrollers to set the strip-size (strip size, not stripe size) to 4K as well) Two thoughts: 1. the performance impact may not be very great, but I'm sure there are exceptions in the consumer-grade market 2. people will be disappointed with the reduced compressibility of their data -- richard ZFS storage and performance consulting at http://www.RichardElling.com ZFS training on deduplication, NexentaStor, and NAS performance Las Vegas, April 29-30, 2010 http://nexenta-vegas.eventbrite.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS size calculation. Again!
On Mar 25, 2010, at 7:25 PM, antst wrote: I have two storages, both on snv133. Both filled with 1TB drives. 1) stripe over two raidz vdevs, 7 disks in each. In total avalable size is (7-1)*2=12TB 2) zfs pool over HW raid, also 12TB. Both storages keeps the same data with minor differences. First pool keeps 24 hourly snapshots + 7 daily snapshots. Second one (backup) keeps only daily snapshots, but for longer period (2 weeks for now). Good idea :-) But they reports strangely different sizes which can't be explained by differences in snapshots I believe. 1) # zpool list export NAME SIZE ALLOC FREECAP DEDUP HEALTH ALTROOT export 12.6T 3.80T 8.82T30% 1.00x ONLINE - # zfs list export NAME USED AVAIL REFER MOUNTPOINT export 3.24T 7.35T 40.9K /export 2) # zpool list export NAME SIZE ALLOC FREECAP DEDUP HEALTH ALTROOT export 12.6T 3.19T 9.44T25% 1.00x ONLINE - # zfs list export NAME USED AVAIL REFER MOUNTPOINT export 3.19T 9.24T25K /export As we see, both pools have the same size according to zpool. Correct. As we see, for second storage size reported by zpool list and sum of used and avail in zfs list are in agreement. Correct. But for first one, 2TB is missing somehow, sum of USED and avail is 10.6 TB. Correct. To understand this, please see the ZFS FAQ: http://hub.opensolaris.org/bin/view/Community+Group+zfs/faq#HWhydoesntthespacethatisreportedbythezpoollistcommandandthezfslistcommandmatch [richard pauses to look in awe at the aforementioned URL...] -- richard Also what makes me a bit wonder, is that I would expect more space to be used on backup pool (more daily snapshots), but if zfs list can be explained that amount taken by hourly snapshots is bigger than amount taken by extra 7 daily snapshots on backup storage (difference is 50GB which is still pretty big, taking into account that on backup storage we have also extra 10 gig of backup of rpool from primary storage), there is no way for this explanation to be valid for difference in USED reported by zpool list. 600GB is much more than any possible difference coming from storing different snapshots, because our guys just don't produce so much of data daily. Also I tried to look how much of space is refereed by hourly snapshots - no way to be even close to 600GB. What's wrong there? My main concern, though, is difference between zpool size and sum of used+avail for zfs on primary storage. 2TB is 2TB! -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ZFS storage and performance consulting at http://www.RichardElling.com ZFS training on deduplication, NexentaStor, and NAS performance Las Vegas, April 29-30, 2010 http://nexenta-vegas.eventbrite.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] RAID10
On Fri, March 26, 2010 14:21, Bob Friesenhahn wrote: On Fri, 26 Mar 2010, Freddie Cash wrote: Overly-simplified, a ZFS pool is a RAID0 stripeset across all the member vdevs, which can be Except that ZFS does not support RAID0. I don't know why you guys persist with these absurd claims and continue to use wrong and misleading terminology. They're attempting to communicate with the OP, who made it pretty clear that he was comfortable with traditional RAID terms, and trying to understand ZFS. What you guys are effectively doing is calling a mule a horse because it has four legs, two ears, and a tail, like a donkey. They're short-circuiting that discussion, and we can have it later if necessary. The differences you're emphasizing are important for implementation, and performance analysis, and even for designing the system at some levels, but they're not important to the initial understanding of the system. The question was essentially Wait, I don't see RAID 10 here, and that's what I like. How do I do that? I think the answer was responsive and not misleading enough to be dangerous; the differences can be explicated later. YMMV :-) -- David Dyer-Bennet, d...@dd-b.net; http://dd-b.net/ Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/ Photos: http://dd-b.net/photography/gallery/ Dragaera: http://dragaera.info ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] RAID10
On Fri, March 26, 2010 14:25, Malte Schirmacher wrote: Bob Friesenhahn wrote: Except that ZFS does not support RAID0. I don't know why you guys persist with these absurd claims and continue to use wrong and misleading terminology. What is the main difference between RAID0 and striping (what zfs really does, i guess?) RAID creates fixed, absolute, patterns of spreading blocks, bytes, and bits around the various disks; ZFS does not, it makes on-the-fly decisions about where things should go at some levels. In RAID1, a block will go the same physical place on each drive; in a ZFS mirror it won't, it'll just go *somewhere* on each drive. In the end, RAID produces a block device that you then run a filesystem on, whereas ZFS includes the filesystem (and other things; including block devices you can run other filesystems on). -- David Dyer-Bennet, d...@dd-b.net; http://dd-b.net/ Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/ Photos: http://dd-b.net/photography/gallery/ Dragaera: http://dragaera.info ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs send/receive - actual performance
On Mar 26, 2010, at 2:34 AM, Bruno Sousa wrote: Hi, The jumbo-frames in my case give me a boost of around 2 mb/s, so it's not that much. That is about right. IIRC, the theoretical max is about 4% improvement, for MTU of 8KB. Now i will play with link aggregation and see how it goes, and of course i'm counting that incremental replication will be slower...but since the amount of data would be much less probably it will still deliver a good performance. Probably won't help at all because of the brain dead way link aggregation has to work. See Ordering of frames at http://en.wikipedia.org/wiki/Link_Aggregation_Control_Protocol#Link_Aggregation_Control_Protocol If you see the workload on the wire go through regular patterns of fast/slow response then there are some additional tricks that can be applied to increase the overall throughput and smooth the jaggies. But that is fodder for another post... You can measure this with iostat using samples 15 seconds or with tcpstat. tcpstat is a handy DTrace script often located as /opt/DTT/Bin/tcpstat.d -- richard And what a relief to know that i'm not alone when i say that storage management is part science, part arts and part voodoo magic ;) Cheers, Bruno On 25-3-2010 23:22, Ian Collins wrote: On 03/26/10 10:00 AM, Bruno Sousa wrote: [Boy top-posting sure mucks up threads!] Hi, Indeed the 3 disks per vdev (raidz2) seems a bad idea...but it's the system i have now. Regarding the performance...let's assume that a bonnie++ benchmark could go to 200 mg/s in. The possibility of getting the same values (or near) in a zfs send / zfs receive is just a matter of putting , let's say a 10gbE card between both systems? Maybe, or a 2x1G LAG would me more cost effective (and easier to check!). The only way to know for sure is to measure. I managed to get slightly better transfers by enabling jumbo frames. I have the impression that benchmarks are always synthetic, therefore live/production environments behave quite differently. Very true, especially in the black arts of storage management! Again, it might be just me, but with 1gb link being able to replicate 2 servers with a average speed above 60 mb/s does seems quite good. However, like i said i would like to know other results from other guys... As I said, the results are typical for a 1G link. Don't forget you are measuring full copies, incremental replications may well be significantly slower. -- Ian. -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ZFS storage and performance consulting at http://www.RichardElling.com ZFS training on deduplication, NexentaStor, and NAS performance Las Vegas, April 29-30, 2010 http://nexenta-vegas.eventbrite.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS hex dump diagrams?
On Mar 25, 2010, at 2:45 PM, John Bonomi wrote: I'm sorry if this is not the appropriate place to ask, but I'm a student and for an assignment I need to be able to show at the hex level how files and their attributes are stored and referenced in ZFS. Are there any resources available that will show me how this is done? IMHO the best place to start with this level of analysis is the ZFS on-disk specification doc: http://hub.opensolaris.org/bin/download/Community+Group+zfs/docs/ondiskformat0822.pdf It is getting long in the tooth and doesn't document recent features, but it is fundamentally correct. -- richard ZFS storage and performance consulting at http://www.RichardElling.com ZFS training on deduplication, NexentaStor, and NAS performance Las Vegas, April 29-30, 2010 http://nexenta-vegas.eventbrite.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] RAID10
It depends a bit on how you set up the drives really. You could make one raidz vdev of 8 drives, losing one of them for parity, or you could make two raidz vdevs of 4 drives each and lose two drives for parity (one for each vdev). You could also do one raidz2 vdev of 8 drives and lose two drives for parity, or two raidz2 vdevs of 4 drives each and lose four drives for parity (2 for each raidz2 vdev). That would give you a bit better redundancy than using 4 mirrors while giving you the same available storage space. The list goes on and on. There are a lot of different configurations you could use with 8 drives, but keep in mind once you add a vdev to your pool, you can't remove it. Personally, I would not choose to create one vdev of 8 disks, but that's just me. It is important to be aware that when and if you want to replace the 1.5TB disks with something bigger, you need to replace ALL the disks in the vdev to gain the extra space. So, if you wanted to go from 1.5TB to 2TB disks down the road, and you set up one raidz of 8 drives, you need to replace all 8 drives before you gain the additional space. If you do two raidz vdevs of 4 drives each, you need to replace 4 drives to gain additional space. If you use mirrors, you need to replace 2 drives. Or, you can add a new vdev of 2, 4, 8, or however many disks you want if you have the physical space to do so. I believe you can mix and match mirror vdevs and raidz vdevs within a zpool, but I don't think it's recommended to do so. The ZFS best practices guide has a lot of good information in it if you have not read it yet (google). You might have less usable drive space using mirrors, but you will gain a bit of performance, and it's a bit easier to expand your zpool when the time comes. A raidz (1,2,3) can give you more usable space, and can give you better or worse redundancy depending on how you set it up. There is a lot to consider. I hope I didn't cloud things up for you any further or misinform you on something (I'm a newb too, so don't take my word alone on anything). Hell, if you wanted to, you could also do one 8-way mirror that would give you an ignorant amount of redundancy at the cost of 7 drives worth of usable space. It all boils down to personal choice. You have to determine how much usable space, redundancy, performance, and ease of replacing drives mean to you and go from there. ZFS will do pretty much any configuration to suit your needs. eric -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS where to go!
On Mar 26, 2010, at 4:46 AM, Edward Ned Harvey wrote: What does everyone thing about that? I bet it is not as mature as on OpenSolaris. mature is not the right term in this case. FreeBSD has been around much longer than opensolaris, and it's equally if not more mature. Bill Joy might take offense to this statement. Both FreeBSD and Solaris trace their roots to the work done at Berkeley 30 years ago. Both have evolved in different ways at different rates. Since Solaris targets the enterprise market, I will claim that Solaris is proven in that space. OpenSolaris is just one of the next steps forward for Solaris. -- richard ZFS storage and performance consulting at http://www.RichardElling.com ZFS training on deduplication, NexentaStor, and NAS performance Las Vegas, April 29-30, 2010 http://nexenta-vegas.eventbrite.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS hex dump diagrams?
Hi Richard, Richard Elling wrote: On Mar 25, 2010, at 2:45 PM, John Bonomi wrote: I'm sorry if this is not the appropriate place to ask, but I'm a student and for an assignment I need to be able to show at the hex level how files and their attributes are stored and referenced in ZFS. Are there any resources available that will show me how this is done? IMHO the best place to start with this level of analysis is the ZFS on-disk specification doc: http://hub.opensolaris.org/bin/download/Community+Group+zfs/docs/ondiskformat0822.pdf It is getting long in the tooth and doesn't document recent features, but it is fundamentally correct. I completely agree with this, but good luck getting a hex dump from that information. max ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] RAID10
On Fri, 26 Mar 2010, Malte Schirmacher wrote: Bob Friesenhahn wrote: Except that ZFS does not support RAID0. I don't know why you guys persist with these absurd claims and continue to use wrong and misleading terminology. What is the main difference between RAID0 and striping (what zfs really does, i guess?) Zfs only stripes within raidzN vdevs, and even then at the zfs record level and not using a RAID0 (fixed mapping on the LUN) approach. RAID0 and striping are similar concepts. When one stripes across an array of disks, one breaks up the written block (record), and writes parts of it across all of the disks in the stripe. This is usually done to increase sequential read/write performance but may also be used to assist with error recovery (which zfs does take advantage of). Zfs only writes whole records (e.g. 128K) to a vdev so that it does not stripe across vdevs. Within a vdev, it may stripe. The difference is pretty huge when one considers that zfs is able to support vdevs of different sizes and topologies, as well as ones added much more recently than when the pool was created. RAID0 and striping can't do that. Bob -- Bob Friesenhahn bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer,http://www.GraphicsMagick.org/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] RAID10
On Fri, 26 Mar 2010, David Dyer-Bennet wrote: The question was essentially Wait, I don't see RAID 10 here, and that's what I like. How do I do that? I think the answer was responsive and not misleading enough to be dangerous; the differences can be explicated later. Most of us choose a pool design and then copy all of our data to it. If one does not understand how the pool works, then a poor design may be selected, which can be difficult to extricate from later. That is why it is important to know that zfs writes full records to each vdev and does not stripe the blocks across vdevs as was suggested. Bob -- Bob Friesenhahn bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer,http://www.GraphicsMagick.org/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] RAID10
On Fri, 26 Mar 2010, Freddie Cash wrote: On Fri, Mar 26, 2010 at 12:21 PM, Bob Friesenhahn bfrie...@simple.dallas.tx.us wrote: On Fri, 26 Mar 2010, Freddie Cash wrote: Overly-simplified, a ZFS pool is a RAID0 stripeset across all the member vdevs, which can be Except that ZFS does not support RAID0. Wow, what part of overly simplified did you not read, see, understand, or parse? You even quoted it. Sorry to pick on your email in particular. Everyone here should consider it to be their personal duty to correct such statements. The distinctions may not seem important but they are important to understand since they can be quite important to pool performance. Bob -- Bob Friesenhahn bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer,http://www.GraphicsMagick.org/___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] ZFS RaidZ to RaidZ2
Hi I have a couple of questions I currently have a 4disk RaidZ1 setup and want to move to a RaidZ2 4x2TB = RaidZ1 (tank) My current plan is to setup 8x1.5TB in a RAIDZ2 and migrate the data from the tank vdev over. What's the best way to accomplish this with minimal disruption? I have seen the zfs send / receive commands which seem to be what I should be using? The reason I'm not doing a simple copy is I have Xen Volumes as well which I'm not exactly sure how to copy over. ZFS list (snipping the rpool) yields tank tank/vm tank/vm/centos48 tank/vm/centos48/disk0 tank/vm/centos54 tank/vm/centos54/disk0 Basically I'm looking to replace a 4disk Raidz1 with an 8disk raidz2 (from what I understand I can't simply add 4 more disks to the existing raidz1 and update it to raidz2) -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS where to go!
Richard, My challenge to you is that at least three vedors that I know of built their storage platforms on FreeBSD. One of them sells $4bn/year of product - petty sure that eclipses all (Open)Solaris-based storage ;) -marc On 3/26/10, Richard Elling richard.ell...@gmail.com wrote: On Mar 26, 2010, at 4:46 AM, Edward Ned Harvey wrote: What does everyone thing about that? I bet it is not as mature as on OpenSolaris. mature is not the right term in this case. FreeBSD has been around much longer than opensolaris, and it's equally if not more mature. Bill Joy might take offense to this statement. Both FreeBSD and Solaris trace their roots to the work done at Berkeley 30 years ago. Both have evolved in different ways at different rates. Since Solaris targets the enterprise market, I will claim that Solaris is proven in that space. OpenSolaris is just one of the next steps forward for Solaris. -- richard ZFS storage and performance consulting at http://www.RichardElling.com ZFS training on deduplication, NexentaStor, and NAS performance Las Vegas, April 29-30, 2010 http://nexenta-vegas.eventbrite.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss -- Sent from my mobile device ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] SSD As ARC
Hi I'm planning on setting up two RaidZ2 volumes in different pools for added flexibility in removing / resizing (from what I understand if they were in the same pool I can't remove them at all). I also have got an SSD drive that I was going to use as Cache (L2ARC). How do I set this up to have two L2ARCs off one SSD (to service each pool). Do I need to create two slices (50% of the SSD disk space) and assign one to each pool? Also I'm not expecting a lot of writes (primarily a file server) so I didn't think a ZIL would be a worthwhile investment. Any advice appreciated -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS where to go!
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 26.03.2010 23:25, Marc Nicholas wrote: Richard, My challenge to you is that at least three vedors that I know of built their storage platforms on FreeBSD. One of them sells $4bn/year of product - petty sure that eclipses all (Open)Solaris-based storage ;) sarcasm alert Butbutbutbut! Solaris is more enterprise focused! /sarcasm alert Seriously. FreeBSD has a _VERY_ good track record (in all levels of busness). This is not an attempt at belittling Solaris, nor the effort of Sun, but trying to claim FreeBSD not being enterprise-ready seems silly. //Svein - -- - +---+--- /\ |Svein Skogen | sv...@d80.iso100.no \ / |Solberg Østli 9| PGP Key: 0xE5E76831 X|2020 Skedsmokorset | sv...@jernhuset.no / \ |Norway | PGP Key: 0xCE96CE13 | | sv...@stillbilde.net ascii | | PGP Key: 0x58CD33B6 ribbon |System Admin | svein-listm...@stillbilde.net Campaign|stillbilde.net | PGP Key: 0x22D494A4 +---+--- |msn messenger: | Mobile Phone: +47 907 03 575 |sv...@jernhuset.no | RIPE handle:SS16503-RIPE - +---+--- If you really are in a hurry, mail me at svein-mob...@stillbilde.net This mailbox goes directly to my cellphone and is checked even when I'm not in front of my computer. - Picture Gallery: https://gallery.stillbilde.net/v/svein/ - -BEGIN PGP SIGNATURE- Version: GnuPG v2.0.12 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkutNgUACgkQSBMQn1jNM7aBswCg6zqxqCmq9bz6OepVPWifMuRo NqoAoIIdmL2IKMVqYrlBvVHPM0BB8P1a =k/MQ -END PGP SIGNATURE- ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS RaidZ to RaidZ2
On 03/27/10 11:22 AM, Muhammed Syyid wrote: Hi I have a couple of questions I currently have a 4disk RaidZ1 setup and want to move to a RaidZ2 4x2TB = RaidZ1 (tank) My current plan is to setup 8x1.5TB in a RAIDZ2 and migrate the data from the tank vdev over. What's the best way to accomplish this with minimal disruption? I have seen the zfs send / receive commands which seem to be what I should be using? Yes, they are the only option if you wish to preserve your filesystem properties. You will end up with a clone of your original pool's filesystems on the new pool. -- Ian. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS RaidZ to RaidZ2
zfs send s...@oldpool | zfs receive newpool -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS where to go!
On 03/27/10 11:32 AM, Svein Skogen wrote: On 26.03.2010 23:25, Marc Nicholas wrote: Richard, My challenge to you is that at least three vedors that I know of built their storage platforms on FreeBSD. One of them sells $4bn/year of product - petty sure that eclipses all (Open)Solaris-based storage ;) sarcasm alert Butbutbutbut! Solaris is more enterprise focused! /sarcasm alert Seriously. FreeBSD has a _VERY_ good track record (in all levels of busness). This is not an attempt at belittling Solaris, nor the effort of Sun, but trying to claim FreeBSD not being enterprise-ready seems silly. Which is why no one on this thread has. //Svein - -- Please use a standard signature delimiter -- if you are going to tag on so much ASCII art and unnecessary PGP baggage! -- Ian. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] SSD As ARC
On Fri, March 26, 2010 17:26, Muhammed Syyid wrote: Hi I'm planning on setting up two RaidZ2 volumes in different pools for added flexibility in removing / resizing (from what I understand if they were in the same pool I can't remove them at all). What do you mean remove? You cannot remove a vdev from a pool. You can however destroy the entire pool, thus essentially removing the vdev. You CAN replace the drives in a vdev, one at a time, with larger drives, and when you are done the extra space will be available to the pool, so for resizing purposes you can essentially replace a vdev, though not remove it or alter the number of drives or the type. -- David Dyer-Bennet, d...@dd-b.net; http://dd-b.net/ Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/ Photos: http://dd-b.net/photography/gallery/ Dragaera: http://dragaera.info ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS RaidZ to RaidZ2
On 03/27/10 11:33 AM, Richard Jahnel wrote: zfs send s...@oldpool | zfs receive newpool In the OP's case, a recursive send is in order. -- Ian. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS where to go!
On Mar 26, 2010, at 3:25 PM, Marc Nicholas wrote: Richard, My challenge to you is that at least three vedors that I know of built their storage platforms on FreeBSD. One of them sells $4bn/year of product - petty sure that eclipses all (Open)Solaris-based storage ;) FreeBSD 8 or FreeBSD 7.3? If neither, then the point is moot. -- richard ZFS storage and performance consulting at http://www.RichardElling.com ZFS training on deduplication, NexentaStor, and NAS performance Las Vegas, April 29-30, 2010 http://nexenta-vegas.eventbrite.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] *SPAM* Re: zfs send/receive - actual performance
On 03/27/10 09:39 AM, Richard Elling wrote: On Mar 26, 2010, at 2:34 AM, Bruno Sousa wrote: Hi, The jumbo-frames in my case give me a boost of around 2 mb/s, so it's not that much. That is about right. IIRC, the theoretical max is about 4% improvement, for MTU of 8KB. Now i will play with link aggregation and see how it goes, and of course i'm counting that incremental replication will be slower...but since the amount of data would be much less probably it will still deliver a good performance. Probably won't help at all because of the brain dead way link aggregation has to work. See Ordering of frames at http://en.wikipedia.org/wiki/Link_Aggregation_Control_Protocol#Link_Aggregation_Control_Protocol Arse, thanks for reminding me Richard! A single stream will only use one path in a LAG. -- Ian. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] RAID10
OK, so I made progress today. FreeBSD see's all of my drives, ZFS is acting correct. Now for me confusion. RAIDz3 # zpool create datastore raidz3 da0 da1 da2 da3 da4 da5 da6 da7 Gives: 'raidz3' no such GEOM providor # I am looking at the best practices guide and I am confused about adding a hot spare. Wont that happen with the above command or do I really just zpool create datastore raidz3 da0 da1 da2 da3 da4 da5 and then issue the hotspare command twice for da6 and da7? -Jason - Original Message - From: Slack-Moehrle mailingli...@mailnewsrss.com To: zfs-discuss@opensolaris.org Sent: Friday, March 26, 2010 12:13:58 PM Subject: Re: [zfs-discuss] RAID10 Can someone explain in terms of usable space RAIDZ vs RAIDZ2 vs RAIDZ3? With 8 x 1.5tb? I apologize for seeming dense, I just am confused about non-stardard raid setups, they seem tricky. raidz eats one disk. Like RAID5 raidz2 digests another one. Like RAID6 raidz3 yet another one. Like ... h... So: RAIDZ would be 8 x 1.5tb = 12tb - 1.5tb = 10.5tb RAIDZ2 would be 8 x 1.5tb = 12tb - 3.0tb = 9.0tb RAIDZ3 would be 8 x 1.5tb = 12tb - 4.5tb = 7.5tb But not really that usable space for each since the mirroring? So do you not mirror drives with RAIDZ2 or RAIDZ3 because you would have nothing for space left -Jason ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] RAID10
On Fri, Mar 26, 2010 at 6:29 PM, Slack-Moehrle mailingli...@mailnewsrss.com wrote: OK, so I made progress today. FreeBSD see's all of my drives, ZFS is acting correct. Now for me confusion. RAIDz3 # zpool create datastore raidz3 da0 da1 da2 da3 da4 da5 da6 da7 Gives: 'raidz3' no such GEOM providor # I am looking at the best practices guide and I am confused about adding a hot spare. Wont that happen with the above command or do I really just zpool create datastore raidz3 da0 da1 da2 da3 da4 da5 and then issue the hotspare command twice for da6 and da7? -Jason - Original Message - From: Slack-Moehrle mailingli...@mailnewsrss.com To: zfs-discuss@opensolaris.org Sent: Friday, March 26, 2010 12:13:58 PM Subject: Re: [zfs-discuss] RAID10 Can someone explain in terms of usable space RAIDZ vs RAIDZ2 vs RAIDZ3? With 8 x 1.5tb? I apologize for seeming dense, I just am confused about non-stardard raid setups, they seem tricky. raidz eats one disk. Like RAID5 raidz2 digests another one. Like RAID6 raidz3 yet another one. Like ... h... So: RAIDZ would be 8 x 1.5tb = 12tb - 1.5tb = 10.5tb RAIDZ2 would be 8 x 1.5tb = 12tb - 3.0tb = 9.0tb RAIDZ3 would be 8 x 1.5tb = 12tb - 4.5tb = 7.5tb But not really that usable space for each since the mirroring? So do you not mirror drives with RAIDZ2 or RAIDZ3 because you would have nothing for space left -Jason Triple parity did not get added until version 17. FreeBSD cannot do raidz3. --Tim ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS where to go!
On Fri, Mar 26, 2010 at 5:42 PM, Richard Elling richard.ell...@gmail.comwrote: On Mar 26, 2010, at 3:25 PM, Marc Nicholas wrote: Richard, My challenge to you is that at least three vedors that I know of built their storage platforms on FreeBSD. One of them sells $4bn/year of product - petty sure that eclipses all (Open)Solaris-based storage ;) FreeBSD 8 or FreeBSD 7.3? If neither, then the point is moot. -- richard ZFS storage and performance consulting at http://www.RichardElling.com ZFS training on deduplication, NexentaStor, and NAS performance Las Vegas, April 29-30, 2010 http://nexenta-vegas.eventbrite.com Well that depends on exactly what you mean. There's several that are actively contributing and using code from both. Built on is all relative. Given the SMP improvement recently from all of the major players using BSD, if you're talking kernel code, I would say every single one of them has pulled code from the 7-branch, and likely the 8-branch as well. --Tim ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] RAID10
On Mar 26, 2010, at 23:37, David Dyer-Bennet d...@dd-b.net wrote: On Fri, March 26, 2010 14:25, Malte Schirmacher wrote: Bob Friesenhahn wrote: Except that ZFS does not support RAID0. I don't know why you guys persist with these absurd claims and continue to use wrong and misleading terminology. What is the main difference between RAID0 and striping (what zfs really does, i guess?) RAID creates fixed, absolute, patterns of spreading blocks, bytes, and bits around the various disks; ZFS does not, it makes on-the-fly decisions about where things should go at some levels. In RAID1, a block will go the same physical place on each drive; in a ZFS mirror it won't, it'll just go *somewhere* on each drive. This is not correct. In ZFS mirror a block will go to the same offset within data area on both submirrors. But if you set up your mirrored slices starting at different offsets you can arrange for blocks on submirrors to have different physical offsets ;-) In the end, RAID produces a block device that you then run a filesystem on, whereas ZFS includes the filesystem (and other things; including block devices you can run other filesystems on). -- David Dyer-Bennet, d...@dd-b.net; http://dd-b.net/ Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/ Photos: http://dd-b.net/photography/gallery/ Dragaera: http://dragaera.info ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS and 4kb sector Drives (All new western digital GREEN Drives?)
For the time being, the EARS series of drives actually present 512 byte sectors to the o/s through emulation in firmware. The drive I tested was WD20EARS (2TB WD Caviar Green Advanced Format drives): MDL: WD20EARS-00S81 DATE: 29 DEC 2009 DCM: HBRNHT2BB DCX: 6019S1W87 LBA: 3907029168 The LBA above is key -. this s the unmber of sectors presented by the drive's firmware to the host o/s. Combinations of jumpers, and runnign the WD alignment utility only appears to reorganise how the ECC is stored in 4k blocks physically on disk, but the drive still presents each 4K physicaly dick block as 8 x 512byte logical blocks to the host. I have logged a support request with WD to see if they may be releasing firmware that will present the 4k blocks natively. As an individual user, i actually doubt that WD will ever respond. I can only hope that quite a few others (00's / 000's) of other people also log similar requests and that WD may release appropriate firmware. Would be grateful to hear of any others and their testing experiences with other series of advanced format drives from WD. The drives works perfectly on 64bit kernel, but not on 32bit osol kernels. I purchased the drive just to test on 32bit kernel - mainly as there are quite a lot of soho NAS devices that may be able to use our Velitium Embedded Kit for OpenSolaris with drives larger than 1TB. It would be nice if the 32bit osol kernel support 48bit LBA (similar to linux, not sure if 32bit BSD supports 48bit LBA ), then the drive would probably work - perhaps later in the year we will have time to work on a patch to support 48bit lba on the 32bit osol kernels... Darren Mackay http://www.sikkra.com http://sourceforge.net/projects/velitium/ -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Mixed ZFS vdev in same pool.
I have a question about using mixed vdev in the same zpool and what the community opinion is on the matter. Here is my setup: I have four 1TB drives and two 500GB drives. When I first setup ZFS I was under the assumption that it does not really care much on how you add devices to the pool and it assumes you are thinking things through. But when I tried to create a pool (called group) with four 1TB disk in raidz and two 500GB disk in mirror configuration to the same pool ZFS complained and said if I wanted to do it I had to add a -f (which I assume stands for force). So was ZFS attempting to stop me from doing something generally considered bad? Some other questions I have, lets assume that this setup isn't that bad (or it is that bad and these questions will be why): If one 500GB disk dies (c10dX) in the mirror and I choose not to replace it, would I be able to migrate the files that are on the other mirror that still works over to the drives in the raidz configuration assuming there is space? Would ZFS inform me which files are affected, like it does in other situations? In this configuration how does Solaris/ZFS determine which vdev to place the current write operations worth of data into? Is there any situations where data would, for some reason, not be protected against single disk failures? Would this configuration survive a two disk failure if the disk are in a separate vdev? jsm...@corax:~# zpool status group pool: group state: ONLINE scrub: none requested config: NAMESTATE READ WRITE CKSUM group ONLINE 0 0 0 ..raidz1ONLINE 0 0 0 c7t0d0 ONLINE 0 0 0 c7t1d0 ONLINE 0 0 0 c8t0d0 ONLINE 0 0 0 c8t1d0 ONLINE 0 0 0 ..mirrorONLINE 0 0 0 c10d0 ONLINE 0 0 0 c10d1 ONLINE 0 0 0 errors: No known data errors jsm...@corax:~# zfs list group NAMEUSED AVAIL REFER MOUNTPOINT group 94.4K 3.12T 23.7K /group This isn't for a production environment in some datacenter but nevertheless I would like to make the data as reasonably secure as possible while maximizing total storage space. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss