On Wed, 2007-06-27 at 12:03 -0700, Jef Pearlman wrote: > > Jef Pearlman wrote: > > > Absent that, I was considering using zfs and just > > > having a single pool. My main question is this: what > > > is the failure mode of zfs if one of those drives > > > either fails completely or has errors? Do I > > > permanently lose access to the entire pool? Can I > > > attempt to read other data? Can I "zfs replace" the > > > bad drive and get some level of data recovery? > > > Otherwise, by pooling drives am I simply increasing > > > the probability of a catastrophic data loss? I > > > apologize if this is addressed elsewhere -- I've read > > > a bunch about zfs, but not come across this > > > particular answer. > > Pooling devices in a non-redundant mode (ie without a raidz or mirror vdev) increases your chance of losing data, just like every other RAID system out there.
However, since ZFS doesn't do concatenation (it stripes), by losing one drive in a non-redundant stripe, you effectively corrupt the entire dataset, as virtually all files should have some portion of their data on the dead drive. > > We generally recommend a single pool, as long as the > > use case permits. > > But I think you are confused about what a zpool is. > > I suggest you look > > t the examples or docs. A good overview is the slide > > show > > http://www.opensolaris.org/os/community/zfs/docs/zfs_ > > last.pdf > > Perhaps I'm not asking my question clearly. I've already experimented a fair > amount with zfs, including creating and destroying a number of pools with and > without redundancy, replacing vdevs, etc. Maybe asking by example will > clarify what I'm looking for or where I've missed the boat. The key is that I > want a grow-as-you-go heterogenous set of disks in my pool: > > Let's say I start with a 40g drive and a 60g drive. I create a non-redundant > pool (which will be 100g). At some later point, I run across an unused 30g > drive, which I add to the pool. Now my pool is 130g. At some point after > that, the 40g drive fails, either by producing read errors or my failing to > spin up at all. What happens to my pool? Can I mount and access it at all > (for the data not on or striped across the 40g drive)? Can I "zfs replace" > the 40g drive with another drive and have it attempt to copy as much data > over as it can? Or am I just out of luck? zfs seems like a great way to use > old/unutilized drives to expand capacity, but sooner or later one of those > drives will fail, and if it takes out the whole pool (which it might > reasonably do), then it doesn't work out in the end. > Nope. Your zpool is a stripe. As mentioned above, losing one disk in a stripe effectively destroys all data, just as with any other RAID system. > > > As a side-question, does anyone have a suggestion > > > for an intelligent way to approach this goal? This is > > > not mission-critical data, but I'd prefer not to make > > > data loss _more_ probable. Perhaps some volume > > > manager (like LVM on linux) has appropriate features? > > > > ZFS, mirrored pool will be the most performant and > > easiest to manage > > with better RAS than a raidz pool. > > The problem I've come across with using mirror or raidz for this setup is > that (as far as I know) you can't add disks to mirror/raidz groups, and if > you just add the disk to the pool, you end up in the same situation as above > (with more space but no redundancy). > > Thanks for your help. > > -Jef > > To answer the original question, you _have_ to create mirrors, which, if you have odd-sized disks, will end up with unused space. An example: Disk A: 20GB Disk B: 30GB Disk C: 40GB Disk D: 60GB Start with disk A & B: zpool create tank mirror A B results in a 20GB pool. Later, add disks C & D: zpool add tank mirror C D this results in a 2-wide stripe of 2 mirrors, which means there is a total capacity of 60GB (20GB for A & B, 40GB for B & C) of the pool. 10GB of the 30GB drive, and 20GB of the 60GB drive are currently unused. You can lose one drive from both pairs (i.e. A and C, A and D, B and C, or B and D) before any data loss. If you had known about the drive sizes beforehand, the you could have done something like this: Partition the drives as follows: A: 1 20GB partition B: 1 20gb & 1 10GB partition C: 1 40GB partition D: 1 40GB partition & 2 10GB paritions then you do: zpool create tank mirror Ap0 Bp0 mirror Cp0 Dp0 mirror Bp1 Dp1 and you get a total of 70GB of space. However, the performance on this is going to be bad (as you frequently need to write to both partitions on B & D, causing head seek), though you can still lose up to 2 drives before experiencing data loss. -- Erik Trimble Java System Support Mailstop: usca14-102 Phone: x17195 Santa Clara, CA Timezone: US/Pacific (GMT-0800) _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss