Torrey McMahon wrote:
Matthew Ahrens wrote:
The problem that this feature attempts to address is when you have some data that is more important (and thus needs a higher level of redundancy) than other data. Of course in some situations you can use multiple pools, but that is antithetical to ZFS's pooled storage model. (You have to divide up your storage, you'll end up with stranded storage and bandwidth, etc.)

Can you expand? I can think of some examples where using multiple pools - even on the same host - is quite useful given the current feature set of the product. Or are you only discussing the specific case where a host would want more reliability for a certain set of data then an other? If that's the case I'm still confused as to what failure cases would still allow you to retrieve your data if there are more then one copy in the fs or pool.....but I'll gladly take some enlightenment. :)

(My apologies for the length of this response, I'll try to address most of the issues brought up recently...)

When I wrote this proposal, I was only seriously thinking about the case where you want different amounts of redundancy for different data. Perhaps because I failed to make this clear, discussion has concentrated on laptop reliability issues. It is true that there would be some benefit to using multiple copies on a single-disk (eg. laptop) pool, but of course it would not protect against the most common failure mode (whole disk failure).

One case where this feature would be useful is if you have a pool with no redundancy (ie. no mirroring or raid-z), because most of the data in the pool is not very important. However, the pool may have a bunch of disks in it (say, four). The administrator/user may realize (perhaps later on) that some of their data really *is* important and they would like some protection against losing it if a disk fails. They may not have the option of adding more disks to mirror all of their data (cost or physical space constraints may apply here). Their problem is solved by creating a new filesystem with copies=2 and putting the important data there. Now, if a disk fails, then the data in the copies=2 filesystem will not be lost. Approximately 1/4 of the data in other filesystems will be lost. (There is a small chance that some tiny fraction of the data in the copies=2 filesystem will still be lost if we were forced to put both copies on the disk that failed.)

Another plausible use case would be where you have some level of redundancy, say you have a Thumper (X4500) with its 48 disks configured into 9 5-wide single-parity raid-z groups (with 3 spares). If a single disk fails, there will be no data loss. However, if two disks within the same raid-z group fail, data will be lost. In this scenario, imagine that this data loss probability is acceptable for most of the data stored here, but there is some extremely important data for which this is unacceptable. Rather than reconfiguring the entire pool for higher redundancy (say, double-parity raid-z) and less usable storage, you can simply create a filesystem with copies=2 within the raid-z storage pool. Data within that filesystem will not be lost even if any three disks fail.

I believe that these use cases, while not being extremely common, do occur. The extremely low amount of engineering effort required to implement the feature (modulo the space accounting issues) seems justified. The fact that this feature does not solve all problems (eg, it is not intended to be a replacement for mirroring) is not a downside; not all features need to be used in all situations :-)

The real problem with this proposal is the confusion surrounding disk space accounting with copies>1. While the same issues are present when using compression, people are understandably less upset when files take up less space than expected. Given the current lack of interest in this feature, the effort required to address the space accounting issue does not seem justified at this time.

--matt
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to