Re: [zfs-discuss] Proposal: multiple copies of user data

Matthew Ahrens Mon, 11 Sep 2006 17:27:11 -0700

Mike Gerdts wrote:

Is there anything in the works to compress (or encrypt) existing data
after the fact?  For example, a special option to scrub that causes
the data to be re-written with the new properties could potentially do
this.

This is a long-term goal of ours, but with snapshots, this is extremelynontrivial to do efficiently and without increasing the amount of spaceused.) .


> If so, this feature should subscribe to any generic framework

provided by such an effort.


Yep, absolutely.

* Mirroring offers slightly better redundancy, because one disk from
   each mirror can fail without data loss.


Is this use of slightly based upon disk failure modes?  That is, when
disks fail do they tend to get isolated areas of badness compared to
complete loss?  I would suggest that complete loss should include
someone tripping over the power cord to the external array that houses
the disk.

I'm basing this "slightly better" call on a model of random,complete-disk failures. I know that this is only an approximation.With many mirrors, most (but not all) 2-disk failures can be tolerated.With copies=2, almost no 2-top-level-vdev failures will be tolerated,because it's likely that *some* block will have both its copies on those2 disks. With mirrors, you can arrange to mirror across cabinets, notwithin them, which you can't do with copies.

It is important to note that the copies provided by this feature are in
addition to any redundancy provided by the pool configuration or the
underlying storage.  For example:


All of these examples seem to assume that there six disks.

Not really. There could be any number of mirrors or raid-z groups(although I note, you need at least 'copies' groups to survive the maxwhole-disk failures).

* In a pool with 2-way mirrors, a filesystem with copies=1 (the default)
   will be stored with 2 * 1 = 2 copies.  The filesystem can tolerate any
   1 disk failing without data loss.
* In a pool with 2-way mirrors, a filesystem with copies=3
   will be stored with 2 * 3 = 6 copies.  The filesystem can tolerate any
   5 disks failing without data loss (assuming that there are at least
   ncopies=3 mirror groups).


This one assumes best case scenario with 6 disks.  Suppose you had 4 x
72 GB and 2 x 36 GB disks.  You could end up with multiple copies on
the 72 GB disks.

Yes, all these examples assume that our "putting the copies on differentdisks when possible" actually worked out. It will almost certainly workout unless you have a small number of different-sized devices, or arerunning with very little free space. If you need hard guarantees, youneed to use actual mirroring.

Any statement about physical location on the disk?   It would seem as
though locating two copies sequentially on the disk would not provide
nearly the amount of protection as having them fairly distant from
each other.

Yep, if the copies can't be stored on different disks, they will bestored spread-out on the same disk if possible (I think we aim for oneon each quarter of the disk).


--matt
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Proposal: multiple copies of user data

Reply via email to