Mike Gerdts wrote:
Is there anything in the works to compress (or encrypt) existing data
after the fact?  For example, a special option to scrub that causes
the data to be re-written with the new properties could potentially do
this.

This is a long-term goal of ours, but with snapshots, this is extremely nontrivial to do efficiently and without increasing the amount of space used.) .

> If so, this feature should subscribe to any generic framework
provided by such an effort.

Yep, absolutely.

* Mirroring offers slightly better redundancy, because one disk from
   each mirror can fail without data loss.

Is this use of slightly based upon disk failure modes?  That is, when
disks fail do they tend to get isolated areas of badness compared to
complete loss?  I would suggest that complete loss should include
someone tripping over the power cord to the external array that houses
the disk.

I'm basing this "slightly better" call on a model of random, complete-disk failures. I know that this is only an approximation. With many mirrors, most (but not all) 2-disk failures can be tolerated. With copies=2, almost no 2-top-level-vdev failures will be tolerated, because it's likely that *some* block will have both its copies on those 2 disks. With mirrors, you can arrange to mirror across cabinets, not within them, which you can't do with copies.

It is important to note that the copies provided by this feature are in
addition to any redundancy provided by the pool configuration or the
underlying storage.  For example:

All of these examples seem to assume that there six disks.

Not really. There could be any number of mirrors or raid-z groups (although I note, you need at least 'copies' groups to survive the max whole-disk failures).

* In a pool with 2-way mirrors, a filesystem with copies=1 (the default)
   will be stored with 2 * 1 = 2 copies.  The filesystem can tolerate any
   1 disk failing without data loss.
* In a pool with 2-way mirrors, a filesystem with copies=3
   will be stored with 2 * 3 = 6 copies.  The filesystem can tolerate any
   5 disks failing without data loss (assuming that there are at least
   ncopies=3 mirror groups).

This one assumes best case scenario with 6 disks.  Suppose you had 4 x
72 GB and 2 x 36 GB disks.  You could end up with multiple copies on
the 72 GB disks.

Yes, all these examples assume that our "putting the copies on different disks when possible" actually worked out. It will almost certainly work out unless you have a small number of different-sized devices, or are running with very little free space. If you need hard guarantees, you need to use actual mirroring.

Any statement about physical location on the disk?   It would seem as
though locating two copies sequentially on the disk would not provide
nearly the amount of protection as having them fairly distant from
each other.

Yep, if the copies can't be stored on different disks, they will be stored spread-out on the same disk if possible (I think we aim for one on each quarter of the disk).

--matt
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to