>>>>> "et" == Erik Trimble <[EMAIL PROTECTED]> writes:

    et> Dedup Advantages:

    et> (1) save space

(2) coalesce data which is frequently used by many nodes in a large
    cluster into a small nugget of common data which can fit into RAM
    or L2 fast disk

(3) back up non-ZFS filesystems that don't have snapshots and clones

(4) make offsite replication easier on the WAN

but, yeah, aside from imagining ahead to possible disastrous problems
with the final implementation, the imagined use cases should probably
be carefully compared to existing large installations.

Firstly, dedup may be more tempting as a bulletted marketing feature
or a bloggable/banterable boasting point than it is valuable to real
people.

Secondly, the comparison may drive the implementation.  For example,
should dedup happen at write time and be something that doesn't happen
to data written before it's turned on, like recordsize or compression,
to make it simpler in the user interface, and avoid problems with
scrubs making pools uselessly slow?  Or should it be scrub-like so
that already-written filesystems can be thrown into the dedup bag and
slowly squeezed, or so that dedup can run slowly during the business
day over data written quickly at night (fast outside-business-hours
backup)?

Attachment: pgpHArHK13e1c.pgp
Description: PGP signature

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to