> 1. don't allow fletcher2,verify.  (we only allow checksum=fletcher2  
> for backwards compatability anyway)

I think you'll see this was called out explicitly in section B.1.

> 2. expose the dedup ratio as a pool property (like compression ratio  
> is a zfs property)

Deduplication efficacy is a bit more complex than compression which is  
why we present the raw data as pool properties. We can let our users  
guide us with regard to adding further computation on those properties.

> 3. Are the valid values for the zpool dedupchecksum property  
> "sha256" and "sha256,verify"?

No, today, just 'sha256'.

> I'm wondering what the rationale is behind introducing the idea of a  
> pool-wide default for "zfs set X=on", and why we wouldn't extend  
> this to all ZFS properties of the form "zfs set X=on | off |  
> specific value" (namely, compression, checksum, and share*)?  It  
> seems simpler to continue using the existing zfs property  
> inheritance model, rather than introducing new pool-wide "what does  
> on mean" properties.

See Jeff's and George's follow-ups.

Adam

>
> --matt
>
> Adam Leventhal wrote:
>> I'm sponsoring the following fasttrack on behalf of Jeff Bonwick, and
>> the ZFS team. The binding is patch and the commitment level is  
>> Committed.
>> Apologies for the late notice, but if it is possible to review this  
>> case
>> at the 10/21/2009 meeting that would be much appreciated. We  
>> believe the
>> interfaces as defined are in keeping with other ZFS interfaces.
>> Please take particular note of the question at the end of B.1 where  
>> we're
>> unsure of the best path and hope the ARC can provide guidance.
>> Thanks.
>> Adam
>> Template Version: @(#)sac_nextcase 1.68 02/23/09 SMI
>> This information is Copyright 2009 Sun Microsystems
>> 1. Introduction
>>    1.1. Project/Component Working Name:
>>       ZFS Deduplication Properties
>>    1.2. Name of Document Author/Supplier:
>>       Author:  Jeffrey Bonwick
>>    1.3  Date of This Document:
>>      19 October, 2009
>> 4. Technical Description
>> A. Background
>> Deduplication is a feature of modern storage platforms by which   
>> varying mechanisms are employed to reduce the amount of total data   
>> stored by eliminating and sharing common components. We are adding   
>> deduplication to ZFS in order to further enable market penetration   
>> with ZFS and the Sun Storage 7000 series.
>> The algorithm employed by ZFS deduplication uses checksum based   
>> comparison of blocks with optional verification (for example with  
>> non- cryptographically secure checksums). Deduplication is  
>> performed across  the entire ZFS storage pool; administrators can  
>> select if individual  datasets have deduplication enabled or not.  
>> This is useful in mixed- mode environments in which some datasets  
>> have highly duplicated data  (e.g. VMware images, VDI, home  
>> directories, or email folders) and  others are unique (e.g.  
>> databases).
>> With this case we propose the user interface for enabling   
>> deduplication in ZFS.
>> B. Interface
>> B.1 zfs(1M)
>> The interface for enabling and disabling deduplication is simple  
>> and  straightforward, and follows the convention of other similar  
>> ZFS  settings. We simply add a new per-dataset property, dedup:
>>      zfs set dedup=<on | off | checksum>[,verify]
>>      zfs get dedup
>> The acceptable values for the dedup property are as follows:
>>      off (the default)
>>      on (see below)
>>      on,verify
>>      verify
>>      sha256
>>      sha256,verify
>>      fletcher4,verify
>>      fletcher2,verify
>> The dedup property can be set to any of the cryptographically  
>> strong  checksums supported by ZFS (today just sha256). In this  
>> mode we rely  on the checksum alone to ensure no data collisions.  
>> Alternatively the  dedup property can be set to '<checksum>,verify'  
>> in which the given  checksum is used for comparison, the blocks are  
>> compared to ensure  against collisions. This is strictly relevant  
>> only for non- cryptographically secure checksums but we offer it as  
>> an option for  customers who seek that reassurance. The value of  
>> 'on' uses the zpool- wide default defined by the zpool property  
>> dedupchecksum (see B.2.1).
>> As an explicit request for input from the ARC, our fletcher2  
>> implementation
>> has been shown to be suboptimal and results in a large number of
>> collisions (as a result, the default checksum has been changed to
>> fletcher4). Should 'fletcher2,verify' be permitted as an option for
>> consistency or should we eliminate that option since it would rarely
>> be an attractive choice for users due to the high number of hash
>> collisions.
>> B.2 zpool(1M)
>> B.2.1 Mutable properties
>> Two new mutable pool-wide properties will be added:
>>      zpool set dedupchecksum=<cryptographically strong checksum>
>>      zpool set dedupditto=<number>
>> The first selects the pool-wide default to be used when a dataset's  
>> dedup
>> value is set to 'on' or 'on,verify'. The default value for  
>> dedupchecksum
>> is 'sha256'.
>> The second allows the administrator to select a threshhold afterwhich
>> 2 copies of a block are stored rather than 1. For example, if many
>> duplicate blocks exist deduplication would reduce that count to  
>> just 1;
>> at some threshhold, it becomes desirable to have multiple copies to
>> guard against the multiplied effects of the loss of a single block.
>> The default value is '100'.
>> B.2.2 Statistics
>> Two new read-only pool-wide properties will be added to track
>> deduplication efficacy:
>>      deduptotal      # the amount of deduplicated data on disk
>>      dedupinflated   # deduplicated data had duplicates not be removed
>> With these two properties and the pool's size property one could   
>> compute:
>>      dedup efficacy = dedupinflated / deduptotal
>>      dedup savings = dedupinflated - deduptotal
>>      dedup ratio = (size + dedupinflated) / (size + deduptotal)
>> Note that efficacy measures only data that was a candidate for   
>> deduplication (i.e. on which the dedup dataset property was  
>> enabled)  whereas the ratio measures a similar value for all data  
>> regardless of  whether it was a candidate for deduplication.
>> The 'zpool status' command will be modified to present the size  
>> and  dedup ratio and efficacy for the give pool or pools:
>> # zpool status tank
>>   pool: tank
>>  state: ONLINE
>>   size: 464G
>>  dedup: 1.90x (total) / 5.41x (dedup enabled)
>>     ...
>> C. Man Page Changes
>> The zfs(1M) and zpool(1M) man pages will be modified to include the
>> descriptions above for the new properties as well as an overview of  
>> the
>> deduplication feature.
>> 6. Resources and Schedule
>>    6.4. Steering Committee requested information
>>      6.4.1. Consolidation C-team Name:
>>              OS/Net
>>    6.5. ARC review type: FastTrack
>>    6.6. ARC Exposure: open
>


--
Adam Leventhal, Fishworks                        http://blogs.sun.com/ahl

Reply via email to