Adam Leventhal wrote: > B.1 zfs(1M) > > The interface for enabling and disabling deduplication is simple and > straightforward, and follows the convention of other similar ZFS > settings. We simply add a new per-dataset property, dedup: > > zfs set dedup=<on | off | checksum>[,verify] > zfs get dedup
I'm happy with this. > The acceptable values for the dedup property are as follows: > > off (the default) > on (see below) > on,verify > verify > sha256 > sha256,verify > fletcher4,verify > fletcher2,verify Given that dedup allows specifying a checksum does this mean that there need not be a relationship between the checksum used for the block on disk (ie the one stored in blkptr_t) and the one used for dedup ? Is this valid: zfs set checksum=fletcher4 tank zfs set dedup=sha256 If so what is stored on disk in the blkptr_t ? I assume it is a fletecher4 stored there. Where is the sha256 checksum stored then ? In the DDT ? Does this mean that deduplication is not using the blkptr checksum at all even if the blkptr checksum and dedup checksum are the same ? When in the ZIO pipeline is the checksum specified with the dedup property calculated ? I'm assuming it is in zio_write_bp_init() after compression and after encryption so it is on the state of the block exactly as it will be written to disk. Can gang blocks be deduplicated ? > The dedup property can be set to any of the cryptographically strong > checksums supported by ZFS (today just sha256). In this mode we rely > on the checksum alone to ensure no data collisions. Alternatively the > dedup property can be set to '<checksum>,verify' in which the given > checksum is used for comparison, the blocks are compared to ensure > against collisions. This is strictly relevant only for non- > cryptographically secure checksums but we offer it as an option for > customers who seek that reassurance. The value of 'on' uses the zpool- > wide default defined by the zpool property dedupchecksum (see B.2.1). Glad that you do offer verify as a choice. It would be very useful to provide some sort of log output for the cases where verify found a collision - ie the checksum hashes matched but the verify said they were different. Not useful to end users so it could be a DTrace SDT or only in a DEBUG kernel. If this ever shows up a "hit" when dedup=sha256,verify it will make ZFS famous for finding collisions in SHA256. > As an explicit request for input from the ARC, our fletcher2 implementation > has been shown to be suboptimal and results in a large number of > collisions (as a result, the default checksum has been changed to > fletcher4). Should 'fletcher2,verify' be permitted as an option for > consistency or should we eliminate that option since it would rarely > be an attractive choice for users due to the high number of hash > collisions. I don't think fletcher2,verify should be provided. > B.2 zpool(1M) > > B.2.1 Mutable properties > > Two new mutable pool-wide properties will be added: > > zpool set dedupchecksum=<cryptographically strong checksum> Why is this needed when we don't have a pool level property for the default checksum or compression or encryption (or any other property that is inherited and has an "on | off | ..." style of setting ? Under what circumstances would this be needed rather than setting dedup to the required value ? Is this a new precendent that all properties with an "on" value should have a pool level property to determine what "on" is ? > zpool set dedupditto=<number> > The second allows the administrator to select a threshhold afterwhich > 2 copies of a block are stored rather than 1. For example, if many > duplicate blocks exist deduplication would reduce that count to just 1; > at some threshhold, it becomes desirable to have multiple copies to > guard against the multiplied effects of the loss of a single block. > The default value is '100'. I think I understand that this needs to be pool wide because this is a SPA level concept not a dataset level one. Is it actually necessary to expose this tunable ? Given there is already a per dataset copies property how does this interact with that ? -- Darren J Moffat