Garrett D'Amore wrote:
> Nicolas Williams wrote:
>> On Tue, Oct 20, 2009 at 10:51:29AM +0100, Darren J Moffat wrote:
>>  
>>> Glad that you do offer verify as a choice.   It would be very useful 
>>> to provide some sort of log output for the cases where verify found a 
>>> collision - ie the checksum hashes matched but the verify said they 
>>> were different.  Not useful to end users so it could be a DTrace SDT 
>>> or only in a DEBUG kernel.  If this ever shows up a "hit" when 
>>> dedup=sha256,verify it will make ZFS famous for finding collisions in 
>>> SHA256.
>>>     
>>
>> A collision log for debug purposes would be nice.  But collision stats
>> should be provided in any case because such stats can be useful to
>> estimitating the usefulness of a hash function for this purpose (how
>> much time is spent computing hashes vs. how much time is spent verifying
>> blocks, and how many blocks do collide).
>>
>> Collision stats could also be a useful way to build confidence in a hash
>> function ("look! 0 collisions for SHA-3 candidate X on a 1PB pool with
>> random and real data!").  Of course, a SHA-3 candidate must build
>> confidence by surviving known cryptanalysis techniques + any new ones
>> that cryptographers throw at it, and no collisions in 1PB hardly
>> constitutes proof, but N>0 collisions in 1PB would be likely be
>> worrisome and indicative that additional analysis is needed.  Yes, a
>> flight of fancy, maybe just eye candy ("ah, SHA-256 seems to be working
>> as advertised"), but if so, it'd be cheap eye candy.
>>
>> % zpool get ddhashcolls,ddcollrate rpool
>> NAME   PROPERTY       VALUE                     SOURCE
>> rpool  ddhashcolls    5                         -
>> rpool  ddcollrate     .0135                     -
>> %
>> (One prop would count total collisions ever seen, the other would be a
>> ration of the first and the pool size.)
>>   
> 
> How about just a kstat where it can be located easily for debug, without 
> polluting normal zfs properties?

kstat's don't persist over reboot or pool export/import.

But I agree with Adam this is a future nice to have feature that is more 
about debugging than run time stats not a requirement for dedup's first 
integration.

-- 
Darren J Moffat

Reply via email to