Nicolas Williams wrote: > On Tue, Oct 20, 2009 at 10:51:29AM +0100, Darren J Moffat wrote: > >> Glad that you do offer verify as a choice. It would be very useful to >> provide some sort of log output for the cases where verify found a >> collision - ie the checksum hashes matched but the verify said they were >> different. Not useful to end users so it could be a DTrace SDT or only >> in a DEBUG kernel. If this ever shows up a "hit" when >> dedup=sha256,verify it will make ZFS famous for finding collisions in >> SHA256. >> > > A collision log for debug purposes would be nice. But collision stats > should be provided in any case because such stats can be useful to > estimitating the usefulness of a hash function for this purpose (how > much time is spent computing hashes vs. how much time is spent verifying > blocks, and how many blocks do collide). > > Collision stats could also be a useful way to build confidence in a hash > function ("look! 0 collisions for SHA-3 candidate X on a 1PB pool with > random and real data!"). Of course, a SHA-3 candidate must build > confidence by surviving known cryptanalysis techniques + any new ones > that cryptographers throw at it, and no collisions in 1PB hardly > constitutes proof, but N>0 collisions in 1PB would be likely be > worrisome and indicative that additional analysis is needed. Yes, a > flight of fancy, maybe just eye candy ("ah, SHA-256 seems to be working > as advertised"), but if so, it'd be cheap eye candy. > > % zpool get ddhashcolls,ddcollrate rpool > NAME PROPERTY VALUE SOURCE > rpool ddhashcolls 5 - > rpool ddcollrate .0135 - > % > > (One prop would count total collisions ever seen, the other would be a > ration of the first and the pool size.) >
How about just a kstat where it can be located easily for debug, without polluting normal zfs properties? - Garrett > Nico >