I think that dedup has a variety of use cases that are all very dependent on your workload. The approach you have here seems to be a quite reasonable one.

I did not see it in the code, but it is great to be able to collect statistics on how effective your hash is and any counters for the extra IO imposed.

Also very useful to have a paranoid mode where when you see a hash collision (dedup candidate), you fall back to a byte-by-byte compare to verify that the the collision is correct. Keeping stats on how often this is a false collision would be quite interesting as well :)

Ric

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to