On Mon, Jun 03, 2019 at 04:58:46PM +0200, Johannes Thumshirn wrote: > This patchset add support for adding new checksum types in BTRFS.
V4 looks good to me, with a few minor fixups added to topic branch, including the sha256 patch. As noted this may not be merged and now servers for the testing purposes. > Currently BTRFS only supports CRC32C as data and metadata checksum, which is > good if you only want to detect errors due to data corruption in hardware. > > But CRC32C isn't able cover other use-cases like de-duplication or > cryptographically save data integrity guarantees. > > The following properties made SHA-256 interesting for these use-cases: > - Still considered cryptographically sound > - Reasonably well understood by the security industry > - Result fits into the 32Byte/256Bit we have for the checksum in the on-disk > format > - Small enough collision space to make it feasible for data de-duplication > - Fast enough to calculate and offloadable to crypto hardware via the kernel's > crypto_shash framework. Regarding hw offload, David pointed out that the ahash API would need to be used and that turned out to be infeasible with current btrfs code. I think the only hw-based improvements left are based on CPU instructions (crc32c, SSE, AVX) but that's sufficient. I also think software implementations of the checksum(s) are going to be used in most cases, which kind of makes SHA-3 less appealing to us as it's main point was 'excellent efficiency in hardware implementations' (quoting NIST announcement [1]). As has been suggested, BLAKE2 is for consideration, we only need the kernel module which I'll provide for testing purposes. And the more I know about it, the more I like it so we might have a winner, but the selection is still open. > The patchset also provides mechanisms for plumbing in different hash > algorithms relatively easy. > > This is an intermediate submission, as a) mkfs.btrfs support is still missing > and We'll need that one, briefly checking the progs souces, the same cleanups will be needed there too. > b) David requested to have three hash algorithms, where 1 is crc32c, one > cryptographically secure and one in between. Let me summarize the current satus: for strong hash we have SHA256 and BLAKE2. For the fast hash xxhash and murmur3 have been suggested. Let me add XXH3 and xxh128 for now (they're not finalized yet).