This is v3 of the multi-data-unit skcipher request series, addressing review feedback from Mikulas Patocka on v2.
v2: https://lore.kernel.org/linux-crypto/[email protected]/ v1: https://lore.kernel.org/linux-crypto/[email protected]/ The series adds a per-tfm "data unit size" to the skcipher API so a caller can submit several data units in one crypto request, mirroring the data_unit_size concept already exposed by struct blk_crypto_config for inline encryption hardware. The first user is dm-crypt, which today issues one skcipher request per sector and so pays a per-sector cost in request allocation, callback dispatch, completion handling, and scatterlist setup. Proof-of-concept performance numbers from the RFC reply [1]: +19% throughput / -40% CPU on a single-core arm64 system with a hardware XTS-AES-256 accelerator running fio 4 KiB sequential writes through dm-crypt, when an out-of-tree arm64 xts driver advertises the new flag. This series itself does not include arch enablement. [1] https://lore.kernel.org/linux-crypto/[email protected]/ Changes since v2 ---------------- Patch 4 (dm-crypt) only. Patches 1-3 are unchanged from v2. - Replace integer division with the equivalent shift, and tighten the size sanity check from "is total < sector_size?" to "is total a multiple of sector_size?". Reject unaligned residues explicitly instead of silently truncating them. The local n_sectors variable used only for a now-redundant !=0 check was dropped — crypt_convert()'s outer while-loop already guarantees iter_in.bi_size > 0 on entry. (Mikulas) - Drop `min(iter_in.bi_size, iter_out.bi_size)` in favour of using iter_in.bi_size directly, with a WARN_ON_ONCE() to flag any future violation of the "iter_in and iter_out describe equally- sized payloads" invariant maintained by crypt_convert_init(). Replaces a silent mask of a real bug with an explicit warning. (Mikulas) Changes since v1 ---------------- Patch 4 only. Addressed Mikulas's review of v1: - Multi-DU scatterlist allocation uses GFP_NOIO | __GFP_NORETRY | __GFP_NOWARN. - On scatterlist allocation failure, return -EAGAIN. crypt_convert() handles -EAGAIN by clearing its local multi_du flag and re-entering the per-sector path for the rest of this crypt_convert() invocation. The per-tfm data_unit_size on the cipher remains set, so subsequent bios (which start a fresh crypt_convert() and re-read cipher_flags) get to try multi-DU again once memory pressure eases. This gives forward progress under total memory exhaustion: the per-sector path uses only cc->req_pool (a mempool with reservoir set up at table-load time) and the inline dmreq->sg_in[]/sg_out[] arrays, never doing any allocation that could fail. - Walk the bio with __bio_for_each_bvec instead of __bio_for_each_segment for folio-friendly SG construction. Design overview (unchanged from v1) ----------------------------------- * Patch 1 adds an `unsigned int data_unit_size` field to `struct crypto_skcipher` (per-tfm: invariant for the consumer's lifetime, set once via `crypto_skcipher_set_data_unit_size()`), plus a capability flag CRYPTO_ALG_SKCIPHER_MULTI_DATA_UNIT in `cra_flags` (type-specific high-byte range, mirroring the CRYPTO_AHASH_ALG_BLOCK_ONLY precedent). `crypto_skcipher_encrypt()` and `crypto_skcipher_decrypt()` validate that `cryptlen` is a positive multiple of `data_unit_size`. The setter rejects sub-blocksize values; algorithm registration rejects the flag for algorithms with `ivsize != 16`. Also exposes `skcipher_walk_data_units()` in <crypto/internal/skcipher.h> as a default per-DU dispatcher for drivers that don't want to roll their own. * Patch 2 lets the generic `xts(...)` template advertise the flag when the inner cipher is synchronous. * Patch 3 extends `testmgr` with a self-comparison test that fires automatically for every alg advertising the flag. * Patch 4 turns dm-crypt on automatically when all of the following hold at table load: skcipher (not aead), tfms_count == 1, IV mode is plain or plain64, no per-sector iv_gen_ops->post() hook, no dm-integrity stacking, and the underlying cipher advertises the capability. This series intentionally does NOT add the capability flag to any arch crypto driver. Arch maintainers can opt in independently in follow-up patches. Verification ------------ A formal regression protocol is included in the project tree (.claude/regression-protocol.md, .claude/run-regression.sh). The v3 reference run reports 12/12 cases PASS: - x86 + arm64 build clean (with and without out-of-tree arch enablement). - checkpatch.pl --strict: clean on all 4 patches. - testmgr self-comparison: PASS for any algorithm advertising the flag (verified end-to-end against an out-of-tree arm64/x86 xts driver during regression). - dm-crypt activation gating: plain/plain64 enabled, essiv:sha256 / plain64be fall back. - dm-crypt round-trip plain64: PASS with multi-DU active. - dm-crypt round-trip essiv:sha256 (per-sector path on multi-DU kernel): PASS. - dm-crypt low-memory (mem=128M): PASS, no OOM kill. - Byte-equivalence: 256 MB of ciphertext written through the multi-DU path is bit-identical to ciphertext written through the per-sector path on an unpatched axboe/for-next baseline (sha256 4913910b1aa6f8859fcb8f4adec20230274993a3ade8f4dd0140a323dc43efc0). The on-disk format is unchanged. - arm64 functional (activation + round-trip) under qemu-aarch64: PASS. The OOM-fallback path (multi-DU helper returns -EAGAIN, caller reverts to per-sector) is verified by inspection: the fallback is two lines in crypt_convert(), the per-sector path uses only the existing mempool reserve and the inline dmreq SG arrays (no allocation that could fail), and there is no shared state between the two paths that could deadlock. Leonid Ravich (4): crypto: skcipher - add per-tfm data_unit_size for batched requests crypto: xts - support multiple data units per request in template crypto: testmgr - exercise multi-data-unit path for skcipher dm crypt: batch all sectors of a bio per crypto request crypto/skcipher.c | 120 ++++++++++++ crypto/testmgr.c | 129 +++++++++++++ crypto/xts.c | 25 ++- drivers/md/dm-crypt.c | 281 ++++++++++++++++++++++++++++- include/crypto/internal/skcipher.h | 34 ++++ include/crypto/skcipher.h | 85 +++++++++ 6 files changed, 665 insertions(+), 9 deletions(-) base-commit: a8cafdf8c949f17c92eca0045532e88ac0dac30d -- 2.47.3
