On Fri, Oct 31, 2025 at 10:34:36AM +0100, Christoph Hellwig wrote:
> The current code in blk_crypto_fallback_encrypt_bio is inefficient and
> prone to deadlocks under memory pressure: It first walks to pass in
> plaintext bio to see how much of it can fit into a single encrypted
> bio using up to BIO_MAX_VEC PAGE_SIZE segments, and then allocates a
> plaintext clone that fits the size, only to allocate another bio for
> the ciphertext later. While the plaintext clone uses a bioset to avoid
> deadlocks when allocations could fail, the ciphertex one uses bio_kmalloc
> which is a no-go in the file system I/O path.
>
> Switch blk_crypto_fallback_encrypt_bio to walk the source plaintext bio
> while consuming bi_iter without cloning it, and instead allocate a
> ciphertext bio at the beginning and whenever we fille up the previous
> one. The existing bio_set for the plaintext clones is reused for the
> ciphertext bios to remove the deadlock risk.
>
> Signed-off-by: Christoph Hellwig <[email protected]>
> ---
> block/blk-crypto-fallback.c | 162 ++++++++++++++----------------------
> 1 file changed, 63 insertions(+), 99 deletions(-)
>
> diff --git a/block/blk-crypto-fallback.c b/block/blk-crypto-fallback.c
> index 86b27f96051a..1f58010fb437 100644
> --- a/block/blk-crypto-fallback.c
> +++ b/block/blk-crypto-fallback.c
> @@ -152,35 +152,26 @@ static void blk_crypto_fallback_encrypt_endio(struct
> bio *enc_bio)
>
> src_bio->bi_status = enc_bio->bi_status;
There can now be multiple enc_bios completing for the same src_bio, so
this needs something like:
if (enc_bio->bi_status)
cmpxchg(&src_bio->bi_status, 0, enc_bio->bi_status);
> -static struct bio *blk_crypto_fallback_clone_bio(struct bio *bio_src)
> +static struct bio *blk_crypto_alloc_enc_bio(struct bio *bio_src,
> + unsigned int nr_segs)
> {
> - unsigned int nr_segs = bio_segments(bio_src);
> - struct bvec_iter iter;
> - struct bio_vec bv;
> struct bio *bio;
>
> - bio = bio_kmalloc(nr_segs, GFP_NOIO);
> - if (!bio)
> - return NULL;
> - bio_init_inline(bio, bio_src->bi_bdev, nr_segs, bio_src->bi_opf);
> + bio = bio_alloc_bioset(bio_src->bi_bdev, nr_segs, bio_src->bi_opf,
> + GFP_NOIO, &crypto_bio_split);
Rename crypto_bio_split => enc_bio_set?
> @@ -257,34 +222,22 @@ static void blk_crypto_dun_to_iv(const u64
> dun[BLK_CRYPTO_DUN_ARRAY_SIZE],
> */
> static bool blk_crypto_fallback_encrypt_bio(struct bio **bio_ptr)
> {
I don't think this patch makes sense by itself, since it leaves the
bio_ptr argument that is used to return a single enc_bio. It does get
updated later in the series, but it seems that additional change to how
this function is called should go earlier in the series.
> + /* Encrypt each page in the origin bio */
Maybe origin => source, so that consistent terminology is used.
> + if (++enc_idx == enc_bio->bi_max_vecs) {
> + /*
> + * Each encrypted bio will call bio_endio in the
> + * completion handler, so ensure the remaining count
> + * matches the number of submitted bios.
> + */
> + bio_inc_remaining(src_bio);
> + submit_bio(enc_bio);
The above comment is a bit confusing and could be made clearer. When we
get here for the first time for example, we increment remaining from 1
to 2. It doesn't match the number of bios submitted so far, but rather
is one more than it. The extra one pairs with the submit_bio() outside
the loop. Maybe consider the following:
/*
* For each additional encrypted bio submitted,
* increment the source bio's remaining count. Each
* encrypted bio's completion handler calls bio_endio on
* the source bio, so this keeps the source bio from
* completing until the last encrypted bio does.
*/
> +out_ioerror:
> + while (enc_idx > 0)
> + mempool_free(enc_bio->bi_io_vec[enc_idx--].bv_page,
> + blk_crypto_bounce_page_pool);
> + bio_put(enc_bio);
> + src_bio->bi_status = BLK_STS_IOERR;
This error path doesn't seem correct at all. It would need to free the
full set of pages in enc_bio, not just the ones initialized so far. It
would also need to use cmpxchg() to correctly set an error on the
src_bio considering that blk_crypto_fallback_encrypt_endio() be trying
to do it concurrently too, and then call bio_endio() on it.
(It's annoying that encryption errors need to be handled at all. When I
eventually convert this to use lib/crypto/, the encryption functions are
just going to return void. But for now this is using the traditional
API, which can fail, so technically errors need to be handled...)
- Eric