On Fri, Oct 31, 2025 at 10:34:36AM +0100, Christoph Hellwig wrote:
> The current code in blk_crypto_fallback_encrypt_bio is inefficient and
> prone to deadlocks under memory pressure: It first walks to pass in
> plaintext bio to see how much of it can fit into a single encrypted
> bio using up to BIO_MAX_VEC PAGE_SIZE segments, and then allocates a
> plaintext clone that fits the size, only to allocate another bio for
> the ciphertext later.  While the plaintext clone uses a bioset to avoid
> deadlocks when allocations could fail, the ciphertex one uses bio_kmalloc
> which is a no-go in the file system I/O path.
> 
> Switch blk_crypto_fallback_encrypt_bio to walk the source plaintext bio
> while consuming bi_iter without cloning it, and instead allocate a
> ciphertext bio at the beginning and whenever we fille up the previous
> one.  The existing bio_set for the plaintext clones is reused for the
> ciphertext bios to remove the deadlock risk.
> 
> Signed-off-by: Christoph Hellwig <[email protected]>
> ---
>  block/blk-crypto-fallback.c | 162 ++++++++++++++----------------------
>  1 file changed, 63 insertions(+), 99 deletions(-)
> 
> diff --git a/block/blk-crypto-fallback.c b/block/blk-crypto-fallback.c
> index 86b27f96051a..1f58010fb437 100644
> --- a/block/blk-crypto-fallback.c
> +++ b/block/blk-crypto-fallback.c
> @@ -152,35 +152,26 @@ static void blk_crypto_fallback_encrypt_endio(struct 
> bio *enc_bio)
>  
>       src_bio->bi_status = enc_bio->bi_status;

There can now be multiple enc_bios completing for the same src_bio, so
this needs something like:

        if (enc_bio->bi_status)
                cmpxchg(&src_bio->bi_status, 0, enc_bio->bi_status);

> -static struct bio *blk_crypto_fallback_clone_bio(struct bio *bio_src)
> +static struct bio *blk_crypto_alloc_enc_bio(struct bio *bio_src,
> +             unsigned int nr_segs)
>  {
> -     unsigned int nr_segs = bio_segments(bio_src);
> -     struct bvec_iter iter;
> -     struct bio_vec bv;
>       struct bio *bio;
>  
> -     bio = bio_kmalloc(nr_segs, GFP_NOIO);
> -     if (!bio)
> -             return NULL;
> -     bio_init_inline(bio, bio_src->bi_bdev, nr_segs, bio_src->bi_opf);
> +     bio = bio_alloc_bioset(bio_src->bi_bdev, nr_segs, bio_src->bi_opf,
> +                     GFP_NOIO, &crypto_bio_split);

Rename crypto_bio_split => enc_bio_set?

> @@ -257,34 +222,22 @@ static void blk_crypto_dun_to_iv(const u64 
> dun[BLK_CRYPTO_DUN_ARRAY_SIZE],
>   */
>  static bool blk_crypto_fallback_encrypt_bio(struct bio **bio_ptr)
>  {

I don't think this patch makes sense by itself, since it leaves the
bio_ptr argument that is used to return a single enc_bio.  It does get
updated later in the series, but it seems that additional change to how
this function is called should go earlier in the series.

> +     /* Encrypt each page in the origin bio */

Maybe origin => source, so that consistent terminology is used.

> +             if (++enc_idx == enc_bio->bi_max_vecs) {
> +                     /*
> +                      * Each encrypted bio will call bio_endio in the
> +                      * completion handler, so ensure the remaining count
> +                      * matches the number of submitted bios.
> +                      */
> +                     bio_inc_remaining(src_bio);
> +                     submit_bio(enc_bio);

The above comment is a bit confusing and could be made clearer.  When we
get here for the first time for example, we increment remaining from 1
to 2.  It doesn't match the number of bios submitted so far, but rather
is one more than it.  The extra one pairs with the submit_bio() outside
the loop.  Maybe consider the following:

                        /*
                         * For each additional encrypted bio submitted,
                         * increment the source bio's remaining count.  Each
                         * encrypted bio's completion handler calls bio_endio on
                         * the source bio, so this keeps the source bio from
                         * completing until the last encrypted bio does.
                         */

> +out_ioerror:
> +     while (enc_idx > 0)
> +             mempool_free(enc_bio->bi_io_vec[enc_idx--].bv_page,
> +                          blk_crypto_bounce_page_pool);
> +     bio_put(enc_bio);
> +     src_bio->bi_status = BLK_STS_IOERR;

This error path doesn't seem correct at all.  It would need to free the
full set of pages in enc_bio, not just the ones initialized so far.  It
would also need to use cmpxchg() to correctly set an error on the
src_bio considering that blk_crypto_fallback_encrypt_endio() be trying
to do it concurrently too, and then call bio_endio() on it.

(It's annoying that encryption errors need to be handled at all.  When I
eventually convert this to use lib/crypto/, the encryption functions are
just going to return void.  But for now this is using the traditional
API, which can fail, so technically errors need to be handled...)

- Eric

Reply via email to