On 10/25/23 12:13, Mikulas Patocka wrote:
> So, I forward this to memory management maintainers.

Hi,

> What do you think? - We have a problem that if dm-crypt allocates pages 
> with order > 3 (PAGE_ALLOC_COSTLY_ORDER), the system occasionally freezes 
> waiting in writeback.

Hmm, so I've checked the backtraces provided and none seems to show the
actual allocating task doing the crypt_alloc_buffer(), do we know what it's
doing? Is it blocked or perhaps spinning in some infinite loop?

> dm-crypt allocates the pages with GFP_NOWAIT | __GFP_HIGHMEM | 
> __GFP_NOMEMALLOC | __GFP_NORETRY | __GFP_NOWARN | __GFP_COMP, so it 
> shouldn't put any pressure on the system. If the allocations fails, it 
> falls back to smaller order allocation or to mempool as a last resort.

Right. I noticed it may also fallback to __GFP_DIRECT_RECLAIM but then it's
only order-0.

> When the freeze happens, there is "349264kB" free memory - so the system 
> is not short on memory.

Yeah, it may still be fragmented, although in [1] the sysqr show memory
report suggests it's not, pages do exist up to MAX_ORDER. Weird.

[1] https://lore.kernel.org/all/ZTiJ3CO8w0jauOzW@mail-itl/

> Should we restrict the dm-crypt allocation size to 
> PAGE_ALLOC_COSTLY_ORDER? Or is it a bug somewhere in memory management 
> system that needs to be fixes there?

Haven't found any. However I'd like to point out some things I noticed in
crypt_alloc_buffer(), although they are probably not related.

> static struct bio *crypt_alloc_buffer(struct dm_crypt_io *io, unsigned int 
> size)
> {
>       struct crypt_config *cc = io->cc;
>       struct bio *clone;
>       unsigned int nr_iovecs = (size + PAGE_SIZE - 1) >> PAGE_SHIFT;
>       gfp_t gfp_mask = GFP_NOWAIT | __GFP_HIGHMEM;
>       unsigned int remaining_size;
>       unsigned int order = MAX_ORDER - 1;
> 
> retry:
>       if (unlikely(gfp_mask & __GFP_DIRECT_RECLAIM))
>               mutex_lock(&cc->bio_alloc_lock);

What if we end up in "goto retry" more than once? I don't see a matching
unlock. Yeah, very unlikely to happen that order-0 in page allocator which
includes __GFP_DIRECT_RECLAIM would fail, but not impossible, and also I see
crypt_page_alloc() for the mempool can fail for another reason, due to a
counter being too high. Looks dangerous.

> 
>       clone = bio_alloc_bioset(cc->dev->bdev, nr_iovecs, io->base_bio->bi_opf,
>                                GFP_NOIO, &cc->bs);
>       clone->bi_private = io;
>       clone->bi_end_io = crypt_endio;
> 
>       remaining_size = size;
> 
>       while (remaining_size) {
>               struct page *pages;
>               unsigned size_to_add;
>               unsigned remaining_order = __fls((remaining_size + PAGE_SIZE - 
> 1) >> PAGE_SHIFT);

Tip: you can use get_order(remaining_size) here.

>               order = min(order, remaining_order);
> 
>               while (order > 0) {

Is this intentionally > 0 and not >= 0? We could still succeed avoiding
mempool with order-0...

>                       pages = alloc_pages(gfp_mask
>                               | __GFP_NOMEMALLOC | __GFP_NORETRY | 
> __GFP_NOWARN | __GFP_COMP,
>                               order);
>                       if (likely(pages != NULL))
>                               goto have_pages;
>                       order--;
>               }
> 
>               pages = mempool_alloc(&cc->page_pool, gfp_mask);
>               if (!pages) {
>                       crypt_free_buffer_pages(cc, clone);
>                       bio_put(clone);
>                       gfp_mask |= __GFP_DIRECT_RECLAIM;
>                       order = 0;
>                       goto retry;
>               }
> 
> have_pages:
>               size_to_add = min((unsigned)PAGE_SIZE << order, remaining_size);
>               __bio_add_page(clone, pages, size_to_add, 0);
>               remaining_size -= size_to_add;
>       }
> 
>       /* Allocate space for integrity tags */
>       if (dm_crypt_integrity_io_alloc(io, clone)) {
>               crypt_free_buffer_pages(cc, clone);
>               bio_put(clone);
>               clone = NULL;
>       }
> 
>       if (unlikely(gfp_mask & __GFP_DIRECT_RECLAIM))
>               mutex_unlock(&cc->bio_alloc_lock);
> 
>       return clone;
> }


Reply via email to