Re: [dm-devel] [BUG] BLKZEROOUT on dm-crypt container cause OOM / kernel panic
On Sun, 2017-08-13 at 22:47 -0400, Mikulas Patocka wrote: > > On Wed, 9 Aug 2017, h...@lst.de wrote: > > > Does commit 615d22a51c04856efe62af6e1d5b450aaf5cc2c0 > > "block: Fix __blkdev_issue_zeroout loop" fix the issue for you? > > > > -- > > dm-devel mailing list > > dm-de...@redhat.com > > https://www.redhat.com/mailman/listinfo/dm-devel > > I think that patch is incorrect. sector_t may be a 32-bit type and > nr_sects << 9 may overflow. > > static unsigned int __blkdev_sectors_to_bio_pages(sector_t nr_sects) > { >sector_t bytes = (nr_sects << 9) + PAGE_SIZE - 1; > >return min(bytes >> PAGE_SHIFT, (sector_t)BIO_MAX_PAGES); > } > > Mikulas Mikulas, Does the follwing patch fix the problem ? From 947b3cf41e759b2b23f684e215e651d0c8037f88 Mon Sep 17 00:00:00 2001 From: Damien Le MoalDate: Mon, 14 Aug 2017 13:01:16 +0900 Subject: [PATCH] block: Fix __blkdev_sectors_to_bio_pages() On 32bit systems where sector_t is a 32bits type, the calculation of bytes may overflow. Use the u64 type for the local calculation to avoid overflows. Signed-off-by: Damien Le Moal --- block/blk-lib.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/block/blk-lib.c b/block/blk-lib.c index 3fe0aec90597..ccf22dba21f0 100644 --- a/block/blk-lib.c +++ b/block/blk-lib.c @@ -269,9 +269,9 @@ static int __blkdev_issue_write_zeroes(struct block_device *bdev, */ static unsigned int __blkdev_sectors_to_bio_pages(sector_t nr_sects) { - sector_t bytes = (nr_sects << 9) + PAGE_SIZE - 1; + u64 bytes = ((u64)nr_sects << 9) + PAGE_SIZE - 1; - return min(bytes >> PAGE_SHIFT, (sector_t)BIO_MAX_PAGES); + return min(bytes >> PAGE_SHIFT, (u64)BIO_MAX_PAGES); } /** -- 2.13.4 -- Damien Le Moal Western Digital
Re: [dm-devel] [BUG] BLKZEROOUT on dm-crypt container cause OOM / kernel panic
On Wed, 9 Aug 2017, h...@lst.de wrote: > Does commit 615d22a51c04856efe62af6e1d5b450aaf5cc2c0 > "block: Fix __blkdev_issue_zeroout loop" fix the issue for you? > > -- > dm-devel mailing list > dm-de...@redhat.com > https://www.redhat.com/mailman/listinfo/dm-devel I think that patch is incorrect. sector_t may be a 32-bit type and nr_sects << 9 may overflow. static unsigned int __blkdev_sectors_to_bio_pages(sector_t nr_sects) { sector_t bytes = (nr_sects << 9) + PAGE_SIZE - 1; return min(bytes >> PAGE_SHIFT, (sector_t)BIO_MAX_PAGES); } Mikulas
Re: [BUG] BLKZEROOUT on dm-crypt container cause OOM / kernel panic
Never mind, I guess I was wrong, coz bio_add_page() is always called with "offset" = 0 here, so: offset == bv->bv_offset + bv->bv_len is never matched. Hence, if (bio->bi_vcnt >= bio->bi_max_vecs) return 0; will trigger the if-break. So maybe this is after all just a dm-crypt issue on handling multiple bios created with next_bio (a bio chain)? On 10 August 2017 at 07:27, Tom Yanwrote: > I haven't really tested but I was aware of the commit before I send my > last email. It doesn't seem relevant to be honest, because it doesn't > change the fact that the inner loop wil only end if the whole request > has been looped over. So still one big bio. > > There are a few things that seem suspicious to me. First of all, the > inner loop has an if-break at its end that seem to practically do > nothing, especially in terms of making the inner loop end when the bio > reach its expected size (BIO_MAX_PAGES?). > > bi_size = bio_add_page(bio, ZERO_PAGE(0), sz, 0); > > If you take a look at bio_add_page(): > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/block/bio.c?h=v4.13-rc4#n820 > > it will basically always return "len", which is "sz" here, unchanged. > > if (bi_size < sz) > break; > > So this pretty much never happens, with two exceptions: > > if (WARN_ON_ONCE(bio_flagged(bio, BIO_CLONED))) > return 0; > > which is most likely irrelevant in this case, > > if (bio->bi_vcnt >= bio->bi_max_vecs) > return 0; > > which seem to matter in this case. However: > > if (page == bv->bv_page && > offset == bv->bv_offset + bv->bv_len) { > bv->bv_len += len; > goto done; > } > ... > if (bio->bi_vcnt >= bio->bi_max_vecs) > return 0; > ... > bio->bi_vcnt++; > done: > bio->bi_iter.bi_size += len; > return len; > > So if the first condition in the above quote is matched, the exception > would never happen either. > > On 9 August 2017 at 21:38, h...@lst.de wrote: >> Does commit 615d22a51c04856efe62af6e1d5b450aaf5cc2c0 >> "block: Fix __blkdev_issue_zeroout loop" fix the issue for you?
Re: [BUG] BLKZEROOUT on dm-crypt container cause OOM / kernel panic
I haven't really tested but I was aware of the commit before I send my last email. It doesn't seem relevant to be honest, because it doesn't change the fact that the inner loop wil only end if the whole request has been looped over. So still one big bio. There are a few things that seem suspicious to me. First of all, the inner loop has an if-break at its end that seem to practically do nothing, especially in terms of making the inner loop end when the bio reach its expected size (BIO_MAX_PAGES?). bi_size = bio_add_page(bio, ZERO_PAGE(0), sz, 0); If you take a look at bio_add_page(): https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/block/bio.c?h=v4.13-rc4#n820 it will basically always return "len", which is "sz" here, unchanged. if (bi_size < sz) break; So this pretty much never happens, with two exceptions: if (WARN_ON_ONCE(bio_flagged(bio, BIO_CLONED))) return 0; which is most likely irrelevant in this case, if (bio->bi_vcnt >= bio->bi_max_vecs) return 0; which seem to matter in this case. However: if (page == bv->bv_page && offset == bv->bv_offset + bv->bv_len) { bv->bv_len += len; goto done; } ... if (bio->bi_vcnt >= bio->bi_max_vecs) return 0; ... bio->bi_vcnt++; done: bio->bi_iter.bi_size += len; return len; So if the first condition in the above quote is matched, the exception would never happen either. On 9 August 2017 at 21:38, h...@lst.dewrote: > Does commit 615d22a51c04856efe62af6e1d5b450aaf5cc2c0 > "block: Fix __blkdev_issue_zeroout loop" fix the issue for you?
Re: [dm-devel] [BUG] BLKZEROOUT on dm-crypt container cause OOM / kernel panic
On 08/09/2017 03:38 PM, h...@lst.de wrote: Does commit 615d22a51c04856efe62af6e1d5b450aaf5cc2c0 "block: Fix __blkdev_issue_zeroout loop" fix the issue for you? I crashed the 4.13-rc4 with the test above (blkdiscard -z on dm-crypt dev). Unfortunately I hadn't have my VM configured properly so the only info I got before it crashed was basically OOM. I'm going to play with it tomorrow and will let you know. O.
Re: [BUG] BLKZEROOUT on dm-crypt container cause OOM / kernel panic
Does commit 615d22a51c04856efe62af6e1d5b450aaf5cc2c0 "block: Fix __blkdev_issue_zeroout loop" fix the issue for you?