Re: [dm-devel] [BUG] BLKZEROOUT on dm-crypt container cause OOM / kernel panic

2017-08-13 Thread Damien Le Moal
On Sun, 2017-08-13 at 22:47 -0400, Mikulas Patocka wrote:
> 
> On Wed, 9 Aug 2017, h...@lst.de wrote:
> 
> > Does commit 615d22a51c04856efe62af6e1d5b450aaf5cc2c0
> > "block: Fix __blkdev_issue_zeroout loop" fix the issue for you?
> > 
> > --
> > dm-devel mailing list
> > dm-de...@redhat.com
> > https://www.redhat.com/mailman/listinfo/dm-devel
> 
> I think that patch is incorrect. sector_t may be a 32-bit type and 
> nr_sects << 9 may overflow.
> 
> static unsigned int __blkdev_sectors_to_bio_pages(sector_t nr_sects)
> {
>sector_t bytes = (nr_sects << 9) + PAGE_SIZE - 1;
> 
>return min(bytes >> PAGE_SHIFT, (sector_t)BIO_MAX_PAGES);
> }
> 
> Mikulas

Mikulas,

Does the follwing patch fix the problem ?

From 947b3cf41e759b2b23f684e215e651d0c8037f88 Mon Sep 17 00:00:00 2001
From: Damien Le Moal 
Date: Mon, 14 Aug 2017 13:01:16 +0900
Subject: [PATCH] block: Fix __blkdev_sectors_to_bio_pages()

On 32bit systems where sector_t is a 32bits type, the calculation of
bytes may overflow. Use the u64 type for the local calculation to avoid
overflows.

Signed-off-by: Damien Le Moal 
---
 block/blk-lib.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/block/blk-lib.c b/block/blk-lib.c
index 3fe0aec90597..ccf22dba21f0 100644
--- a/block/blk-lib.c
+++ b/block/blk-lib.c
@@ -269,9 +269,9 @@ static int __blkdev_issue_write_zeroes(struct block_device
*bdev,
  */
 static unsigned int __blkdev_sectors_to_bio_pages(sector_t nr_sects)
 {
-   sector_t bytes = (nr_sects << 9) + PAGE_SIZE - 1;
+   u64 bytes = ((u64)nr_sects << 9) + PAGE_SIZE - 1;
 
-   return min(bytes >> PAGE_SHIFT, (sector_t)BIO_MAX_PAGES);
+   return min(bytes >> PAGE_SHIFT, (u64)BIO_MAX_PAGES);
 }
 
 /**
-- 
2.13.4


-- 
Damien Le Moal
Western Digital

Re: [dm-devel] [BUG] BLKZEROOUT on dm-crypt container cause OOM / kernel panic

2017-08-13 Thread Mikulas Patocka


On Wed, 9 Aug 2017, h...@lst.de wrote:

> Does commit 615d22a51c04856efe62af6e1d5b450aaf5cc2c0
> "block: Fix __blkdev_issue_zeroout loop" fix the issue for you?
> 
> --
> dm-devel mailing list
> dm-de...@redhat.com
> https://www.redhat.com/mailman/listinfo/dm-devel

I think that patch is incorrect. sector_t may be a 32-bit type and 
nr_sects << 9 may overflow.

static unsigned int __blkdev_sectors_to_bio_pages(sector_t nr_sects)
{
   sector_t bytes = (nr_sects << 9) + PAGE_SIZE - 1;

   return min(bytes >> PAGE_SHIFT, (sector_t)BIO_MAX_PAGES);
}

Mikulas


Re: [BUG] BLKZEROOUT on dm-crypt container cause OOM / kernel panic

2017-08-10 Thread Tom Yan
Never mind, I guess I was wrong, coz bio_add_page() is always called
with "offset" = 0 here, so:

offset == bv->bv_offset + bv->bv_len

is never matched. Hence,

if (bio->bi_vcnt >= bio->bi_max_vecs)
return 0;

will trigger the if-break.

So maybe this is after all just a dm-crypt issue on handling multiple
bios created with next_bio (a bio chain)?

On 10 August 2017 at 07:27, Tom Yan  wrote:
> I haven't really tested but I was aware of the commit before I send my
> last email. It doesn't seem relevant to be honest, because it doesn't
> change the fact that the inner loop wil only end if the whole request
> has been looped over. So still one big bio.
>
> There are a few things that seem suspicious to me. First of all, the
> inner loop has an if-break at its end that seem to practically do
> nothing, especially in terms of making the inner loop end when the bio
> reach its expected size (BIO_MAX_PAGES?).
>
> bi_size = bio_add_page(bio, ZERO_PAGE(0), sz, 0);
>
> If you take a look at bio_add_page():
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/block/bio.c?h=v4.13-rc4#n820
>
> it will basically always return "len", which is "sz" here, unchanged.
>
> if (bi_size < sz)
> break;
>
> So this pretty much never happens, with two exceptions:
>
> if (WARN_ON_ONCE(bio_flagged(bio, BIO_CLONED)))
> return 0;
>
> which is most likely irrelevant in this case,
>
> if (bio->bi_vcnt >= bio->bi_max_vecs)
> return 0;
>
> which seem to matter in this case. However:
>
> if (page == bv->bv_page &&
> offset == bv->bv_offset + bv->bv_len) {
> bv->bv_len += len;
> goto done;
> }
> ...
> if (bio->bi_vcnt >= bio->bi_max_vecs)
> return 0;
> ...
> bio->bi_vcnt++;
> done:
> bio->bi_iter.bi_size += len;
> return len;
>
> So if the first condition in the above quote is matched, the exception
> would never happen either.
>
> On 9 August 2017 at 21:38, h...@lst.de  wrote:
>> Does commit 615d22a51c04856efe62af6e1d5b450aaf5cc2c0
>> "block: Fix __blkdev_issue_zeroout loop" fix the issue for you?


Re: [BUG] BLKZEROOUT on dm-crypt container cause OOM / kernel panic

2017-08-09 Thread Tom Yan
I haven't really tested but I was aware of the commit before I send my
last email. It doesn't seem relevant to be honest, because it doesn't
change the fact that the inner loop wil only end if the whole request
has been looped over. So still one big bio.

There are a few things that seem suspicious to me. First of all, the
inner loop has an if-break at its end that seem to practically do
nothing, especially in terms of making the inner loop end when the bio
reach its expected size (BIO_MAX_PAGES?).

bi_size = bio_add_page(bio, ZERO_PAGE(0), sz, 0);

If you take a look at bio_add_page():
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/block/bio.c?h=v4.13-rc4#n820

it will basically always return "len", which is "sz" here, unchanged.

if (bi_size < sz)
break;

So this pretty much never happens, with two exceptions:

if (WARN_ON_ONCE(bio_flagged(bio, BIO_CLONED)))
return 0;

which is most likely irrelevant in this case,

if (bio->bi_vcnt >= bio->bi_max_vecs)
return 0;

which seem to matter in this case. However:

if (page == bv->bv_page &&
offset == bv->bv_offset + bv->bv_len) {
bv->bv_len += len;
goto done;
}
...
if (bio->bi_vcnt >= bio->bi_max_vecs)
return 0;
...
bio->bi_vcnt++;
done:
bio->bi_iter.bi_size += len;
return len;

So if the first condition in the above quote is matched, the exception
would never happen either.

On 9 August 2017 at 21:38, h...@lst.de  wrote:
> Does commit 615d22a51c04856efe62af6e1d5b450aaf5cc2c0
> "block: Fix __blkdev_issue_zeroout loop" fix the issue for you?


Re: [dm-devel] [BUG] BLKZEROOUT on dm-crypt container cause OOM / kernel panic

2017-08-09 Thread Ondrej Kozina

On 08/09/2017 03:38 PM, h...@lst.de wrote:

Does commit 615d22a51c04856efe62af6e1d5b450aaf5cc2c0
"block: Fix __blkdev_issue_zeroout loop" fix the issue for you?


I crashed the 4.13-rc4 with the test above (blkdiscard -z on dm-crypt 
dev). Unfortunately I hadn't have my VM configured properly so the only 
info I got before it crashed was basically OOM. I'm going to play with 
it tomorrow and will let you know.


O.


Re: [BUG] BLKZEROOUT on dm-crypt container cause OOM / kernel panic

2017-08-09 Thread h...@lst.de
Does commit 615d22a51c04856efe62af6e1d5b450aaf5cc2c0
"block: Fix __blkdev_issue_zeroout loop" fix the issue for you?