Re: [dm-devel] [BUG] BLKZEROOUT on dm-crypt container cause OOM / kernel panic

2017-08-13 Thread Damien Le Moal
On Sun, 2017-08-13 at 22:47 -0400, Mikulas Patocka wrote:
> 
> On Wed, 9 Aug 2017, h...@lst.de wrote:
> 
> > Does commit 615d22a51c04856efe62af6e1d5b450aaf5cc2c0
> > "block: Fix __blkdev_issue_zeroout loop" fix the issue for you?
> > 
> > --
> > dm-devel mailing list
> > dm-devel@redhat.com
> > https://www.redhat.com/mailman/listinfo/dm-devel
> 
> I think that patch is incorrect. sector_t may be a 32-bit type and 
> nr_sects << 9 may overflow.
> 
> static unsigned int __blkdev_sectors_to_bio_pages(sector_t nr_sects)
> {
>sector_t bytes = (nr_sects << 9) + PAGE_SIZE - 1;
> 
>return min(bytes >> PAGE_SHIFT, (sector_t)BIO_MAX_PAGES);
> }
> 
> Mikulas

Mikulas,

Does the follwing patch fix the problem ?

>From 947b3cf41e759b2b23f684e215e651d0c8037f88 Mon Sep 17 00:00:00 2001
From: Damien Le Moal 
Date: Mon, 14 Aug 2017 13:01:16 +0900
Subject: [PATCH] block: Fix __blkdev_sectors_to_bio_pages()

On 32bit systems where sector_t is a 32bits type, the calculation of
bytes may overflow. Use the u64 type for the local calculation to avoid
overflows.

Signed-off-by: Damien Le Moal 
---
 block/blk-lib.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/block/blk-lib.c b/block/blk-lib.c
index 3fe0aec90597..ccf22dba21f0 100644
--- a/block/blk-lib.c
+++ b/block/blk-lib.c
@@ -269,9 +269,9 @@ static int __blkdev_issue_write_zeroes(struct block_device
*bdev,
  */
 static unsigned int __blkdev_sectors_to_bio_pages(sector_t nr_sects)
 {
-   sector_t bytes = (nr_sects << 9) + PAGE_SIZE - 1;
+   u64 bytes = ((u64)nr_sects << 9) + PAGE_SIZE - 1;
 
-   return min(bytes >> PAGE_SHIFT, (sector_t)BIO_MAX_PAGES);
+   return min(bytes >> PAGE_SHIFT, (u64)BIO_MAX_PAGES);
 }
 
 /**
-- 
2.13.4


-- 
Damien Le Moal
Western Digital

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


Re: [dm-devel] [BUG] BLKZEROOUT on dm-crypt container cause OOM / kernel panic

2017-08-13 Thread Mikulas Patocka


On Wed, 9 Aug 2017, h...@lst.de wrote:

> Does commit 615d22a51c04856efe62af6e1d5b450aaf5cc2c0
> "block: Fix __blkdev_issue_zeroout loop" fix the issue for you?
> 
> --
> dm-devel mailing list
> dm-devel@redhat.com
> https://www.redhat.com/mailman/listinfo/dm-devel

I think that patch is incorrect. sector_t may be a 32-bit type and 
nr_sects << 9 may overflow.

static unsigned int __blkdev_sectors_to_bio_pages(sector_t nr_sects)
{
   sector_t bytes = (nr_sects << 9) + PAGE_SIZE - 1;

   return min(bytes >> PAGE_SHIFT, (sector_t)BIO_MAX_PAGES);
}

Mikulas

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


Re: [dm-devel] [BUG] BLKZEROOUT on dm-crypt container cause OOM / kernel panic

2017-08-10 Thread Tom Yan
Never mind, I guess I was wrong, coz bio_add_page() is always called
with "offset" = 0 here, so:

offset == bv->bv_offset + bv->bv_len

is never matched. Hence,

if (bio->bi_vcnt >= bio->bi_max_vecs)
return 0;

will trigger the if-break.

So maybe this is after all just a dm-crypt issue on handling multiple
bios created with next_bio (a bio chain)?

On 10 August 2017 at 07:27, Tom Yan  wrote:
> I haven't really tested but I was aware of the commit before I send my
> last email. It doesn't seem relevant to be honest, because it doesn't
> change the fact that the inner loop wil only end if the whole request
> has been looped over. So still one big bio.
>
> There are a few things that seem suspicious to me. First of all, the
> inner loop has an if-break at its end that seem to practically do
> nothing, especially in terms of making the inner loop end when the bio
> reach its expected size (BIO_MAX_PAGES?).
>
> bi_size = bio_add_page(bio, ZERO_PAGE(0), sz, 0);
>
> If you take a look at bio_add_page():
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/block/bio.c?h=v4.13-rc4#n820
>
> it will basically always return "len", which is "sz" here, unchanged.
>
> if (bi_size < sz)
> break;
>
> So this pretty much never happens, with two exceptions:
>
> if (WARN_ON_ONCE(bio_flagged(bio, BIO_CLONED)))
> return 0;
>
> which is most likely irrelevant in this case,
>
> if (bio->bi_vcnt >= bio->bi_max_vecs)
> return 0;
>
> which seem to matter in this case. However:
>
> if (page == bv->bv_page &&
> offset == bv->bv_offset + bv->bv_len) {
> bv->bv_len += len;
> goto done;
> }
> ...
> if (bio->bi_vcnt >= bio->bi_max_vecs)
> return 0;
> ...
> bio->bi_vcnt++;
> done:
> bio->bi_iter.bi_size += len;
> return len;
>
> So if the first condition in the above quote is matched, the exception
> would never happen either.
>
> On 9 August 2017 at 21:38, h...@lst.de  wrote:
>> Does commit 615d22a51c04856efe62af6e1d5b450aaf5cc2c0
>> "block: Fix __blkdev_issue_zeroout loop" fix the issue for you?

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


Re: [dm-devel] [BUG] BLKZEROOUT on dm-crypt container cause OOM / kernel panic

2017-08-10 Thread Tom Yan
I haven't really tested but I was aware of the commit before I send my
last email. It doesn't seem relevant to be honest, because it doesn't
change the fact that the inner loop wil only end if the whole request
has been looped over. So still one big bio.

There are a few things that seem suspicious to me. First of all, the
inner loop has an if-break at its end that seem to practically do
nothing, especially in terms of making the inner loop end when the bio
reach its expected size (BIO_MAX_PAGES?).

bi_size = bio_add_page(bio, ZERO_PAGE(0), sz, 0);

If you take a look at bio_add_page():
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/block/bio.c?h=v4.13-rc4#n820

it will basically always return "len", which is "sz" here, unchanged.

if (bi_size < sz)
break;

So this pretty much never happens, with two exceptions:

if (WARN_ON_ONCE(bio_flagged(bio, BIO_CLONED)))
return 0;

which is most likely irrelevant in this case,

if (bio->bi_vcnt >= bio->bi_max_vecs)
return 0;

which seem to matter in this case. However:

if (page == bv->bv_page &&
offset == bv->bv_offset + bv->bv_len) {
bv->bv_len += len;
goto done;
}
...
if (bio->bi_vcnt >= bio->bi_max_vecs)
return 0;
...
bio->bi_vcnt++;
done:
bio->bi_iter.bi_size += len;
return len;

So if the first condition in the above quote is matched, the exception
would never happen either.

On 9 August 2017 at 21:38, h...@lst.de  wrote:
> Does commit 615d22a51c04856efe62af6e1d5b450aaf5cc2c0
> "block: Fix __blkdev_issue_zeroout loop" fix the issue for you?

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


Re: [dm-devel] [BUG] BLKZEROOUT on dm-crypt container cause OOM / kernel panic

2017-08-09 Thread Ondrej Kozina

On 08/09/2017 03:38 PM, h...@lst.de wrote:

Does commit 615d22a51c04856efe62af6e1d5b450aaf5cc2c0
"block: Fix __blkdev_issue_zeroout loop" fix the issue for you?


I crashed the 4.13-rc4 with the test above (blkdiscard -z on dm-crypt 
dev). Unfortunately I hadn't have my VM configured properly so the only 
info I got before it crashed was basically OOM. I'm going to play with 
it tomorrow and will let you know.


O.

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


Re: [dm-devel] [BUG] BLKZEROOUT on dm-crypt container cause OOM / kernel panic

2017-08-08 Thread Tom Yan
Hi again,

I think I have sort of identified the problem. It appears to me that both the 
block layer and dm-crypt is defected on handling this.

First of all, check out the "fallback" zero out implementation, which is used 
in this case, here:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/block/blk-lib.c?h=v4.12#n309

>From the outer loop, it seem to imply that this should be done in multiple 
>"bio"s, if the request (original "nr_sects") is larger than BIO_MAX_PAGES:

...
while (nr_sects != 0) {
bio = next_bio(bio, min(nr_sects, (sector_t)BIO_MAX_PAGES),
gfp_mask);
...
}
...

However, there is a inner loop:

...
while (nr_sects != 0) {
sz = min((sector_t) PAGE_SIZE >> 9 , nr_sects);
bi_size = bio_add_page(bio, ZERO_PAGE(0), sz << 9, 0);
nr_sects -= bi_size >> 9;
sector += bi_size >> 9;
if (bi_size < (sz << 9))
break;
}
...

which apparently would loop over the whole request on its own, making the outer 
loop a bogus one.

The request ends up being done in a single (huge) bio. When the bio is passed 
on to dm-crypt, it appears that dm-crypt will not split the bio either when it 
allocates buffer for conversion/encryption:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/md/dm-crypt.c?h=v4.12#n1694

which leads to possible enormous uptake of memory, causing OOM / kernel panic.

There seems to be some measure that is suppose to split large bio though:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/md/dm-crypt.c?h=v4.12#n2787

Apparently it is called before kcryptd_crypt_write_convert() / 
crypt_alloc_buffer(). However, I don't really parse dm_accept_partial_bio() (or 
the comment about it) so I don't really know what it actually does or how it 
does it. Neither can I see it helps in reality anyway.

Here is another test case that shows the problem:
https://ptpb.pw/BWWo.png
https://ptpb.pw/aRTE.png

Regards,
Tom Yan


From: Tom Yan
Sent: Monday, August 7, 2017 9:58 AM
To: dm-devel@redhat.com
Subject: [BUG] BLKZEROOUT on dm-crypt container cause OOM / kernel panic
    
Hi all,

When I do the following:

cryptsetup open /dev/sdX[Y] rand --type plain --key-file /dev/random;
blkdiscard -z /dev/mapper/rand

Some OOM killings and a kernel panic occur. Here is a screenshot:
https://gitlab.com/cryptsetup/cryptsetup/uploads/207ffdada52f3172f54a014c67159625/DSC_0129.JPGhttps://gitlab.com/cryptsetup/cryptsetup/uploads/207ffdada52f3172f54a014c67159625/DSC_0129.JPG
 


Similar issue is NOT found in doing `cat /dev/zero > /dev/mapper/rand`, or 
`blkdiscard -z /dev/sdX[Y]` (without opening a dm-crypt on it), on the same 
hardware and configuration.

Note that `blkdiscard -z` does not trigger the BLKDISCARD ioctl but the 
BLKZEROOUT ioctl. And the devices I have been testing on are USB devices that 
does NOT support WRITE SAME or UNMAP.

Regards,
Tom Yan

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


Re: [dm-devel] [BUG] BLKZEROOUT on dm-crypt container cause OOM / kernel panic

2017-08-08 Thread Tom Yan
(Sorry for the spam. Having hard time moving the thread from
outlook.com to gmail.com for linux-block. Please reply to this, thank
you.)

Hi again,

I think I have sort of identified the problem. It appears to me that
both the block layer and dm-crypt is defected on handling this.

First of all, check out the "fallback" zero out implementation, which
is used in this case, here:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/block/blk-lib.c?h=v4.12#n309

>From the outer loop, it seem to imply that this should be done in
multiple "bio"s, if the request (original "nr_sects") is larger than
BIO_MAX_PAGES:

...
while (nr_sects != 0) {
bio = next_bio(bio, min(nr_sects, (sector_t)BIO_MAX_PAGES),
gfp_mask);
...
}
...

However, there is a inner loop:

...
while (nr_sects != 0) {
sz = min((sector_t) PAGE_SIZE >> 9 , nr_sects);
bi_size = bio_add_page(bio, ZERO_PAGE(0), sz << 9, 0);
nr_sects -= bi_size >> 9;
sector += bi_size >> 9;
if (bi_size < (sz << 9))
break;
}
...

which apparently would loop over the whole request on its own, making
the outer loop a bogus one.

The request ends up being done in a single (huge) bio. When the bio is
passed on to dm-crypt, it appears that dm-crypt will not split the bio
either when it allocates buffer for conversion/encryption:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/md/dm-crypt.c?h=v4.12#n1694

which leads to possible enormous uptake of memory, causing OOM / kernel panic.

There seems to be some measure that is suppose to split large bio though:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/md/dm-crypt.c?h=v4.12#n2787

Apparently it is called before kcryptd_crypt_write_convert() /
crypt_alloc_buffer(). However, I don't really parse
dm_accept_partial_bio() (or the comment about it) so I don't really
know what it actually does or how it does it. Neither can I see it
helps in reality anyway.

Here is another test case that shows the problem:
https://ptpb.pw/BWWo.png
https://ptpb.pw/aRTE.png

Regards,
Tom Yan


> From: Tom Yan
> Sent: Monday, August 7, 2017 9:58 AM
> To: dm-devel@redhat.com
> Subject: [BUG] BLKZEROOUT on dm-crypt container cause OOM / kernel panic
>
> Hi all,
>
> When I do the following:
>
> cryptsetup open /dev/sdX[Y] rand --type plain --key-file /dev/random;
> blkdiscard -z /dev/mapper/rand
>
> Some OOM killings and a kernel panic occur. Here is a screenshot:
> https://gitlab.com/cryptsetup/cryptsetup/uploads/207ffdada52f3172f54a014c67159625/DSC_0129.JPG
>
> Similar issue is NOT found in doing `cat /dev/zero > /dev/mapper/rand`, or 
> `blkdiscard -z /dev/sdX[Y]` (without opening a dm-crypt on it), on the same 
> hardware and configuration.
>
> Note that `blkdiscard -z` does not trigger the BLKDISCARD ioctl but the 
> BLKZEROOUT ioctl. And the devices I have been testing on are USB devices that 
> does NOT support WRITE SAME or UNMAP.
>
> Regards,
> Tom Yan

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


Re: [dm-devel] [BUG] BLKZEROOUT on dm-crypt container cause OOM / kernel panic

2017-08-08 Thread Tom Yan
Hi again,

I think I have sort of identified the problem. It appears to me that both t=
he block layer and dm-crypt is defected on handling this.

First of all, check out the "fallback" zero out implementation, which is us=
ed in this case, here:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/blo=
ck/blk-lib.c?h=3Dv4.12#n309

>From the outer loop, it seem to imply that this should be done in multiple =
"bio"s, if the request (original "nr_sects") is larger than BIO_MAX_PAGES:

...
while (nr_sects !=3D 0) {
bio =3D next_bio(bio, min(nr_sects, (sector_t)BIO_MAX_PAGES),
gfp_mask);
...
}
...

However, there is a inner loop:

...
while (nr_sects !=3D 0) {
sz =3D min((sector_t) PAGE_SIZE >> 9 , nr_sects);
bi_size =3D bio_add_page(bio, ZERO_PAGE(0), sz << 9, 0);
nr_sects -=3D bi_size >> 9;
sector +=3D bi_size >> 9;
if (bi_size < (sz << 9))
break;
}
...

which apparently would loop over the whole request on its own, making the o=
uter loop a bogus one.

The request ends up being done in a single (huge) bio. When the bio is pass=
ed on to dm-crypt, it appears that dm-crypt will not split the bio either w=
hen it allocates buffer for conversion/encryption:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/dri=
vers/md/dm-crypt.c?h=3Dv4.12#n1694

which leads to possible enormous uptake of memory, causing OOM / kernel pan=
ic.

There seems to be some measure that is suppose to split large bio though:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/dri=
vers/md/dm-crypt.c?h=3Dv4.12#n2787

Apparently it is called before kcryptd_crypt_write_convert() / crypt_alloc_=
buffer(). However, I don't really parse dm_accept_partial_bio() (or the com=
ment about it) so I don't really know what it actually does or how it does =
it. Neither can I see it helps in reality anyway.

Here is another test case that shows the problem:
https://ptpb.pw/BWWo.png
https://ptpb.pw/aRTE.png

Regards,
Tom Yan


> From: Tom Yan
> Sent: Monday, August 7, 2017 9:58 AM
> To: dm-devel@redhat.com
> Subject: [BUG] BLKZEROOUT on dm-crypt container cause OOM / kernel panic
>
> Hi all,
>
> When I do the following:
>
> cryptsetup open /dev/sdX[Y] rand --type plain --key-file /dev/random;
> blkdiscard -z /dev/mapper/rand
>
> Some OOM killings and a kernel panic occur. Here is a screenshot:
> https://gitlab.com/cryptsetup/cryptsetup/uploads/207ffdada52f3172f54a014c67=
> 159625/DSC_0129.JPGhttps://gitlab.com/cryptsetup/cryptsetup/uploads/207ffda=
> da52f3172f54a014c67159625/DSC_0129.JPG=20
>
>
> Similar issue is NOT found in doing `cat /dev/zero > /dev/mapper/rand`, or =
> `blkdiscard -z /dev/sdX[Y]` (without opening a dm-crypt on it), on the same=
>  hardware and configuration.
>
> Note that `blkdiscard -z` does not trigger the BLKDISCARD ioctl but the BLK=
> ZEROOUT ioctl. And the devices I have been testing on are USB devices that =
> does NOT support WRITE SAME or UNMAP.
>
> Regards,
> Tom Yan

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel