Re: [PATCH] Fix nasty 32-bit overflow bug in buffer i/o code.

2014-09-22 Thread Anton Altaparmakov
Hi Linus,

On 22 Sep 2014, at 16:33, Linus Torvalds  wrote:
> On Mon, Sep 22, 2014 at 8:29 AM, Anton Altaparmakov  wrote:
>> 
>> You could do "block & ~(sector_t)(size - 1)" instead of "(sector_t)index << 
>> sizebits" if you prefer but not sure that is an improvement!
> 
> No, it would be even worse. Something like
> 
>  block & ~(sector_t)((size >> 9) - 1)
> 
> because block is the sector number (ie 512-byte) and size is in bytes.

Oops, sorry.  But I think you got it wrong, too as you are ignoring the 
PAGE_SIZE - as was I but it is what we need to align to in addition to the 
problem of "size" being in bytes.  So I think the correct mask is actually 
based on sizebits which reflects the number of blocks per page thus:

block & ~(sector_t)((1 << sizebits) - 1)

In any case the shift is the lesser evil I think as it is at least obviously 
correct whilst getting the right mask has taken us a few iterations of 
correcting each other! (-:

PS. Thank you for taking my patch and correcting the misleading description!

Best regards,

Anton

>   Linus

-- 
Anton Altaparmakov  (replace at with @)
University of Cambridge Information Services, Roger Needham Building
7 JJ Thomson Avenue, Cambridge, CB3 0RB, UK

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Fix nasty 32-bit overflow bug in buffer i/o code.

2014-09-22 Thread Linus Torvalds
On Mon, Sep 22, 2014 at 8:29 AM, Anton Altaparmakov  wrote:
>
> You could do "block & ~(sector_t)(size - 1)" instead of "(sector_t)index << 
> sizebits" if you prefer but not sure that is an improvement!

No, it would be even worse. Something like

  block & ~(sector_t)((size >> 9) - 1)

because block is the sector number (ie 512-byte) and size is in bytes.

   Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Fix nasty 32-bit overflow bug in buffer i/o code.

2014-09-22 Thread Anton Altaparmakov
Hi Linus,

On 22 Sep 2014, at 16:18, Linus Torvalds  wrote:
> On Sun, Sep 21, 2014 at 5:53 PM, Anton Altaparmakov  wrote:
>> 
>> This patch fixes this issue by type casting "index" to sector_t before
>> doing the left shift.
> 
> Ugh. Does the simpler patch to just pass in "block" work as well?

That doesn't work because nothing rounds down "block" to the first block in the 
page and init_page_buffers() requires "block" to be the first block in the page.

The shift right followed by shift left achieves the "rounding down" required.

You could do "block & ~(sector_t)(size - 1)" instead of "(sector_t)index << 
sizebits" if you prefer but not sure that is an improvement!

Best regards,

Anton

>Linus
> 


-- 
Anton Altaparmakov  (replace at with @)
University of Cambridge Information Services, Roger Needham Building
7 JJ Thomson Avenue, Cambridge, CB3 0RB, UK

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Fix nasty 32-bit overflow bug in buffer i/o code.

2014-09-22 Thread Linus Torvalds
On Mon, Sep 22, 2014 at 8:18 AM, Linus Torvalds
 wrote:
>
> Ugh. Does the simpler patch to just pass in "block" work as well?

Ugh, never mind, no it does not. It doesn't truncate the sectors to
the beginning of a page boundary.

Will take the patch as-is (will edit the commit message that seems a
bit inaccurate).

 Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Fix nasty 32-bit overflow bug in buffer i/o code.

2014-09-22 Thread Linus Torvalds
On Sun, Sep 21, 2014 at 5:53 PM, Anton Altaparmakov  wrote:
>
> This patch fixes this issue by type casting "index" to sector_t before
> doing the left shift.

Ugh. Does the simpler patch to just pass in "block" work as well?

Linus
 fs/buffer.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/fs/buffer.c b/fs/buffer.c
index 8f05111bbb8b..f32d6a3cff38 100644
--- a/fs/buffer.c
+++ b/fs/buffer.c
@@ -1021,8 +1021,7 @@ grow_dev_page(struct block_device *bdev, sector_t block,
if (page_has_buffers(page)) {
bh = page_buffers(page);
if (bh->b_size == size) {
-   end_block = init_page_buffers(page, bdev,
-   index << sizebits, size);
+   end_block = init_page_buffers(page, bdev, block, size);
goto done;
}
if (!try_to_free_buffers(page))
@@ -1043,7 +1042,7 @@ grow_dev_page(struct block_device *bdev, sector_t block,
 */
spin_lock(>i_mapping->private_lock);
link_dev_buffers(page, bh);
-   end_block = init_page_buffers(page, bdev, index << sizebits, size);
+   end_block = init_page_buffers(page, bdev, block, size);
spin_unlock(>i_mapping->private_lock);
 done:
ret = (block < end_block) ? 1 : -ENXIO;


Re: [PATCH] Fix nasty 32-bit overflow bug in buffer i/o code.

2014-09-22 Thread Anton Altaparmakov
Hi,

On 22 Sep 2014, at 11:36, Hugh Dickins  wrote:
> On Mon, 22 Sep 2014, Anton Altaparmakov wrote:
>> On 22 Sep 2014, at 05:43, Hugh Dickins  wrote:
>>> On Mon, 22 Sep 2014, Anton Altaparmakov wrote:
 Any code that uses __getblk() and thus bread(), breadahead(), sb_bread(), 
 sb_breadahead(), sb_getblk(), and calls it using a 64-bit block on a 
 32-bit arch (where "long" is 32-bit) causes an inifinite loop in 
 __getblk_slow() with an infinite stream of errors logged to dmesg like 
 this:
 
 __find_get_block_slow() failed. block=6740375944, b_blocknr=2445408648
 b_state=0x0020, b_size=512
 device sda1 blocksize: 512
 
 Note how in hex block is 0x191C1F988 and b_blocknr is 0x91C1F988 i.e. the 
 top 32-bits are missing (in this case the 0x1 at the top).
 
 This is because grow_dev_page() was broken in commit 676ce6d5ca30: "block: 
 replace __getblk_slow misfix by grow_dev_page fix" by Hugh Dickins so that 
 it now has a 32-bit overflow due to shifting the block value to the right 
 so it fits in 32-bits and storing the result in pgoff_t variable which is 
 32-bit and then passing the pgoff_t variable left-shifted as the block 
 number which causes the top bits to get lost as the pgoff_t is not type 
 cast to sector_t / 64-bit before the shift.
 
 This patch fixes this issue by type casting "index" to sector_t before 
 doing the left shift.
 
 Note this is not a theoretical bug but has been seen in the field on a 
 4TiB hard drive with logical sector size 512 bytes.
 
 This patch has been verified to fix the infinite loop problem on 3.17-rc5 
 kernel using a 4TB disk image mounted using "-o loop".  Without this patch 
 doing a "find /nt" where /nt is an NTFS volume causes the inifinite loop 
 100% reproducibly whilst with the patch it works fine as expected.
 
 Signed-off-by: Anton Altaparmakov 
 Cc: sta...@vger.kernel.org # 3.6 3.8 3.10 3.12 3.14 3.16
>>> 
>>> Acked-by: Hugh Dickins 
>>> 
>>> Yes indeed, that's bad, and entirely my fault (though it took me a while
>>> to see how the "block = index << sizebits" which was there before was okay,
>> 
>> Ouch.  I failed to see that (I admit I did not pay too much attention to the 
>> original code - I just used "git blame" to find out which commit put that 
>> code in place - my bad).  It was never ok!  I went back to 2.6.12-rc2 
>> (original commit to git Linus' kernel repository) and that is also affected. 
>>  That implies this is the first time anyone has used a 4TiB disk with 512 
>> byte sectors with NTFS where Windows had previously created a file/directory 
>> with an attribute list attribute in it.  -  That is the only metadata type 
>> we use sb_bread() for and the disk image from the customer does contain it 
>> and I verified that is where the inifinite loop comes from.
> 
> Ow, that line needs some truncating itself :)
> 
> You generously interpret my words as seeing that for myself.
> No, thank you for following up: I had persuaded myself when writing
> before, that index would be promoted from pgoff_t to sector_t before
> shifting in the original assignment, but not when I passed it as arg.
> 
> I've resorted to writing a proggy to check, and it looks like I was
> earlier confused, and you are right, and that code was "always" wrong.
> 
> Not much use of 4TiB disks on 32-bit kernels I suppose.

Actually on embedded devices like broadband routers/home network NAS boxes/PVR 
boxes, etc it is very common to have 32-bit (on ARM/MIPS or more recently also 
x86) and more and more people are plugging in larger disks to store their 
music/videos/movies, etc...  So this is something that is on the up and going 
to become an increasing problem as large disks get cheaper and cheaper.

>> So it appears sb_bread() and friends have always been broken on 32-bit 
>> architectures when accessing blocks outside the range 2^32 - 1 and it 
>> appears google finds lots of occurrences of such infinite loops being 
>> reported but the fixes have always been to not make the calls in the first 
>> place.  No-one seems to have recognized that there is a genuine problem here 
>> before.
>> 
>> Still my patch stands correct and should be applied to all kernel versions 
>> that have your patch and older kernels should have an equivalent patch 
>> applied to fix them, too for people who run very old kernels.
> 
> Yes, though I'm now uncertain whether your patch is a bugfix or an
> enhancement.  Definitely a bugfix, given the infinite loops.  But the
> oldest code ("index = block >> sizebits; block = index << sizebits;")
> is so clearly trying to truncate block, that I wonder what departing
> from that might be letting us in for.

Having looked at it, I think I understand the original code and intention - it 
is not truncating the top bits, it is truncating the bottom bits, i.e. it is 
rounding down so that "block" becomes the first block 

Re: [PATCH] Fix nasty 32-bit overflow bug in buffer i/o code.

2014-09-22 Thread Hugh Dickins
On Mon, 22 Sep 2014, Anton Altaparmakov wrote:

> Hi Hugh,
> 
> On 22 Sep 2014, at 05:43, Hugh Dickins  wrote:
> > On Mon, 22 Sep 2014, Anton Altaparmakov wrote:
> >> Any code that uses __getblk() and thus bread(), breadahead(), sb_bread(), 
> >> sb_breadahead(), sb_getblk(), and calls it using a 64-bit block on a 
> >> 32-bit arch (where "long" is 32-bit) causes an inifinite loop in 
> >> __getblk_slow() with an infinite stream of errors logged to dmesg like 
> >> this:
> >> 
> >> __find_get_block_slow() failed. block=6740375944, b_blocknr=2445408648
> >> b_state=0x0020, b_size=512
> >> device sda1 blocksize: 512
> >> 
> >> Note how in hex block is 0x191C1F988 and b_blocknr is 0x91C1F988 i.e. the 
> >> top 32-bits are missing (in this case the 0x1 at the top).
> >> 
> >> This is because grow_dev_page() was broken in commit 676ce6d5ca30: "block: 
> >> replace __getblk_slow misfix by grow_dev_page fix" by Hugh Dickins so that 
> >> it now has a 32-bit overflow due to shifting the block value to the right 
> >> so it fits in 32-bits and storing the result in pgoff_t variable which is 
> >> 32-bit and then passing the pgoff_t variable left-shifted as the block 
> >> number which causes the top bits to get lost as the pgoff_t is not type 
> >> cast to sector_t / 64-bit before the shift.
> >> 
> >> This patch fixes this issue by type casting "index" to sector_t before 
> >> doing the left shift.
> >> 
> >> Note this is not a theoretical bug but has been seen in the field on a 
> >> 4TiB hard drive with logical sector size 512 bytes.
> >> 
> >> This patch has been verified to fix the infinite loop problem on 3.17-rc5 
> >> kernel using a 4TB disk image mounted using "-o loop".  Without this patch 
> >> doing a "find /nt" where /nt is an NTFS volume causes the inifinite loop 
> >> 100% reproducibly whilst with the patch it works fine as expected.
> >> 
> >> Signed-off-by: Anton Altaparmakov 
> >> Cc: sta...@vger.kernel.org # 3.6 3.8 3.10 3.12 3.14 3.16
> > 
> > Acked-by: Hugh Dickins 
> > 
> > Yes indeed, that's bad, and entirely my fault (though it took me a while
> > to see how the "block = index << sizebits" which was there before was okay,
> 
> Ouch.  I failed to see that (I admit I did not pay too much attention to the 
> original code - I just used "git blame" to find out which commit put that 
> code in place - my bad).  It was never ok!  I went back to 2.6.12-rc2 
> (original commit to git Linus' kernel repository) and that is also affected.  
> That implies this is the first time anyone has used a 4TiB disk with 512 byte 
> sectors with NTFS where Windows had previously created a file/directory with 
> an attribute list attribute in it.  -  That is the only metadata type we use 
> sb_bread() for and the disk image from the customer does contain it and I 
> verified that is where the inifinite loop comes from.

Ow, that line needs some truncating itself :)

You generously interpret my words as seeing that for myself.
No, thank you for following up: I had persuaded myself when writing
before, that index would be promoted from pgoff_t to sector_t before
shifting in the original assignment, but not when I passed it as arg.

I've resorted to writing a proggy to check, and it looks like I was
earlier confused, and you are right, and that code was "always" wrong.

Not much use of 4TiB disks on 32-bit kernels I suppose.

> 
> So it appears sb_bread() and friends have always been broken on 32-bit 
> architectures when accessing blocks outside the range 2^32 - 1 and it appears 
> google finds lots of occurrences of such infinite loops being reported but 
> the fixes have always been to not make the calls in the first place.  No-one 
> seems to have recognized that there is a genuine problem here before.
> 
> Still my patch stands correct and should be applied to all kernel versions 
> that have your patch and older kernels should have an equivalent patch 
> applied to fix them, too for people who run very old kernels.

Yes, though I'm now uncertain whether your patch is a bugfix or an
enhancement.  Definitely a bugfix, given the infinite loops.  But the
oldest code ("index = block >> sizebits; block = index << sizebits;")
is so clearly trying to truncate block, that I wonder what departing
from that might be letting us in for.

I see Andrew did 2.6.19's intervening e5657933863f ("grow_buffers()
infinite loop fix"): let's hope that he has a clearer head in the
morning than I have now - there's a chance that he might think it
safer to extend that check to exclude your case, than take your patch.

I hope not, that would not please you; but right now I'm ruling
myself out of grasping the issues here!

Hugh

> 
> > but my passing "index << sizebits" as arg to function very much not okay).
> > Thank you, Anton.
> 
> You are welcome.
> 
> > But I see my commit was marked for stable 3.0 3.2 3.4 3.5: so your fix
> > is needed in 3.2 and 3.4 longterm also (the others now beyond life).
> 
> You are quite right.  It 

Re: [PATCH] Fix nasty 32-bit overflow bug in buffer i/o code.

2014-09-22 Thread Anton Altaparmakov
Hi Hugh,

On 22 Sep 2014, at 05:43, Hugh Dickins  wrote:
> On Mon, 22 Sep 2014, Anton Altaparmakov wrote:
>> Any code that uses __getblk() and thus bread(), breadahead(), sb_bread(), 
>> sb_breadahead(), sb_getblk(), and calls it using a 64-bit block on a 
>> 32-bit arch (where "long" is 32-bit) causes an inifinite loop in 
>> __getblk_slow() with an infinite stream of errors logged to dmesg like 
>> this:
>> 
>> __find_get_block_slow() failed. block=6740375944, b_blocknr=2445408648
>> b_state=0x0020, b_size=512
>> device sda1 blocksize: 512
>> 
>> Note how in hex block is 0x191C1F988 and b_blocknr is 0x91C1F988 i.e. the 
>> top 32-bits are missing (in this case the 0x1 at the top).
>> 
>> This is because grow_dev_page() was broken in commit 676ce6d5ca30: "block: 
>> replace __getblk_slow misfix by grow_dev_page fix" by Hugh Dickins so that 
>> it now has a 32-bit overflow due to shifting the block value to the right 
>> so it fits in 32-bits and storing the result in pgoff_t variable which is 
>> 32-bit and then passing the pgoff_t variable left-shifted as the block 
>> number which causes the top bits to get lost as the pgoff_t is not type 
>> cast to sector_t / 64-bit before the shift.
>> 
>> This patch fixes this issue by type casting "index" to sector_t before 
>> doing the left shift.
>> 
>> Note this is not a theoretical bug but has been seen in the field on a 
>> 4TiB hard drive with logical sector size 512 bytes.
>> 
>> This patch has been verified to fix the infinite loop problem on 3.17-rc5 
>> kernel using a 4TB disk image mounted using "-o loop".  Without this patch 
>> doing a "find /nt" where /nt is an NTFS volume causes the inifinite loop 
>> 100% reproducibly whilst with the patch it works fine as expected.
>> 
>> Signed-off-by: Anton Altaparmakov 
>> Cc: sta...@vger.kernel.org # 3.6 3.8 3.10 3.12 3.14 3.16
> 
> Acked-by: Hugh Dickins 
> 
> Yes indeed, that's bad, and entirely my fault (though it took me a while
> to see how the "block = index << sizebits" which was there before was okay,

Ouch.  I failed to see that (I admit I did not pay too much attention to the 
original code - I just used "git blame" to find out which commit put that code 
in place - my bad).  It was never ok!  I went back to 2.6.12-rc2 (original 
commit to git Linus' kernel repository) and that is also affected.  That 
implies this is the first time anyone has used a 4TiB disk with 512 byte 
sectors with NTFS where Windows had previously created a file/directory with an 
attribute list attribute in it.  -  That is the only metadata type we use 
sb_bread() for and the disk image from the customer does contain it and I 
verified that is where the inifinite loop comes from.

So it appears sb_bread() and friends have always been broken on 32-bit 
architectures when accessing blocks outside the range 2^32 - 1 and it appears 
google finds lots of occurrences of such infinite loops being reported but the 
fixes have always been to not make the calls in the first place.  No-one seems 
to have recognized that there is a genuine problem here before.

Still my patch stands correct and should be applied to all kernel versions that 
have your patch and older kernels should have an equivalent patch applied to 
fix them, too for people who run very old kernels.

> but my passing "index << sizebits" as arg to function very much not okay).
> Thank you, Anton.

You are welcome.

> But I see my commit was marked for stable 3.0 3.2 3.4 3.5: so your fix
> is needed in 3.2 and 3.4 longterm also (the others now beyond life).

You are quite right.  It needs to go back to all those kernels, too.  Thank you!

Best regards,

Anton

> Hugh
> 
>> ---
>> 
>> Linus, can you please apply this?  Alternatively, Andrew, can you please 
>> pick this up and send it to Linus?
>> 
>> It would be good it it can be applied for 3.17 as well as to all stable 
>> kernels >= 3.6 as they are all broken!
>> 
>> Thanks a lot in advance!
>> 
>> Best regards,
>> 
>>  Anton
>> -- 
>> Anton Altaparmakov  (replace at with @)
>> Unix Support, Computing Service, University of Cambridge, CB2 3QH, UK
>> Linux NTFS maintainer, http://www.linux-ntfs.org/
>> 
>> diff --git a/fs/buffer.c b/fs/buffer.c
>> index 8f05111..3588a80 100644
>> --- a/fs/buffer.c
>> +++ b/fs/buffer.c
>> @@ -1022,7 +1022,8 @@ grow_dev_page(struct block_device *bdev, sector_t 
>> block,
>>  bh = page_buffers(page);
>>  if (bh->b_size == size) {
>>  end_block = init_page_buffers(page, bdev,
>> -index << sizebits, size);
>> +(sector_t)index << sizebits,
>> +size);
>>  goto done;
>>  }
>>  if (!try_to_free_buffers(page))
>> @@ -1043,7 +1044,8 @@ grow_dev_page(struct block_device *bdev, sector_t 
>> block,
>>   */
>>  spin_lock(>i_mapping->private_lock);
>>  

Re: [PATCH] Fix nasty 32-bit overflow bug in buffer i/o code.

2014-09-22 Thread Anton Altaparmakov
Hi Hugh,

On 22 Sep 2014, at 05:43, Hugh Dickins hu...@google.com wrote:
 On Mon, 22 Sep 2014, Anton Altaparmakov wrote:
 Any code that uses __getblk() and thus bread(), breadahead(), sb_bread(), 
 sb_breadahead(), sb_getblk(), and calls it using a 64-bit block on a 
 32-bit arch (where long is 32-bit) causes an inifinite loop in 
 __getblk_slow() with an infinite stream of errors logged to dmesg like 
 this:
 
 __find_get_block_slow() failed. block=6740375944, b_blocknr=2445408648
 b_state=0x0020, b_size=512
 device sda1 blocksize: 512
 
 Note how in hex block is 0x191C1F988 and b_blocknr is 0x91C1F988 i.e. the 
 top 32-bits are missing (in this case the 0x1 at the top).
 
 This is because grow_dev_page() was broken in commit 676ce6d5ca30: block: 
 replace __getblk_slow misfix by grow_dev_page fix by Hugh Dickins so that 
 it now has a 32-bit overflow due to shifting the block value to the right 
 so it fits in 32-bits and storing the result in pgoff_t variable which is 
 32-bit and then passing the pgoff_t variable left-shifted as the block 
 number which causes the top bits to get lost as the pgoff_t is not type 
 cast to sector_t / 64-bit before the shift.
 
 This patch fixes this issue by type casting index to sector_t before 
 doing the left shift.
 
 Note this is not a theoretical bug but has been seen in the field on a 
 4TiB hard drive with logical sector size 512 bytes.
 
 This patch has been verified to fix the infinite loop problem on 3.17-rc5 
 kernel using a 4TB disk image mounted using -o loop.  Without this patch 
 doing a find /nt where /nt is an NTFS volume causes the inifinite loop 
 100% reproducibly whilst with the patch it works fine as expected.
 
 Signed-off-by: Anton Altaparmakov ai...@cantab.net
 Cc: sta...@vger.kernel.org # 3.6 3.8 3.10 3.12 3.14 3.16
 
 Acked-by: Hugh Dickins hu...@google.com
 
 Yes indeed, that's bad, and entirely my fault (though it took me a while
 to see how the block = index  sizebits which was there before was okay,

Ouch.  I failed to see that (I admit I did not pay too much attention to the 
original code - I just used git blame to find out which commit put that code 
in place - my bad).  It was never ok!  I went back to 2.6.12-rc2 (original 
commit to git Linus' kernel repository) and that is also affected.  That 
implies this is the first time anyone has used a 4TiB disk with 512 byte 
sectors with NTFS where Windows had previously created a file/directory with an 
attribute list attribute in it.  -  That is the only metadata type we use 
sb_bread() for and the disk image from the customer does contain it and I 
verified that is where the inifinite loop comes from.

So it appears sb_bread() and friends have always been broken on 32-bit 
architectures when accessing blocks outside the range 2^32 - 1 and it appears 
google finds lots of occurrences of such infinite loops being reported but the 
fixes have always been to not make the calls in the first place.  No-one seems 
to have recognized that there is a genuine problem here before.

Still my patch stands correct and should be applied to all kernel versions that 
have your patch and older kernels should have an equivalent patch applied to 
fix them, too for people who run very old kernels.

 but my passing index  sizebits as arg to function very much not okay).
 Thank you, Anton.

You are welcome.

 But I see my commit was marked for stable 3.0 3.2 3.4 3.5: so your fix
 is needed in 3.2 and 3.4 longterm also (the others now beyond life).

You are quite right.  It needs to go back to all those kernels, too.  Thank you!

Best regards,

Anton

 Hugh
 
 ---
 
 Linus, can you please apply this?  Alternatively, Andrew, can you please 
 pick this up and send it to Linus?
 
 It would be good it it can be applied for 3.17 as well as to all stable 
 kernels = 3.6 as they are all broken!
 
 Thanks a lot in advance!
 
 Best regards,
 
  Anton
 -- 
 Anton Altaparmakov aia21 at cam.ac.uk (replace at with @)
 Unix Support, Computing Service, University of Cambridge, CB2 3QH, UK
 Linux NTFS maintainer, http://www.linux-ntfs.org/
 
 diff --git a/fs/buffer.c b/fs/buffer.c
 index 8f05111..3588a80 100644
 --- a/fs/buffer.c
 +++ b/fs/buffer.c
 @@ -1022,7 +1022,8 @@ grow_dev_page(struct block_device *bdev, sector_t 
 block,
  bh = page_buffers(page);
  if (bh-b_size == size) {
  end_block = init_page_buffers(page, bdev,
 -index  sizebits, size);
 +(sector_t)index  sizebits,
 +size);
  goto done;
  }
  if (!try_to_free_buffers(page))
 @@ -1043,7 +1044,8 @@ grow_dev_page(struct block_device *bdev, sector_t 
 block,
   */
  spin_lock(inode-i_mapping-private_lock);
  link_dev_buffers(page, bh);
 -end_block = init_page_buffers(page, bdev, index  sizebits, size);
 +end_block = 

Re: [PATCH] Fix nasty 32-bit overflow bug in buffer i/o code.

2014-09-22 Thread Hugh Dickins
On Mon, 22 Sep 2014, Anton Altaparmakov wrote:

 Hi Hugh,
 
 On 22 Sep 2014, at 05:43, Hugh Dickins hu...@google.com wrote:
  On Mon, 22 Sep 2014, Anton Altaparmakov wrote:
  Any code that uses __getblk() and thus bread(), breadahead(), sb_bread(), 
  sb_breadahead(), sb_getblk(), and calls it using a 64-bit block on a 
  32-bit arch (where long is 32-bit) causes an inifinite loop in 
  __getblk_slow() with an infinite stream of errors logged to dmesg like 
  this:
  
  __find_get_block_slow() failed. block=6740375944, b_blocknr=2445408648
  b_state=0x0020, b_size=512
  device sda1 blocksize: 512
  
  Note how in hex block is 0x191C1F988 and b_blocknr is 0x91C1F988 i.e. the 
  top 32-bits are missing (in this case the 0x1 at the top).
  
  This is because grow_dev_page() was broken in commit 676ce6d5ca30: block: 
  replace __getblk_slow misfix by grow_dev_page fix by Hugh Dickins so that 
  it now has a 32-bit overflow due to shifting the block value to the right 
  so it fits in 32-bits and storing the result in pgoff_t variable which is 
  32-bit and then passing the pgoff_t variable left-shifted as the block 
  number which causes the top bits to get lost as the pgoff_t is not type 
  cast to sector_t / 64-bit before the shift.
  
  This patch fixes this issue by type casting index to sector_t before 
  doing the left shift.
  
  Note this is not a theoretical bug but has been seen in the field on a 
  4TiB hard drive with logical sector size 512 bytes.
  
  This patch has been verified to fix the infinite loop problem on 3.17-rc5 
  kernel using a 4TB disk image mounted using -o loop.  Without this patch 
  doing a find /nt where /nt is an NTFS volume causes the inifinite loop 
  100% reproducibly whilst with the patch it works fine as expected.
  
  Signed-off-by: Anton Altaparmakov ai...@cantab.net
  Cc: sta...@vger.kernel.org # 3.6 3.8 3.10 3.12 3.14 3.16
  
  Acked-by: Hugh Dickins hu...@google.com
  
  Yes indeed, that's bad, and entirely my fault (though it took me a while
  to see how the block = index  sizebits which was there before was okay,
 
 Ouch.  I failed to see that (I admit I did not pay too much attention to the 
 original code - I just used git blame to find out which commit put that 
 code in place - my bad).  It was never ok!  I went back to 2.6.12-rc2 
 (original commit to git Linus' kernel repository) and that is also affected.  
 That implies this is the first time anyone has used a 4TiB disk with 512 byte 
 sectors with NTFS where Windows had previously created a file/directory with 
 an attribute list attribute in it.  -  That is the only metadata type we use 
 sb_bread() for and the disk image from the customer does contain it and I 
 verified that is where the inifinite loop comes from.

Ow, that line needs some truncating itself :)

You generously interpret my words as seeing that for myself.
No, thank you for following up: I had persuaded myself when writing
before, that index would be promoted from pgoff_t to sector_t before
shifting in the original assignment, but not when I passed it as arg.

I've resorted to writing a proggy to check, and it looks like I was
earlier confused, and you are right, and that code was always wrong.

Not much use of 4TiB disks on 32-bit kernels I suppose.

 
 So it appears sb_bread() and friends have always been broken on 32-bit 
 architectures when accessing blocks outside the range 2^32 - 1 and it appears 
 google finds lots of occurrences of such infinite loops being reported but 
 the fixes have always been to not make the calls in the first place.  No-one 
 seems to have recognized that there is a genuine problem here before.
 
 Still my patch stands correct and should be applied to all kernel versions 
 that have your patch and older kernels should have an equivalent patch 
 applied to fix them, too for people who run very old kernels.

Yes, though I'm now uncertain whether your patch is a bugfix or an
enhancement.  Definitely a bugfix, given the infinite loops.  But the
oldest code (index = block  sizebits; block = index  sizebits;)
is so clearly trying to truncate block, that I wonder what departing
from that might be letting us in for.

I see Andrew did 2.6.19's intervening e5657933863f (grow_buffers()
infinite loop fix): let's hope that he has a clearer head in the
morning than I have now - there's a chance that he might think it
safer to extend that check to exclude your case, than take your patch.

I hope not, that would not please you; but right now I'm ruling
myself out of grasping the issues here!

Hugh

 
  but my passing index  sizebits as arg to function very much not okay).
  Thank you, Anton.
 
 You are welcome.
 
  But I see my commit was marked for stable 3.0 3.2 3.4 3.5: so your fix
  is needed in 3.2 and 3.4 longterm also (the others now beyond life).
 
 You are quite right.  It needs to go back to all those kernels, too.  Thank 
 you!
 
 Best regards,
 
   Anton
 
  Hugh
  
  ---
  
  Linus, can you please 

Re: [PATCH] Fix nasty 32-bit overflow bug in buffer i/o code.

2014-09-22 Thread Anton Altaparmakov
Hi,

On 22 Sep 2014, at 11:36, Hugh Dickins hu...@google.com wrote:
 On Mon, 22 Sep 2014, Anton Altaparmakov wrote:
 On 22 Sep 2014, at 05:43, Hugh Dickins hu...@google.com wrote:
 On Mon, 22 Sep 2014, Anton Altaparmakov wrote:
 Any code that uses __getblk() and thus bread(), breadahead(), sb_bread(), 
 sb_breadahead(), sb_getblk(), and calls it using a 64-bit block on a 
 32-bit arch (where long is 32-bit) causes an inifinite loop in 
 __getblk_slow() with an infinite stream of errors logged to dmesg like 
 this:
 
 __find_get_block_slow() failed. block=6740375944, b_blocknr=2445408648
 b_state=0x0020, b_size=512
 device sda1 blocksize: 512
 
 Note how in hex block is 0x191C1F988 and b_blocknr is 0x91C1F988 i.e. the 
 top 32-bits are missing (in this case the 0x1 at the top).
 
 This is because grow_dev_page() was broken in commit 676ce6d5ca30: block: 
 replace __getblk_slow misfix by grow_dev_page fix by Hugh Dickins so that 
 it now has a 32-bit overflow due to shifting the block value to the right 
 so it fits in 32-bits and storing the result in pgoff_t variable which is 
 32-bit and then passing the pgoff_t variable left-shifted as the block 
 number which causes the top bits to get lost as the pgoff_t is not type 
 cast to sector_t / 64-bit before the shift.
 
 This patch fixes this issue by type casting index to sector_t before 
 doing the left shift.
 
 Note this is not a theoretical bug but has been seen in the field on a 
 4TiB hard drive with logical sector size 512 bytes.
 
 This patch has been verified to fix the infinite loop problem on 3.17-rc5 
 kernel using a 4TB disk image mounted using -o loop.  Without this patch 
 doing a find /nt where /nt is an NTFS volume causes the inifinite loop 
 100% reproducibly whilst with the patch it works fine as expected.
 
 Signed-off-by: Anton Altaparmakov ai...@cantab.net
 Cc: sta...@vger.kernel.org # 3.6 3.8 3.10 3.12 3.14 3.16
 
 Acked-by: Hugh Dickins hu...@google.com
 
 Yes indeed, that's bad, and entirely my fault (though it took me a while
 to see how the block = index  sizebits which was there before was okay,
 
 Ouch.  I failed to see that (I admit I did not pay too much attention to the 
 original code - I just used git blame to find out which commit put that 
 code in place - my bad).  It was never ok!  I went back to 2.6.12-rc2 
 (original commit to git Linus' kernel repository) and that is also affected. 
  That implies this is the first time anyone has used a 4TiB disk with 512 
 byte sectors with NTFS where Windows had previously created a file/directory 
 with an attribute list attribute in it.  -  That is the only metadata type 
 we use sb_bread() for and the disk image from the customer does contain it 
 and I verified that is where the inifinite loop comes from.
 
 Ow, that line needs some truncating itself :)
 
 You generously interpret my words as seeing that for myself.
 No, thank you for following up: I had persuaded myself when writing
 before, that index would be promoted from pgoff_t to sector_t before
 shifting in the original assignment, but not when I passed it as arg.
 
 I've resorted to writing a proggy to check, and it looks like I was
 earlier confused, and you are right, and that code was always wrong.
 
 Not much use of 4TiB disks on 32-bit kernels I suppose.

Actually on embedded devices like broadband routers/home network NAS boxes/PVR 
boxes, etc it is very common to have 32-bit (on ARM/MIPS or more recently also 
x86) and more and more people are plugging in larger disks to store their 
music/videos/movies, etc...  So this is something that is on the up and going 
to become an increasing problem as large disks get cheaper and cheaper.

 So it appears sb_bread() and friends have always been broken on 32-bit 
 architectures when accessing blocks outside the range 2^32 - 1 and it 
 appears google finds lots of occurrences of such infinite loops being 
 reported but the fixes have always been to not make the calls in the first 
 place.  No-one seems to have recognized that there is a genuine problem here 
 before.
 
 Still my patch stands correct and should be applied to all kernel versions 
 that have your patch and older kernels should have an equivalent patch 
 applied to fix them, too for people who run very old kernels.
 
 Yes, though I'm now uncertain whether your patch is a bugfix or an
 enhancement.  Definitely a bugfix, given the infinite loops.  But the
 oldest code (index = block  sizebits; block = index  sizebits;)
 is so clearly trying to truncate block, that I wonder what departing
 from that might be letting us in for.

Having looked at it, I think I understand the original code and intention - it 
is not truncating the top bits, it is truncating the bottom bits, i.e. it is 
rounding down so that block becomes the first block in the page as that is 
what is needed to be passed into grow_dev_page().  And the shift right followed 
by shift left makes those bottom bits fall off.

A perfectly valid 

Re: [PATCH] Fix nasty 32-bit overflow bug in buffer i/o code.

2014-09-22 Thread Linus Torvalds
On Sun, Sep 21, 2014 at 5:53 PM, Anton Altaparmakov ai...@cam.ac.uk wrote:

 This patch fixes this issue by type casting index to sector_t before
 doing the left shift.

Ugh. Does the simpler patch to just pass in block work as well?

Linus
 fs/buffer.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/fs/buffer.c b/fs/buffer.c
index 8f05111bbb8b..f32d6a3cff38 100644
--- a/fs/buffer.c
+++ b/fs/buffer.c
@@ -1021,8 +1021,7 @@ grow_dev_page(struct block_device *bdev, sector_t block,
if (page_has_buffers(page)) {
bh = page_buffers(page);
if (bh-b_size == size) {
-   end_block = init_page_buffers(page, bdev,
-   index  sizebits, size);
+   end_block = init_page_buffers(page, bdev, block, size);
goto done;
}
if (!try_to_free_buffers(page))
@@ -1043,7 +1042,7 @@ grow_dev_page(struct block_device *bdev, sector_t block,
 */
spin_lock(inode-i_mapping-private_lock);
link_dev_buffers(page, bh);
-   end_block = init_page_buffers(page, bdev, index  sizebits, size);
+   end_block = init_page_buffers(page, bdev, block, size);
spin_unlock(inode-i_mapping-private_lock);
 done:
ret = (block  end_block) ? 1 : -ENXIO;


Re: [PATCH] Fix nasty 32-bit overflow bug in buffer i/o code.

2014-09-22 Thread Linus Torvalds
On Mon, Sep 22, 2014 at 8:18 AM, Linus Torvalds
torva...@linux-foundation.org wrote:

 Ugh. Does the simpler patch to just pass in block work as well?

Ugh, never mind, no it does not. It doesn't truncate the sectors to
the beginning of a page boundary.

Will take the patch as-is (will edit the commit message that seems a
bit inaccurate).

 Linus
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Fix nasty 32-bit overflow bug in buffer i/o code.

2014-09-22 Thread Anton Altaparmakov
Hi Linus,

On 22 Sep 2014, at 16:18, Linus Torvalds torva...@linux-foundation.org wrote:
 On Sun, Sep 21, 2014 at 5:53 PM, Anton Altaparmakov ai...@cam.ac.uk wrote:
 
 This patch fixes this issue by type casting index to sector_t before
 doing the left shift.
 
 Ugh. Does the simpler patch to just pass in block work as well?

That doesn't work because nothing rounds down block to the first block in the 
page and init_page_buffers() requires block to be the first block in the page.

The shift right followed by shift left achieves the rounding down required.

You could do block  ~(sector_t)(size - 1) instead of (sector_t)index  
sizebits if you prefer but not sure that is an improvement!

Best regards,

Anton

Linus
 patch.diff


-- 
Anton Altaparmakov aia21 at cam.ac.uk (replace at with @)
University of Cambridge Information Services, Roger Needham Building
7 JJ Thomson Avenue, Cambridge, CB3 0RB, UK

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Fix nasty 32-bit overflow bug in buffer i/o code.

2014-09-22 Thread Linus Torvalds
On Mon, Sep 22, 2014 at 8:29 AM, Anton Altaparmakov ai...@cam.ac.uk wrote:

 You could do block  ~(sector_t)(size - 1) instead of (sector_t)index  
 sizebits if you prefer but not sure that is an improvement!

No, it would be even worse. Something like

  block  ~(sector_t)((size  9) - 1)

because block is the sector number (ie 512-byte) and size is in bytes.

   Linus
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Fix nasty 32-bit overflow bug in buffer i/o code.

2014-09-22 Thread Anton Altaparmakov
Hi Linus,

On 22 Sep 2014, at 16:33, Linus Torvalds torva...@linux-foundation.org wrote:
 On Mon, Sep 22, 2014 at 8:29 AM, Anton Altaparmakov ai...@cam.ac.uk wrote:
 
 You could do block  ~(sector_t)(size - 1) instead of (sector_t)index  
 sizebits if you prefer but not sure that is an improvement!
 
 No, it would be even worse. Something like
 
  block  ~(sector_t)((size  9) - 1)
 
 because block is the sector number (ie 512-byte) and size is in bytes.

Oops, sorry.  But I think you got it wrong, too as you are ignoring the 
PAGE_SIZE - as was I but it is what we need to align to in addition to the 
problem of size being in bytes.  So I think the correct mask is actually 
based on sizebits which reflects the number of blocks per page thus:

block  ~(sector_t)((1  sizebits) - 1)

In any case the shift is the lesser evil I think as it is at least obviously 
correct whilst getting the right mask has taken us a few iterations of 
correcting each other! (-:

PS. Thank you for taking my patch and correcting the misleading description!

Best regards,

Anton

   Linus

-- 
Anton Altaparmakov aia21 at cam.ac.uk (replace at with @)
University of Cambridge Information Services, Roger Needham Building
7 JJ Thomson Avenue, Cambridge, CB3 0RB, UK

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Fix nasty 32-bit overflow bug in buffer i/o code.

2014-09-21 Thread Hugh Dickins
On Mon, 22 Sep 2014, Anton Altaparmakov wrote:

> Any code that uses __getblk() and thus bread(), breadahead(), sb_bread(), 
> sb_breadahead(), sb_getblk(), and calls it using a 64-bit block on a 
> 32-bit arch (where "long" is 32-bit) causes an inifinite loop in 
> __getblk_slow() with an infinite stream of errors logged to dmesg like 
> this:
> 
> __find_get_block_slow() failed. block=6740375944, b_blocknr=2445408648
> b_state=0x0020, b_size=512
> device sda1 blocksize: 512
> 
> Note how in hex block is 0x191C1F988 and b_blocknr is 0x91C1F988 i.e. the 
> top 32-bits are missing (in this case the 0x1 at the top).
> 
> This is because grow_dev_page() was broken in commit 676ce6d5ca30: "block: 
> replace __getblk_slow misfix by grow_dev_page fix" by Hugh Dickins so that 
> it now has a 32-bit overflow due to shifting the block value to the right 
> so it fits in 32-bits and storing the result in pgoff_t variable which is 
> 32-bit and then passing the pgoff_t variable left-shifted as the block 
> number which causes the top bits to get lost as the pgoff_t is not type 
> cast to sector_t / 64-bit before the shift.
> 
> This patch fixes this issue by type casting "index" to sector_t before 
> doing the left shift.
> 
> Note this is not a theoretical bug but has been seen in the field on a 
> 4TiB hard drive with logical sector size 512 bytes.
> 
> This patch has been verified to fix the infinite loop problem on 3.17-rc5 
> kernel using a 4TB disk image mounted using "-o loop".  Without this patch 
> doing a "find /nt" where /nt is an NTFS volume causes the inifinite loop 
> 100% reproducibly whilst with the patch it works fine as expected.
> 
> Signed-off-by: Anton Altaparmakov 
> Cc: sta...@vger.kernel.org # 3.6 3.8 3.10 3.12 3.14 3.16

Acked-by: Hugh Dickins 

Yes indeed, that's bad, and entirely my fault (though it took me a while
to see how the "block = index << sizebits" which was there before was okay,
but my passing "index << sizebits" as arg to function very much not okay).
Thank you, Anton.

But I see my commit was marked for stable 3.0 3.2 3.4 3.5: so your fix
is needed in 3.2 and 3.4 longterm also (the others now beyond life).

Hugh

> ---
> 
> Linus, can you please apply this?  Alternatively, Andrew, can you please 
> pick this up and send it to Linus?
> 
> It would be good it it can be applied for 3.17 as well as to all stable 
> kernels >= 3.6 as they are all broken!
> 
> Thanks a lot in advance!
> 
> Best regards,
> 
>   Anton
> -- 
> Anton Altaparmakov  (replace at with @)
> Unix Support, Computing Service, University of Cambridge, CB2 3QH, UK
> Linux NTFS maintainer, http://www.linux-ntfs.org/
> 
> diff --git a/fs/buffer.c b/fs/buffer.c
> index 8f05111..3588a80 100644
> --- a/fs/buffer.c
> +++ b/fs/buffer.c
> @@ -1022,7 +1022,8 @@ grow_dev_page(struct block_device *bdev, sector_t block,
>   bh = page_buffers(page);
>   if (bh->b_size == size) {
>   end_block = init_page_buffers(page, bdev,
> - index << sizebits, size);
> + (sector_t)index << sizebits,
> + size);
>   goto done;
>   }
>   if (!try_to_free_buffers(page))
> @@ -1043,7 +1044,8 @@ grow_dev_page(struct block_device *bdev, sector_t block,
>*/
>   spin_lock(>i_mapping->private_lock);
>   link_dev_buffers(page, bh);
> - end_block = init_page_buffers(page, bdev, index << sizebits, size);
> + end_block = init_page_buffers(page, bdev, (sector_t)index << sizebits,
> + size);
>   spin_unlock(>i_mapping->private_lock);
>  done:
>   ret = (block < end_block) ? 1 : -ENXIO;
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] Fix nasty 32-bit overflow bug in buffer i/o code.

2014-09-21 Thread Anton Altaparmakov
Any code that uses __getblk() and thus bread(), breadahead(), sb_bread(), 
sb_breadahead(), sb_getblk(), and calls it using a 64-bit block on a 
32-bit arch (where "long" is 32-bit) causes an inifinite loop in 
__getblk_slow() with an infinite stream of errors logged to dmesg like 
this:

__find_get_block_slow() failed. block=6740375944, b_blocknr=2445408648
b_state=0x0020, b_size=512
device sda1 blocksize: 512

Note how in hex block is 0x191C1F988 and b_blocknr is 0x91C1F988 i.e. the 
top 32-bits are missing (in this case the 0x1 at the top).

This is because grow_dev_page() was broken in commit 676ce6d5ca30: "block: 
replace __getblk_slow misfix by grow_dev_page fix" by Hugh Dickins so that 
it now has a 32-bit overflow due to shifting the block value to the right 
so it fits in 32-bits and storing the result in pgoff_t variable which is 
32-bit and then passing the pgoff_t variable left-shifted as the block 
number which causes the top bits to get lost as the pgoff_t is not type 
cast to sector_t / 64-bit before the shift.

This patch fixes this issue by type casting "index" to sector_t before 
doing the left shift.

Note this is not a theoretical bug but has been seen in the field on a 
4TiB hard drive with logical sector size 512 bytes.

This patch has been verified to fix the infinite loop problem on 3.17-rc5 
kernel using a 4TB disk image mounted using "-o loop".  Without this patch 
doing a "find /nt" where /nt is an NTFS volume causes the inifinite loop 
100% reproducibly whilst with the patch it works fine as expected.

Signed-off-by: Anton Altaparmakov 
Cc: sta...@vger.kernel.org # 3.6 3.8 3.10 3.12 3.14 3.16
---

Linus, can you please apply this?  Alternatively, Andrew, can you please 
pick this up and send it to Linus?

It would be good it it can be applied for 3.17 as well as to all stable 
kernels >= 3.6 as they are all broken!

Thanks a lot in advance!

Best regards,

Anton
-- 
Anton Altaparmakov  (replace at with @)
Unix Support, Computing Service, University of Cambridge, CB2 3QH, UK
Linux NTFS maintainer, http://www.linux-ntfs.org/

diff --git a/fs/buffer.c b/fs/buffer.c
index 8f05111..3588a80 100644
--- a/fs/buffer.c
+++ b/fs/buffer.c
@@ -1022,7 +1022,8 @@ grow_dev_page(struct block_device *bdev, sector_t block,
bh = page_buffers(page);
if (bh->b_size == size) {
end_block = init_page_buffers(page, bdev,
-   index << sizebits, size);
+   (sector_t)index << sizebits,
+   size);
goto done;
}
if (!try_to_free_buffers(page))
@@ -1043,7 +1044,8 @@ grow_dev_page(struct block_device *bdev, sector_t block,
 */
spin_lock(>i_mapping->private_lock);
link_dev_buffers(page, bh);
-   end_block = init_page_buffers(page, bdev, index << sizebits, size);
+   end_block = init_page_buffers(page, bdev, (sector_t)index << sizebits,
+   size);
spin_unlock(>i_mapping->private_lock);
 done:
ret = (block < end_block) ? 1 : -ENXIO;
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] Fix nasty 32-bit overflow bug in buffer i/o code.

2014-09-21 Thread Anton Altaparmakov
Any code that uses __getblk() and thus bread(), breadahead(), sb_bread(), 
sb_breadahead(), sb_getblk(), and calls it using a 64-bit block on a 
32-bit arch (where long is 32-bit) causes an inifinite loop in 
__getblk_slow() with an infinite stream of errors logged to dmesg like 
this:

__find_get_block_slow() failed. block=6740375944, b_blocknr=2445408648
b_state=0x0020, b_size=512
device sda1 blocksize: 512

Note how in hex block is 0x191C1F988 and b_blocknr is 0x91C1F988 i.e. the 
top 32-bits are missing (in this case the 0x1 at the top).

This is because grow_dev_page() was broken in commit 676ce6d5ca30: block: 
replace __getblk_slow misfix by grow_dev_page fix by Hugh Dickins so that 
it now has a 32-bit overflow due to shifting the block value to the right 
so it fits in 32-bits and storing the result in pgoff_t variable which is 
32-bit and then passing the pgoff_t variable left-shifted as the block 
number which causes the top bits to get lost as the pgoff_t is not type 
cast to sector_t / 64-bit before the shift.

This patch fixes this issue by type casting index to sector_t before 
doing the left shift.

Note this is not a theoretical bug but has been seen in the field on a 
4TiB hard drive with logical sector size 512 bytes.

This patch has been verified to fix the infinite loop problem on 3.17-rc5 
kernel using a 4TB disk image mounted using -o loop.  Without this patch 
doing a find /nt where /nt is an NTFS volume causes the inifinite loop 
100% reproducibly whilst with the patch it works fine as expected.

Signed-off-by: Anton Altaparmakov ai...@cantab.net
Cc: sta...@vger.kernel.org # 3.6 3.8 3.10 3.12 3.14 3.16
---

Linus, can you please apply this?  Alternatively, Andrew, can you please 
pick this up and send it to Linus?

It would be good it it can be applied for 3.17 as well as to all stable 
kernels = 3.6 as they are all broken!

Thanks a lot in advance!

Best regards,

Anton
-- 
Anton Altaparmakov aia21 at cam.ac.uk (replace at with @)
Unix Support, Computing Service, University of Cambridge, CB2 3QH, UK
Linux NTFS maintainer, http://www.linux-ntfs.org/

diff --git a/fs/buffer.c b/fs/buffer.c
index 8f05111..3588a80 100644
--- a/fs/buffer.c
+++ b/fs/buffer.c
@@ -1022,7 +1022,8 @@ grow_dev_page(struct block_device *bdev, sector_t block,
bh = page_buffers(page);
if (bh-b_size == size) {
end_block = init_page_buffers(page, bdev,
-   index  sizebits, size);
+   (sector_t)index  sizebits,
+   size);
goto done;
}
if (!try_to_free_buffers(page))
@@ -1043,7 +1044,8 @@ grow_dev_page(struct block_device *bdev, sector_t block,
 */
spin_lock(inode-i_mapping-private_lock);
link_dev_buffers(page, bh);
-   end_block = init_page_buffers(page, bdev, index  sizebits, size);
+   end_block = init_page_buffers(page, bdev, (sector_t)index  sizebits,
+   size);
spin_unlock(inode-i_mapping-private_lock);
 done:
ret = (block  end_block) ? 1 : -ENXIO;
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Fix nasty 32-bit overflow bug in buffer i/o code.

2014-09-21 Thread Hugh Dickins
On Mon, 22 Sep 2014, Anton Altaparmakov wrote:

 Any code that uses __getblk() and thus bread(), breadahead(), sb_bread(), 
 sb_breadahead(), sb_getblk(), and calls it using a 64-bit block on a 
 32-bit arch (where long is 32-bit) causes an inifinite loop in 
 __getblk_slow() with an infinite stream of errors logged to dmesg like 
 this:
 
 __find_get_block_slow() failed. block=6740375944, b_blocknr=2445408648
 b_state=0x0020, b_size=512
 device sda1 blocksize: 512
 
 Note how in hex block is 0x191C1F988 and b_blocknr is 0x91C1F988 i.e. the 
 top 32-bits are missing (in this case the 0x1 at the top).
 
 This is because grow_dev_page() was broken in commit 676ce6d5ca30: block: 
 replace __getblk_slow misfix by grow_dev_page fix by Hugh Dickins so that 
 it now has a 32-bit overflow due to shifting the block value to the right 
 so it fits in 32-bits and storing the result in pgoff_t variable which is 
 32-bit and then passing the pgoff_t variable left-shifted as the block 
 number which causes the top bits to get lost as the pgoff_t is not type 
 cast to sector_t / 64-bit before the shift.
 
 This patch fixes this issue by type casting index to sector_t before 
 doing the left shift.
 
 Note this is not a theoretical bug but has been seen in the field on a 
 4TiB hard drive with logical sector size 512 bytes.
 
 This patch has been verified to fix the infinite loop problem on 3.17-rc5 
 kernel using a 4TB disk image mounted using -o loop.  Without this patch 
 doing a find /nt where /nt is an NTFS volume causes the inifinite loop 
 100% reproducibly whilst with the patch it works fine as expected.
 
 Signed-off-by: Anton Altaparmakov ai...@cantab.net
 Cc: sta...@vger.kernel.org # 3.6 3.8 3.10 3.12 3.14 3.16

Acked-by: Hugh Dickins hu...@google.com

Yes indeed, that's bad, and entirely my fault (though it took me a while
to see how the block = index  sizebits which was there before was okay,
but my passing index  sizebits as arg to function very much not okay).
Thank you, Anton.

But I see my commit was marked for stable 3.0 3.2 3.4 3.5: so your fix
is needed in 3.2 and 3.4 longterm also (the others now beyond life).

Hugh

 ---
 
 Linus, can you please apply this?  Alternatively, Andrew, can you please 
 pick this up and send it to Linus?
 
 It would be good it it can be applied for 3.17 as well as to all stable 
 kernels = 3.6 as they are all broken!
 
 Thanks a lot in advance!
 
 Best regards,
 
   Anton
 -- 
 Anton Altaparmakov aia21 at cam.ac.uk (replace at with @)
 Unix Support, Computing Service, University of Cambridge, CB2 3QH, UK
 Linux NTFS maintainer, http://www.linux-ntfs.org/
 
 diff --git a/fs/buffer.c b/fs/buffer.c
 index 8f05111..3588a80 100644
 --- a/fs/buffer.c
 +++ b/fs/buffer.c
 @@ -1022,7 +1022,8 @@ grow_dev_page(struct block_device *bdev, sector_t block,
   bh = page_buffers(page);
   if (bh-b_size == size) {
   end_block = init_page_buffers(page, bdev,
 - index  sizebits, size);
 + (sector_t)index  sizebits,
 + size);
   goto done;
   }
   if (!try_to_free_buffers(page))
 @@ -1043,7 +1044,8 @@ grow_dev_page(struct block_device *bdev, sector_t block,
*/
   spin_lock(inode-i_mapping-private_lock);
   link_dev_buffers(page, bh);
 - end_block = init_page_buffers(page, bdev, index  sizebits, size);
 + end_block = init_page_buffers(page, bdev, (sector_t)index  sizebits,
 + size);
   spin_unlock(inode-i_mapping-private_lock);
  done:
   ret = (block  end_block) ? 1 : -ENXIO;
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/