Re: BUG: failure at fs/btrfs/ctree.h:337/btrfs_chunk_item_size()!

2015-12-04 Thread Qu Wenruo



On 12/04/2015 09:12 PM, David Sterba wrote:

On Fri, Dec 04, 2015 at 09:21:59AM +0800, Qu Wenruo wrote:

We do have the alignment check in kernel, but it's in the early phase
where we don't know if nodesize is reliable and print only a warning.


This can be enhanced by the following method:


At minimum, we can promote the 4k alignment checks in
btrfs_check_super_valid from a warning to an error. The blocks must be
4k aligned, regardless of sectorsize or nodesize.


1) Check sectorsize first
 Only several sector size is valid for current btrfs:
 4K, 8K, 16K, 32K, 64K
 Just five numbers, quite easy to check.


The sectorsize must be PAGE_SIZE at the moment. This will change with
Chandan's patchset though.


PAGE_SIZE would be good enough.




 Or if anyone is going to extend supported sectorsize, we can change
 the check to if the number is power of 2 starting from 4K.

2) Check nodesize/leafsize then
 It should be aligned to sectorsize.


This particular check is missing but is implicit because of the
sectorsize == PAGE_SIZE restriction.


But still need to check nodesize/leafsize validation against sectorsize.
Current btrfs is already using large nodesize by default.

For example, 20K nodesize can pass 4K page size check but still wrong.

(And I'm also wrong in previous mail, it's not only aligned to 
sectorisze, but also need to be power of 2)





 And nodesize must match with leafsize.
 Currently, it's done out of check_super_valid(), we can integrate it.


Yeah it's done, then I don't see why we should add it agian.


Just want to move it to check_super_valid(), as it's better to put 
validation check codes together, and that's why we have check_super_valid().





3) Check all super root bytenr against *sectorsize*
 Yeah, not nodesize.
 As some old bad convert will cause metadata extent unaligned to
 nodesize(just before my convert rework patch), but only aligned to
 sectorsize.
 So only check alignment of sectorsize.


While the real check should be against the sectorsize, at the moment I
think it's covered by the 4k checks anyway. I understand why we can't
use the nodesize.



4K is good enough for x86 family but can't find all problem for 64K page 
size like PPC64 or AArch64.


So it's still better to change the check at least to page size even we 
don't have subpage size support yet.



So, if we do the warning -> error, we're fine for now. Some of the
checks you suggest would be good to merge when the subpage blocksize
patchset is merged.


Right, more accurate check is only needed after subpage patchset.

Thanks,
Qu


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: BUG: failure at fs/btrfs/ctree.h:337/btrfs_chunk_item_size()!

2015-12-04 Thread David Sterba
On Fri, Dec 04, 2015 at 09:21:59AM +0800, Qu Wenruo wrote:
> > We do have the alignment check in kernel, but it's in the early phase
> > where we don't know if nodesize is reliable and print only a warning.
> >
> This can be enhanced by the following method:

At minimum, we can promote the 4k alignment checks in
btrfs_check_super_valid from a warning to an error. The blocks must be
4k aligned, regardless of sectorsize or nodesize.

> 1) Check sectorsize first
> Only several sector size is valid for current btrfs:
> 4K, 8K, 16K, 32K, 64K
> Just five numbers, quite easy to check.

The sectorsize must be PAGE_SIZE at the moment. This will change with
Chandan's patchset though.

> Or if anyone is going to extend supported sectorsize, we can change
> the check to if the number is power of 2 starting from 4K.
> 
> 2) Check nodesize/leafsize then
> It should be aligned to sectorsize.

This particular check is missing but is implicit because of the
sectorsize == PAGE_SIZE restriction.

> And nodesize must match with leafsize.
> Currently, it's done out of check_super_valid(), we can integrate it.

Yeah it's done, then I don't see why we should add it agian.

> 3) Check all super root bytenr against *sectorsize*
> Yeah, not nodesize.
> As some old bad convert will cause metadata extent unaligned to
> nodesize(just before my convert rework patch), but only aligned to
> sectorsize.
> So only check alignment of sectorsize.

While the real check should be against the sectorsize, at the moment I
think it's covered by the 4k checks anyway. I understand why we can't
use the nodesize.

So, if we do the warning -> error, we're fine for now. Some of the
checks you suggest would be good to merge when the subpage blocksize
patchset is merged.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: BUG: failure at fs/btrfs/ctree.h:337/btrfs_chunk_item_size()!

2015-12-03 Thread Qu Wenruo



David Sterba wrote on 2015/12/03 18:47 +0100:

On Tue, Dec 01, 2015 at 08:50:57AM +0800, Qu Wenruo wrote:

Btrfsck already gives quite good clue on the problem:

ERROR: chunk_root block unaligned: 4294967168
ERROR: chunk_root block unaligned: 4294967168

Chunk root in superblock is not aligned, and the superblock is not valid.

It would be better to copy all these btrfs-progs enhancement to kernel.


We do have the alignment check in kernel, but it's in the early phase
where we don't know if nodesize is reliable and print only a warning.



This can be enhanced by the following method:

1) Check sectorsize first
   Only several sector size is valid for current btrfs:
   4K, 8K, 16K, 32K, 64K
   Just five numbers, quite easy to check.
   Or if anyone is going to extend supported sectorsize, we can change
   the check to if the number is power of 2 starting from 4K.

2) Check nodesize/leafsize then
   It should be aligned to sectorsize.
   And nodesize must match with leafsize.
   Currently, it's done out of check_super_valid(), we can integrate it.

3) Check all super root bytenr against *sectorsize*
   Yeah, not nodesize.
   As some old bad convert will cause metadata extent unaligned to
   nodesize(just before my convert rework patch), but only aligned to
   sectorsize.
   So only check alignment of sectorsize.

If you are OK with this idea, I can make the patch soon.

Thanks,
Qu




--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: BUG: failure at fs/btrfs/ctree.h:337/btrfs_chunk_item_size()!

2015-12-03 Thread David Sterba
On Mon, Nov 30, 2015 at 11:05:20PM +0100, Vegard Nossum wrote:
> With your patch and a new image, I run into a second issue (which is 
> probably unrelated):

Yes, it's unrelated. As pointed out by Qu, the alignment is wrong and
more strict checks should catch it.

> Should I start a new thread? I've attached the new image. Thanks,

Yes please, the bugs from fuzzers are usually hitting random bugs.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: BUG: failure at fs/btrfs/ctree.h:337/btrfs_chunk_item_size()!

2015-12-03 Thread David Sterba
On Tue, Dec 01, 2015 at 08:50:57AM +0800, Qu Wenruo wrote:
> Btrfsck already gives quite good clue on the problem:
> 
> ERROR: chunk_root block unaligned: 4294967168
> ERROR: chunk_root block unaligned: 4294967168
> 
> Chunk root in superblock is not aligned, and the superblock is not valid.
> 
> It would be better to copy all these btrfs-progs enhancement to kernel.

We do have the alignment check in kernel, but it's in the early phase
where we don't know if nodesize is reliable and print only a warning.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: BUG: failure at fs/btrfs/ctree.h:337/btrfs_chunk_item_size()!

2015-11-30 Thread Qu Wenruo



Vegard Nossum wrote on 2015/11/30 23:05 +0100:

On 11/30/2015 05:34 PM, David Sterba wrote:

On Mon, Nov 30, 2015 at 02:48:51PM +0100, David Sterba wrote:

On Sun, Nov 15, 2015 at 07:21:17PM +0100, Vegard Nossum wrote:

With the attached btrfs image, I get the following splat when mounting:

"""
# mount -o loop -t btrfs ./btrfs.0 /mnt/0/
BTRFS: device fsid 9006933e-2a9a-44f0-917f-514252aeec2c devid 1 transid
7 /dev/loop0
BTRFS info (device loop0): disk space caching is enabled
BUG: failure at fs/btrfs/ctree.h:337/btrfs_chunk_item_size()!


The fix, worked for me on the provided image:
https://patchwork.kernel.org/patch/7728051/

I'll add the attached image to btrfs-progs testsuite as it triggers
crashes in other tools.



Thanks, that seems to fix the problem.

With your patch and a new image, I run into a second issue (which is
probably unrelated):

BTRFS critical (device loop0): unable to find logical 4294963200 len
40966cc7e != de8
BUG: failure at fs/btrfs/inode.c:1805/btrfs_merge_bio_hook()!
Kernel panic - not syncing: BUG! errors
CPU: 0 PID: 913 Comm: mount Not tainted 4.2.5 #1cc7e devid 1 transid 7
/dev/loop0
Stack: dev_item UUID does not match fsid:
de80ced1-18ac-490c-9afb-cf0a7d66cc7e != de8
   e0147430 60075412 600bd457 603f5080
   606dfc7a 605e2b14 e0147440 605e595fors
   e0147560 605e291e e00cb2e8 1000
  Call Trace:
   [<60029f3b>] show_stack+0xdb/0x1a0
   [<605e595f>] dump_stack+0x2a/0x2c
   [<605e291e>] panic+0x137/0x2ac
   [<602b8d0c>] btrfs_merge_bio_hook+0xfc/0x100
   [<602e3923>] submit_extent_page+0x223/0x330
   [<602e51a6>] __do_readpage+0x476/0xb60
   [<602e59bf>] __extent_read_full_page+0x12f/0x140
   [<602e8264>] read_extent_buffer_pages+0x1e4/0x410
   [<602a9d46>] btree_read_extent_buffer_pages.constprop.24+0x106/0x1a0
   [<602aa67e>] read_tree_block+0x5e/0xa0
   [<602b0064>] open_ctree+0x1814/0x2db0
   [<60273a3b>] btrfs_mount+0xf3b/0x1010
   [<601054d3>] mount_fs+0x33/0x210
   [<60124b94>] vfs_kern_mount+0x74/0x170
   [<60272df3>] btrfs_mount+0x2f3/0x1010
   [<601054d3>] mount_fs+0x33/0x210
   [<60124b94>] vfs_kern_mount+0x74/0x170
   [<601264dc>] do_mount+0x26c/0xf30
   [<6012768b>] SyS_mount+0xab/0x120

Should I start a new thread? I've attached the new image. Thanks,


Vegard



Glad to see btrfsck is much robust than kernel now. :)

Btrfsck already gives quite good clue on the problem:

ERROR: chunk_root block unaligned: 4294967168
ERROR: chunk_root block unaligned: 4294967168

Chunk root in superblock is not aligned, and the superblock is not valid.

It would be better to copy all these btrfs-progs enhancement to kernel.

Thanks,
Qu


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: BUG: failure at fs/btrfs/ctree.h:337/btrfs_chunk_item_size()!

2015-11-30 Thread Vegard Nossum

On 11/30/2015 05:34 PM, David Sterba wrote:

On Mon, Nov 30, 2015 at 02:48:51PM +0100, David Sterba wrote:

On Sun, Nov 15, 2015 at 07:21:17PM +0100, Vegard Nossum wrote:

With the attached btrfs image, I get the following splat when mounting:

"""
# mount -o loop -t btrfs ./btrfs.0 /mnt/0/
BTRFS: device fsid 9006933e-2a9a-44f0-917f-514252aeec2c devid 1 transid
7 /dev/loop0
BTRFS info (device loop0): disk space caching is enabled
BUG: failure at fs/btrfs/ctree.h:337/btrfs_chunk_item_size()!


The fix, worked for me on the provided image:
https://patchwork.kernel.org/patch/7728051/

I'll add the attached image to btrfs-progs testsuite as it triggers
crashes in other tools.



Thanks, that seems to fix the problem.

With your patch and a new image, I run into a second issue (which is 
probably unrelated):


BTRFS critical (device loop0): unable to find logical 4294963200 len 
40966cc7e != de8

BUG: failure at fs/btrfs/inode.c:1805/btrfs_merge_bio_hook()!
Kernel panic - not syncing: BUG! errors
CPU: 0 PID: 913 Comm: mount Not tainted 4.2.5 #1cc7e devid 1 transid 7 
/dev/loop0
Stack: dev_item UUID does not match fsid: 
de80ced1-18ac-490c-9afb-cf0a7d66cc7e != de8

  e0147430 60075412 600bd457 603f5080
  606dfc7a 605e2b14 e0147440 605e595fors
  e0147560 605e291e e00cb2e8 1000
 Call Trace:
  [<60029f3b>] show_stack+0xdb/0x1a0
  [<605e595f>] dump_stack+0x2a/0x2c
  [<605e291e>] panic+0x137/0x2ac
  [<602b8d0c>] btrfs_merge_bio_hook+0xfc/0x100
  [<602e3923>] submit_extent_page+0x223/0x330
  [<602e51a6>] __do_readpage+0x476/0xb60
  [<602e59bf>] __extent_read_full_page+0x12f/0x140
  [<602e8264>] read_extent_buffer_pages+0x1e4/0x410
  [<602a9d46>] btree_read_extent_buffer_pages.constprop.24+0x106/0x1a0
  [<602aa67e>] read_tree_block+0x5e/0xa0
  [<602b0064>] open_ctree+0x1814/0x2db0
  [<60273a3b>] btrfs_mount+0xf3b/0x1010
  [<601054d3>] mount_fs+0x33/0x210
  [<60124b94>] vfs_kern_mount+0x74/0x170
  [<60272df3>] btrfs_mount+0x2f3/0x1010
  [<601054d3>] mount_fs+0x33/0x210
  [<60124b94>] vfs_kern_mount+0x74/0x170
  [<601264dc>] do_mount+0x26c/0xf30
  [<6012768b>] SyS_mount+0xab/0x120

Should I start a new thread? I've attached the new image. Thanks,


Vegard


btrfs.1.bz2
Description: application/bzip


Re: BUG: failure at fs/btrfs/ctree.h:337/btrfs_chunk_item_size()!

2015-11-30 Thread David Sterba
On Mon, Nov 30, 2015 at 02:48:51PM +0100, David Sterba wrote:
> On Sun, Nov 15, 2015 at 07:21:17PM +0100, Vegard Nossum wrote:
> > With the attached btrfs image, I get the following splat when mounting:
> > 
> > """
> > # mount -o loop -t btrfs ./btrfs.0 /mnt/0/
> > BTRFS: device fsid 9006933e-2a9a-44f0-917f-514252aeec2c devid 1 transid 
> > 7 /dev/loop0
> > BTRFS info (device loop0): disk space caching is enabled
> > BUG: failure at fs/btrfs/ctree.h:337/btrfs_chunk_item_size()!

The fix, worked for me on the provided image:
https://patchwork.kernel.org/patch/7728051/

I'll add the attached image to btrfs-progs testsuite as it triggers
crashes in other tools.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: BUG: failure at fs/btrfs/ctree.h:337/btrfs_chunk_item_size()!

2015-11-30 Thread David Sterba
On Sun, Nov 15, 2015 at 07:21:17PM +0100, Vegard Nossum wrote:
> With the attached btrfs image, I get the following splat when mounting:
> 
> """
> # mount -o loop -t btrfs ./btrfs.0 /mnt/0/
> BTRFS: device fsid 9006933e-2a9a-44f0-917f-514252aeec2c devid 1 transid 
> 7 /dev/loop0
> BTRFS info (device loop0): disk space caching is enabled
> BUG: failure at fs/btrfs/ctree.h:337/btrfs_chunk_item_size()!

Looks like a fuzzed image, it blows when num_stripes is 0

 335 static inline unsigned long btrfs_chunk_item_size(int num_stripes)
 336 {
 337 BUG_ON(num_stripes == 0);
 338 return sizeof(struct btrfs_chunk) +
 339 sizeof(struct btrfs_stripe) * (num_stripes - 1);
 340 }

called from btrfs_read_sys_array. The check for num_stripes == 0 seems to be
missing from kernel code, 'btrfs check' detects that and does not crash.

The userspace code has some more validations so I'll port them to kernel
and this should fix the mount crash.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html