Re: [Bugme-new] [Bug 29302] New: Null pointer dereference with large max_sectors_kb
On 02/21/2011 04:08 PM, Chris Mason wrote: Excerpts from Tomas Zvala's message of 2011-02-20 06:06:42 -0500: On 18.2.2011 21:18, Chris Mason wrote: Ok, so it isn't part of the open devices code that prints errors, my guess is we're failing to read a good super. Could you please mkfs.btrfs /dev/xxx, sync, then btrfsck /dev/xxx, I want to make sure things are really getting written. Here's a patch that makes sure we have a good bdev after scanning, hopefully it will let us debug things without your box going boom. -chris Hello Chris, I patched the kernel, recompiled and installed, but I don't see the message in the kernel log. I forgot to mention before, that max_hw_sectors_kb and thus max_sectors_kb is 4096. ok, more below. Linux stg-top 2.6.37stg #7 SMP Sun Feb 20 10:48:57 CET 2011 x86_64 Intel(R) Xeon(R) CPU E5620 @ 2.40GHz GenuineIntel GNU/Linux With default max_sectors_kb = 512 mkfs, sync, btrfsck going through fine first mount is fine [112005]stg-top ~ # cat /sys/block/sdb/queue/max_sectors_kb 512 [112012]stg-top ~ # mkfs.btrfs /dev/sdb WARNING! - Btrfs Btrfs v0.19 IS EXPERIMENTAL WARNING! - see http://btrfs.wiki.kernel.org before using fs created label (null) on /dev/sdb nodesize 4096 leafsize 4096 sectorsize 4096 size 2.73TB Btrfs Btrfs v0.19 [112045]stg-top ~ # sync [112046]stg-top ~ # btrfsck /dev/sdb found 28672 bytes used err is 0 total csum bytes: 0 total tree bytes: 28672 total fs tree bytes: 8192 btree space waste bytes: 23875 file data blocks allocated: 0 referenced 0 Btrfs Btrfs v0.19 [112050]stg-top ~ # mount /dev/sdb /mnt/ [112054]stg-top ~ # df /mnt/ Filesystem 1K-blocks Used Available Use% Mounted on /dev/sdb 292968576056 2927559936 1% /mnt [112057]stg-top ~ # umount /mnt With max_sectors_kb = 4096 mkfs, sync, btrfsck is fine second mount BUGs out [112100]stg-top ~ # echo 4096 /sys/block/sdb/queue/max_sectors_kb [112107]stg-top ~ # mkfs.btrfs /dev/sdb WARNING! - Btrfs Btrfs v0.19 IS EXPERIMENTAL WARNING! - see http://btrfs.wiki.kernel.org before using fs created label (null) on /dev/sdb nodesize 4096 leafsize 4096 sectorsize 4096 size 2.73TB Btrfs Btrfs v0.19 [112113]stg-top ~ # sync [112117]stg-top ~ # btrfsck /dev/sdb found 28672 bytes used err is 0 total csum bytes: 0 total tree bytes: 28672 total fs tree bytes: 8192 btree space waste bytes: 23875 file data blocks allocated: 0 referenced 0 Btrfs Btrfs v0.19 [112123]stg-top ~ # mount /dev/sdb /mnt/ Killed At this point, your box is broken. You have to reboot before anything new is valid (sorry). Could you please: Set max_sectors_kb to 4096 mkfs.btrfs /dev/xxx sync echo 3 /proc/sys/vm/drop_caches btrfsck /dev/xxx Don't try and mount, clearly my patch isn't saving us. -chris Hello Chris, I'm sorry it took so long, I've been a bit busy lately ... [132926]stg-bot linux-2.6.37-scst # cat /sys/block/sdb/queue/max_hw_sectors_kb /sys/block/sdb/queue/max_sectors_kb [132932]stg-bot linux-2.6.37-scst # mkfs.btrfs /dev/sdb WARNING! - Btrfs Btrfs v0.19 IS EXPERIMENTAL WARNING! - see http://btrfs.wiki.kernel.org before using fs created label (null) on /dev/sdb nodesize 4096 leafsize 4096 sectorsize 4096 size 2.73TB Btrfs Btrfs v0.19 [132945]stg-bot linux-2.6.37-scst # sync [132948]stg-bot linux-2.6.37-scst # echo 3 /proc/sys/vm/drop_caches [132958]stg-bot linux-2.6.37-scst # btrfsc btrfsck btrfsctl [132958]stg-bot linux-2.6.37-scst # btrfsck /dev/sdb found 28672 bytes used err is 0 total csum bytes: 0 total tree bytes: 28672 total fs tree bytes: 8192 btree space waste bytes: 23875 file data blocks allocated: 0 referenced 0 Btrfs Btrfs v0.19 And still cannot mount. Tomas -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Bugme-new] [Bug 29302] New: Null pointer dereference with large max_sectors_kb
On 18.2.2011 21:18, Chris Mason wrote: Ok, so it isn't part of the open devices code that prints errors, my guess is we're failing to read a good super. Could you please mkfs.btrfs /dev/xxx, sync, then btrfsck /dev/xxx, I want to make sure things are really getting written. Here's a patch that makes sure we have a good bdev after scanning, hopefully it will let us debug things without your box going boom. -chris Hello Chris, I patched the kernel, recompiled and installed, but I don't see the message in the kernel log. I forgot to mention before, that max_hw_sectors_kb and thus max_sectors_kb is 4096. Linux stg-top 2.6.37stg #7 SMP Sun Feb 20 10:48:57 CET 2011 x86_64 Intel(R) Xeon(R) CPU E5620 @ 2.40GHz GenuineIntel GNU/Linux With default max_sectors_kb = 512 mkfs, sync, btrfsck going through fine first mount is fine [112005]stg-top ~ # cat /sys/block/sdb/queue/max_sectors_kb 512 [112012]stg-top ~ # mkfs.btrfs /dev/sdb WARNING! - Btrfs Btrfs v0.19 IS EXPERIMENTAL WARNING! - see http://btrfs.wiki.kernel.org before using fs created label (null) on /dev/sdb nodesize 4096 leafsize 4096 sectorsize 4096 size 2.73TB Btrfs Btrfs v0.19 [112045]stg-top ~ # sync [112046]stg-top ~ # btrfsck /dev/sdb found 28672 bytes used err is 0 total csum bytes: 0 total tree bytes: 28672 total fs tree bytes: 8192 btree space waste bytes: 23875 file data blocks allocated: 0 referenced 0 Btrfs Btrfs v0.19 [112050]stg-top ~ # mount /dev/sdb /mnt/ [112054]stg-top ~ # df /mnt/ Filesystem 1K-blocks Used Available Use% Mounted on /dev/sdb 292968576056 2927559936 1% /mnt [112057]stg-top ~ # umount /mnt With max_sectors_kb = 4096 mkfs, sync, btrfsck is fine second mount BUGs out [112100]stg-top ~ # echo 4096 /sys/block/sdb/queue/max_sectors_kb [112107]stg-top ~ # mkfs.btrfs /dev/sdb WARNING! - Btrfs Btrfs v0.19 IS EXPERIMENTAL WARNING! - see http://btrfs.wiki.kernel.org before using fs created label (null) on /dev/sdb nodesize 4096 leafsize 4096 sectorsize 4096 size 2.73TB Btrfs Btrfs v0.19 [112113]stg-top ~ # sync [112117]stg-top ~ # btrfsck /dev/sdb found 28672 bytes used err is 0 total csum bytes: 0 total tree bytes: 28672 total fs tree bytes: 8192 btree space waste bytes: 23875 file data blocks allocated: 0 referenced 0 Btrfs Btrfs v0.19 [112123]stg-top ~ # mount /dev/sdb /mnt/ Killed Back to max_sectors_kb = 512 mkfs, can't sync, because that'd block, btrfsck is fine third mount fails [112126]stg-top ~ # echo 512 /sys/block/sdb/queue/max_sectors_kb [112141]stg-top ~ # mkfs.btrfs /dev/sdb WARNING! - Btrfs Btrfs v0.19 IS EXPERIMENTAL WARNING! - see http://btrfs.wiki.kernel.org before using fs created label (null) on /dev/sdb nodesize 4096 leafsize 4096 sectorsize 4096 size 2.73TB Btrfs Btrfs v0.19 [112144]stg-top ~ # btrfs btrfs btrfs-bcp btrfs-convert btrfs-image btrfs-show btrfs-vol btrfsckbtrfsctl btrfstune [112144]stg-top ~ # btrfsck /dev/sdb found 28672 bytes used err is 0 total csum bytes: 0 total tree bytes: 28672 total fs tree bytes: 8192 btree space waste bytes: 23875 file data blocks allocated: 0 referenced 0 Btrfs Btrfs v0.19 [112150]stg-top ~ # mount /dev/sdb /mnt mount: wrong fs type, bad option, bad superblock on /dev/sdb, missing codepage or helper program, or other error In some cases useful info is found in syslog - try dmesg | tail or so Run sync in background (it just sits there) and try to mount again (no other commands in meantime) [112155]stg-top ~ # sync [1] 15800 [112202]stg-top ~ # mount /dev/sdb /mnt [112411]stg-top ~ # df /mnt/ Filesystem 1K-blocks Used Available Use% Mounted on /dev/sdb 292968576056 2927559936 1% /mnt [112456]stg-top ~ # (dmesg after 1st mount) [ 122.085837] device fsid 8d440bbadcd5b357-9e8f05f44e926498 devid 1 transid 7 /dev/sdb (dmesg after 2nd mount is pretty much the same as before - no mention of the patch) (dmesg after 3rd mount) [ 183.535484] device fsid 945db83dbf98e60-a27fadce8932f892 devid 1 transid 7 /dev/sdb [ 183.536785] btrfs bad fsid on block 20971520 [ 183.537041] btrfs bad fsid on block 20971520 [ 183.537268] btrfs bad fsid on block 20971520 [ 183.537469] btrfs: failed to read chunk root on sdb [ 183.545654] btrfs: open_ctree failed (dmesg after 4th mount) [ 192.481985] device fsid 945db83dbf98e60-a27fadce8932f892 devid 1 transid 7 /dev/sdb Tomas -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Bugme-new] [Bug 29302] New: Null pointer dereference with large max_sectors_kb
On Thu, Feb 17, 2011 at 03:37:19PM -0800, Andrew Morton wrote: [ 605.109630] BUG: unable to handle kernel NULL pointer dereference at 0010 [ 605.109928] IP: [81100a7a] bio_add_page+0xa/0x40 disassembly of the crashpoint obtained from Code: bytes below: 0: 48 8b 4f 10 mov0x10(%rdi),%rcx 4: 48 8b 89 98 00 00 00mov0x98(%rcx),%rcx b: 48 8b b9 f0 01 00 00mov0x1f0(%rcx),%rdi 12: 89 d1 mov%edx,%ecx line 0, accessing some struct member at offset 0x10 struct bio { sector_tbi_sector; /* device address in 512 byte sectors */ struct bio *bi_next; /* request queue link */ struct block_device *bi_bdev; ^ so this is zero in bio_add_page: int bio_add_page(struct bio *bio, struct page *page, unsigned int len, unsigned int offset) { struct request_queue *q = bdev_get_queue(bio-bi_bdev); return __bio_add_page(q, bio, page, len, offset, queue_max_sectors(q)); } static inline struct request_queue *bdev_get_queue(struct block_device *bdev) { return bdev-bd_disk-queue; } just to verify following asm instructions (lines 4 and b): 4: dereference something at offset 0x98, roughly matches bd_disk b: then dereference something at 0x1f0, rougly matches queue So, bio-bi_bdev == NULL, going backwards in call stack to __extent_read_full_page(), where the bdev is set: 2050 bdev = em-bdev; em = get_extent(...), indirect call of btree_get_extent() passed from btree_read_extent_buffer_pages() to read_extent_buffer_pages() 165 em-bdev = BTRFS_I(inode)-root-fs_info-fs_devices-latest_bdev; (or at line 149, but I do not think that there could be any succesful lookup 147 em = lookup_extent_mapping(em_tree, start, len); be it either way, the assigned value is exactly the same ) Now let's hunt why latest_bdev is NULL. Startin in btrfs_mount, where fs_devices is being filled in. fs/btrfs/volumes.c:__btrfs_open_devices: 579 struct block_device *latest_bdev = NULL; ... 583 u64 latest_transid = 0; ... 590 list_for_each_entry(device, head, dev_list) { 591 if (device-bdev) 592 continue; 593 if (!device-name) 594 continue; ... 618 device-generation = btrfs_super_generation(disk_super); 619 if (!latest_transid || device-generation latest_transid) { 620 latest_devid = devid; 621 latest_transid = device-generation; 622 latest_bdev = bdev; 623 } ... 653 } ... 660 fs_devices-latest_bdev = latest_bdev; Line 591 or 593 will not stop the loop, device is not filled yet. Could be, that any of the devices from 'head' list does not satisfy condition on line 619, but it cannot be the first device, as !latest_transid would work. There is only one device, /dev/sdb, latest_bdev is set and later on set on line 660. /me sees no more options Meanwhile I've tried it myself and the error does not happen here, with head at 795abaf1e4e185 (.38-rc4-178-g795abaf). I'll try it with latest -rc5. [ 605.110089] PGD 277d70067 PUD 277e0a067 PMD 0 [ 605.110247] Oops: [#1] SMP [ 605.110394] last sysfs file: /sys/devices/system/cpu/cpu15/cache/index2/shared_cpu_map [ 605.110686] CPU 6 [ 605.110698] Modules linked in: ip6table_filter ip6_tables nf_nat_tftp nf_nat_sip nf_nat_pptp nf_nat_proto_gre nf_nat_irc nf_nat_h323 nf_nat_ftp nf_nat_amanda nf_conntrack_amanda nf_conntrack_tftp nf_conntrack_sip nf_conntrack_proto_sctp nf_conntrack_pptp nf_conntrack_proto_gre nf_conntrack_netlink nf_conntrack_netbios_ns nf_conntrack_irc nf_conntrack_h323 nf_conntrack_ftp xt_physdev xt_hashlimit nfs ib_iser libiscsi scsi_transport_iscsi ib_ucm ib_ipoib rdma_ucm rdma_cm ib_cm iw_cm ib_sa ib_addr ib_uverbs ib_umad mlx4_ib ib_mthca ib_mad ib_core i7core_edac edac_core mlx4_core iTCO_wdt iTCO_vendor_support [ 605.112285] [ 605.112419] Pid: 1, comm: mount Not tainted 2.6.37stg #6 X8DTU/X8DTU [ 605.112586] RIP: 0010:[81100a7a] [81100a7a] bio_add_page+0xa/0x40 [ 605.112879] RSP: :8801833b39b8 EFLAGS: 00010296 [ 605.113035] RAX: RBX: RCX: [ 605.113207] RDX: 1000 RSI: ea000c3cd200 RDI: [ 605.113382] RBP: 8801833b3ba0 R08: R09: [ 605.113554] R10: R11: R12: [ 605.113723] R13: R14: a000 R15: 88024a19ab98 [ 605.113895] FS: 7fbcfd971740() GS:880339c8() knlGS: [
Re: [Bugme-new] [Bug 29302] New: Null pointer dereference with large max_sectors_kb
Excerpts from David Sterba's message of 2011-02-18 12:43:15 -0500: [ great analysis, thanks so much ] Could be, that any of the devices from 'head' list does not satisfy condition on line 619, but it cannot be the first device, as !latest_transid would work. There is only one device, /dev/sdb, latest_bdev is set and later on set on line 660. /me sees no more options Meanwhile I've tried it myself and the error does not happen here, with head at 795abaf1e4e185 (.38-rc4-178-g795abaf). I'll try it with latest -rc5. [ 605.110089] PGD 277d70067 PUD 277e0a067 PMD 0 [ 605.110247] Oops: [#1] SMP Are there any more kernel messages involved before the oops starts? It really feels like we're failing to open the block device somehow inside the btrfs scanning code. -chris -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Bugme-new] [Bug 29302] New: Null pointer dereference with large max_sectors_kb
On Fri, 18 Feb 2011 14:16:12 -0500 Chris Mason chris.ma...@oracle.com wrote: Are there any more kernel messages involved before the oops starts? The full dmesg is in bugzilla. https://bugzilla.kernel.org/show_bug.cgi?id=29302 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Bugme-new] [Bug 29302] New: Null pointer dereference with large max_sectors_kb
Excerpts from Andrew Morton's message of 2011-02-18 14:56:03 -0500: On Fri, 18 Feb 2011 14:16:12 -0500 Chris Mason chris.ma...@oracle.com wrote: Are there any more kernel messages involved before the oops starts? The full dmesg is in bugzilla. https://bugzilla.kernel.org/show_bug.cgi?id=29302 Ok, so it isn't part of the open devices code that prints errors, my guess is we're failing to read a good super. Could you please mkfs.btrfs /dev/xxx, sync, then btrfsck /dev/xxx, I want to make sure things are really getting written. Here's a patch that makes sure we have a good bdev after scanning, hopefully it will let us debug things without your box going boom. -chris diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c index d39a989..d8f7ee0 100644 --- a/fs/btrfs/super.c +++ b/fs/btrfs/super.c @@ -757,6 +757,10 @@ static struct dentry *btrfs_mount(struct file_system_type *fs_type, int flags, tree_root-fs_info = fs_info; bdev = fs_devices-latest_bdev; + if (!bdev) { + printk(KERN_CRIT btrfs read devices failed on %s\n, dev_name); + goto error_close_devices; + } s = sget(fs_type, btrfs_test_super, btrfs_set_super, tree_root); if (IS_ERR(s)) goto error_s; -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Bugme-new] [Bug 29302] New: Null pointer dereference with large max_sectors_kb
(switched to email. Please respond via emailed reply-to-all, not via the bugzilla web interface). On Thu, 17 Feb 2011 13:20:20 GMT bugzilla-dae...@bugzilla.kernel.org wrote: https://bugzilla.kernel.org/show_bug.cgi?id=29302 Summary: Null pointer dereference with large max_sectors_kb Product: IO/Storage Version: 2.5 Kernel Version: 2.6.36 - 2.6.38-rc5 Platform: All OS/Version: Linux Tree: Mainline Status: NEW Severity: normal Priority: P1 Component: Block Layer AssignedTo: ax...@kernel.dk ReportedBy: f...@murder.cz Regression: No Created an attachment (id=48132) -- (https://bugzilla.kernel.org/attachment.cgi?id=48132) dmesg with error displayed. Hello, I'm not really sure I identified the problem product/component correctly, but this seems most appropriate. [140517]stg-bot ~ # echo 256 /sys/block/sdb/queue/max_sectors_kb [140523]stg-bot ~ # mkfs.btrfs /dev/sdb WARNING! - Btrfs Btrfs v0.19 IS EXPERIMENTAL WARNING! - see http://btrfs.wiki.kernel.org before using fs created label (null) on /dev/sdb nodesize 4096 leafsize 4096 sectorsize 4096 size 2.73TB Btrfs Btrfs v0.19 [140532]stg-bot ~ # mount /dev/sdb /mnt [140540]stg-bot ~ # umount /mnt [140543]stg-bot ~ # cat /sys/block/sdb/queue/max_hw_sectors_kb /sys/block/sdb/queue/max_sectors_kb [140710]stg-bot ~ # mkfs.btrfs /dev/sdb WARNING! - Btrfs Btrfs v0.19 IS EXPERIMENTAL WARNING! - see http://btrfs.wiki.kernel.org before using fs created label (null) on /dev/sdb nodesize 4096 leafsize 4096 sectorsize 4096 size 2.73TB Btrfs Btrfs v0.19 [140713]stg-bot ~ # mount /dev/sdb /mnt Killed [140715]stg-bot ~ # Now there is a bug in dmesg (output attached) and another attempt to mount the device kind of freezes it. The mount blocks, sync blocks, but i can read/write the device using dd. And if I, instead of trying to mount again, zero out first 1MB, mkfs.btrfs and mount, I get the bug again. Freeze again on second mount attempt after that. This happens on 2.6.36 and 2.6.37 with scst patches, 2.6.37 vanilla and on 2.6.38-rc5 it just freezes the first time I try to mount. No outuput in dmesg. The hardware is Dual Xeon E5620, 12GB ram, Areca 1880 with 3 arrays (testing on 3TB raid10 and 10TB raid6), SuperMicro X8DTU-F. If I left out any important info, please let me know ;). A btrfs bug, I suspect. [ 605.109630] BUG: unable to handle kernel NULL pointer dereference at 0010 [ 605.109928] IP: [81100a7a] bio_add_page+0xa/0x40 [ 605.110089] PGD 277d70067 PUD 277e0a067 PMD 0 [ 605.110247] Oops: [#1] SMP [ 605.110394] last sysfs file: /sys/devices/system/cpu/cpu15/cache/index2/shared_cpu_map [ 605.110686] CPU 6 [ 605.110698] Modules linked in: ip6table_filter ip6_tables nf_nat_tftp nf_nat_sip nf_nat_pptp nf_nat_proto_gre nf_nat_irc nf_nat_h323 nf_nat_ftp nf_nat_amanda nf_conntrack_amanda nf_conntrack_tftp nf_conntrack_sip nf_conntrack_proto_sctp nf_conntrack_pptp nf_conntrack_proto_gre nf_conntrack_netlink nf_conntrack_netbios_ns nf_conntrack_irc nf_conntrack_h323 nf_conntrack_ftp xt_physdev xt_hashlimit nfs ib_iser libiscsi scsi_transport_iscsi ib_ucm ib_ipoib rdma_ucm rdma_cm ib_cm iw_cm ib_sa ib_addr ib_uverbs ib_umad mlx4_ib ib_mthca ib_mad ib_core i7core_edac edac_core mlx4_core iTCO_wdt iTCO_vendor_support [ 605.112285] [ 605.112419] Pid: 1, comm: mount Not tainted 2.6.37stg #6 X8DTU/X8DTU [ 605.112586] RIP: 0010:[81100a7a] [81100a7a] bio_add_page+0xa/0x40 [ 605.112879] RSP: :8801833b39b8 EFLAGS: 00010296 [ 605.113035] RAX: RBX: RCX: [ 605.113207] RDX: 1000 RSI: ea000c3cd200 RDI: [ 605.113382] RBP: 8801833b3ba0 R08: R09: [ 605.113554] R10: R11: R12: [ 605.113723] R13: R14: a000 R15: 88024a19ab98 [ 605.113895] FS: 7fbcfd971740() GS:880339c8() knlGS: [ 605.114188] CS: 0010 DS: ES: CR0: 80050033 [ 605.114352] CR2: 0010 CR3: 0001c17d5000 CR4: 06e0 [ 605.114525] DR0: DR1: DR2: [ 605.114695] DR3: DR6: 0ff0 DR7: 0400 [ 605.114864] Process mount (pid: 1, threadinfo 8801833b2000, task 8801b8b48cf0) [ 605.115157] Stack: [ 605.115290] 81251384 0140 ea000c3cd200 [ 605.115590] 4a19ab88 8801b966f380 1000 [ 605.115884] 81255810 0002 0140 [ 605.116180] Call Trace: [