Hi,

It took me a couple of days, because I needed to patch my kernel first
and then issue a rebalance, which ran for more than two days.
Nevertheless, the rebalance succeeded without any "kernel BUG"-messages,
so apparently your patch works!

I noticed that at first, the messages were like this:

[79329.526490] btrfs: found 1939 extents
[79375.950834] btrfs: found 1939 extents
[79376.083599] btrfs: relocating block group 352220872704 flags 1
[80052.940435] btrfs: found 3786 extents
[80108.439657] btrfs: found 3786 extents
[80112.325548] btrfs: relocating block group 351147130880 flags 1

Just like I saw during previous balance-runs. Then all of a sudden the
messages changed to:

[104178.827594] btrfs allocation failed flags 1, wanted 2013265920
[104178.827599] space_info has 4271198208 free, is not full
[104178.827602] space_info total=214748364800, used=210440957952,
pinned=0, reserved=36208640, may_use=3168993280, readonly=0
[104178.827606] block group 1107296256 has 5368709120 bytes, 5368582144
used 0 pinned 0 reserved
[104178.827610] entry offset 1778384896, bytes 86016, bitmap yes
[104178.827612] entry offset 1855827968, bytes 20480, bitmap no
[104178.827614] entry offset 1855852544, bytes 20480, bitmap no
[104178.827617] block group has cluster?: no
[104178.827618] 0 blocks of free space at or bigger than bytes is
[104178.827621] block group 8623489024 has 5368709120 bytes, 5368705024
used 0 pinned 0 reserved
[104178.827624] entry offset 8891924480, bytes 4096, bitmap yes
[104178.827626] block group has cluster?: no
[104178.827628] 0 blocks of free space at or bigger than bytes is
[104178.827631] block group 17213423616 has 5368709120 bytes, 5368709120
used 0 pinned 0 reserved
[104178.827634] block group has cluster?: no

And so on.

Does this indicate an error of any sort, or is this expected behaviour?

Kind regards,

Erik.


On 01/21/2011 10:19 AM, Yan, Zheng wrote:
> please try patch attached below, Thanks.
> 
> ---
> diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
> index b37d723..49d6b13 100644
> --- a/fs/btrfs/relocation.c
> +++ b/fs/btrfs/relocation.c
> @@ -1158,6 +1158,7 @@ static int clone_backref_node(struct
> btrfs_trans_handle *trans,
>       new_node->bytenr = dest->node->start;
>       new_node->level = node->level;
>       new_node->lowest = node->lowest;
> +     new_node->checked = 1;
>       new_node->root = dest;
> 
>       if (!node->lowest) {
> ---
> 
> 
> On Fri, Jan 21, 2011 at 4:50 PM, Erik Logtenberg <e...@logtenberg.eu> wrote:
>> Hi,
>>
>> I hit the same bug again I think:
>>
>> [291835.724344] ------------[ cut here ]------------
>> [291835.724376] kernel BUG at fs/btrfs/relocation.c:836!
>> [291835.724401] invalid opcode: 0000 [#1] SMP
>> [291835.724424] last sysfs file:
>> /sys/devices/system/cpu/cpu1/cache/index2/shared_cpu_map
>> [291835.724461] CPU 0
>> [291835.724472] Modules linked in: uvcvideo snd_usb_audio
>> snd_usbmidi_lib videodev v4l1_compat snd_rawmidi v4l2_compat_ioctl32
>> btrfs zlib_deflate libcrc32c sha256_generic cryptd aes_x86_64
>> aes_generic cbc dm_crypt tun ebtable_nat ebtables ipt_MASQUERADE
>> iptable_nat nf_nat bridge stp llc nfsd lockd nfs_acl auth_rpcgss
>> exportfs nls_utf8 cifs fscache sunrpc cpufreq_ondemand acpi_cpufreq
>> freq_table mperf ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6
>> ip6table_filter ip6_tables ipv6 kvm_intel kvm dummy uinput
>> snd_hda_codec_analog snd_hda_intel snd_hda_codec snd_hwdep snd_seq
>> snd_seq_device e1000e snd_pcm snd_timer i2c_i801 snd shpchp iTCO_wdt
>> iTCO_vendor_support soundcore dell_wmi sparse_keymap snd_page_alloc
>> serio_raw joydev wmi dcdbas microcode usb_storage uas raid1 pata_acpi
>> ata_generic radeon ttm drm_kms_helper drm i2c_algo_bit i2c_core [last
>> unloaded: scsi_wait_scan]
>> [291835.725002]
>> [291835.725013] Pid: 27386, comm: btrfs Tainted: G          I
>> 2.6.37-2.fc15.x86_64 #1
>> [291835.725062] RIP: 0010:[<ffffffffa0565237>]  [<ffffffffa0565237>]
>> build_backref_tree+0x473/0xd6d [btrfs]
>> [291835.725126] RSP: 0018:ffff8800373bf9c8  EFLAGS: 00010246
>> [291835.725152] RAX: ffff8801367d5100 RBX: ffff88020b110880 RCX:
>> 0000000000000040
>> [291835.725186] RDX: 0000000000000030 RSI: 0000006dd08d3000 RDI:
>> ffff880100069820
>> [291835.725219] RBP: ffff8800373bfaf8 R08: 0000000000008050 R09:
>> ffff8800373bf980
>> [291835.725253] R10: ffff8800373bf918 R11: ffff88020b110880 R12:
>> ffff8801367d5100
>> [291835.725254] R13: ffff88012c0a24c0 R14: ffff88021e2013f0 R15:
>> ffff88021e201cf0
>> [291835.725254] FS:  00007fcb1a6cc760(0000) GS:ffff8800bfa00000(0000)
>> knlGS:0000000000000000
>> [291835.725254] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
>> [291835.725254] CR2: 0000000002feeeb8 CR3: 00000001c2943000 CR4:
>> 00000000000426e0
>> [291835.725254] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
>> 0000000000000000
>> [291835.725254] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
>> 0000000000000400
>> [291835.725254] Process btrfs (pid: 27386, threadinfo ffff8800373be000,
>> task ffff88022452ae40)
>> [291835.725254] Stack:
>> [291835.725254]  ffffea0004b5a470 ffffea0000000000 ffff8800373bf9f8
>> ffff8800373bfaa8
>> [291835.725254]  0000000000000000 ffff88005faafbb0 ffff880100069808
>> ffff880100069d78
>> [291835.725254]  ffff88012c0a2aa0 ffff880100069820 ffff88020b1108c0
>> ffff880100069d80
>> [291835.725254] Call Trace:
>> [291835.725254]  [<ffffffffa0565c91>] relocate_tree_blocks+0x160/0x478
>> [btrfs]
>> [291835.725254]  [<ffffffffa056463d>] ? add_tree_block+0x11e/0x13e [btrfs]
>> [291835.725254]  [<ffffffffa0566b45>] relocate_block_group+0x1e3/0x490
>> [btrfs]
>> [291835.725254]  [<ffffffff8103edb9>] ? should_resched+0xe/0x2e
>> [291835.725254]  [<ffffffffa0566f39>]
>> btrfs_relocate_block_group+0x147/0x28a [btrfs]
>> [291835.725254]  [<ffffffffa054e52a>]
>> btrfs_relocate_chunk.clone.40+0x61/0x4ab [btrfs]
>> [291835.725254]  [<ffffffffa05152d4>] ? btrfs_item_key+0x1e/0x20 [btrfs]
>> [291835.725254]  [<ffffffffa05152f0>] ? btrfs_item_key_to_cpu+0x1a/0x36
>> [btrfs]
>> [291835.725254]  [<ffffffffa054c2a8>] ? read_extent_buffer+0xc3/0xe3 [btrfs]
>> [291835.725254]  [<ffffffffa05154e6>] ?
>> btrfs_header_nritems.clone.12+0x17/0x1c [btrfs]
>> [291835.725254]  [<ffffffffa054cff6>] ? btrfs_item_key_to_cpu+0x2a/0x46
>> [btrfs]
>> [291835.725254]  [<ffffffffa055045e>] btrfs_balance+0x1a3/0x1f0 [btrfs]
>> [291835.725254]  [<ffffffff8112bce5>] ? do_filp_open+0x226/0x5c8
>> [291835.725254]  [<ffffffffa0556773>] btrfs_ioctl+0x641/0x846 [btrfs]
>> [291835.725254]  [<ffffffff811f3ed1>] ? file_has_perm+0xa5/0xc7
>> [291835.725254]  [<ffffffff8112e091>] do_vfs_ioctl+0x4b1/0x4f2
>> [291835.725254]  [<ffffffff8112e128>] sys_ioctl+0x56/0x7a
>> [291835.725254]  [<ffffffff8100acc2>] system_call_fastpath+0x16/0x1b
>> [291835.725254] Code: 48 8b 45 89 49 8d 7d 10 48 8d 75 b0 49 89 44 24 18
>> 8a 43 70 ff c0 41 88 44 24 70 e8 f7 c3 ff ff eb 17 f6 40 71 10 49 89 c4
>> 75 02 <0f> 0b 49 8d 45 10 49 89 45 10 49 89 45 18 48 8b b5 20 ff ff ff
>> [291835.725254] RIP  [<ffffffffa0565237>] build_backref_tree+0x473/0xd6d
>> [btrfs]
>> [291835.725254]  RSP <ffff8800373bf9c8>
>> [291835.738971] ---[ end trace a7919e7f17c0a727 ]---
>>
>>
>> It is really difficult to reproduce this bug. This time, I was balancing
>> a 300GB volume, which was almost finished by the time it crashed. It had
>> been running for 2 days straight, and survived a complete backup run,
>> with 5 simultaneous rsyncs running on it. Last night when the rsyncs
>> kicked in, it crashed within half an hour though.
>>
>> I will now try downgrading to 2.6.36 as per Zheng Yan's suggestion.
>>
>> Thanks,
>>
>> Erik.
>>
>>
>> Op 17-1-2011 15:31, Erik Logtenberg schreef:
>>> Hi,
>>>
>>> Please find attached the error log, for future reference.
>>>
>>> Forgot to mention:
>>> I could still use the system after this error, so it was not a complete
>>> fatal error in that regard. All active processes (mostly rsync) were
>>> hanging in state D though, so I couldn't kill them anymore. Also the FS
>>> was not umountable. So I still had to reboot.
>>>
>>> Thanks,
>>>
>>> Erik.
>>>
>>>
>>> On 01/17/2011 03:14 PM, Erik Logtenberg wrote:
>>>> Hi,
>>>>
>>>> btrfs balance results in:
>>>>
>>>> http://pastebin.com/v5j0809M
>>>>
>>>> My system: fully up-to-date Fedora 14 with rawhide kernel to make btrfs
>>>> balance do useful stuff to my free space:
>>>>
>>>> kernel-2.6.37-2.fc15.x86_64
>>>> btrfs-progs-0.19-12.fc14.x86_64
>>>>
>>>> Filesystem had 0 bytes free, should be 45G, so on darklings advice I ran
>>>> btrfs balance on the fs, while doing heavy I/O (re-running 5 backup jobs
>>>> that had failed due to ENOSP).
>>>> Up until the crash, btrfs balance did retrieve a couple of Gigs free
>>>> space though, so that part of the plan worked just fine.
>>>>
>>>> Thanks,
>>>>
>>>> Erik.
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>>>> the body of a message to majord...@vger.kernel.org
>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>> the body of a message to majord...@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to