On Wed, Aug 04, 2010 at 07:21:00PM +0800, Yan, Zheng wrote: > > We're seeing this too, since upgrading from 2.6.33.2 + merged old git btrfs > > unstable HEAD to plain 2.6.35. > > > > [sr...@backup01:.../.rmagic]# rm * > > rm: cannot remove `WEEKLY_bar3d.png': No space left on device > > rm: cannot remove `WEEKLY.html': No space left on device > > rm: cannot remove `YEARLY_bar3d.png': No space left on device > > rm: cannot remove `YEARLY.html': No space left on device >... > > Aug ?3 18:44:44 backup01 kernel: ------------[ cut here ]------------ > > Aug ?3 18:44:44 backup01 kernel: WARNING: at fs/btrfs/extent-tree.c:3441 > > btrfs_block_rsv_check+0x151/0x180() >... > > These warning is because btrfs in 2.6.35 reserves more metadata space > for internal use > than older kernel. Your FS is too full, btrfs can't reserve enough > metadata space.
Hello! Is it possible that 2.6.33.2 btrfs has mucked up the on-disk stuff in a way that causes 2.6.35 to be unhappy? The file system in question was reported to be 85% full, according to "df". In the meantime, we've been having some other problems on 2.6.35; for example, rsync has been trying to append a block to a file for the past 5 days. The file system is reported as 45% full: [sr...@backup01:/root]# df -Pt btrfs /backup/bu000/vol05/ Filesystem 1024-blocks Used Available Capacity Mounted on /dev/mapper/bu000-vol05 3221225472 1429529324 1791696148 45% /backup/bu000/vol05 [sr...@backup01:/root]# btrfs files df /backup/bu000/vol05 Data: total=1.57TB, used=1.31TB Metadata: total=15.51GB, used=10.48GB System: total=12.00MB, used=192.00KB At some point today, the kernel also spat this out: BUG: soft lockup - CPU#3 stuck for 61s! [rsync:21903] Modules linked in: ipmi_devintf ipmi_si ipmi_msghandler aoe bnx2 CPU 3 Modules linked in: ipmi_devintf ipmi_si ipmi_msghandler aoe bnx2 Pid: 21903, comm: rsync Tainted: G W 2.6.35-hw #2 0NK937/PowerEdge 1950 RIP: 0010:[<ffffffff81101a2d>] [<ffffffff81101a2d>] iput+0x5d/0x70 RSP: 0018:ffff8802c14abb48 EFLAGS: 00000246 RAX: 0000000000000000 RBX: ffff8802c14abb58 RCX: 0000000000000003 RDX: 0000000000000000 RSI: 0000000000000002 RDI: ffff88007c075980 RBP: ffffffff8100a84e R08: 0000000000000001 R09: 8000000000000000 R10: 0000000000000002 R11: 0000000000000000 R12: ffffffffffffff66 R13: ffffffff81af04e0 R14: 0000000000000000 R15: 7fffffffffffffff FS: 00007fd13bbb06e0(0000) GS:ffff880001cc0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000000002f5a108 CR3: 00000001eb94a000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process rsync (pid: 21903, threadinfo ffff8802c14aa000, task ffff880080b04b00) Stack: ffff88007c075888 ffff88007c0757b0 ffff8802c14abb98 ffffffff812d7439 <0> ffffffff81664cde 0000000000000001 0000000004d80000 0000000000004000 <0> ffff88042a708178 ffff88042a708000 ffff8802c14abc08 ffffffff812c599c Call Trace: [<ffffffff812d7439>] ? btrfs_start_one_delalloc_inode+0x129/0x160 [<ffffffff81664cde>] ? _raw_spin_lock+0xe/0x20 [<ffffffff812c599c>] ? shrink_delalloc+0x8c/0x130 [<ffffffff812c5f39>] ? btrfs_delalloc_reserve_metadata+0x189/0x190 [<ffffffff8110180e>] ? file_update_time+0x11e/0x180 [<ffffffff812c5f83>] ? btrfs_delalloc_reserve_space+0x43/0x60 [<ffffffff812e2a98>] ? btrfs_file_aio_write+0x508/0x970 [<ffffffff8100a84e>] ? apic_timer_interrupt+0xe/0x20 [<ffffffff810ec1b1>] ? do_sync_write+0xd1/0x120 [<ffffffff810fc767>] ? poll_select_copy_remaining+0xf7/0x140 [<ffffffff810ecd2b>] ? vfs_write+0xcb/0x1a0 [<ffffffff810ecef0>] ? sys_write+0x50/0x90 [<ffffffff81009f02>] ? system_call_fastpath+0x16/0x1b Code: 00 01 00 00 48 c7 c2 a0 2c 10 81 48 8b 40 30 48 85 c0 74 12 48 8b 50 20 48 c7 c0 a0 2c 10 81 48 85 d2 48 0 Call Trace: [<ffffffff812d7439>] ? btrfs_start_one_delalloc_inode+0x129/0x160 [<ffffffff81664cde>] ? _raw_spin_lock+0xe/0x20 [<ffffffff812c599c>] ? shrink_delalloc+0x8c/0x130 [<ffffffff812c5f39>] ? btrfs_delalloc_reserve_metadata+0x189/0x190 [<ffffffff8110180e>] ? file_update_time+0x11e/0x180 [<ffffffff812c5f83>] ? btrfs_delalloc_reserve_space+0x43/0x60 [<ffffffff812e2a98>] ? btrfs_file_aio_write+0x508/0x970 [<ffffffff8100a84e>] ? apic_timer_interrupt+0xe/0x20 [<ffffffff810ec1b1>] ? do_sync_write+0xd1/0x120 [<ffffffff810fc767>] ? poll_select_copy_remaining+0xf7/0x140 [<ffffffff810ecd2b>] ? vfs_write+0xcb/0x1a0 [<ffffffff810ecef0>] ? sys_write+0x50/0x90 [<ffffffff81009f02>] ? system_call_fastpath+0x16/0x1b [sr...@backup01:/root]# ls -l /proc/21903/fd/1 lrwx------ 1 root root 64 2010-08-09 18:21 /proc/21903/fd/1 -> /backup/bu000/vol05/vg005_web11_backup/2010-08-04-17-00/64/54/.../customer file.mov.aYX4Js [sr...@backup01:/root]# ls -lL /proc/21903/fd/1 -rw------- 1 root root 977797120 2010-08-04 20:39 /proc/21903/fd/1 [sr...@backup01:/root]# ps auxw|grep rsync root 21903 73.2 0.0 12912 192 ? R Aug04 5177:08 rsync -aHq --numeric-ids --exclude-from=/etc/backups/backup.exclude --delete --delete-excluded /storage/vg005/web11/64/54/ /backup/bu000/vol05/vg005_web11_backup/2010-08-04-17-00/64/54 In other words, the rsync has made no progress for 5 days (or at least the mtime hasn't changed since then). "perf top" still shows output like this, showing that btrfs is trying to btrfs_find_space_cluster all of the time: samples pcnt function DSO _______ _____ ______________________________ _________________ 2127.00 11.9% find_next_bit [kernel] 1914.00 10.7% find_next_zero_bit [kernel] 1580.00 8.8% schedule [kernel] 1340.00 7.5% btrfs_find_space_cluster [kernel] 1238.00 6.9% _raw_spin_lock_irqsave [kernel] 1017.00 5.7% _raw_spin_lock [kernel] 662.00 3.7% sched_clock_local [kernel] 615.00 3.4% native_read_tsc [kernel] 590.00 3.3% _raw_spin_lock_irq [kernel] 468.00 2.6% _raw_spin_unlock_irqrestore [kernel] 405.00 2.3% schedule_timeout [kernel] 338.00 1.9% native_sched_clock [kernel] 329.00 1.8% update_curr [kernel] 323.00 1.8% lock_timer_base [kernel] 297.00 1.7% btrfs_start_one_delalloc_inode [kernel] 285.00 1.6% pick_next_task_fair [kernel] 267.00 1.5% try_to_del_timer_sync [kernel] 248.00 1.4% sched_clock_cpu [kernel] 245.00 1.4% deflate_fast [kernel] So, is it possible that the older btrfs code left things in a way that wouldn't have happened if we had started with 3.6.35 to begin with? In the case of the 45% full file system, it seems it should be able to allocate more space without issue. Simon- -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html