[PATCH V2] Btrfs: find_free_extent: Do not erroneously skip LOOP_CACHING_WAIT state
When executing generic/001 in a loop on a ppc64 machine (with both sectorsize and nodesize set to 64k), the following call trace is observed, WARNING: at /root/repos/linux/fs/btrfs/locking.c:253 Modules linked in: CPU: 2 PID: 8353 Comm: umount Not tainted 4.3.0-rc5-13676-ga5e681d #54 task: c000f2b1f560 ti: c000f6008000 task.ti: c000f6008000 NIP: c0520c88 LR: c04a3b34 CTR: REGS: c000f600a820 TRAP: 0700 Not tainted (4.3.0-rc5-13676-ga5e681d) MSR: 800102029032 CR: 2884 XER: CFAR: c04a3b30 SOFTE: 1 GPR00: c04a3b34 c000f600aaa0 c108ac00 c000f5a808c0 GPR04: c000f600ae60 0005 GPR08: 20a1 0001 c000f2b1f560 0030 GPR12: 84842882 cfdc0900 c000f600ae60 c000f070b800 GPR16: c000f3c8a000 0049 GPR20: 0001 0001 c000f5aa01f8 GPR24: 0f83e0f83e0f83e1 c000f5a808c0 c000f3c8d000 c000 GPR28: c000f600ae74 0001 c000f3c8d000 c000f5a808c0 NIP [c0520c88] .btrfs_tree_lock+0x48/0x2a0 LR [c04a3b34] .btrfs_lock_root_node+0x44/0x80 Call Trace: [c000f600aaa0] [c000f600ab80] 0xc000f600ab80 (unreliable) [c000f600ab80] [c04a3b34] .btrfs_lock_root_node+0x44/0x80 [c000f600ac00] [c04a99dc] .btrfs_search_slot+0xa8c/0xc00 [c000f600ad40] [c04ab878] .btrfs_insert_empty_items+0x98/0x120 [c000f600adf0] [c050da44] .btrfs_finish_chunk_alloc+0x1d4/0x620 [c000f600af20] [c04be854] .btrfs_create_pending_block_groups+0x1d4/0x2c0 [c000f600b020] [c04bf188] .do_chunk_alloc+0x3c8/0x420 [c000f600b100] [c04c27cc] .find_free_extent+0xbfc/0x1030 [c000f600b260] [c04c2ce8] .btrfs_reserve_extent+0xe8/0x250 [c000f600b330] [c04c2f90] .btrfs_alloc_tree_block+0x140/0x590 [c000f600b440] [c04a47b4] .__btrfs_cow_block+0x124/0x780 [c000f600b530] [c04a4fc0] .btrfs_cow_block+0xf0/0x250 [c000f600b5e0] [c04a917c] .btrfs_search_slot+0x22c/0xc00 [c000f600b720] [c050aa40] .btrfs_remove_chunk+0x1b0/0x9f0 [c000f600b850] [c04c4e04] .btrfs_delete_unused_bgs+0x434/0x570 [c000f600b950] [c04d3cb8] .close_ctree+0x2e8/0x3b0 [c000f600ba20] [c049d178] .btrfs_put_super+0x18/0x30 [c000f600ba90] [c0243cd4] .generic_shutdown_super+0xa4/0x1a0 [c000f600bb10] [c02441d8] .kill_anon_super+0x18/0x30 [c000f600bb90] [c049c898] .btrfs_kill_super+0x18/0xc0 [c000f600bc10] [c02444f8] .deactivate_locked_super+0x98/0xe0 [c000f600bc90] [c0269f94] .cleanup_mnt+0x54/0xa0 [c000f600bd10] [c00bd744] .task_work_run+0xc4/0x100 [c000f600bdb0] [c0016334] .do_notify_resume+0x74/0x80 [c000f600be30] [c00098b8] .ret_from_except_lite+0x64/0x68 Instruction dump: fba1ffe8 fbc1fff0 fbe1fff8 7c791b78 f8010010 f821ff21 e94d0290 81030040 812a04e8 7d094a78 7d290034 5529d97e <0b09> 3b40 3be30050 3bc3004c The above call trace is seen even on x86_64; albeit very rarely and that too with nodesize set to 64k and with nospace_cache mount option being used. The reason for the above call trace is, btrfs_remove_chunk check_system_chunk Allocate chunk if required For each physical stripe on underlying device, btrfs_free_dev_extent ... Take lock on Device tree's root node btrfs_cow_block("dev tree's root node"); btrfs_reserve_extent find_free_extent index = BTRFS_RAID_DUP; have_caching_bg = false; When in LOOP_CACHING_NOWAIT state, Assume we find a block group which is being cached; Hence have_caching_bg is set to true When repeating the search for the next RAID index, we set have_caching_bg to false. Hence right after completing the LOOP_CACHING_NOWAIT state, we incorrectly skip LOOP_CACHING_WAIT state and move to LOOP_ALLOC_CHUNK state where we allocate a chunk and try to add entries corresponding to the chunk's physical stripe into the device tree. When doing so the task deadlocks itself waiting for the blocking lock on the root node of the device tree. This commit fixes the issue by introducing a new local variable to help indicate as to whether a block group of any RAID type is being cached. Signed-off-by: Chandan Rajendra --- Changelog: v1->v2: Honor 80 column restriction. fs/btrfs/extent-tree.c | 7 ++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c index f50c7c2..99a8e57 100644 --- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -7029,6 +7029,7 @@ static noinline int find_free_extent(struct btrfs_root *orig_root, bool failed_alloc = false; bool use_c
[PATCH] btrfs-progs: show-super: Add option to print superblock at given bytenr
Add '-s ' option to show superblock at given bytenr. This is very useful to debug non-standard btrfs, like debuging the 1st stage btrfs of btrfs-convert. Signed-off-by: Qu Wenruo --- Documentation/btrfs-show-super.asciidoc | 5 + btrfs-show-super.c | 7 ++- 2 files changed, 11 insertions(+), 1 deletion(-) diff --git a/Documentation/btrfs-show-super.asciidoc b/Documentation/btrfs-show-super.asciidoc index 1646be3..3480a3d 100644 --- a/Documentation/btrfs-show-super.asciidoc +++ b/Documentation/btrfs-show-super.asciidoc @@ -40,6 +40,11 @@ If several '-i ' are given, only the last one is valid. Attempt to print the superblock even if no superblock magic is found. May end badly. +-s :: +Specifiy the superblock bytenr. ++ +Used for debug purpose. Disable '-f' option. + EXIT STATUS --- *btrfs-show-super* will return 0 if no error happened. diff --git a/btrfs-show-super.c b/btrfs-show-super.c index 27414c8..7b499e4 100644 --- a/btrfs-show-super.c +++ b/btrfs-show-super.c @@ -48,6 +48,7 @@ static void print_usage(void) fprintf(stderr, "\t-a : print information of all superblocks\n"); fprintf(stderr, "\t-i : specify which mirror to print out\n"); fprintf(stderr, "\t-F : attempt to dump superblocks with bad magic\n"); + fprintf(stderr, "\t-s : specify the superblock bytenr\n"); fprintf(stderr, "%s\n", PACKAGE_STRING); } @@ -63,7 +64,7 @@ int main(int argc, char **argv) u64 arg; u64 sb_bytenr = btrfs_sb_offset(0); - while ((opt = getopt(argc, argv, "fFai:")) != -1) { + while ((opt = getopt(argc, argv, "fFai:s:")) != -1) { switch (opt) { case 'i': arg = arg_strtou64(optarg); @@ -86,6 +87,10 @@ int main(int argc, char **argv) case 'F': force = 1; break; + case 's': + sb_bytenr = arg_strtou64(optarg); + all = 0; + break; default: print_usage(); exit(1); -- 2.6.2 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Process is blocked for more than 120 seconds
Hi everyone, I have noticed the following in the log. The system continues to run, but I am not sure for how long it will be stable. # uname -a Linux Debian 4.2.3-2~bpo8+1 (2015-10-20) i686 GNU/Linux # mount | grep /var /dev/sdd2 on /var type btrfs (rw,noatime,compress=lzo,space_cache,subvolid=258,subvol=/var) > [Mon Nov 2 06:35:57 2015] INFO: task nscd:859 blocked for more than 120 > seconds. > [Mon Nov 2 06:35:57 2015] Not tainted 4.2.0-0.bpo.1-686-pae #1 > [Mon Nov 2 06:35:57 2015] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" > disables this message. > [Mon Nov 2 06:35:57 2015] nscdD f1c7dd20 0 859 1 > 0x > [Mon Nov 2 06:35:57 2015] f1c7dd40 00200082 f79de900 f1c7dd20 c10bc119 > ffe0 f3aec740 00200246 > [Mon Nov 2 06:35:57 2015] f74ea800 f79e3f40 f77fb800 f1c7e000 f6b381dc > f6b38000 f1c7dd4c c14f1fdb > [Mon Nov 2 06:35:57 2015] d5553960 f1c7dd70 f867672f f77fb800 > c1099250 d0a4be08 d9755e68 > [Mon Nov 2 06:35:57 2015] Call Trace: > [Mon Nov 2 06:35:57 2015] [] ? del_timer_sync+0x49/0x50 > [Mon Nov 2 06:35:57 2015] [] ? schedule+0x2b/0x80 > [Mon Nov 2 06:35:57 2015] [] ? > wait_current_trans.isra.21+0x8f/0xf0 [btrfs] > [Mon Nov 2 06:35:57 2015] [] ? wait_woken+0x80/0x80 > [Mon Nov 2 06:35:57 2015] [] ? start_transaction+0x3d0/0x5d0 > [btrfs] > [Mon Nov 2 06:35:57 2015] [] ? > btrfs_delalloc_reserve_metadata+0x32d/0x580 [btrfs] > [Mon Nov 2 06:35:57 2015] [] ? btrfs_dirty_inode+0xb0/0xb0 [btrfs] > [Mon Nov 2 06:35:57 2015] [] ? btrfs_join_transaction+0x23/0x30 > [btrfs] > [Mon Nov 2 06:35:57 2015] [] ? btrfs_dirty_inode+0x39/0xb0 [btrfs] > [Mon Nov 2 06:35:57 2015] [] ? btrfs_dirty_inode+0xb0/0xb0 [btrfs] > [Mon Nov 2 06:35:57 2015] [] ? file_update_time+0x7e/0xc0 > [Mon Nov 2 06:35:57 2015] [] ? btrfs_page_mkwrite+0x80/0x3c0 > [btrfs] > [Mon Nov 2 06:35:57 2015] [] ? hrtimer_cancel+0x19/0x20 > [Mon Nov 2 06:35:57 2015] [] ? futex_wait+0x1e1/0x270 > [Mon Nov 2 06:35:57 2015] [] ? do_page_mkwrite+0x38/0x90 > [Mon Nov 2 06:35:57 2015] [] ? do_wp_page+0x2e2/0x6d0 > [Mon Nov 2 06:35:57 2015] [] ? futex_wake+0x71/0x140 > [Mon Nov 2 06:35:57 2015] [] ? kmap_atomic_prot+0xe7/0x110 > [Mon Nov 2 06:35:57 2015] [] ? handle_mm_fault+0xd59/0x14d0 > [Mon Nov 2 06:35:57 2015] [] ? __do_page_fault+0x18c/0x480 > [Mon Nov 2 06:35:57 2015] [] ? __do_page_fault+0x480/0x480 > [Mon Nov 2 06:35:57 2015] [] ? error_code+0x67/0x6c > [Mon Nov 2 06:35:57 2015] INFO: task nscd:864 blocked for more than 120 > seconds. > [Mon Nov 2 06:35:57 2015] Not tainted 4.2.0-0.bpo.1-686-pae #1 > [Mon Nov 2 06:35:57 2015] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" > disables this message. > [Mon Nov 2 06:35:57 2015] nscdD f1c87f5c 0 864 1 > 0x > [Mon Nov 2 06:35:57 2015] f1c87ef4 00200082 f1c87f80 f1c87f5c 03e7 > f1c87ee4 f3aec740 ac76c560 > [Mon Nov 2 06:35:57 2015] f74ea800 f79e3f40 f3c7b040 f1c88000 f3c7b040 > 0001 f1c87f00 c14f1fdb > [Mon Nov 2 06:35:57 2015] f3aec77c f1c87f38 c14f4265 f1c87f1c f3aec780 > f3aec788 0125 > [Mon Nov 2 06:35:57 2015] Call Trace: > [Mon Nov 2 06:35:57 2015] [] ? schedule+0x2b/0x80 > [Mon Nov 2 06:35:57 2015] [] ? rwsem_down_write_failed+0x185/0x280 > [Mon Nov 2 06:35:57 2015] [] ? > call_rwsem_down_write_failed+0x6/0x8 > [Mon Nov 2 06:35:57 2015] [] ? down_write+0x25/0x40 > [Mon Nov 2 06:35:57 2015] [] ? vm_mmap_pgoff+0x4a/0xa0 > [Mon Nov 2 06:35:57 2015] [] ? SyS_fstat64+0x28/0x30 > [Mon Nov 2 06:35:57 2015] [] ? SyS_mmap_pgoff+0x110/0x210 > [Mon Nov 2 06:35:57 2015] [] ? sysenter_do_call+0x12/0x12 > [Mon Nov 2 06:35:57 2015] INFO: task nmbd:1330 blocked for more than 120 > seconds. > [Mon Nov 2 06:35:57 2015] Not tainted 4.2.0-0.bpo.1-686-pae #1 > [Mon Nov 2 06:35:57 2015] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" > disables this message. > [Mon Nov 2 06:35:57 2015] nmbdD 0 1330 1 > 0x > [Mon Nov 2 06:35:57 2015] ef44bd74 00200086 > f3984900 > [Mon Nov 2 06:35:57 2015] f69e1800 f79e3f40 f3a7a800 ef44c000 d17255a0 > d17255a0 ef44bd80 c14f1fdb > [Mon Nov 2 06:35:57 2015] d1725600 ef44bdc8 f86961b5 000d3fff > 1000 000d3000 > [Mon Nov 2 06:35:57 2015] Call Trace: > [Mon Nov 2 06:35:57 2015] [] ? schedule+0x2b/0x80 > [Mon Nov 2 06:35:57 2015] [] ? > btrfs_start_ordered_extent+0xd5/0x100 [btrfs] > [Mon Nov 2 06:35:57 2015] [] ? wait_woken+0x80/0x80 > [Mon Nov 2 06:35:57 2015] [] ? > lock_and_cleanup_extent_if_need+0x134/0x260 [btrfs] > [Mon Nov 2 06:35:57 2015] [] ? prepare_pages+0xc6/0x150 [btrfs] > [Mon Nov 2 06:35:57 2015] [] ? __btrfs_buffered_write+0x17a/0x5e0 > [btrfs] > [Mon Nov 2 06:35:57 2015] [] ? __alloc_pages_nodemask+0x133/0x880 > [Mon Nov 2 06:35:57 2015] [] ? btrfs_file_write_iter+0x1e5/0x550 > [btrfs] > [Mon Nov 2 06:35:5
[PATCH] Btrfs: fix hole punching when using the no-holes feature
From: Filipe Manana When we are using the no-holes feature, if we punch a hole into a file range that already contains a hole which overlaps the range we are passing to fallocate(), we end up removing the extent map that represents the existing hole without adding a new one. This happens because with the no-holes feature we do not have explicit extent items to represent holes and therefore the call to __btrfs_drop_extents(), made from btrfs_punch_hole(), returns an end offset to the variable drop_end that is smaller than the end of the range passed to fallocate(), while it drops all existing extent maps in that range. Normally having a missing extent map is not a problem, for example for a readpages() operation we just end up building the extent map by looking at the fs/subvol tree for a matching extent item (or a lack of one for implicit holes). However for an fsync that uses the fast path, which needs to look at the list of modified extent maps, this means the fsync will not record information about the complete hole we had before the fallocate() call into the log tree, resulting in a file with content/layout that does not match what we had neither before nor after the hole punch operation. The following test case for fstests reproduces the issue. It fails without this change because we get a file with a different digest after the fsync log replay and also with a different extent/hole layout. seq=`basename $0` seqres=$RESULT_DIR/$seq echo "QA output created by $seq" tmp=/tmp/$$ status=1 # failure is the default! trap "_cleanup; exit \$status" 0 1 2 3 15 _cleanup() { _cleanup_flakey rm -f $tmp.* } # get standard environment, filters and checks . ./common/rc . ./common/filter . ./common/punch . ./common/dmflakey # real QA test starts here _need_to_be_root _supported_fs generic _supported_os Linux _require_scratch _require_xfs_io_command "fpunch" _require_xfs_io_command "fiemap" _require_dm_target flakey _require_metadata_journaling $SCRATCH_DEV # This test was motivated by an issue found in btrfs when the btrfs # no-holes feature is enabled (introduced in kernel 3.14). So enable # the feature if the fs being tested is btrfs. if [ $FSTYP == "btrfs" ]; then _require_btrfs_fs_feature "no_holes" _require_btrfs_mkfs_feature "no-holes" MKFS_OPTIONS="$MKFS_OPTIONS -O no-holes" fi rm -f $seqres.full _scratch_mkfs >>$seqres.full 2>&1 _init_flakey _mount_flakey # Create out test file with some data and then fsync it. # We do the fsync only to make sure the last fsync we do in this test # triggers the fast code path of btrfs' fsync implementation, a # condition necessary to trigger the bug btrfs had. $XFS_IO_PROG -f -c "pwrite -S 0xaa 0K 128K" \ -c "fsync" \ $SCRATCH_MNT/foobar | _filter_xfs_io # Now punch a hole against the range [96K, 128K[. $XFS_IO_PROG -c "fpunch 96K 32K" $SCRATCH_MNT/foobar # Punch another hole against a range that overlaps the previous range # and ends beyond eof. $XFS_IO_PROG -c "fpunch 64K 128K" $SCRATCH_MNT/foobar # Punch another hole against a range that overlaps the first range # ([96K, 128K[) and ends at eof. $XFS_IO_PROG -c "fpunch 32K 96K" $SCRATCH_MNT/foobar # Fsync our file. We want to verify that, after a power failure and # mounting the filesystem again, the file content reflects all the hole # punch operations. $XFS_IO_PROG -c "fsync" $SCRATCH_MNT/foobar echo "File digest before power failure:" md5sum $SCRATCH_MNT/foobar | _filter_scratch echo "Fiemap before power failure:" $XFS_IO_PROG -c "fiemap -v" $SCRATCH_MNT/foobar | _filter_fiemap # Silently drop all writes and umount to simulate a crash/power failure. _load_flakey_table $FLAKEY_DROP_WRITES _unmount_flakey # Allow writes again, mount to trigger log replay and validate file # contents. _load_flakey_table $FLAKEY_ALLOW_WRITES _mount_flakey echo "File digest after log replay:" # Must match the same digest we got before the power failure. md5sum $SCRATCH_MNT/foobar | _filter_scratch echo "Fiemap after log replay:" # Must match the same extent listing we got before the power failure. $XFS_IO_PROG -c "fiemap -v" $SCRATCH_MNT/foobar | _filter_fiemap _unmount_flakey status=0 exit Signed-off-by: Filipe Manana --- fs/btrfs/file.c | 13 + 1 file changed, 13 insertions(+) diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c index 381be79..0c48d94 100644 --- a/fs/btrfs/file.c +++ b/fs/btrfs/file.c @@ -2489,6 +2489,19 @@ static int btrfs_punch_hole(struct inode *inode, loff_t offset, loff_t len) trans->block_rsv = &root->fs_info->trans_block_rsv; /* +* If we are using the NO_HOLES feature we might have had already an +* hole that overlaps a part of the region [lockstart, lockend] and +* ends at (or beyond) lockend. Since we have no file extent
[PATCH] fstests: generic test for fsync after hole punching
From: Filipe Manana Test that a file fsync works after punching a hole for the same file range multiple times, and that after log/journal replay the file's content and layout are correct. This test is motivated by a bug found in btrfs, which is fixed by the following linux kernel patch: "Btrfs: fix hole punching when using the no-holes feature" Signed-off-by: Filipe Manana --- tests/generic/110 | 123 ++ tests/generic/110.out | 13 ++ tests/generic/group | 1 + 3 files changed, 137 insertions(+) create mode 100755 tests/generic/110 create mode 100644 tests/generic/110.out diff --git a/tests/generic/110 b/tests/generic/110 new file mode 100755 index 000..1e3daac --- /dev/null +++ b/tests/generic/110 @@ -0,0 +1,123 @@ +#! /bin/bash +# FSQA Test No. 110 +# +# Test that a file fsync works after punching a hole for the same file range +# multiple times and that after log/journal replay the file's content is +# correct. +# +# This test is motivated by a bug found in btrfs. +# +#--- +# +# Copyright (C) 2015 SUSE Linux Products GmbH. All Rights Reserved. +# Author: Filipe Manana +# +# This program is free software; you can redistribute it and/or +# modify it under the terms of the GNU General Public License as +# published by the Free Software Foundation. +# +# This program is distributed in the hope that it would be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with this program; if not, write the Free Software Foundation, +# Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA +#--- +# + +seq=`basename $0` +seqres=$RESULT_DIR/$seq +echo "QA output created by $seq" +tmp=/tmp/$$ +status=1 # failure is the default! +trap "_cleanup; exit \$status" 0 1 2 3 15 + +_cleanup() +{ + _cleanup_flakey + cd / + rm -f $tmp.* +} + +# get standard environment, filters and checks +. ./common/rc +. ./common/filter +. ./common/punch +. ./common/dmflakey + +# real QA test starts here +_need_to_be_root +_supported_fs generic +_supported_os Linux +_require_scratch +_require_xfs_io_command "fpunch" +_require_xfs_io_command "fiemap" +_require_dm_target flakey +_require_metadata_journaling $SCRATCH_DEV + +# This test was motivated by an issue found in btrfs when the btrfs no-holes +# feature is enabled (introduced in kernel 3.14). So enable the feature if the +# fs being tested is btrfs. +if [ $FSTYP == "btrfs" ]; then + _require_btrfs_fs_feature "no_holes" + _require_btrfs_mkfs_feature "no-holes" + MKFS_OPTIONS="$MKFS_OPTIONS -O no-holes" +fi + +rm -f $seqres.full + +_scratch_mkfs >>$seqres.full 2>&1 +_init_flakey +_mount_flakey + +# Create out test file with some data and then fsync it. +# We do the fsync only to make sure the last fsync we do in this test triggers +# the fast code path of btrfs' fsync implementation, a condition necessary to +# trigger the bug btrfs had. +$XFS_IO_PROG -f -c "pwrite -S 0xaa 0K 128K" \ + -c "fsync" \ + $SCRATCH_MNT/foobar | _filter_xfs_io + +# Now punch a hole against the range [96K, 128K[. +$XFS_IO_PROG -c "fpunch 96K 32K" $SCRATCH_MNT/foobar + +# Punch another hole against a range that overlaps the previous range and ends +# beyond eof. +$XFS_IO_PROG -c "fpunch 64K 128K" $SCRATCH_MNT/foobar + +# Punch another hole against a range that overlaps the first range ([96K, 128K[) +# and ends at eof. +$XFS_IO_PROG -c "fpunch 32K 96K" $SCRATCH_MNT/foobar + +# Fsync our file. We want to verify that, after a power failure and mounting the +# filesystem again, the file content reflects all the hole punch operations. +$XFS_IO_PROG -c "fsync" $SCRATCH_MNT/foobar + +echo "File digest before power failure:" +md5sum $SCRATCH_MNT/foobar | _filter_scratch + +echo "Fiemap before power failure:" +$XFS_IO_PROG -c "fiemap -v" $SCRATCH_MNT/foobar | _filter_fiemap + +# Silently drop all writes and unmount to simulate a crash/power failure. +_load_flakey_table $FLAKEY_DROP_WRITES +_unmount_flakey + +# Allow writes again, mount to trigger log replay and validate file contents. +_load_flakey_table $FLAKEY_ALLOW_WRITES +_mount_flakey + +echo "File digest after log replay:" +# Must match the same digest we got before the power failure. +md5sum $SCRATCH_MNT/foobar | _filter_scratch + +echo "Fiemap after log replay:" +# Must match the same extent listing we got before the power failure. +$XFS_IO_PROG -c "fiemap -v" $SCRATCH_MNT/foobar | _filter_fiemap + +_unmount_flakey + +status=0 +exit diff --git a/tests/generic/110.out b/tests/generic/110.out new file mode 100644 index 000..ba016c8 --- /dev/null +++
Re: trying to balance, filesystem keeps going read-only.
On 2015-11-01 09:33, Ken Long wrote: > I get a similar read-only status when I try to remove the drive from the > array.. > > Too bad the utility's function can not be slowed down.. to avoid > triggering this error... ? > Actually, there are a couple of ways you could do this. The most reliable way to do it (and arguably the only correct way) is to use the blkio cgroup to put bandwidth or IOPS limits on the process. For authoritative info about how to do this, check Documentation/cgroups/blkio-controller.txt in the Linux source tree. If the issue really is the device not responding soon enough, you may also try increasing the device timeout the kernel uses. A udev rule like the following will increase the timeout for all ATA/SCSI/USB (it says SCSI devices, but all ATA and USB devices get routed through the SCSI subsystem anyway unless you're using really old and deprecated drivers) devices to 150 seconds (2.5 minutes, which is reasonable for most non-enterprise devices): DRIVER=="sd", SUBSYSTEM=="scsi", ENV{DEVTYPE}=="scsi_device", ATTR{timeout}="150" smime.p7s Description: S/MIME Cryptographic Signature
Re: [PATCH] btrfs-progs: show-super: Add option to print superblock at given bytenr
On Mon, Nov 02, 2015 at 04:34:19PM +0800, Qu Wenruo wrote: > Add '-s ' option to show superblock at given bytenr. > > This is very useful to debug non-standard btrfs, like debuging the > 1st stage btrfs of btrfs-convert. > > Signed-off-by: Qu Wenruo Applied, thanks. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
4.2.5 forced read-only -ENOSPC w/ free space
During an rsync, 20TB unallocated space. Currently, no snapshots. Should I try 4.1.12, or 4.3? dmesg: [122014.436612] BTRFS: error (device sde) in btrfs_run_delayed_refs:2781: errno=-28 No space left [122014.436615] BTRFS info (device sde): forced readonly [122014.436624] BTRFS: error (device sde) in btrfs_run_delayed_refs:2781: errno=-28 No space left [122014.436725] WARNING: CPU: 13 PID: 8025 at fs/btrfs/extent-tree.c:2781 btrfs_run_delayed_refs+0x97/0x195 [btrfs]() [122014.436741] BTRFS: error (device sde) in __btrfs_prealloc_file_range:9636: errno=-28 No space left [122014.436772] BTRFS: error (device sde) in btrfs_start_dirty_block_groups:3461: errno=-28 No space left [122014.436777] BTRFS warning (device sde): Skipping commit of aborted transaction. [122014.436780] BTRFS: error (device sde) in cleanup_transaction:1710: errno=-5 IO failure [122014.436959] BTRFS: Transaction aborted (error -28) [122014.436961] Modules linked in: ipmi_si mpt2sas raid_class scsi_transport_sas dell_rbu nfsv3 nfsv4 nfsd auth_rpcgss oid_registry nfs_acl nfs lockd grace fscache sunrpc ext4 crc16 jbd2 ext2 coretemp joydev crct10dif_pclmul sha256_generic psmouse serio_raw hmac drbg aesni_intel iTCO_wdt ipmi_devintf iTCO_vendor_support dcdbas evdev aes_x86_64 glue_helper lrw gf128mul ablk_helper pcspkr cryptd lpc_ich mfd_core i7core_edac edac_core ipmi_msghandler acpi_power_meter button processor thermal_sys loop ext3 mbcache jbd btrfs xor raid6_pq hid_generic usbhid hid sg sd_mod crc32c_intel uhci_hcd ehci_pci ehci_hcd megaraid_sas ixgbe mdio ptp usbcore pps_core usb_common scsi_mod bnx2 [last unloaded: ipmi_si] [122014.437405] CPU: 13 PID: 8025 Comm: kworker/u66:13 Tainted: G I 4.2.5 #1 [122014.437519] Workqueue: btrfs-extent-refs btrfs_extent_refs_helper [btrfs] [122014.437552] 0009 813af77a 880006ab7d08 [122014.437606] 810421bb 3dac a01ee526 880342782f30 [122014.437660] ffe4 880100ea33b0 8803218ae800 880342782e08 [122014.437714] Call Trace: [122014.437743] [] ? dump_stack+0x40/0x50 [122014.437773] [] ? warn_slowpath_common+0x98/0xb0 [122014.437817] [] ? btrfs_run_delayed_refs+0x97/0x195 [btrfs] [122014.437863] [] ? warn_slowpath_fmt+0x45/0x4a [122014.437906] [] ? btrfs_run_delayed_refs+0x97/0x195 [btrfs] [122014.437965] [] ? delayed_ref_async_start+0x33/0x71 [btrfs] [122014.438029] [] ? normal_work_helper+0xc3/0x1fa [btrfs] [122014.438063] [] ? process_one_work+0x159/0x286 [122014.438093] [] ? worker_thread+0x1d9/0x280 [122014.438123] [] ? rescuer_thread+0x27a/0x27a [122014.438152] [] ? kthread+0xab/0xb3 [122014.438180] [] ? kthread_parkme+0x16/0x16 [122014.438211] [] ? ret_from_fork+0x3f/0x70 [122014.438240] [] ? kthread_parkme+0x16/0x16 [122014.438268] ---[ end trace 1c8deab18b734f90 ]--- [122014.438296] BTRFS: error (device sde) in btrfs_run_delayed_refs:2781: errno=-28 No space left btrfs file usage /mirror Overall: Device size: 140.07TiB Device allocated:119.96TiB Device unallocated: 20.11TiB Device missing: 0.00B Used:117.54TiB Free (estimated): 22.53TiB (min: 12.47TiB) Data ratio: 1.00 Metadata ratio: 2.00 Global reserve: 512.00MiB (used: 0.00B) Data,single: Size:119.66TiB, Used:117.24TiB /dev/sdb 24.91TiB /dev/sdc 24.91TiB /dev/sdd 34.92TiB /dev/sde 34.92TiB Metadata,RAID10: Size:151.00GiB, Used:149.88GiB /dev/sdb 37.75GiB /dev/sdc 37.75GiB /dev/sdd 37.75GiB /dev/sde 37.75GiB System,RAID10: Size:64.00MiB, Used:15.75MiB /dev/sdb 16.00MiB /dev/sdc 16.00MiB /dev/sdd 16.00MiB /dev/sde 16.00MiB Unallocated: /dev/sdb5.06TiB /dev/sdc5.06TiB /dev/sdd5.06TiB /dev/sde5.06TiB -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Btrfs progs pre-release 4.3-rc1
Hi, the kernel 4.3 was released yesterday, the btrfs-progs will follow at the end of this week. I've tagged an rc1 from current devel branch. There are a lots of small invisible changes and one change in the defaults: * mkfs: mixed mode is not forced anymore for devices smaller than 1 GiB I've updated manual pages for mkfs, balance, btrfstune, convert and inspect, I'd be glad if somebody could proofread them. Otherwise bugfixes and small-sized patches are still welcome. Tarballs: https://www.kernel.org/pub/linux/kernel/people/kdave/btrfs-progs/ Git: git://git.kernel.org/pub/scm/linux/kernel/git/kdave/btrfs-progs.git -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH V2] Btrfs: find_free_extent: Do not erroneously skip LOOP_CACHING_WAIT state
On 11/02/2015 03:29 AM, Chandan Rajendra wrote: When executing generic/001 in a loop on a ppc64 machine (with both sectorsize and nodesize set to 64k), the following call trace is observed, WARNING: at /root/repos/linux/fs/btrfs/locking.c:253 Modules linked in: CPU: 2 PID: 8353 Comm: umount Not tainted 4.3.0-rc5-13676-ga5e681d #54 task: c000f2b1f560 ti: c000f6008000 task.ti: c000f6008000 NIP: c0520c88 LR: c04a3b34 CTR: REGS: c000f600a820 TRAP: 0700 Not tainted (4.3.0-rc5-13676-ga5e681d) MSR: 800102029032 CR: 2884 XER: CFAR: c04a3b30 SOFTE: 1 GPR00: c04a3b34 c000f600aaa0 c108ac00 c000f5a808c0 GPR04: c000f600ae60 0005 GPR08: 20a1 0001 c000f2b1f560 0030 GPR12: 84842882 cfdc0900 c000f600ae60 c000f070b800 GPR16: c000f3c8a000 0049 GPR20: 0001 0001 c000f5aa01f8 GPR24: 0f83e0f83e0f83e1 c000f5a808c0 c000f3c8d000 c000 GPR28: c000f600ae74 0001 c000f3c8d000 c000f5a808c0 NIP [c0520c88] .btrfs_tree_lock+0x48/0x2a0 LR [c04a3b34] .btrfs_lock_root_node+0x44/0x80 Call Trace: [c000f600aaa0] [c000f600ab80] 0xc000f600ab80 (unreliable) [c000f600ab80] [c04a3b34] .btrfs_lock_root_node+0x44/0x80 [c000f600ac00] [c04a99dc] .btrfs_search_slot+0xa8c/0xc00 [c000f600ad40] [c04ab878] .btrfs_insert_empty_items+0x98/0x120 [c000f600adf0] [c050da44] .btrfs_finish_chunk_alloc+0x1d4/0x620 [c000f600af20] [c04be854] .btrfs_create_pending_block_groups+0x1d4/0x2c0 [c000f600b020] [c04bf188] .do_chunk_alloc+0x3c8/0x420 [c000f600b100] [c04c27cc] .find_free_extent+0xbfc/0x1030 [c000f600b260] [c04c2ce8] .btrfs_reserve_extent+0xe8/0x250 [c000f600b330] [c04c2f90] .btrfs_alloc_tree_block+0x140/0x590 [c000f600b440] [c04a47b4] .__btrfs_cow_block+0x124/0x780 [c000f600b530] [c04a4fc0] .btrfs_cow_block+0xf0/0x250 [c000f600b5e0] [c04a917c] .btrfs_search_slot+0x22c/0xc00 [c000f600b720] [c050aa40] .btrfs_remove_chunk+0x1b0/0x9f0 [c000f600b850] [c04c4e04] .btrfs_delete_unused_bgs+0x434/0x570 [c000f600b950] [c04d3cb8] .close_ctree+0x2e8/0x3b0 [c000f600ba20] [c049d178] .btrfs_put_super+0x18/0x30 [c000f600ba90] [c0243cd4] .generic_shutdown_super+0xa4/0x1a0 [c000f600bb10] [c02441d8] .kill_anon_super+0x18/0x30 [c000f600bb90] [c049c898] .btrfs_kill_super+0x18/0xc0 [c000f600bc10] [c02444f8] .deactivate_locked_super+0x98/0xe0 [c000f600bc90] [c0269f94] .cleanup_mnt+0x54/0xa0 [c000f600bd10] [c00bd744] .task_work_run+0xc4/0x100 [c000f600bdb0] [c0016334] .do_notify_resume+0x74/0x80 [c000f600be30] [c00098b8] .ret_from_except_lite+0x64/0x68 Instruction dump: fba1ffe8 fbc1fff0 fbe1fff8 7c791b78 f8010010 f821ff21 e94d0290 81030040 812a04e8 7d094a78 7d290034 5529d97e <0b09> 3b40 3be30050 3bc3004c The above call trace is seen even on x86_64; albeit very rarely and that too with nodesize set to 64k and with nospace_cache mount option being used. The reason for the above call trace is, btrfs_remove_chunk check_system_chunk Allocate chunk if required For each physical stripe on underlying device, btrfs_free_dev_extent ... Take lock on Device tree's root node btrfs_cow_block("dev tree's root node"); btrfs_reserve_extent find_free_extent index = BTRFS_RAID_DUP; have_caching_bg = false; When in LOOP_CACHING_NOWAIT state, Assume we find a block group which is being cached; Hence have_caching_bg is set to true When repeating the search for the next RAID index, we set have_caching_bg to false. Hence right after completing the LOOP_CACHING_NOWAIT state, we incorrectly skip LOOP_CACHING_WAIT state and move to LOOP_ALLOC_CHUNK state where we allocate a chunk and try to add entries corresponding to the chunk's physical stripe into the device tree. When doing so the task deadlocks itself waiting for the blocking lock on the root node of the device tree. This commit fixes the issue by introducing a new local variable to help indicate as to whether a block group of any RAID type is being cached. Signed-off-by: Chandan Rajendra --- Changelog: v1->v2: Honor 80 column restriction. fs/btrfs/extent-tree.c | 7 ++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c index f50c7c2..99a8e57 100644 --- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -7029,6 +7029,7 @@ static noinline int find_free_extent(struct btrfs_root *o
Re: [PATCH V2] Btrfs: find_free_extent: Do not erroneously skip LOOP_CACHING_WAIT state
On Mon, Nov 02, 2015 at 01:59:46PM +0530, Chandan Rajendra wrote: > When executing generic/001 in a loop on a ppc64 machine (with both sectorsize > and nodesize set to 64k), the following call trace is observed, Thanks Chandan, I hit this same trace on x86-64 with 16K nodes. -chris -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: random i/o error without error in dmesg
2015-10-28 09:44 keltezéssel, Szalma László írta: Ok, I had a chance to try some things. 1.: the error md5sum xyz md5sum: xyz: Input/output error (no any errors in dmesg) 2.: mount -o remount,ro /mnt/x (could not do, it is used) mysql stop && mount -o remount,ro /mnt/x problem persists: io error. mount -o remount,rw /mnt/x still io error umount /mnt/x mount /mnt/x NO io error, md5sum works! The umount/mount ALWAYS solved the problem for me, mount -o remount,ro was tried for the first time, but it was not enought. Reboot was not needed. (kernel 4.2.4) László Szalma Unfortunately the problem with kernel 4.3.0 still exists. László Szalma -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH/RFC] make btrfs subvol mounts appear in /proc/mounts
On Wed, Oct 28, 2015 at 07:25:10AM +0900, Neil Brown wrote: > > If you create a subvolume in btrfs and access it (by name) without > mounting it, then the subvolume looks like a separate mount to some > extent, returning a different st_dev to stat(), but it doesn't look like > a separate mount in that it isn't listed in /proc/mounts. This > inconsistency can confuse tools. > > This patch causes these subvolumes to become separate mounts by using > the VFS' automount functionality, much like NFS uses automount when it > discovered mountpoints on the server. > > The VFS currently makes it impossible to auto-mount a directory on to itself > (i.e. a bind mount). For NFS this isn't a problem as a new superblock > is created for the child filesystem so there are two separate dentries > (and inodes) for the one directory: one in the parent filesystem, one in > the child (note that the two superblocks share a common connection to > the server so there is still a lot of commonality). > > BTRFS has chosen instead to use a single superblock for all subvolumes. Naive question: was there a reason for that choice? --b. > This results in a single dentry for the subvol-root. A dentry which > must be auto-mounted on itself. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH/RFC] make btrfs subvol mounts appear in /proc/mounts
On Mon, Nov 02, 2015 at 03:50:12PM -0500, J. Bruce Fields wrote: > On Wed, Oct 28, 2015 at 07:25:10AM +0900, Neil Brown wrote: > > > > If you create a subvolume in btrfs and access it (by name) without > > mounting it, then the subvolume looks like a separate mount to some > > extent, returning a different st_dev to stat(), but it doesn't look like > > a separate mount in that it isn't listed in /proc/mounts. This > > inconsistency can confuse tools. > > > > This patch causes these subvolumes to become separate mounts by using > > the VFS' automount functionality, much like NFS uses automount when it > > discovered mountpoints on the server. > > > > The VFS currently makes it impossible to auto-mount a directory on to itself > > (i.e. a bind mount). For NFS this isn't a problem as a new superblock > > is created for the child filesystem so there are two separate dentries > > (and inodes) for the one directory: one in the parent filesystem, one in > > the child (note that the two superblocks share a common connection to > > the server so there is still a lot of commonality). > > > > BTRFS has chosen instead to use a single superblock for all subvolumes. > > Naive question: was there a reason for that choice? They are really all part of the same FS, the single super better fits. Or said another way, it felt like there would be dramatically more duct tape around supers-per-subvolume than there was abusing st_dev. Neil's patch came up after I told him a few of us had tried to do the same thing and failed to find clean vfs changes to make it possible...he took it as a challenge. Now I have to remember what it was about our past attempts that I didn't like. I'll test this and queue for 4.5 if it all works out, thanks Neil! -chris -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH V2] btrfs: Print Warning only if ENOSPC_DEBUG is enabled
Dont call WARN_ON for ENOSPC error unless ENOSPC_DEBUG is enabled. Signed-off-by : Ashish Samant --- fs/btrfs/delayed-inode.c | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/fs/btrfs/delayed-inode.c b/fs/btrfs/delayed-inode.c index a2ae427..b86cfd9 100644 --- a/fs/btrfs/delayed-inode.c +++ b/fs/btrfs/delayed-inode.c @@ -652,9 +652,13 @@ static int btrfs_delayed_inode_reserve_metadata( goto out; ret = btrfs_block_rsv_migrate(src_rsv, dst_rsv, num_bytes); - if (!WARN_ON(ret)) + if (!ret) goto out; + if (btrfs_test_opt(root, ENOSPC_DEBUG)) + WARN(1, KERN_DEBUG +"btrfs: block rsv migrate returned %d\n", ret); + /* * Ok this is a problem, let's just steal from the global rsv * since this really shouldn't happen that often. -- 1.8.3.2 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] fstests: generic test for fsync after hole punching
On Mon, Nov 02, 2015 at 12:32:57PM +, fdman...@kernel.org wrote: > From: Filipe Manana > > Test that a file fsync works after punching a hole for the same file > range multiple times, and that after log/journal replay the file's > content and layout are correct. > > This test is motivated by a bug found in btrfs, which is fixed by > the following linux kernel patch: > > "Btrfs: fix hole punching when using the no-holes feature" > +# This test was motivated by an issue found in btrfs when the btrfs no-holes > +# feature is enabled (introduced in kernel 3.14). So enable the feature if > the > +# fs being tested is btrfs. > +if [ $FSTYP == "btrfs" ]; then > + _require_btrfs_fs_feature "no_holes" > + _require_btrfs_mkfs_feature "no-holes" > + MKFS_OPTIONS="$MKFS_OPTIONS -O no-holes" > +fi This sort of transparent filesystem option should be tested by executing the entire test suite with it enabled: # MKFS_OPTIONS="-O no-holes" ./check -g auto rather than only enabling for just this test. > +# Silently drop all writes and unmount to simulate a crash/power failure. > +_load_flakey_table $FLAKEY_DROP_WRITES > +_unmount_flakey > + > +# Allow writes again, mount to trigger log replay and validate file contents. > +_load_flakey_table $FLAKEY_ALLOW_WRITES > +_mount_flakey This is repeated often enough across many tests that a helper like: # Silently drop all writes and unmount/remount to simulate a # crash/power failure. _flakey_drop_and_remount() { _load_flakey_table $FLAKEY_DROP_WRITES _unmount_flakey _load_flakey_table $FLAKEY_ALLOW_WRITES _mount_flakey } is appropriate. Doesn't need to be in this patch, though. Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Btrfs progs pre-release 4.3-rc1
David Sterba posted on Mon, 02 Nov 2015 16:14:53 +0100 as excerpted: > the kernel 4.3 was released yesterday, the btrfs-progs will follow at > the end of this week. I've tagged an rc1 from current devel branch. > There are a lots of small invisible changes and one change in the > defaults: > > * mkfs: mixed mode is not forced anymore for devices smaller than 1 GiB It says one change in the /defaults/, but then it says mixed mode isn't /forced/ anymore under a GiB. Which is it, a change in the /defaults/, under a gig now defaults to separate data/metadata, or same /defaults/, but now there's a way to overrule them and do separate data/metadata under a gig, so while mixed remains the default, it's no longer /forced/? If the /defaults/ changed, is mixed mode still /recommended/ for small filesystems? -- Duncan - List replies preferred. No HTML msgs. "Every nonfree program has a lord, a master -- and if you use the program, he is your master." Richard Stallman -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH/RFC] make btrfs subvol mounts appear in /proc/mounts
On Tue, Nov 03 2015, Chris Mason wrote: > On Mon, Nov 02, 2015 at 03:50:12PM -0500, J. Bruce Fields wrote: >> On Wed, Oct 28, 2015 at 07:25:10AM +0900, Neil Brown wrote: >> > >> > If you create a subvolume in btrfs and access it (by name) without >> > mounting it, then the subvolume looks like a separate mount to some >> > extent, returning a different st_dev to stat(), but it doesn't look like >> > a separate mount in that it isn't listed in /proc/mounts. This >> > inconsistency can confuse tools. >> > >> > This patch causes these subvolumes to become separate mounts by using >> > the VFS' automount functionality, much like NFS uses automount when it >> > discovered mountpoints on the server. >> > >> > The VFS currently makes it impossible to auto-mount a directory on to >> > itself >> > (i.e. a bind mount). For NFS this isn't a problem as a new superblock >> > is created for the child filesystem so there are two separate dentries >> > (and inodes) for the one directory: one in the parent filesystem, one in >> > the child (note that the two superblocks share a common connection to >> > the server so there is still a lot of commonality). >> > >> > BTRFS has chosen instead to use a single superblock for all subvolumes. >> >> Naive question: was there a reason for that choice? > > They are really all part of the same FS, the single super better fits. > Or said another way, it felt like there would be dramatically more duct > tape around supers-per-subvolume than there was abusing st_dev. > > Neil's patch came up after I told him a few of us had tried to do the > same thing and failed to find clean vfs changes to make it possible...he > took it as a challenge. Now I have to remember what it was about our > past attempts that I didn't like. > > I'll test this and queue for 4.5 if it all works out, thanks Neil! I'd rather resend with proper documentation updates and s-o-b before it gets queued if that is OK. So once you are happy, please let me know and I'll do it "properly". Thanks, NeilBrown signature.asc Description: PGP signature
Re: "free_raid_bio" crash on RAID6
Hi No, I never figured this out... After a while of waiting for answers I just started over and took the data from my backup. > Did you try removing the bad drive and did the system keep crashing anyway? As you can see in my first mail the drive was already removed when this error started to happen ("some devices missing"). ;) Regards, Tobias 2015-10-18 16:14 GMT+02:00 Philip Seeger : > Hi Tobias > > On 07/20/2015 06:20 PM, Tobias Holst wrote: >> >> My btrfs-RAID6 seems to be broken again :( >> >> When reading from it I get several of these: >> [ 176.349943] BTRFS info (device dm-4): csum failed ino 1287707 >> extent 21274957705216 csum 2830458701 wanted 426660650 mirror 2 >> >> then followed by a "free_raid_bio"-crash: >> >> [ 176.349961] [ cut here ] >> [ 176.349981] WARNING: CPU: 6 PID: 110 at >> /home/kernel/COD/linux/fs/btrfs/raid56.c:831 >> __free_raid_bio+0xfc/0x130 [btrfs]() >> ... > > > It's been 3 months now, have you ever figured this out? Do you know if the > bug has been identified and fixed or have you filed a bugzilla report? > >> One drive is broken, so at the moment it is mounted with "-O >> defaults,ro,degraded,recovery,compress=lzo,space_cache,subvol=raid". > > > Did you try removing the bad drive and did the system keep crashing anyway? > > > > Philip > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] btrfs-progs: print-tree: Output stripe dev uuid
Add output for dev uuid for print_chunk(). Quite useful to debug temporary btrfs in btrfs-convert. Signed-off-by: Qu Wenruo --- print-tree.c | 8 1 file changed, 8 insertions(+) diff --git a/print-tree.c b/print-tree.c index 7ddf400..4d4c3a2 100644 --- a/print-tree.c +++ b/print-tree.c @@ -231,9 +231,17 @@ void print_chunk(struct extent_buffer *eb, struct btrfs_chunk *chunk) printf("\t\ttype %s num_stripes %d\n", chunk_flags_str, num_stripes); for (i = 0 ; i < num_stripes ; i++) { + unsigned char dev_uuid[BTRFS_UUID_SIZE]; + char str_dev_uuid[BTRFS_UUID_UNPARSED_SIZE]; + + read_extent_buffer(eb, dev_uuid, + (unsigned long)btrfs_stripe_dev_uuid_nr(chunk, i), + BTRFS_UUID_SIZE); + uuid_unparse(dev_uuid, str_dev_uuid); printf("\t\t\tstripe %d devid %llu offset %llu\n", i, (unsigned long long)btrfs_stripe_devid_nr(eb, chunk, i), (unsigned long long)btrfs_stripe_offset_nr(eb, chunk, i)); + printf("\t\t\tdev uuid: %s\n", str_dev_uuid); } } -- 2.6.2 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/2] btrfs-progs: mkfs: Round device size down to sectorsize
When do following command in a vm, whose disks are created by qemu-img create -f raw 11 2.6G: # mkfs.btrfs -f /dev/vdd /dev/vde /dev/vdf # btrfs-show-super /dev/vdd /dev/vde /dev/vdf | grep dev_item.total_bytes dev_item.total_bytes2791727104 dev_item.total_bytes2791729152 dev_item.total_bytes2791729152 We can see that the first device's size is little smaller. And it fails xfstests btrfs/011. Reason: First device's size is rounded down to sectorsize in make_btrfs(), but other devices are not. Fix: Round down remain devices' size in btrfs_add_to_fsid(). Reported-by: Qu Wenruo Signed-off-by: Zhao Lei --- utils.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/utils.c b/utils.c index d17291a..b7752df 100644 --- a/utils.c +++ b/utils.c @@ -736,6 +736,8 @@ int btrfs_add_to_fsid(struct btrfs_trans_handle *trans, u64 num_devs; int ret; + device_total_bytes = (device_total_bytes / sectorsize) * sectorsize; + device = kzalloc(sizeof(*device), GFP_NOFS); if (!device) goto err_nomem; -- 1.8.5.1 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/2] btrfs-progs: Rename variables in btrfs_add_to_fsid
There are two total_bytes in btrfs_add_to_fsid(), local variable of total_bytes means fs_total_bytes, and device->total_bytes means device's total_bytes. And device's total_bytes in argument is named block_count in current code. This patch rename: total_bytes -> fs_total_bytes block_count -> device_total_bytes To make code more readable. Signed-off-by: Zhao Lei --- utils.c | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/utils.c b/utils.c index b7752df..999af43 100644 --- a/utils.c +++ b/utils.c @@ -724,7 +724,7 @@ static int zero_dev_clamped(int fd, off_t start, ssize_t len, u64 dev_size) int btrfs_add_to_fsid(struct btrfs_trans_handle *trans, struct btrfs_root *root, int fd, char *path, - u64 block_count, u32 io_width, u32 io_align, + u64 device_total_bytes, u32 io_width, u32 io_align, u32 sectorsize) { struct btrfs_super_block *disk_super; @@ -732,7 +732,7 @@ int btrfs_add_to_fsid(struct btrfs_trans_handle *trans, struct btrfs_device *device; struct btrfs_dev_item *dev_item; char *buf = NULL; - u64 total_bytes; + u64 fs_total_bytes; u64 num_devs; int ret; @@ -757,7 +757,7 @@ int btrfs_add_to_fsid(struct btrfs_trans_handle *trans, device->sector_size = sectorsize; device->fd = fd; device->writeable = 1; - device->total_bytes = block_count; + device->total_bytes = device_total_bytes; device->bytes_used = 0; device->total_ios = 0; device->dev_root = root->fs_info->dev_root; @@ -768,8 +768,8 @@ int btrfs_add_to_fsid(struct btrfs_trans_handle *trans, ret = btrfs_add_device(trans, root, device); BUG_ON(ret); - total_bytes = btrfs_super_total_bytes(super) + block_count; - btrfs_set_super_total_bytes(super, total_bytes); + fs_total_bytes = btrfs_super_total_bytes(super) + device_total_bytes; + btrfs_set_super_total_bytes(super, fs_total_bytes); num_devs = btrfs_super_num_devices(super) + 1; btrfs_set_super_num_devices(super, num_devs); -- 1.8.5.1 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html