Re: [PATCH] generic/311: Disable dmesg check

2017-02-22 Thread Anand Jain



Hi Chandan,


This bug is easily recreated when executing the test on Btrfs with
subpage-blocksize patchset applied. I haven't been able to test the recently
rebased subpage-blocksize patchset yet.

Coming back to the issue ... The problem exists because the test code uses
dm-flakey. Josef had suggested that using dm-log-writes instead of dm-flakey
should fix the problem. I will work on this and post a patch soon.


 generic/311 is using the drop writes, it appears eio shouldn't be
 reported
_load_flakey_table $FLAKEY_DROP_WRITES $lockfs

 Looks like there is a regression in the dm-flakey which was fixed by
 ---
 dm flakey: fix reads to be issued if drop_writes configured

 v4.8-rc3 commit 99f3c90d0d ("dm flakey: error READ bios during the
 down_interval") overlooked the 'drop_writes' feature, which is meant to
 allow reads to be issued rather than errored, during the down_interval.

 Fixes: 99f3c90d0d ("dm flakey: error READ bios during the
 down_interval")
 ---

 That also explains why I couldn't reproduce this at mainline.


Thanks, Anand
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH/fstest] btrfs/139: correctly receive clones to mounted subvol

2017-02-22 Thread Benedikt Morbach
clone needs to resolve the paths of the involved subvolumes in the target
fs from their UUIDs. When doing so it might need to strip the prefix
that is mounted as the root of the fs from those paths.

It didn't do so correctly when processing the source of "clone" commands

This is a regression test for
btrfs-progs: receive: handle root subvol path in clone
---

fstest for the patch mentioned above (see also 
https://mail-archive.com/linux-btrfs@vger.kernel.org/msg61848.html)

I thought I'd send this to linux-btrfs first as I'm not sure what the policy is
wrt submitting fstests aren't fixed upstream yet.

It's my first fstest and even though the other tests were helpful and it seems
to work, I still assume I've made a couple of mistakes here and there.

But it might be helpful to verify the issue the mentioned patch aims to solve.


 tests/btrfs/139 | 199 
 tests/btrfs/139.out |  16 +
 tests/btrfs/group   |   1 +
 3 files changed, 216 insertions(+)
 create mode 100755 tests/btrfs/139
 create mode 100644 tests/btrfs/139.out

diff --git a/tests/btrfs/139 b/tests/btrfs/139
new file mode 100755
index ..f814eef0
--- /dev/null
+++ b/tests/btrfs/139
@@ -0,0 +1,199 @@
+#! /bin/bash
+# FS QA Test No. btrfs/139
+#
+# Test that an incremental send operation works when in both snapshots there 
are
+# two directory inodes that have the same number but different generations and
+# have an entry with the same name that corresponds to different inodes in each
+# snapshot.
+#
+#---
+# Copyright (C) 2017 Benedikt Morbach. All Rights Reserved.
+# Author: Benedikt Morbach 
+# based on 'btrfs/135', which is:
+# Copyright (C) 2017 Synology Inc. All Rights Reserved.
+# Author: Robbie Ko 
+#
+# This program is free software; you can redistribute it and/or
+# modify it under the terms of the GNU General Public License as
+# published by the Free Software Foundation.
+#
+# This program is distributed in the hope that it would be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, write the Free Software Foundation,
+# Inc.,  51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
+#---
+#
+
+seq=`basename $0`
+seqres=$RESULT_DIR/$seq
+echo "QA output created by $seq"
+
+tmp=/tmp/$$
+status=1# failure is the default!
+trap "_cleanup; exit \$status" 0 1 2 3 15
+
+_cleanup()
+{
+cd /
+rm -fr $send_files_dir
+rm -f $tmp.*
+}
+
+# get standard environment, filters and checks
+. ./common/rc
+. ./common/filter
+
+# real QA test starts here
+_supported_fs btrfs
+_supported_os Linux
+_require_test
+_require_scratch
+_require_cloner
+_require_fssum
+
+send_files_dir=$TEST_DIR/btrfs-test-$seq
+
+rm -f $seqres.full
+rm -fr $send_files_dir
+mkdir $send_files_dir
+
+_scratch_mkfs >>$seqres.full 2>&1
+_scratch_mount
+
+BLOCK_SIZE=$(_get_block_size $SCRATCH_MNT)
+
+# create source fs
+
+$BTRFS_UTIL_PROG subvolume create $SCRATCH_MNT/foo | _filter_scratch
+$BTRFS_UTIL_PROG subvolume create $SCRATCH_MNT/bar | _filter_scratch
+$BTRFS_UTIL_PROG subvolume create $SCRATCH_MNT/baz | _filter_scratch
+$BTRFS_UTIL_PROG subvolume create $SCRATCH_MNT/snap | _filter_scratch
+
+$XFS_IO_PROG -s -f -c "pwrite -S 0xaa -b $((32 * $BLOCK_SIZE)) 0 $((32 * 
$BLOCK_SIZE))" \
+ $SCRATCH_MNT/foo/file_a | _filter_xfs_io_blocks_modified
+$XFS_IO_PROG -s -f -c "pwrite -S 0xbb -b $((32 * $BLOCK_SIZE)) 0 $((32 * 
$BLOCK_SIZE))" \
+ $SCRATCH_MNT/bar/file_b | _filter_xfs_io_blocks_modified
+
+$CLONER_PROG $SCRATCH_MNT/{foo/file_a,baz/file_a}
+$CLONER_PROG $SCRATCH_MNT/{bar/file_b,baz/file_b}
+
+# Filesystem looks like:
+#
+# .
+# |--- foo/
+# |   |--- file_a
+# |--- bar/
+# |   |--- file_b
+# |--- baz/
+# |   |--- file_a   (clone of "foo/file_a")
+# |   |--- file_b   (clone of "bar/file_b")
+# |--- snap/
+#
+
+# create snapshots and send streams
+
+$BTRFS_UTIL_PROG subvolume snapshot -r $SCRATCH_MNT/foo \
+$SCRATCH_MNT/snap/foo.0 > /dev/null
+
+$BTRFS_UTIL_PROG send $SCRATCH_MNT/snap/foo.0   \
+-f $send_files_dir/foo.0.snap   \
+2>&1 1>/dev/null | _filter_scratch
+
+$BTRFS_UTIL_PROG subvolume snapshot -r  \
+$SCRATCH_MNT/bar $SCRATCH_MNT/snap/bar.0\
+> /dev/null
+
+$BTRFS_UTIL_PROG send $SCRATCH_MNT/snap/bar.0   \
+-f $send_files_dir/bar.0.snap   \
+2>&1 1>/dev/null | _filter_scratch
+
+$CLONER_PROG $SCRATCH_MNT/foo/file_a{,.clone}
+rm $SCRATCH_MNT/foo/file_a
+
+$BTRFS_UTIL_PROG subvolume snapshot -r  \
+

[PATCH 1/2] btrfs-progs: receive: better error reporting for snapshots

2017-02-22 Thread Benedikt Morbach
Two fixes:

1)

Check that the parent subvol actually is reachable via our root path.
The previous code wouldn't catch

parent subvol: foo/bar
root path: bar   (i.e. mounted with -o subvol=bar)

where the parent isn't reachable from the root path.
(but the original "strstr(parent, root_path) == NULL" check still doesn't hold)

Also check for the slash after "root_path", i.e. throw an error on

parent subvol: foobar
root path: foo

2)

If the parent subvol is the one that is mounted we obviously can't
receive into it, as it has to be read-only by definition.

We'd get a rather cryptic:

At subvol /tmp/test/dest.snap
At snapshot dest.snap
ERROR: creating snapshot / -> dest.snap failed: Invalid cross-device link

(not sure what it says if "/" isn't even a btrfs)

But with this we get

At subvol /tmp/test/dest.snap
At snapshot dest.snap
ERROR: creating snapshot . -> dest.snap failed: Read-only file system

which is both more helpful and more correct.

Signed-off-by: Benedikt Morbach 
---
 cmds-receive.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/cmds-receive.c b/cmds-receive.c
index 166d37d..790218c 100644
--- a/cmds-receive.c
+++ b/cmds-receive.c
@@ -314,8 +314,8 @@ static int process_snapshot(const char *path, const u8 
*uuid, u64 ctransid,
sub_len = strlen(parent_subvol->path);
 
/* First make sure the parent subvol is actually in our path */
-   if (sub_len < root_len ||
-   strstr(parent_subvol->path, rctx->full_root_path) == NULL) {
+   if (strstr(parent_subvol->path, rctx->full_root_path) != 
parent_subvol->path ||
+   sub_len > root_len && parent_subvol->path[root_len] != '/') 
{
error(
"parent subvol is not reachable from inside the root subvol");
ret = -ENOENT;
@@ -323,7 +323,7 @@ static int process_snapshot(const char *path, const u8 
*uuid, u64 ctransid,
}
 
if (sub_len == root_len) {
-   parent_subvol->path[0] = '/';
+   parent_subvol->path[0] = '.';
parent_subvol->path[1] = '\0';
} else {
/*
-- 
2.11.1

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/2] btrfs-progs: receive: handle root subvol path in clone

2017-02-22 Thread Benedikt Morbach
testcase:
# ro subvol /src/parent
# rw subvol /src/foo
clone /src/parent/file /src/foo/file
subvol snapshot -r /src/foo /src/foo.snap

# generates a "clone parent/file -> foo.snap/file" send command
send -p /src/parent /src/foo.snap

# target fs:
#dest/
#|--- parent/...
# mounted with -o subvol=dest, such that "parent" is at /parent
receive 

result:
ERROR: cannot open dest/parent/file: No such file or directory

expected:
"dest/" get's stripped from the clone source path to get the actual
path in the target fs, if reachable from the mount point/chroot.

This is exactly what process_snapshot does, which gets called on
_every_ incremental receive and I'm quite certain is correct in
doing so

Signed-off-by: Benedikt Morbach 
---

Hi,

I first tried fixing this ages ago with [1], which was met with some scepticism.
While that patch wasn't 100% correct I believe this is, and as mentioned it does
exacly the same thing as process_snapshot because that has the exact same 
problem.

An fstest to reproduce this will be following shortly

[1] https://patchwork.kernel.org/patch/9177155/

 cmds-receive.c | 19 ++-
 1 file changed, 18 insertions(+), 1 deletion(-)

diff --git a/cmds-receive.c b/cmds-receive.c
index 790218c..01345a4 100644
--- a/cmds-receive.c
+++ b/cmds-receive.c
@@ -782,7 +782,24 @@ static int process_clone(const char *path, u64 offset, u64 
len,
r->subvol_parent_name);
}
}*/
-   subvol_path = strdup(si->path);
+
+   /* strip the subvolume that we are receiving to from the start 
of subvol_path */
+   if (rctx->full_root_path) {
+   size_t root_len = strlen(rctx->full_root_path);
+   size_t sub_len = strlen(si->path);
+
+   if (sub_len > root_len &&
+   strstr(si->path, rctx->full_root_path) == si->path 
&&
+   si->path[root_len] == '/') {
+   subvol_path = strdup(si->path + root_len + 1);
+   } else {
+   error("clone: source subvol path %s unreachable 
from %s",
+   si->path, rctx->full_root_path);
+   goto out;
+   }
+   } else {
+   subvol_path = strdup(si->path);
+   }
}
 
ret = path_cat_out(full_clone_path, subvol_path, clone_path);
-- 
2.11.1

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: FS gives kernel UPS on attempt to create snapshot and after running balance it's unmountable.

2017-02-22 Thread Tomasz Kusmierz
when looking through log for old messages I can see that there are
kernel problems before with extent tree while trying to create
snapshots:


Jan 23 05:00:02 server kernel: #011#011tree block backref root 7
Jan 23 05:00:02 server kernel: #011item 108 key (12288467451904 169 0)
itemoff 12677 itemsize 33
Jan 23 05:00:02 server kernel: #011#011extent refs 1 gen 144462 flags 2
Jan 23 05:00:02 server kernel: #011#011tree block backref root 7
Jan 23 05:00:02 server kernel: #011item 109 key (12288467468288 169 0)
itemoff 12644 itemsize 33
Jan 23 05:00:02 server kernel: #011#011extent refs 1 gen 144462 flags 2
Jan 23 05:00:02 server kernel: #011#011tree block backref root 7
Jan 23 05:00:02 server kernel: #011item 110 key (12288467484672 169 0)
itemoff 12611 itemsize 33
Jan 23 05:00:02 server kernel: #011#011extent refs 1 gen 144462 flags 2
Jan 23 05:00:02 server kernel: #011#011tree block backref root 7
Jan 23 05:00:02 server kernel: #011item 111 key (12288467501056 169 0)
itemoff 12578 itemsize 33
Jan 23 05:00:02 server kernel: #011#011extent refs 1 gen 144462 flags 2
Jan 23 05:00:02 server kernel: #011#011tree block backref root 7
Jan 23 05:00:02 server kernel: #011item 112 key (12288467533824 169 0)
itemoff 12545 itemsize 33
Jan 23 05:00:02 server kernel: #011#011extent refs 1 gen 144462 flags 2
Jan 23 05:00:02 server kernel: #011#011tree block backref root 7
Jan 23 05:00:02 server kernel: BTRFS error (device sdc): unable to
find ref byte nr 12288404504576 parent 0 root 258  owner 2 offset 0
Jan 23 05:00:02 server kernel: [ cut here ]
Jan 23 05:00:02 server kernel: WARNING: CPU: 8 PID: 28064 at
fs/btrfs/extent-tree.c:6951 __btrfs_free_extent.isra.69+0xbca/0xca0
[btrfs]
Jan 23 05:00:02 server kernel: Modules linked in: xt_nat veth
xt_conntrack ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat
nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 xt_addrtype
iptable_filter nf_nat nf_conntrack ipmi_devintf ext4 jbd2 mbcache
iTCO_wdt gpio_ich iTCO_vendor_support coretemp kvm_intel kvm irqbypass
intel_cstate input_leds pcspkr hpilo hpwdt lpc_ich mfd_core ioatdma
i7core_edac edac_core ses enclosure ipmi_si ipmi_msghandler sg
acpi_power_meter pcc_cpufreq shpchp acpi_cpufreq nfsd auth_rpcgss
nfs_acl lockd grace sunrpc ip_tables btrfs xor raid6_pq sd_mod amdkfd
amd_iommu_v2 radeon crc32c_intel drm_kms_helper syscopyarea
sysfillrect sysimgblt fb_sys_fops ttm serio_raw drm ahci libahci
libata fjes mpt3sas raid_class scsi_transport_sas igb ptp pps_core dca
i2c_algo_bit
Jan 23 05:00:02 server kernel: CPU: 8 PID: 28064 Comm: btrfs Tainted:
GW I 4.8.7-1.el7.elrepo.x86_64 #1
Jan 23 05:00:02 server kernel: Hardware name: HP ProLiant SE326M1   ,
BIOS R02 12/07/2010
Jan 23 05:00:02 server kernel: 0286 77bb5259
8802bbf1f778 8135406c
Jan 23 05:00:02 server kernel: 8802bbf1f7c8 
8802bbf1f7b8 810817b1
Jan 23 05:00:02 server kernel: 1b270002 8806f612
0b2d1dfc4000 fffe
Jan 23 05:00:02 server kernel: Call Trace:
Jan 23 05:00:02 server kernel: [] dump_stack+0x63/0x87
Jan 23 05:00:02 server kernel: [] __warn+0xd1/0xf0
Jan 23 05:00:02 server kernel: [] warn_slowpath_fmt+0x5f/0x80
Jan 23 05:00:02 server kernel: []
__btrfs_free_extent.isra.69+0xbca/0xca0 [btrfs]
Jan 23 05:00:02 server kernel: []
__btrfs_run_delayed_refs.constprop.78+0xa11/0x1250 [btrfs]
Jan 23 05:00:02 server kernel: []
btrfs_run_delayed_refs+0x8e/0x2c0 [btrfs]
Jan 23 05:00:02 server kernel: []
create_pending_snapshot.isra.26+0x5cd/0xdd0 [btrfs]
Jan 23 05:00:02 server kernel: []
create_pending_snapshots+0x78/0xa0 [btrfs]
Jan 23 05:00:02 server kernel: []
btrfs_commit_transaction+0x435/0xa70 [btrfs]
Jan 23 05:00:02 server kernel: []
btrfs_mksubvol.isra.39+0x513/0x520 [btrfs]
Jan 23 05:00:02 server kernel: [] ?
prepare_to_wait_event+0xf0/0xf0
Jan 23 05:00:02 server kernel: []
btrfs_ioctl_snap_create_transid+0x18f/0x1a0 [btrfs]
Jan 23 05:00:02 server kernel: []
btrfs_ioctl_snap_create_v2+0x125/0x180 [btrfs]
Jan 23 05:00:02 server kernel: []
btrfs_ioctl+0x6b3/0x21e0 [btrfs]
Jan 23 05:00:02 server kernel: [] ?
mem_cgroup_commit_charge+0x85/0x100
Jan 23 05:00:02 server kernel: [] ?
page_add_new_anon_rmap+0x89/0xc0
Jan 23 05:00:02 server kernel: [] ?
lru_cache_add_active_or_unevictable+0x35/0xb0
Jan 23 05:00:02 server kernel: [] ?
handle_mm_fault+0xed0/0x1240
Jan 23 05:00:02 server kernel: [] do_vfs_ioctl+0xa7/0x5f0
Jan 23 05:00:02 server kernel: [] ?
__audit_syscall_entry+0xaf/0x100
Jan 23 05:00:02 server kernel: [] ?
syscall_trace_enter+0x1dd/0x2c0
Jan 23 05:00:02 server kernel: [] SyS_ioctl+0x79/0x90
Jan 23 05:00:02 server kernel: [] do_syscall_64+0x67/0x160
Jan 23 05:00:02 server kernel: []
entry_SYSCALL64_slow_path+0x25/0x25
Jan 23 05:00:02 server kernel: ---[ end trace eb863872ca3491b1 ]---
Jan 23 05:00:02 server kernel: BTRFS: error (device sdc) in
__btrfs_free_extent:6951: errno=-2 No such entry
Jan 23 05:00:02 server kernel: BTRFS info (device sdc): 

[PATCH] btrfs-progs: send-dump: add missing newlines

2017-02-22 Thread Benedikt Morbach
make sure to include newlines after commands that have only one
argument, such as 'unlink' or 'mkfile'

changes

unlink  ./baz.0/file_autimes  ./baz.0/  
  atime=2017-02-22T11:59:16+0100 mtime=2017-02-22T11:59:16+0100 
ctime=2017-02-22T11:59:16+0100
truncate./baz.0/file_a  size=131072
chmod   ./baz.0/file_a  mode=644
utimes  ./baz.0/file_a  
atime=2017-02-22T11:59:16+0100 mtime=2017-02-22T11:59:16+0100 
ctime=2017-02-22T11:59:16+0100
mkfile  ./baz.0/o258-11-0rename  ./baz.0/o258-11-0  
 dest=./baz.0/file_b
utimes  ./baz.0/
atime=2017-02-22T11:59:16+0100 mtime=2017-02-22T11:59:16+0100 
ctime=2017-02-22T11:59:16+0100

to

unlink  ./baz.0/file_a
utimes  ./baz.0/
atime=2017-02-22T11:59:16+0100 mtime=2017-02-22T11:59:16+0100 
ctime=2017-02-22T11:59:16+0100
truncate./baz.0/file_a  size=131072
chmod   ./baz.0/file_a  mode=644
utimes  ./baz.0/file_a  
atime=2017-02-22T11:59:16+0100 mtime=2017-02-22T11:59:16+0100 
ctime=2017-02-22T11:59:16+0100
mkfile  ./baz.0/o258-11-0
rename  ./baz.0/o258-11-0   dest=./baz.0/file_b
utimes  ./baz.0/
atime=2017-02-22T11:59:16+0100 mtime=2017-02-22T11:59:16+0100 
ctime=2017-02-22T11:59:16+0100

Signed-off-by: Benedikt Morbach 
---
 send-dump.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/send-dump.c b/send-dump.c
index 4c44246..67f7977 100644
--- a/send-dump.c
+++ b/send-dump.c
@@ -112,8 +112,10 @@ static int __print_dump(int subvol, void *user, const char 
*path,
/* Unified header */
printf("%-16s", title);
ret = print_path_escaped(out_path);
-   if (!fmt)
+   if (!fmt) {
+   putchar('\n');
return 0;
+   }
/* Short paths ale aligned to 32 chars */
while (ret++ < 32)
putchar(' ');
-- 
2.11.1

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: BTRFS critical: corrupt leaf, slot offset bad; then read-only

2017-02-22 Thread Hans van Kranenburg
On 02/22/2017 08:44 AM, Lukas Tribus wrote:
> Upgrading to 4.8, the FS no longer causes a kernel calltrace and does
> not go read-only. It only shows the "corrupt leaf, slot offset bad"
> message.
> 
> A scrub completed without errors on 3 devices, while it was aborted on 2
> devices. Not sure why it was aborted, since there is no error message in
> dmesg?
> 
> Any suggestions why the scrub was aborted?

Maybe because of the "corrupt leaf" error.

> # uname -a
> Linux srv1-dom0 4.8.0-36-generic #36~16.04.1-Ubuntu SMP Sun Feb 5
> 09:39:57 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
> # btrfs scrub status /storage/users/
> scrub status for f50f980e-7640-49c7-bf8d-20d55cfe6005
> scrub started at Wed Feb 22 00:07:33 2017 and was aborted after
> 06:35:42
> total bytes scrubbed: 10.60TiB with 0 errors
> /# btrfs scrub status /storage/users/ -d
> scrub status for f50f980e-7640-49c7-bf8d-20d55cfe6005
> scrub device /dev/dm-5 (id 1) history
> scrub started at Wed Feb 22 00:07:33 2017 and finished after
> 06:35:36
> total bytes scrubbed: 2.30TiB with 0 errors
> scrub device /dev/dm-6 (id 2) history
> scrub started at Wed Feb 22 00:07:33 2017 and finished after
> 06:35:30
> total bytes scrubbed: 2.30TiB with 0 errors
> scrub device /dev/dm-7 (id 3) history
> scrub started at Wed Feb 22 00:07:33 2017 and finished after
> 06:35:42
> total bytes scrubbed: 2.30TiB with 0 errors
> scrub device /dev/dm-8 (id 4) history
> scrub started at Wed Feb 22 00:07:33 2017 and was aborted after
> 05:01:37
> total bytes scrubbed: 1.85TiB with 0 errors
> scrub device /dev/mapper/sde3_crypt (id 5) history
> scrub started at Wed Feb 22 00:07:33 2017 and was aborted after
> 05:01:37
> total bytes scrubbed: 1.85TiB with 0 errors
> #dmesg | grep BTRFS
> [  929.737119] BTRFS critical (device dm-9): corrupt leaf, slot offset
> bad: block=5242107641856,root=1, slot=39
> [19772.594129] BTRFS critical (device dm-9): corrupt leaf, slot offset
> bad: block=5242107641856,root=1, slot=39
> [19777.127704] BTRFS critical (device dm-9): corrupt leaf, slot offset
> bad: block=5242107641856,root=1, slot=39
> [19777.552191] BTRFS critical (device dm-9): corrupt leaf, slot offset
> bad: block=5242107641856,root=1, slot=39

Ok, this is not a csum failure, so probably not the disk giving other
data back than what was sent to it when doing the writes, or a disk
controller which corrupted the data while writing.

And, it's a metadata page, in which part of the entries do not make
sense any more to btrfs. Specifically, it's in root 1, which is the tree
which contains information about all other subtrees containing metadata,
so it's quite an important one.

So, the corruption which is now present in there likely happened in
memory before writing it out. This is also a scenario in which DUP or
RAIDx on disk doesn't help you, because in memory it's stored just once.

If this is a bitflip like thing in memory, it would probably be possible
to spot it and manually correct it (using a patched btrfschk with
bitflip patch, or manually by hexediting++).

Another option is memory corruption or a bug somewhere else in the
kernel, which lead to a memory address of a pointer being changed,
leading to a write to memory end up in the middle of some btrfs metadata
waiting to be checksummed and written to disk.

Question here is... is it easier for you to nuke the filesystem and
restore the files from somewhere else, or do you want to figure out
manually if it's recoverable, and spend some time with dd, hexedit,
reading struct definitions in btrfs kernel C code etc...

If the regular --repair can't fix it (and it can't do magic if you shoot
a hole in it with a shotgun), then there's no automated other tool that
can do it now.

Since it's block 5242107641856 all the time, it might be worthwhile to
have a look at it. Either it's that block, or there's a bigger mess
hidden behind it.

-- 
Hans van Kranenburg
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: BTRFS critical: corrupt leaf, slot offset bad; then read-only

2017-02-22 Thread Lukas Tribus

I did a "btrfs check" (--readonly):

Summary:
589x filetype 1 errors 4, no inode ref (--> Files)
597x filetype 2 errors 4, no inode ref (--> Directories)
1183x root xxx inode YY errors 2001, no inode item, link count wrong

I looked at a handful of reported files which are verifiable via public 
MD5/SHA1 checksums and they are not corrupted, the checksum is correct.


Any hints or suggestions would be much appreciated, please see below for 
the btrfs check output (repeating lines omitted and some filenames 
redacted):


Checking filesystem on /dev/dm-9
UUID: f50f980e-7640-49c7-bf8d-20d55cfe6005
checking extents [.]
[...]
incorrect offsets 14927 14415
bad block 5242107641856

Errors found in extent allocation tree or chunk allocation
checking free space cache [.]
[...]
checking fs roots [.]
[...]
incorrect offsets 14927 14415
incorrect offsets 14927 14415
root 261 inode 127094 errors 500, file extent discount, nbytes wrong
Found file extent holes:
start: 0, len: 499712
unresolved ref dir 127093 index 2 namelen 24 name ABC DE Fghij 
Klmnopr.tuv filetype 1 errors 4, no inode ref

root 261 inode 127095 errors 2001, no inode item, link count wrong
unresolved ref dir 127080 index 13 namelen 17 name 
Whateverdir123456 filetype 2 errors 4, no inode ref

root 261 inode 127097 errors 2001, no inode item, link count wrong
unresolved ref dir 127080 index 14 namelen 12 name 
WhateverDirectory2 filetype 2 errors 4, no inode ref

root 261 inode 127099 errors 2001, no inode item, link count wrong
unresolved ref dir 127080 index 15 namelen 11 name AnyDir filetype 
2 errors 4, no inode ref

root 261 inode 127105 errors 2001, no inode item, link count wrong
unresolved ref dir 127080 index 16 namelen 10 name AnotherDir 
filetype 2 errors 4, no inode ref

root 261 inode 127107 errors 2001, no inode item, link count wrong
unresolved ref dir 127080 index 17 namelen 11 name Folder11 
filetype 2 errors 4, no inode ref

root 261 inode 127112 errors 2001, no inode item, link count wrong
unresolved ref dir 126959 index 51 namelen 11 name Folder120 
filetype 2 errors 4, no inode ref

root 261 inode 127114 errors 2001, no inode item, link count wrong
unresolved ref dir 126146 index 40 namelen 13 name GVC-dir filetype 
2 errors 4, no inode ref

root 261 inode 127396 errors 2001, no inode item, link count wrong
unresolved ref dir 126146 index 41 namelen 4 name G3-dir filetype 2 
errors 4, no inode ref

root 261 inode 127527 errors 2001, no inode item, link count wrong
unresolved ref dir 126146 index 42 namelen 11 name Hello Dir 2 
filetype 2 errors 4, no inode ref

root 261 inode 127535 errors 2001, no inode item, link count wrong
unresolved ref dir 126146 index 43 namelen 4 name Hellodir filetype 
2 errors 4, no inode ref

root 261 inode 127573 errors 2001, no inode item, link count wrong
unresolved ref dir 126146 index 44 namelen 6 name Hello 2 filetype 
2 errors 4, no inode ref

root 261 inode 127620 errors 2001, no inode item, link count wrong
[...]
root 261 inode 177273 errors 2001, no inode item, link count wrong
unresolved ref dir 23439 index 23 namelen 24 name Firefox Setup 
51.0.1.exe filetype 1 errors 4, no inode ref

root 261 inode 177275 errors 2001, no inode item, link count wrong
unresolved ref dir 23439 index 26 namelen 27 name Firefox Setup 
45.7.0esr.exe filetype 1 errors 4, no inode ref

root 261 inode 180457 errors 2001, no inode item, link count wrong
[...]
checking fs roots [o]
incorrect offsets 14927 14415
checking fs roots [.]
[...]
checking fs roots [o]
The following tree block(s) is corrupted in tree 263:
tree block bytenr: 5242107641856, level: 0, node key: 
(5241902333952, 169, 0)

checking fs roots [o]
incorrect offsets 14927 14415
checking fs roots [O]
The following tree block(s) is corrupted in tree 6685:
tree block bytenr: 5242107641856, level: 0, node key: 
(5241902333952, 169, 0)

checking fs roots [o]
checking fs roots [.]
incorrect offsets 14927 14415
The following tree block(s) is corrupted in tree 6879:
tree block bytenr: 5242107641856, level: 0, node key: 
(5241902333952, 169, 0)

checking fs roots [o]
incorrect offsets 14927 14415
incorrect offsets 14927 14415
root 6893 inode 127094 errors 500, file extent discount, nbytes wrong
Found file extent holes:
start: 0, len: 499712
unresolved ref dir 127093 index 2 namelen 24 name ABC DE Fghij 
Klmnopr.tuv filetype 1 errors 4, no inode ref

root 6893 inode 127095 errors 2001, no inode item, link count wrong
unresolved ref dir 127080 index 13 namelen 17 name 
Whateverdir123456 filetype 2 errors 4, no inode ref

root 6893 inode 127097 errors 2001, no inode item, link count wrong
[...]
root 6893 inode 177273 errors 2001, no inode item, link count wrong
unresolved ref dir 23439 index 23 namelen 24 name Firefox Setup 
51.0.1.exe filetype 1 errors 4, no inode ref

root 6893 inode 177275 errors 2001, no inode item, link count wrong
unresolved ref dir 23439 index 26 

Re: [PATCH] Btrfs: try harder to migrate items to left sibling before splitting a leaf

2017-02-22 Thread Liu Bo
On Wed, Feb 22, 2017 at 02:05:05PM +, Filipe Manana wrote:
> On Wed, Feb 22, 2017 at 12:07 AM, Liu Bo  wrote:
> > On Sun, Feb 19, 2017 at 08:56:39PM +, fdman...@kernel.org wrote:
> >> From: Filipe Manana 
> >>
> >> Before attempting to split a leaf we try to migrate items from the leaf to
> >> its right and left siblings. We start by trying to move items into the
> >> rigth sibling and, if the new item is meant to be inserted at the end of
> >> our leaf, we try to free from our leaf an amount of bytes equal to the
> >> number of bytes used by the new item, by setting the variable space_needed
> >> to the byte size of that new item. However if we fail to move enough items
> >> to the right sibling due to lack of space in that sibling, we then try
> >> to move items into the left sibling, and in that case we try to free
> >> an amount equal to the size of the new item from our leaf, when we need
> >> only to free an amount corresponding to the size of the new item minus
> >> the current free space of our leaf. So make sure that before we try to
> >> move items to the left sibling we do set the variable space_needed with
> >> a value corresponding to the new item's size minus the leaf's current
> >> free space.
> >>
> >> Signed-off-by: Filipe Manana 
> >> ---
> >>  fs/btrfs/ctree.c | 7 +++
> >>  1 file changed, 7 insertions(+)
> >>
> >> diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c
> >> index a426dc8..1d66761 100644
> >> --- a/fs/btrfs/ctree.c
> >> +++ b/fs/btrfs/ctree.c
> >> @@ -4160,6 +4160,9 @@ static noinline int push_for_double_split(struct 
> >> btrfs_trans_handle *trans,
> >>
> >>   /* try to push all the items before our slot into the next leaf */
> >>   slot = path->slots[0];
> >> + space_needed = data_size;
> >> + if (slot > 0)
> >> + space_needed -= btrfs_leaf_free_space(fs_info, 
> >> path->nodes[0]);
> >
> > Good point.
> >
> >>   ret = push_leaf_left(trans, root, path, 1, space_needed, 0, slot);
> >>   if (ret < 0)
> >>   return ret;
> >> @@ -4215,6 +4218,10 @@ static noinline int split_leaf(struct 
> >> btrfs_trans_handle *trans,
> >>   if (wret < 0)
> >>   return wret;
> >>   if (wret) {
> >> + space_needed = data_size;
> >> + if (slot > 0)
> >> + space_needed -= 
> >> btrfs_leaf_free_space(fs_info,
> >> +   l);
> >
> > Not sure if we need this, the above push_leaf_right() was called with
> > @min_data_size == space_needed, thus if @wret == 1, no items have been 
> > moved in
> > push_leaf_right() so that leaf 'l' remains unchanged.
> 
> That's exactly the point. If the item is meant to be appended and we
> failed to move it into the right sibling, we want to try to move items
> from the leaf into its left sibling with the goal of freeing
> "new_item_size - leaf_free_space(l)" bytes from our leaf. Without this
> new assignment, we try to free new_item_size bytes, which has higher
> chances of not being successful at migrating items into the left
> sibling (example, the left sibling has less than new_item_size bytes
> free, but has at least "new_item_size - leaf_free_space(l)" bytes
> free).
> 

I see, I missed that we didn't check (slot > 0) before
push_leaf_right(), you may add

Reviewed-by: Liu Bo 

Thanks,

-liubo
> >
> > Thanks,
> >
> > -liubo
> >>   wret = push_leaf_left(trans, root, path, 
> >> space_needed,
> >> space_needed, 0, (u32)-1);
> >>   if (wret < 0)
> >> --
> >> 2.7.0.rc3
> >>
> >> --
> >> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> >> the body of a message to majord...@vger.kernel.org
> >> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Btrfs: try harder to migrate items to left sibling before splitting a leaf

2017-02-22 Thread Filipe Manana
On Wed, Feb 22, 2017 at 12:07 AM, Liu Bo  wrote:
> On Sun, Feb 19, 2017 at 08:56:39PM +, fdman...@kernel.org wrote:
>> From: Filipe Manana 
>>
>> Before attempting to split a leaf we try to migrate items from the leaf to
>> its right and left siblings. We start by trying to move items into the
>> rigth sibling and, if the new item is meant to be inserted at the end of
>> our leaf, we try to free from our leaf an amount of bytes equal to the
>> number of bytes used by the new item, by setting the variable space_needed
>> to the byte size of that new item. However if we fail to move enough items
>> to the right sibling due to lack of space in that sibling, we then try
>> to move items into the left sibling, and in that case we try to free
>> an amount equal to the size of the new item from our leaf, when we need
>> only to free an amount corresponding to the size of the new item minus
>> the current free space of our leaf. So make sure that before we try to
>> move items to the left sibling we do set the variable space_needed with
>> a value corresponding to the new item's size minus the leaf's current
>> free space.
>>
>> Signed-off-by: Filipe Manana 
>> ---
>>  fs/btrfs/ctree.c | 7 +++
>>  1 file changed, 7 insertions(+)
>>
>> diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c
>> index a426dc8..1d66761 100644
>> --- a/fs/btrfs/ctree.c
>> +++ b/fs/btrfs/ctree.c
>> @@ -4160,6 +4160,9 @@ static noinline int push_for_double_split(struct 
>> btrfs_trans_handle *trans,
>>
>>   /* try to push all the items before our slot into the next leaf */
>>   slot = path->slots[0];
>> + space_needed = data_size;
>> + if (slot > 0)
>> + space_needed -= btrfs_leaf_free_space(fs_info, path->nodes[0]);
>
> Good point.
>
>>   ret = push_leaf_left(trans, root, path, 1, space_needed, 0, slot);
>>   if (ret < 0)
>>   return ret;
>> @@ -4215,6 +4218,10 @@ static noinline int split_leaf(struct 
>> btrfs_trans_handle *trans,
>>   if (wret < 0)
>>   return wret;
>>   if (wret) {
>> + space_needed = data_size;
>> + if (slot > 0)
>> + space_needed -= btrfs_leaf_free_space(fs_info,
>> +   l);
>
> Not sure if we need this, the above push_leaf_right() was called with
> @min_data_size == space_needed, thus if @wret == 1, no items have been moved 
> in
> push_leaf_right() so that leaf 'l' remains unchanged.

That's exactly the point. If the item is meant to be appended and we
failed to move it into the right sibling, we want to try to move items
from the leaf into its left sibling with the goal of freeing
"new_item_size - leaf_free_space(l)" bytes from our leaf. Without this
new assignment, we try to free new_item_size bytes, which has higher
chances of not being successful at migrating items into the left
sibling (example, the left sibling has less than new_item_size bytes
free, but has at least "new_item_size - leaf_free_space(l)" bytes
free).

>
> Thanks,
>
> -liubo
>>   wret = push_leaf_left(trans, root, path, space_needed,
>> space_needed, 0, (u32)-1);
>>   if (wret < 0)
>> --
>> 2.7.0.rc3
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>> the body of a message to majord...@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Fwd:

2017-02-22 Thread Ilan Schwarts
Hi,
I previously had a problem that the same device has 2 device Id on btrfs.

In order to solve it, i had to use
BTRFS_I(inode)->root->anon_super.s_dev instead
of inode->i_sb->s_dev

My question is, if there is something like that for inodes, All directories
on my top level device has the same inode (512), I get inode by:
inode->i_ino

Is there additional way to get the inode number ?

Thanks!
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] generic/311: Disable dmesg check

2017-02-22 Thread Chandan Rajendra
On Monday, February 20, 2017 11:03:11 PM Anand Jain wrote:
> 
> Hi Chandan,
> 
> On 07/17/15 12:56, Chandan Rajendra wrote:
> > When running generic/311 on Btrfs' subpagesize-blocksize patchset (on ppc64
> > with 4k sectorsize and 16k node/leaf size) I noticed the following call 
> > trace,
> >
> > BTRFS (device dm-0): parent transid verify failed on 29720576 wanted 160 
> > found 158
> > BTRFS (device dm-0): parent transid verify failed on 29720576 wanted 160 
> > found 158
> > BTRFS: Transaction aborted (error -5)
> >
> > WARNING: at /root/repos/linux/fs/btrfs/super.c:260
> > Modules linked in:
> > CPU: 3 PID: 30769 Comm: umount Tainted: GWL 
> > 4.0.0-rc5-11671-g8b82e73e #63
> > task: c00079aaddb0 ti: c00079a48000 task.ti: c00079a48000
> > NIP: c0499aa0 LR: c0499a9c CTR: c0779630
> > REGS: c00079a4b480 TRAP: 0700   Tainted: GW   L   
> > (4.0.0-rc5-11671-g8b82e73e)
> > MSR: 800100029032   CR: 28008828  XER: 2000
> > CFAR: c0a23914 SOFTE: 1
> > GPR00: c0499a9c c00079a4b700 c103bdf8 0025
> > GPR04: 0001 0502 c107e918 0cda
> > GPR08: 0007 0007 0001 c10f5044
> > GPR12: 28008822 cfdc0d80 2000 10152e00
> > GPR16: 010002979380 10140724  
> > GPR20:    
> > GPR24: c000151f61a8  c00055e5e800 c0aac270
> > GPR28: 04a4 fffb c00055e5e800 c000679204d0
> > NIP [c0499aa0] .__btrfs_abort_transaction+0x180/0x190
> > LR [c0499a9c] .__btrfs_abort_transaction+0x17c/0x190
> > Call Trace:
> > [c00079a4b700] [c0499a9c] 
> > .__btrfs_abort_transaction+0x17c/0x190 (unreliable)
> > [c00079a4b7a0] [c0541678] .__btrfs_run_delayed_items+0xe8/0x220
> > [c00079a4b850] [c04d5b3c] .btrfs_commit_transaction+0x37c/0xca0
> > [c00079a4b960] [c049824c] .btrfs_sync_fs+0x6c/0x1a0
> > [c00079a4ba00] [c0255270] .sync_filesystem+0xd0/0x100
> > [c00079a4ba80] [c0218070] .generic_shutdown_super+0x40/0x170
> > [c00079a4bb10] [c0218598] .kill_anon_super+0x18/0x30
> > [c00079a4bb90] [c0498418] .btrfs_kill_super+0x18/0xc0
> > [c00079a4bc10] [c0218ac8] .deactivate_locked_super+0x98/0xe0
> > [c00079a4bc90] [c023e744] .cleanup_mnt+0x54/0xa0
> > [c00079a4bd10] [c00b7d14] .task_work_run+0x114/0x150
> > [c00079a4bdb0] [c0015f84] .do_notify_resume+0x74/0x80
> > [c00079a4be30] [c0009838] .ret_from_except_lite+0x64/0x68
> > Instruction dump:
> > ebc1fff0 ebe1fff8 4bfffb28 6000 3ce2ffcd 38e7e818 4bbc 3c62ffd2
> > 7fa4eb78 3863b808 48589e1d 6000 <0fe0> 4bfffedc 6000 6000
> > BTRFS: error (device dm-0) in __btrfs_run_delayed_items:1188: errno=-5 IO 
> > failure
> >
> >
> > The call trace is seen when executing _run_test() for the 8th time.
> > The above trace is actually a false-positive failure as indicated below,
> >  fsync-tester
> >fsync(fd)
> >Write delayed inode item to fs tree
> >  (assume transid to be 160)
> >  (assume tree block to start at logical address 29720576)
> >  md5sum $testfile
> >This causes a delayed inode to be added
> >  Load flakey table
> >i.e. drop writes that are initiated from now onwards
> >  Unmount filesystem
> >btrfs_sync_fs is invoked
> >  Write 29720576 metadata block to disk
> >  free_extent_buffer(29720576)
> >release_extent_buffer(29720576)
> >Start writing delayed inode
> >  Traverse the fs tree
> >(assume the parent tree block of 29720576 is still in memory)
> >When reading 29720576 from disk, parent's blkptr will have generation
> >set to 160. But the on-disk tree block will have an older
> >generation (say, 158). Transid verification fails and hence the
> >transaction gets aborted
> >
> > The test only cares about the FS instance before the unmount
> > operation (i.e. the synced FS). Hence to get the test to pass, ignore the
> > false-positive trace that could be generated.
> 
>   Looks like this patch didn't make it, is there any kernel patch
>   which fixed this bug ? Or any hints on how to reproduce this bug ?
> 

Hi Anand,

This bug is easily recreated when executing the test on Btrfs with
subpage-blocksize patchset applied. I haven't been able to test the recently
rebased subpage-blocksize patchset yet.

Coming back to the issue ... The problem exists because the test code uses
dm-flakey. Josef had suggested that using dm-log-writes instead of dm-flakey
should fix the problem. I will work on this and post a patch soon.

-- 
chandan

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of