segmentation fault in btrfs tool v4.15

2018-02-05 Thread Ralph Gauges

Hi,

I recently started using the btrfs file system on my backup disk and 
until a power failure during a backup everything seemed to work well.
Due a power failure however, the file systems seems to have become 
corrupted and all trials to check or repair it so far have lead to a 
segmentation fault.


Normally I am using Ubuntu 17.10 and the btrfs version that goes along 
with it, but for testing purposes, I compiled my own version 
(btrfs-progs v4.15).
Unfortunately the segmentation fault is also present in this latest 
version. At the end of this email, you can see the output from btrfs 
when I try to check the partition as well as the backtrace from gdb. I 
hope that helps.


My system is running kernel 4.13.0-32-generic on x86_64.

If you need any additional information please contace me directly since 
I don't subscribe to the mailing list.


Sincerily

Ralph Gauges






(gdb) set args check /dev/sdf1
(gdb) run
Starting program: /home/gauges/Applications/bin/btrfs check /dev/sdf1
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
parent transid verify failed on 266195058688 wanted 1857 found 1864
parent transid verify failed on 266195058688 wanted 1857 found 1864
parent transid verify failed on 266195058688 wanted 1857 found 1864
parent transid verify failed on 266195058688 wanted 1857 found 1864
Ignoring transid failure
Checking filesystem on /dev/sdf1
UUID: 9ccb1eaa-a9ae-46f3-8885-fba4799d6e85
parent transid verify failed on 266196516864 wanted 1858 found 1864
parent transid verify failed on 266196516864 wanted 1858 found 1864
parent transid verify failed on 266196516864 wanted 1858 found 1864
parent transid verify failed on 266196516864 wanted 1858 found 1864
Ignoring transid failure
parent transid verify failed on 266196205568 wanted 1858 found 1864
parent transid verify failed on 266196205568 wanted 1858 found 1864
parent transid verify failed on 266196205568 wanted 1858 found 1864
parent transid verify failed on 266196205568 wanted 1858 found 1864
Ignoring transid failure

Program received signal SIGSEGV, Segmentation fault.
0x555bf7a7 in btrfs_extent_flags (s=0xcc344065, 
eb=0x584acce0) at ctree.h:1694
1694    BTRFS_SETGET_FUNCS(extent_flags, struct btrfs_extent_item, 
flags, 64);

(gdb) bt
#0  0x555bf7a7 in btrfs_extent_flags (s=0xcc344065, 
eb=0x584acce0) at ctree.h:1694

#1  build_roots_info_cache (info=0x55821a00) at cmds-check.c:14285
#2  repair_root_items (info=0x55821a00) at cmds-check.c:14450
#3  cmd_check (argc=, argv=) at 
cmds-check.c:14965

#4  0x55566320 in main (argc=2, argv=0x7fffde30) at btrfs.c:302
(gdb)
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] btrfs-progs: fsck-tests: Cleanup the restored image for 028

2018-02-05 Thread Qu Wenruo
Signed-off-by: Qu Wenruo 
---
 tests/fsck-tests/028-unaligned-super-dev-sizes/test.sh | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/tests/fsck-tests/028-unaligned-super-dev-sizes/test.sh 
b/tests/fsck-tests/028-unaligned-super-dev-sizes/test.sh
index 3928f548c3f9..4bbcfbae662e 100755
--- a/tests/fsck-tests/028-unaligned-super-dev-sizes/test.sh
+++ b/tests/fsck-tests/028-unaligned-super-dev-sizes/test.sh
@@ -21,3 +21,5 @@ run_check "$TOP/btrfs" check "$TEST_DEV"
 # mount test
 run_check_mount_test_dev
 run_check_umount_test_dev "$TEST_MNT"
+# don't forget to clean it up
+rm "$TEST_DEV"
-- 
2.16.1

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/3] btrfs-progs: Move check/main.c to cmds-check.c to maintain the subcommand hierarchy

2018-02-05 Thread Qu Wenruo
cmds-check.c is back now, with include files cleaned up.

Signed-off-by: Qu Wenruo 
---
 Makefile |  8 
 check/main.c => cmds-check.c | 17 -
 2 files changed, 4 insertions(+), 21 deletions(-)
 rename check/main.c => cmds-check.c (97%)

diff --git a/Makefile b/Makefile
index f51a8d701a2c..58e894c0171b 100644
--- a/Makefile
+++ b/Makefile
@@ -100,6 +100,7 @@ CHECKER_FLAGS := -include $(check_defs) -D__CHECKER__ \
-D__CHECK_ENDIAN__ -Wbitwise -Wuninitialized -Wshadow -Wundef \
-U_FORTIFY_SOURCE -Wdeclaration-after-statement -Wdefault-bitfield-sign
 
+check_objects = check/mode-lowmem.o check/mode-original.o check/mode-common.o
 objects = ctree.o disk-io.o kernel-lib/radix-tree.o extent-tree.o print-tree.o 
\
  root-tree.o dir-item.o file-item.o inode-item.o inode-map.o \
  extent-cache.o extent_io.o volumes.o utils.o repair.o \
@@ -109,12 +110,11 @@ objects = ctree.o disk-io.o kernel-lib/radix-tree.o 
extent-tree.o print-tree.o \
  fsfeatures.o kernel-lib/tables.o kernel-lib/raid56.o transaction.o
 cmds_objects = cmds-subvolume.o cmds-filesystem.o cmds-device.o cmds-scrub.o \
   cmds-inspect.o cmds-balance.o cmds-send.o cmds-receive.o \
-  cmds-quota.o cmds-qgroup.o cmds-replace.o check/main.o \
+  cmds-quota.o cmds-qgroup.o cmds-replace.o cmds-check.o \
   cmds-restore.o cmds-rescue.o chunk-recover.o super-recover.o \
   cmds-property.o cmds-fi-usage.o cmds-inspect-dump-tree.o \
   cmds-inspect-dump-super.o cmds-inspect-tree-stats.o cmds-fi-du.o 
\
-  mkfs/common.o check/mode-common.o check/mode-lowmem.o \
-  check/mode-original.o
+  mkfs/common.o
 libbtrfs_objects = send-stream.o send-utils.o kernel-lib/rbtree.o btrfs-list.o 
\
   kernel-lib/crc32c.o messages.o \
   uuid-tree.o utils-lib.o rbtree-utils.o
@@ -391,7 +391,7 @@ btrfs-%: btrfs-%.o $(objects) $(standalone_deps) 
$(libs_static)
$(libs_static) \
$(LDFLAGS) $(LIBS) $($(subst -,_,$@-libs))
 
-btrfs: btrfs.o $(objects) $(cmds_objects) $(libs_static)
+btrfs: btrfs.o $(objects) $(cmds_objects) $(libs_static) $(check_objects)
@echo "[LD] $@"
$(Q)$(CC) -o $@ $^ $(LDFLAGS) $(LIBS) $(LIBS_COMP)
 
diff --git a/check/main.c b/cmds-check.c
similarity index 97%
rename from check/main.c
rename to cmds-check.c
index 1bb2142e113e..28746712fac1 100644
--- a/check/main.c
+++ b/cmds-check.c
@@ -16,32 +16,15 @@
  * Boston, MA 021110-1307, USA.
  */
 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
 #include 
 #include 
-#include "ctree.h"
 #include "volumes.h"
 #include "repair.h"
 #include "disk-io.h"
-#include "print-tree.h"
 #include "task-utils.h"
 #include "transaction.h"
 #include "utils.h"
-#include "commands.h"
-#include "free-space-cache.h"
-#include "free-space-tree.h"
-#include "btrfsck.h"
 #include "qgroup-verify.h"
-#include "rbtree-utils.h"
-#include "backref.h"
-#include "kernel-shared/ulist.h"
-#include "hash.h"
 #include "help.h"
 #include "check/mode-common.h"
 #include "check/mode-original.h"
-- 
2.16.1

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/3] btrfs-progs: Split original mode check to its own

2018-02-05 Thread Qu Wenruo
This time, there are 2 patches too large to reach mail list, so please
fetch the whole patchset from github as usual:
https://github.com/adam900710/btrfs-progs/tree/split_check_part2

The branch is based on devel branch, whose HEAD is:
commit 3aa1bbdd89ee9c9c48d260a6192fae08328f1b2f (david/devel)
Author: David Sterba 
Date:   Sat Feb 3 01:15:42 2018 +0100

btrfs-progs: mkfs: fix build on musl

Another build failure on musl.

Issue: #90
Signed-off-by: David Sterba 


The first patch moves remaining common code to mode-common.c.
Things like transid fix, which is handled in read_tree_block(), get
moved to mode-common.c.
And commented added for exported functions.

The 2nd patch moves the original mode code to mode-original.c.
Unlike lowmem mode, original mode has more functions exported, as things
like bad items repair is integrated into the main funtion.

The last patch moves check/main.c back to cmds-check.c to keep
subcommand hierarchy.

With the split done, we now have a clear view about the check code size:
$ wc -l check/*.[ch] cmds-check.c | sort -h
67 check/mode-lowmem.h
   137 check/mode-common.h
   308 check/mode-original.h
   661 cmds-check.c
  2062 check/mode-common.c
  4573 check/mode-lowmem.c
  7577 check/mode-original.c

Qu Wenruo (3):
  btrfs-progs: check: Move more shared codes to mode-common.c
  btrfs-progs: Move the remaining original mode code to mode-original.c
  btrfs-progs: Move check/main.c to cmds-check.c to maintain the
subcommand hierarchy

 Makefile  |7 +-
 check/mode-common.c   | 1711 +
 check/mode-common.h   |   39 +-
 check/{main.c => mode-original.c} | 6954 -
 check/mode-original.h |   32 +-
 cmds-check.c  |  661 
 6 files changed, 4742 insertions(+), 4662 deletions(-)
 rename check/{main.c => mode-original.c} (76%)
 create mode 100644 cmds-check.c

-- 
2.16.1

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v4 3/3] btrfs-progs: Add readme for export testsuits

2018-02-05 Thread Gu Jinxiang
Add the readme of command for export testsuits.
And add the excute method of exported testsuits.

Signed-off-by: Gu Jinxiang 
---
 tests/README.md | 13 +
 1 file changed, 13 insertions(+)

diff --git a/tests/README.md b/tests/README.md
index 04d2ce2a..23f35cfc 100644
--- a/tests/README.md
+++ b/tests/README.md
@@ -48,6 +48,19 @@ $ TEST=001\* ./fsck-tests.sh
 will run the first test in fsck-tests subdirectory.
 
 
+## Package testsuit
+
+The tests can be export as a btrfs-progs-tests.tar.gz current path. Use:
+
+```shell
+$ make testsuite
+```
+
+
+And, after decompress btrfs-progs-tests.tar.gz, test can be run selectively
+from `tests/` directory introduced above.
+
+
 ## Test structure
 
 *tests/fsck-tests/:*
-- 
2.14.3



--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v4 0/3] Add support for export testsuits

2018-02-05 Thread Gu Jinxiang
Achieved:
1. export testsuite by:
 $ make testsuite
files list in testsuites-list will be added into tarball 
btrfs-progs-tests.tar.gz.

2. after decompress btrfs-progs-tests.tar.gz, run test by:
 $ TEST=`MASK` ./tests/mkfs-tests.sh
and, without MASK also be ok.
replenish:
 $ tar -xzvf ./btrfs-progs-tests.tar.gz
 $ ls
   btrfs-progs
tests directory and other files is in btrfs-progs.

Changelog:
v4->v3: modify patch2.
1.keep TOP used for binaries, and introduce TEST_TOP for other 
resources.
v3->v2:
patch1:
1.change command from `make package` to `make testsuite`
2.create btrfs-progs-tests.tar.gz in the current directory,
  so remove EXPORT variable.
3.add a listfile which list files to be added into tarball.
  and, add Documentation into the list. And revert the patch3
  in v2.
4.add some identification info of tarball
5.add temporary file testsuites-id to .gitignore

patch3: modify the readme according to the change of patch1.


v2->v1:
big change of realize idea.
from use EXEC parameter given to run the testsuite to export the testsuite 
files to
a separate tar, run from a script.


Gu Jinxiang (3):
  btrfs-progs: Add make testsuite command for export tests
  btrfs-progs: introduce TEST_TOP for resources except binaries
  btrfs-progs: Add readme for export testsuits

 .gitignore |  1 +
 Makefile   |  4 +++
 tests/README.md| 13 
 tests/cli-tests.sh | 15 ++---
 tests/cli-tests/001-btrfs/test.sh  |  2 +-
 .../cli-tests/002-balance-full-no-filters/test.sh  |  2 +-
 tests/cli-tests/003-fi-resize-args/test.sh |  2 +-
 .../cli-tests/004-send-parent-multi-subvol/test.sh |  2 +-
 tests/cli-tests/005-qgroup-show/test.sh|  2 +-
 tests/cli-tests/006-qgroup-show-sync/test.sh   |  2 +-
 tests/cli-tests/007-check-force/test.sh|  2 +-
 .../008-subvolume-get-set-default/test.sh  |  2 +-
 tests/common   | 16 ++
 tests/convert-tests.sh | 15 ++---
 tests/convert-tests/001-ext2-basic/test.sh |  4 +--
 tests/convert-tests/002-ext3-basic/test.sh |  4 +--
 tests/convert-tests/003-ext4-basic/test.sh |  4 +--
 .../004-ext2-backup-superblock-ranges/test.sh  |  2 +-
 .../convert-tests/005-delete-all-rollback/test.sh  |  4 +--
 tests/convert-tests/006-large-hole-extent/test.sh  |  4 +--
 .../007-unsupported-block-sizes/test.sh|  4 +--
 tests/convert-tests/008-readonly-image/test.sh |  4 +--
 tests/convert-tests/009-common-inode-flags/test.sh |  4 +--
 tests/convert-tests/010-reiserfs-basic/test.sh |  4 +--
 .../011-reiserfs-delete-all-rollback/test.sh   |  4 +--
 .../012-reiserfs-large-hole-extent/test.sh |  4 +--
 .../013-reiserfs-common-inode-flags/test.sh|  4 +--
 .../014-reiserfs-tail-handling/test.sh |  4 +--
 .../015-no-rollback-after-balance/test.sh  |  4 +--
 tests/export-tests.sh  | 37 ++
 tests/fsck-tests.sh| 15 ++---
 tests/fsck-tests/006-bad-root-items/test.sh|  2 +-
 tests/fsck-tests/012-leaf-corruption/test.sh   |  2 +-
 tests/fsck-tests/013-extent-tree-rebuild/test.sh   |  4 +--
 tests/fsck-tests/018-leaf-crossing-stripes/test.sh |  2 +-
 .../fsck-tests/019-non-skinny-false-alert/test.sh  |  2 +-
 tests/fsck-tests/020-extent-ref-cases/test.sh  |  2 +-
 .../021-partially-dropped-snapshot-case/test.sh|  2 +-
 tests/fsck-tests/022-qgroup-rescan-halfway/test.sh |  2 +-
 tests/fsck-tests/023-qgroup-stack-overflow/test.sh |  2 +-
 tests/fsck-tests/024-clear-space-cache/test.sh |  2 +-
 tests/fsck-tests/025-file-extents/test.sh  |  2 +-
 tests/fsck-tests/026-bad-dir-item-name/test.sh |  2 +-
 tests/fsck-tests/027-tree-reloc-tree/test.sh   |  2 +-
 .../028-unaligned-super-dev-sizes/test.sh  |  2 +-
 tests/fuzz-tests.sh| 15 ++---
 .../fuzz-tests/001-simple-check-unmounted/test.sh  |  4 +--
 tests/fuzz-tests/002-simple-image/test.sh  |  4 +--
 tests/fuzz-tests/003-multi-check-unmounted/test.sh |  4 +--
 tests/fuzz-tests/004-simple-dump-tree/test.sh  |  4 +--
 tests/fuzz-tests/005-simple-dump-super/test.sh |  4 +--
 tests/fuzz-tests/006-simple-tree-stats/test.sh |  4 +--
 tests/fuzz-tests/007-simple-super-recover/test.sh  |  4 +--
 tests/fuzz-tests/008-simple-chunk-recover/test.sh  |  4 +--
 tests/fuzz-tests/009-simple-zero-log/test.sh   |  4 +--
 tests/misc-tests.sh| 15 ++---
 tests/misc-tests/001-btrfstune-features/test.sh|  2 +-
 tests/misc-tests/002-uuid-rewrite/test.sh  |  6 ++--
 tests/misc-tests/003-zero-log/test.sh  | 

[PATCH v4 1/3] btrfs-progs: Add make testsuite command for export tests

2018-02-05 Thread Gu Jinxiang
Export the testsuite files to a separate tar.
Since fsck tests depend on btrfs-corrupt-block, and misc
tests depends on both btrfs-corrupt-block and fssum,
so set it as prerequisites for package commad.

Because,
althougth fssum can be generated by source that are all in
tests directory, and has no rely on the btrfs's structure.
But btrfs-corrupt-block deeply relys on btrfs's structure.
For consistency, at the present stage, generete the two
when create test tar.

Signed-off-by: Gu Jinxiang 
---
 .gitignore|  1 +
 Makefile  |  4 
 tests/export-tests.sh | 37 +
 testsuites-list   | 22 ++
 4 files changed, 64 insertions(+)
 create mode 100755 tests/export-tests.sh
 create mode 100644 testsuites-list

diff --git a/.gitignore b/.gitignore
index 8e607f6e..a41ad8ce 100644
--- a/.gitignore
+++ b/.gitignore
@@ -43,6 +43,7 @@ libbtrfs.so.0.1
 library-test
 library-test-static
 /fssum
+testsuites-id
 
 /tests/*-tests-results.txt
 /tests/test-console.txt
diff --git a/Makefile b/Makefile
index 6369e8f4..7eab0f4f 100644
--- a/Makefile
+++ b/Makefile
@@ -333,6 +333,10 @@ test-inst: all
 
 test: test-fsck test-mkfs test-convert test-misc test-fuzz test-cli
 
+testsuite: btrfs-corrupt-block fssum
+   @echo "Export tests as a package"
+   $(Q)bash tests/export-tests.sh
+
 #
 # NOTE: For static compiles, you need to have all the required libs
 #  static equivalent available
diff --git a/tests/export-tests.sh b/tests/export-tests.sh
new file mode 100755
index ..0ed7dd99
--- /dev/null
+++ b/tests/export-tests.sh
@@ -0,0 +1,37 @@
+#!/bin/bash
+# export the testsuite files to a separate tar
+
+TESTSUITES_LIST_FILE=$PWD/testsuites-list
+if ! [ -f $TESTSUITES_LIST_FILE ];then
+   echo "testsuites list file is not exsit."
+   exit 1
+fi
+
+TESTSUITES_LIST=$(cat $TESTSUITES_LIST_FILE)
+if [ -z "$TESTSUITES_LIST" ]; then
+   echo "no file be list in testsuites-list"
+   exit 1
+fi
+
+DEST="btrfs-progs-tests.tar.gz"
+if [ -f $DEST ];then
+   echo "remove exsit package: " $DEST
+   rm $DEST
+fi
+
+TEST_ID=$PWD/testsuites-id
+if [ -f $TEST_ID ];then
+   rm $TEST_ID
+fi
+VERSION=`./version.sh`
+TIMESTAMP=`date -u "+%Y-%m-%d %T %Z"`
+
+echo "git version: " $VERSION > $TEST_ID
+echo "this tar is created in: " $TIMESTAMP >> $TEST_ID
+
+echo "begin create tar:  " $DEST
+tar --exclude-vcs-ignores -zScf $DEST -C ../ $TESTSUITES_LIST
+if [ $? -eq 0 ]; then
+   echo "create tar successfully."
+fi
+rm $TEST_ID
diff --git a/testsuites-list b/testsuites-list
new file mode 100644
index ..a24591f5
--- /dev/null
+++ b/testsuites-list
@@ -0,0 +1,22 @@
+btrfs-progs/testsuites-id
+btrfs-progs/fssum
+btrfs-progs/btrfs-corrupt-block
+btrfs-progs/Documentation/
+btrfs-progs/tests/cli-tests
+btrfs-progs/tests/cli-tests.sh
+btrfs-progs/tests/common
+btrfs-progs/tests/common.convert
+btrfs-progs/tests/common.local
+btrfs-progs/tests/convert-tests
+btrfs-progs/tests/convert-tests.sh
+btrfs-progs/tests/fsck-tests
+btrfs-progs/tests/fsck-tests.sh
+btrfs-progs/tests/fuzz-tests/
+btrfs-progs/tests/fuzz-tests.sh
+btrfs-progs/tests/misc-tests/
+btrfs-progs/tests/misc-tests.sh
+btrfs-progs/tests/mkfs-tests/
+btrfs-progs/tests/mkfs-tests.sh
+btrfs-progs/tests/README.md
+btrfs-progs/tests/scan-results.sh
+btrfs-progs/tests/test-console.sh
-- 
2.14.3



--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] btrfs: Add enospc_debug printing in metadata_reserve_bytes

2018-02-05 Thread Qu Wenruo


On 2018年02月06日 00:20, Nikolay Borisov wrote:
> 
> 
> On 15.12.2017 12:05, Nikolay Borisov wrote:
>> Currently when enoscp_debug mount option is turned on we do not print
>> any debug info in case metadata reservation failures happen. Fix this
>> by adding the necessary hook in reserve_metadata_bytes.
>>
>> Signed-off-by: Nikolay Borisov 

Looks good.

Reviewed-by: Qu Wenruo 

Thanks,
Qu

>> ---
>>  fs/btrfs/extent-tree.c | 7 ++-
>>  1 file changed, 6 insertions(+), 1 deletion(-)
>>
>> diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
>> index 4497f937e8fb..7a281fc97bc5 100644
>> --- a/fs/btrfs/extent-tree.c
>> +++ b/fs/btrfs/extent-tree.c
>> @@ -5382,10 +5382,15 @@ static int reserve_metadata_bytes(struct btrfs_root 
>> *root,
>>  !block_rsv_use_bytes(global_rsv, orig_bytes))
>>  ret = 0;
>>  }
>> -if (ret == -ENOSPC)
>> +if (ret == -ENOSPC) {
>>  trace_btrfs_space_reservation(fs_info, "space_info:enospc",
>>block_rsv->space_info->flags,
>>orig_bytes, 1);
>> +
>> +if (btrfs_test_opt(fs_info, ENOSPC_DEBUG))
>> +dump_space_info(fs_info, block_rsv->space_info,
>> +orig_bytes, 0);
>> +}
>>  return ret;
>>  }
>>  
>>
> 
> Ping
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 



signature.asc
Description: OpenPGP digital signature


Re: [PATCH RFC] Btrfs: expose bad chunks in sysfs

2018-02-05 Thread Qu Wenruo


On 2018年02月06日 07:15, Liu Bo wrote:
> Btrfs tries its best to tolerate write errors, but kind of silently
> (except some messages in kernel log).
> 
> For raid1 and raid10, this is usually not a problem because there is a
> copy as backup, while for parity based raid setup, i.e. raid5 and
> raid6, the problem is that, if a write error occurs due to some bad
> sectors, one horizonal stripe becomes degraded and the number of write
> errors it can tolerate gets reduced by one, now if two disk fails,
> data may be lost forever.
> 
> One way to mitigate the data loss pain is to expose 'bad chunks',
> i.e. degraded chunks, to users, so that they can use 'btrfs balance'
> to relocate the whole chunk and get the full raid6 protection again
> (if the relocation works).
> 
> This introduces 'bad_chunks' in btrfs's per-fs sysfs directory.  Once
> a chunk of raid5 or raid6 becomes degraded, it will appear in
> 'bad_chunks'.

Sysfs looks good.

Although other systems uses their own interface to handle their status.
Mdadm uses /proc/mdstat to show such status, LVM uses lvdisplay/lvs.

So here comes to a new sys-fs interface.

> 
> Signed-off-by: Liu Bo 
> ---
> - In this patch, 'bad chunks' is not persistent on disk, but it can be
>   added if it's thought to be a good idea.

IHMO such bad chunks list can be built using existing dev status at
mount time.

Although using dev status may cause extra problems like false alerts.

> - This is lightly tested, comments are very welcome.

Just checked the code, there are 2 concerns:

1) The way to remove bad chunk
   Currently it can only be removed when the chunk is removed.
   If any transient write error happened, the bad chunk will just be
   there forever (if not removed)

   It seems to cause false alert.

   And extra logic to determine if it's a real bad chunk in kernel seems
   a little complex and less flex.
   (Maybe an interface to info userspace where problem happens is more
flex?)

2) Bad chunk is only added when writing
   Read routine should also be able to detect bad chunks, with better
   accuracy.

> 
>  fs/btrfs/ctree.h   |  8 +++
>  fs/btrfs/disk-io.c |  2 ++
>  fs/btrfs/extent-tree.c | 13 +++
>  fs/btrfs/raid56.c  | 59 
> --
>  fs/btrfs/sysfs.c   | 26 ++
>  fs/btrfs/volumes.c | 15 +++--
>  fs/btrfs/volumes.h |  2 ++
>  7 files changed, 121 insertions(+), 4 deletions(-)
> 
> diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
> index 13c260b..08aad65 100644
> --- a/fs/btrfs/ctree.h
> +++ b/fs/btrfs/ctree.h
> @@ -1101,6 +1101,9 @@ struct btrfs_fs_info {
>   spinlock_t ref_verify_lock;
>   struct rb_root block_tree;
>  #endif
> +
> + struct list_head bad_chunks;

Rbtree may be better here.

Since iterating a list to remove bad chunk can sometimes be slow.

> + seqlock_t bc_lock;
>  };
>  
>  static inline struct btrfs_fs_info *btrfs_sb(struct super_block *sb)
> @@ -2568,6 +2571,11 @@ static inline gfp_t btrfs_alloc_write_mask(struct 
> address_space *mapping)
>  
>  /* extent-tree.c */
>  
> +struct btrfs_bad_chunk {
> + u64 chunk_offset;

It would be better to have chunk_size to info user.
Just chunk start won't tell user how serious the problem is.

And according to the code, any write error makes the chunk marked as
bad, no matter if the error can be tolerant.

It would be better if we have different error type, bad for error
intolerant, while degraded for some something tolerant.

Thanks,
Qu

> + struct list_head list;
> +};
> +
>  enum btrfs_inline_ref_type {
>   BTRFS_REF_TYPE_INVALID = 0,
>   BTRFS_REF_TYPE_BLOCK =   1,
> diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
> index a8ecccf..061e7f94 100644
> --- a/fs/btrfs/disk-io.c
> +++ b/fs/btrfs/disk-io.c
> @@ -2568,6 +2568,8 @@ int open_ctree(struct super_block *sb,
>   init_waitqueue_head(_info->async_submit_wait);
>  
>   INIT_LIST_HEAD(_info->pinned_chunks);
> + INIT_LIST_HEAD(_info->bad_chunks);
> + seqlock_init(_info->bc_lock);
>  
>   /* Usable values until the real ones are cached from the superblock */
>   fs_info->nodesize = 4096;
> diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
> index 2f43285..3ca7cb4 100644
> --- a/fs/btrfs/extent-tree.c
> +++ b/fs/btrfs/extent-tree.c
> @@ -9903,6 +9903,19 @@ int btrfs_free_block_groups(struct btrfs_fs_info *info)
>   kobject_del(_info->kobj);
>   kobject_put(_info->kobj);
>   }
> +
> + /* Clean up bad chunks. */
> + write_seqlock_irq(>bc_lock);
> + while (!list_empty(>bad_chunks)) {
> + struct btrfs_bad_chunk *bc;
> +
> + bc = list_first_entry(>bad_chunks,
> +   struct btrfs_bad_chunk, list);
> + list_del_init(>list);
> + kfree(bc);
> + }
> + write_sequnlock_irq(>bc_lock);
> +
>   return 0;
>  }
>  
> 

[PATCH RFC] Btrfs: expose bad chunks in sysfs

2018-02-05 Thread Liu Bo
Btrfs tries its best to tolerate write errors, but kind of silently
(except some messages in kernel log).

For raid1 and raid10, this is usually not a problem because there is a
copy as backup, while for parity based raid setup, i.e. raid5 and
raid6, the problem is that, if a write error occurs due to some bad
sectors, one horizonal stripe becomes degraded and the number of write
errors it can tolerate gets reduced by one, now if two disk fails,
data may be lost forever.

One way to mitigate the data loss pain is to expose 'bad chunks',
i.e. degraded chunks, to users, so that they can use 'btrfs balance'
to relocate the whole chunk and get the full raid6 protection again
(if the relocation works).

This introduces 'bad_chunks' in btrfs's per-fs sysfs directory.  Once
a chunk of raid5 or raid6 becomes degraded, it will appear in
'bad_chunks'.

Signed-off-by: Liu Bo 
---
- In this patch, 'bad chunks' is not persistent on disk, but it can be
  added if it's thought to be a good idea.
- This is lightly tested, comments are very welcome.

 fs/btrfs/ctree.h   |  8 +++
 fs/btrfs/disk-io.c |  2 ++
 fs/btrfs/extent-tree.c | 13 +++
 fs/btrfs/raid56.c  | 59 --
 fs/btrfs/sysfs.c   | 26 ++
 fs/btrfs/volumes.c | 15 +++--
 fs/btrfs/volumes.h |  2 ++
 7 files changed, 121 insertions(+), 4 deletions(-)

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 13c260b..08aad65 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -1101,6 +1101,9 @@ struct btrfs_fs_info {
spinlock_t ref_verify_lock;
struct rb_root block_tree;
 #endif
+
+   struct list_head bad_chunks;
+   seqlock_t bc_lock;
 };
 
 static inline struct btrfs_fs_info *btrfs_sb(struct super_block *sb)
@@ -2568,6 +2571,11 @@ static inline gfp_t btrfs_alloc_write_mask(struct 
address_space *mapping)
 
 /* extent-tree.c */
 
+struct btrfs_bad_chunk {
+   u64 chunk_offset;
+   struct list_head list;
+};
+
 enum btrfs_inline_ref_type {
BTRFS_REF_TYPE_INVALID = 0,
BTRFS_REF_TYPE_BLOCK =   1,
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index a8ecccf..061e7f94 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -2568,6 +2568,8 @@ int open_ctree(struct super_block *sb,
init_waitqueue_head(_info->async_submit_wait);
 
INIT_LIST_HEAD(_info->pinned_chunks);
+   INIT_LIST_HEAD(_info->bad_chunks);
+   seqlock_init(_info->bc_lock);
 
/* Usable values until the real ones are cached from the superblock */
fs_info->nodesize = 4096;
diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index 2f43285..3ca7cb4 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -9903,6 +9903,19 @@ int btrfs_free_block_groups(struct btrfs_fs_info *info)
kobject_del(_info->kobj);
kobject_put(_info->kobj);
}
+
+   /* Clean up bad chunks. */
+   write_seqlock_irq(>bc_lock);
+   while (!list_empty(>bad_chunks)) {
+   struct btrfs_bad_chunk *bc;
+
+   bc = list_first_entry(>bad_chunks,
+ struct btrfs_bad_chunk, list);
+   list_del_init(>list);
+   kfree(bc);
+   }
+   write_sequnlock_irq(>bc_lock);
+
return 0;
 }
 
diff --git a/fs/btrfs/raid56.c b/fs/btrfs/raid56.c
index a7f7925..e960247 100644
--- a/fs/btrfs/raid56.c
+++ b/fs/btrfs/raid56.c
@@ -888,14 +888,19 @@ static void rbio_orig_end_io(struct btrfs_raid_bio *rbio, 
blk_status_t err)
 }
 
 /*
- * end io function used by finish_rmw.  When we finally
- * get here, we've written a full stripe
+ * end io function used by finish_rmw.  When we finally get here, we've written
+ * a full stripe.
+ *
+ * Note that this is not under interrupt context as we queued endio to workers.
  */
 static void raid_write_end_io(struct bio *bio)
 {
struct btrfs_raid_bio *rbio = bio->bi_private;
blk_status_t err = bio->bi_status;
int max_errors;
+   u64 stripe_start = rbio->bbio->raid_map[0];
+   struct btrfs_fs_info *fs_info = rbio->fs_info;
+   int err_cnt;
 
if (err)
fail_bio_stripe(rbio, bio);
@@ -908,12 +913,58 @@ static void raid_write_end_io(struct bio *bio)
err = BLK_STS_OK;
 
/* OK, we have read all the stripes we need to. */
+   err_cnt = atomic_read(>error);
max_errors = (rbio->operation == BTRFS_RBIO_PARITY_SCRUB) ?
 0 : rbio->bbio->max_errors;
if (atomic_read(>error) > max_errors)
err = BLK_STS_IOERR;
 
rbio_orig_end_io(rbio, err);
+
+   /*
+* If there is any error, this stripe is a degraded one, so is the whole
+* chunk, expose this chunk info to sysfs.
+*/
+   if (unlikely(err_cnt)) {
+   struct btrfs_bad_chunk *bc;
+   struct btrfs_bad_chunk 

Re: [PATCH 0/2] Remove custom crc32c init code from btrfs

2018-02-05 Thread David Sterba
On Mon, Jan 08, 2018 at 11:45:03AM +0200, Nikolay Borisov wrote:
> So here is a small 2 patch set which removes btrfs' manual initialisation of 
> the lower level crc32c module. Explanation why is ok can be found in Patch 
> 2/2.
> 
> Patch 1/2 just adds a function to the generic crc32c header which allows 
> querying the actual crc32c implementaiton used (i.e. software or 
> hw-accelerated)
> to retain current btrfs behavior. This is mainly used for debugging purposes 
> and is independent. 
> 
> Nikolay Borisov (2):
>   libcrc32c: Add crc32c_impl function
>   btrfs: Remove custom crc32c init code

As we got ack from Herbert, I'm going to add the patches to my for-next.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] btrfs: Remove custom crc32c init code

2018-02-05 Thread David Sterba
On Fri, Feb 02, 2018 at 09:46:40PM +0200, Andy Shevchenko wrote:
> On Mon, Jan 8, 2018 at 11:45 AM, Nikolay Borisov  wrote:
> > The custom crc32 init code was introduced in
> > 14a958e678cd ("Btrfs: fix btrfs boot when compiled as built-in") to
> > enable using btrfs as a built-in. However, later as pointed out by
> > 60efa5eb2e88 ("Btrfs: use late_initcall instead of module_init") this
> > wasn't enough and finally btrfs was switched to late_initcall which
> > comes after the generic crc32c implementation is initiliased. The
> > latter commit superseeded the former. Now that we don't have to
> > maintain our own code let's just remove it and switch to using the
> > generic implementation.
> 
> >  fs/btrfs/hash.c| 54 
> > --
> >  fs/btrfs/hash.h| 43 
> 
> IIRC Adding -D to git format-patch will make it shorter and nicer.
> But please, double check that git am actually detects removals.

While the patch looks better, my 'git am' does not want to apply the
deletion:

$ git am 0001-btrfs-Remove-custom-crc32c-init-code.patch
Applying: btrfs: Remove custom crc32c init code
error: removal patch leaves file contents
error: fs/btrfs/hash.c: patch does not apply
error: removal patch leaves file contents
error: fs/btrfs/hash.h: patch does not apply
Patch failed at 0001 btrfs: Remove custom crc32c init code
The copy of the patch that failed is found in: .git/rebase-apply/patch
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".

so the ordinary diff shall be preferred.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


How to efficiently remove a subvolume with children located further than BTRFS_INO_LOOKUP_PATH_MAX?

2018-02-05 Thread Nikita Gerasimov

Hi All,

I have an issue with recursively subvolume deletion when child created further 
than PATH_MAX.

As I understand current algorithm is:
- Find child by BTRFS_IOC_TREE_SEARCH, which is work fine.
- Get relative path to child by BTRFS_IOC_INO_LOOKUP. (that's a problem)
- Open child fd and recursively repeat until BTRFS_IOC_TREE_SEARCH could found 
something.

So when relative path is not fit into BTRFS_INO_LOOKUP_PATH_MAX,
BTRFS_IOC_INO_LOOKUP return ENAMETOOLONG and zeros as 
btrfs_ioctl_ino_lookup_args.
The only way I can see by far is walk through parent fs tree to find objects
matching by name with BTRFS_IOC_TREE_SEARCH results and check them byfstat. It doesn't look very good in terms of performance. Is there any 
shorter way to get fd by child btrfs_root_ref?


Also btrfs-progs are fail to delete subvolumes with canonical path > PATH_MAX
because of realpath also return ENAMETOOLONG. At the same time creation by
relative path works fine.

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] btrfs: handle failure of add_pending_csums

2018-02-05 Thread David Sterba
On Fri, Jan 26, 2018 at 09:28:58AM -0500, Josef Bacik wrote:
> On Mon, Jan 08, 2018 at 10:59:43AM +0200, Nikolay Borisov wrote:
> > add_pending_csums was added as part of the new data=ordered implementation 
> > in
> > e6dcd2dc9c48 ("Btrfs: New data=ordered implementation"). Even back then it
> > called the btrfs_csum_file_blocks which can fail but it never bothered 
> > handling
> > the failure. In ENOMEM situation this could lead to the filesystem failing 
> > to
> > write the checksums for a particular extent and not detect this. On read 
> > this
> > could lead to the filesystem erroring out due to crc mismatch. Fix it by
> > propagating failure from add_pending_csums and handling them
> > 
> > Signed-off-by: Nikolay Borisov 
> 
> Reviewed-by: Josef Bacik 

Reviewed-by: David Sterba 

and added to next, thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3] btrfs: print error if primary super block write fails

2018-02-05 Thread David Sterba
On Mon, Feb 05, 2018 at 02:45:21PM +0800, Anand Jain wrote:
> On 02/05/2018 02:38 PM, Anand Jain wrote:
> > On 02/03/2018 03:09 AM, Howard McLauchlan wrote:
> >> Presently, failing a primary super block write but succeeding in at
> >> least one super block write in general will appear to users as if
> >> nothing important went wrong.
> > 
> >> However, upon unmounting and re-mounting,
> >> the file system will be in a rolled back state.
> > 
> > Right.! In case of non-datasync-IO, or where applications depend on the
> > transcation_commit() to confirm the IO.
> > 
> > David, Qu,
> > 
> >   In this context (single disk btrfs, the write to primary SB has
> >   been failing but read is fine), if a system is rebooted and continues
> >   the production the data loss is eminent!!!. btrfs_err() won't help to
> >   avoid the data loss nor we could blame the user for not taking any
> >   action on the btrfs_err().
> 
>   I am wrong. I missed the return -1 which this patch adds. Which
>   means it is choice a. as below. But IMO better would have been
>   choice b. though.

I'm concerned about the cases where automatically mounting from the 2nd
superblock would make thigs worse, and IIRC there are not enough data to
decide in the mount callback.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3] btrfs: print error if primary super block write fails

2018-02-05 Thread David Sterba
On Fri, Feb 02, 2018 at 11:19:59AM -0800, Howard McLauchlan wrote:
> On 02/02/2018 11:09 AM, Howard McLauchlan wrote:
> > Presently, failing a primary super block write but succeeding in at
> > least one super block write in general will appear to users as if
> > nothing important went wrong. However, upon unmounting and re-mounting,
> > the file system will be in a rolled back state. This was discovered
> > with a BCC program that uses bpf_override_return() to fail super block
> > writes.
> > 
> > This patch outputs an error clarifying that the primary super block
> > write has failed, so users can expect potentially erroneous behaviour.
> > It also forces wait_dev_supers() to return an error to its caller if
> > the primary super block write fails.
> > 
> > Signed-off-by: Howard McLauchlan 
> Reviewed-by: Qu Wenruo 

Added to next, thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH v2 2/4] btrfs-progs: Add EXEC represent path of executable file

2018-02-05 Thread David Sterba
On Sat, Feb 03, 2018 at 09:07:47AM +, Gu, Jinxiang wrote:
> 
> 
> > -Original Message-
> > From: David Sterba [mailto:dste...@suse.cz]
> > Sent: Saturday, February 03, 2018 2:08 AM
> > To: Gu, Jinxiang/顾 金香 
> > Cc: linux-btrfs@vger.kernel.org; dste...@suse.cz; quwenruo.bt...@gmx.com
> > Subject: Re: [RFC PATCH v2 2/4] btrfs-progs: Add EXEC represent path of 
> > executable file
> > 
> > On Fri, Feb 02, 2018 at 04:34:03PM +0800, Gu Jinxiang wrote:
> > > Use EXEC instead of TOP to represent the path of excutable file.
> > > EXEC is set to TOP by default, but when there is no excutable file in
> > > TOP, use the path where btrfs is install as EXEC.
> > 
> > What if we just allow to change TOP (ie. do not overwrite it in the test 
> > driver scripts)? The logic will be the same as with EXEC, but we won't
> > have to rewrite essentailly all paths in the testsuite.
> > 
> Since besides executable files, TOP is also used to find
> $TOP/tests/common, $TOP/Documentation, $TOP/tests/fuzz-tests/images,
> i.e.
> so, change TOP will also effect on those resources.

Yeah $TOP is used for too many different things. It works for from
inside git, but this needs to change for the exported testsuite. So
TEST_TOP for the sourced 'common' scripts and maybe other variables for
the internal binaries that do not get typically installed.

> So, I introduce EXEC to differentiate executable files and other resources.

I understand why but I don't like this approach. TOP should be enough
for all binaries that are expected to exist in the path, the rest needs
own variable.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] btrfs: alloc_chunk: fix DUP stripe size handling

2018-02-05 Thread Hans van Kranenburg
In case of using DUP, we search for enough unallocated disk space on a
device to hold two stripes.

The devices_info[ndevs-1].max_avail that holds the amount of unallocated
space found is directly assigned to stripe_size, while it's actually
twice the stripe size.

Later on in the code, an unconditional division of stripe_size by
dev_stripes corrects the value, but in the meantime there's a check to
see if the stripe_size does not exceed max_chunk_size. Since during this
check stripe_size is twice the amount as intended, the check will reduce
the stripe_size to max_chunk_size if the actual correct to be used
stripe_size is more than half the amount of max_chunk_size.

The unconditional division later tries to correct stripe_size, but will
actually make sure we can't allocate more than half the max_chunk_size.

Fix this by moving the division by dev_stripes before the max chunk size
check, so it always contains the right value, instead of putting a duct
tape division in further on to get it fixed again.

Since in all other cases than DUP, dev_stripes is 1, this change only
affects DUP.

Other attempts in the past were made to fix this:
* 37db63a400 "Btrfs: fix max chunk size check in chunk allocator" tried
to fix the same problem, but still resulted in part of the code acting
on a wrongly doubled stripe_size value.
* 86db25785a "Btrfs: fix max chunk size on raid5/6" unintentionally
broke this fix again.

The real problem was already introduced with the rest of the code in
73c5de0051.

The user visible result however will be that the max chunk size for DUP
will suddenly double, while it's actually acting according to the limits
in the code again like it was 5 years ago.

Reported-by: Naohiro Aota 
Link: https://www.spinics.net/lists/linux-btrfs/msg69752.html
Fixes: 73c5de0051 ("btrfs: quasi-round-robin for chunk allocation")
Fixes: 86db25785a ("Btrfs: fix max chunk size on raid5/6")
Signed-off-by: Hans van Kranenburg 
Cc: Naohiro Aota 
Cc: Arne Jansen 
Cc: Chris Mason 
---
 fs/btrfs/volumes.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 4006b2a1233d..a50bd02b7ada 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -4737,7 +4737,7 @@ static int __btrfs_alloc_chunk(struct btrfs_trans_handle 
*trans,
 * the primary goal is to maximize the number of stripes, so use as many
 * devices as possible, even if the stripes are not maximum sized.
 */
-   stripe_size = devices_info[ndevs-1].max_avail;
+   stripe_size = div_u64(devices_info[ndevs-1].max_avail, dev_stripes);
num_stripes = ndevs * dev_stripes;
 
/*
@@ -4772,8 +4772,6 @@ static int __btrfs_alloc_chunk(struct btrfs_trans_handle 
*trans,
stripe_size = devices_info[ndevs-1].max_avail;
}
 
-   stripe_size = div_u64(stripe_size, dev_stripes);
-
/* align to BTRFS_STRIPE_LEN */
stripe_size = round_down(stripe_size, BTRFS_STRIPE_LEN);
 
-- 
2.11.0

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] btrfs: Add enospc_debug printing in metadata_reserve_bytes

2018-02-05 Thread Nikolay Borisov


On 15.12.2017 12:05, Nikolay Borisov wrote:
> Currently when enoscp_debug mount option is turned on we do not print
> any debug info in case metadata reservation failures happen. Fix this
> by adding the necessary hook in reserve_metadata_bytes.
> 
> Signed-off-by: Nikolay Borisov 
> ---
>  fs/btrfs/extent-tree.c | 7 ++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
> index 4497f937e8fb..7a281fc97bc5 100644
> --- a/fs/btrfs/extent-tree.c
> +++ b/fs/btrfs/extent-tree.c
> @@ -5382,10 +5382,15 @@ static int reserve_metadata_bytes(struct btrfs_root 
> *root,
>   !block_rsv_use_bytes(global_rsv, orig_bytes))
>   ret = 0;
>   }
> - if (ret == -ENOSPC)
> + if (ret == -ENOSPC) {
>   trace_btrfs_space_reservation(fs_info, "space_info:enospc",
> block_rsv->space_info->flags,
> orig_bytes, 1);
> +
> + if (btrfs_test_opt(fs_info, ENOSPC_DEBUG))
> + dump_space_info(fs_info, block_rsv->space_info,
> + orig_bytes, 0);
> + }
>   return ret;
>  }
>  
> 

Ping
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 1/4] btrfs: Remove userspace transaction ioctls

2018-02-05 Thread Sage Weil
On Mon, 5 Feb 2018, David Sterba wrote:
> On Mon, Feb 05, 2018 at 05:52:52PM +0900, Wang Shilong wrote:
> >These ioctl are originally introduced by Sage Weil for Ceph use?
> > Not sure whether it still useful, Cc Sage just in case.
> 
> We have checked that the ioctl is not used in ceph, the reasons why we
> think it's ok to remove the ioctl were in the cover letter of v1.

Yeah, +1 on removal.

Acked-by: Sage Weil 

sage
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 1/4] btrfs: Remove userspace transaction ioctls

2018-02-05 Thread David Sterba
On Mon, Feb 05, 2018 at 05:52:52PM +0900, Wang Shilong wrote:
>These ioctl are originally introduced by Sage Weil for Ceph use?
> Not sure whether it still useful, Cc Sage just in case.

We have checked that the ioctl is not used in ceph, the reasons why we
think it's ok to remove the ioctl were in the cover letter of v1.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 1/4] btrfs: Remove userspace transaction ioctls

2018-02-05 Thread Wang Shilong

   These ioctl are originally introduced by Sage Weil for Ceph use?
Not sure whether it still useful, Cc Sage just in case.


在 2018年2月5日,下午5:41,Nikolay Borisov  写道:

Commit 3558d4f88ec8 ("btrfs: Deprecate userspace transaction ioctls")
marked the beginning of the end of userspace transaction. This commit
finishes the job!

Signed-off-by: Nikolay Borisov 

---

V2:
* Also remove the usage of btrfs_ioctl_trans_end from btrfs_release_file so 
that the patch compiles on its own as well.

fs/btrfs/ctree.h |  1 -
fs/btrfs/file.c  |  8 -
fs/btrfs/ioctl.c | 95 
3 files changed, 104 deletions(-)

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 1a462ab85c49..6a4752177ad8 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -3193,7 +3193,6 @@ void btrfs_destroy_inode(struct inode *inode);
int btrfs_drop_inode(struct inode *inode);
int __init btrfs_init_cachep(void);
void btrfs_destroy_cachep(void);
-long btrfs_ioctl_trans_end(struct file *file);
struct inode *btrfs_iget(struct super_block *s, struct btrfs_key *location,
 struct btrfs_root *root, int *was_new);
struct extent_map *btrfs_get_extent(struct btrfs_inode *inode,
diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
index cba2ac371ce0..101e0c7fea92 100644
--- a/fs/btrfs/file.c
+++ b/fs/btrfs/file.c
@@ -1996,8 +1996,6 @@ int btrfs_release_file(struct inode *inode, struct file 
*filp)
{
struct btrfs_file_private *private = filp->private_data;

-   if (private && private->trans)
-   btrfs_ioctl_trans_end(filp);
if (private && private->filldir_buf)
kfree(private->filldir_buf);
kfree(private);
@@ -2189,12 +2187,6 @@ int btrfs_sync_file(struct file *file, loff_t start, 
loff_t end, int datasync)
}

/*
-* ok we haven't committed the transaction yet, lets do a commit
-*/
-   if (file->private_data)
-   btrfs_ioctl_trans_end(file);
-
-   /*
 * We use start here because we will need to wait on the IO to complete
 * in btrfs_sync_log, which could require joining a transaction (for
 * example checking cross references in the nocow path).  If we use join
diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index f573cad72b7e..3094e079fc4f 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -3935,73 +3935,6 @@ int btrfs_clone_file_range(struct file *src_file, loff_t 
off,
return btrfs_clone_files(dst_file, src_file, off, len, destoff);
}

-/*
- * there are many ways the trans_start and trans_end ioctls can lead
- * to deadlocks.  They should only be used by applications that
- * basically own the machine, and have a very in depth understanding
- * of all the possible deadlocks and enospc problems.
- */
-static long btrfs_ioctl_trans_start(struct file *file)
-{
-   struct inode *inode = file_inode(file);
-   struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb);
-   struct btrfs_root *root = BTRFS_I(inode)->root;
-   struct btrfs_trans_handle *trans;
-   struct btrfs_file_private *private;
-   int ret;
-   static bool warned = false;
-
-   ret = -EPERM;
-   if (!capable(CAP_SYS_ADMIN))
-   goto out;
-
-   if (!warned) {
-   btrfs_warn(fs_info,
-   "Userspace transaction mechanism is considered "
-   "deprecated and slated to be removed in 4.17. "
-   "If you have a valid use case please "
-   "speak up on the mailing list");
-   WARN_ON(1);
-   warned = true;
-   }
-
-   ret = -EINPROGRESS;
-   private = file->private_data;
-   if (private && private->trans)
-   goto out;
-   if (!private) {
-   private = kzalloc(sizeof(struct btrfs_file_private),
- GFP_KERNEL);
-   if (!private)
-   return -ENOMEM;
-   file->private_data = private;
-   }
-
-   ret = -EROFS;
-   if (btrfs_root_readonly(root))
-   goto out;
-
-   ret = mnt_want_write_file(file);
-   if (ret)
-   goto out;
-
-   atomic_inc(_info->open_ioctl_trans);
-
-   ret = -ENOMEM;
-   trans = btrfs_start_ioctl_transaction(root);
-   if (IS_ERR(trans))
-   goto out_drop;
-
-   private->trans = trans;
-   return 0;
-
-out_drop:
-   atomic_dec(_info->open_ioctl_trans);
-   mnt_drop_write_file(file);
-out:
-   return ret;
-}
-
static long btrfs_ioctl_default_subvol(struct file *file, void __user *argp)
{
struct inode *inode = file_inode(file);
@@ -4243,30 +4176,6 @@ static long btrfs_ioctl_space_info(struct btrfs_fs_info 
*fs_info,
return ret;
}

-/*
- * there are many ways the trans_start and trans_end ioctls can lead
- * to deadlocks.  They should only be used by 

[PATCH v2 4/4] btrfs: Remove btrfs_fs_info::open_ioctl_trans

2018-02-05 Thread Nikolay Borisov
Since userspace transaction have been removed we no longer have use
for this field so delete it.

Signed-off-by: Nikolay Borisov 
---

V2: 
 * No changes
 fs/btrfs/ctree.h   | 1 -
 fs/btrfs/extent-tree.c | 3 +--
 fs/btrfs/transaction.c | 9 +++--
 3 files changed, 4 insertions(+), 9 deletions(-)

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index dc679246b8e8..57a0d0b0ea74 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -877,7 +877,6 @@ struct btrfs_fs_info {
struct rb_root tree_mod_log;
 
atomic_t async_delalloc_pages;
-   atomic_t open_ioctl_trans;
 
/*
 * this is used to protect the following list -- ordered_roots.
diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index 8d51e4bb67c1..dcb059a46b77 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -4329,8 +4329,7 @@ int btrfs_alloc_data_chunk_ondemand(struct btrfs_inode 
*inode, u64 bytes)
 
/* commit the current transaction and try again */
 commit_trans:
-   if (need_commit &&
-   !atomic_read(_info->open_ioctl_trans)) {
+   if (need_commit) {
need_commit--;
 
if (need_commit > 0) {
diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
index d61d1fd59ccd..c5ef42318058 100644
--- a/fs/btrfs/transaction.c
+++ b/fs/btrfs/transaction.c
@@ -443,8 +443,7 @@ static int may_wait_transaction(struct btrfs_fs_info 
*fs_info, int type)
if (test_bit(BTRFS_FS_LOG_RECOVERING, _info->flags))
return 0;
 
-   if (type == TRANS_START &&
-   !atomic_read(_info->open_ioctl_trans))
+   if (type == TRANS_START)
return 1;
 
return 0;
@@ -774,8 +773,7 @@ int btrfs_wait_for_commit(struct btrfs_fs_info *fs_info, 
u64 transid)
 
 void btrfs_throttle(struct btrfs_fs_info *fs_info)
 {
-   if (!atomic_read(_info->open_ioctl_trans))
-   wait_current_trans(fs_info);
+   wait_current_trans(fs_info);
 }
 
 static int should_end_transaction(struct btrfs_trans_handle *trans)
@@ -857,8 +855,7 @@ static int __btrfs_end_transaction(struct 
btrfs_trans_handle *trans,
 
btrfs_trans_release_chunk_metadata(trans);
 
-   if (lock && !atomic_read(>open_ioctl_trans) &&
-   should_end_transaction(trans) &&
+   if (lock && should_end_transaction(trans) &&
READ_ONCE(cur_trans->state) == TRANS_STATE_RUNNING) {
spin_lock(>trans_lock);
if (cur_trans->state == TRANS_STATE_RUNNING)
-- 
2.7.4

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 2/4] btrfs: Remove btrfs_file_private::trans

2018-02-05 Thread Nikolay Borisov
Now that the userspace transaction IOCTL have been removed, this member
is no longer used so just remove it

Signed-off-by: Nikolay Borisov 
---

V2: 
 * This was 3/4 before, but now move it to 2/4 since it makes more sense
 * Reword the subject to be more concise

 fs/btrfs/ctree.h | 1 -
 1 file changed, 1 deletion(-)

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 6a4752177ad8..dc679246b8e8 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -1265,7 +1265,6 @@ struct btrfs_root {
 };
 
 struct btrfs_file_private {
-   struct btrfs_trans_handle *trans;
void *filldir_buf;
 };
 
-- 
2.7.4

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 1/4] btrfs: Remove userspace transaction ioctls

2018-02-05 Thread Nikolay Borisov
Commit 3558d4f88ec8 ("btrfs: Deprecate userspace transaction ioctls")
marked the beginning of the end of userspace transaction. This commit
finishes the job!

Signed-off-by: Nikolay Borisov 
---

V2:
 * Also remove the usage of btrfs_ioctl_trans_end from btrfs_release_file so 
 that the patch compiles on its own as well.

 fs/btrfs/ctree.h |  1 -
 fs/btrfs/file.c  |  8 -
 fs/btrfs/ioctl.c | 95 
 3 files changed, 104 deletions(-)

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 1a462ab85c49..6a4752177ad8 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -3193,7 +3193,6 @@ void btrfs_destroy_inode(struct inode *inode);
 int btrfs_drop_inode(struct inode *inode);
 int __init btrfs_init_cachep(void);
 void btrfs_destroy_cachep(void);
-long btrfs_ioctl_trans_end(struct file *file);
 struct inode *btrfs_iget(struct super_block *s, struct btrfs_key *location,
 struct btrfs_root *root, int *was_new);
 struct extent_map *btrfs_get_extent(struct btrfs_inode *inode,
diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
index cba2ac371ce0..101e0c7fea92 100644
--- a/fs/btrfs/file.c
+++ b/fs/btrfs/file.c
@@ -1996,8 +1996,6 @@ int btrfs_release_file(struct inode *inode, struct file 
*filp)
 {
struct btrfs_file_private *private = filp->private_data;
 
-   if (private && private->trans)
-   btrfs_ioctl_trans_end(filp);
if (private && private->filldir_buf)
kfree(private->filldir_buf);
kfree(private);
@@ -2189,12 +2187,6 @@ int btrfs_sync_file(struct file *file, loff_t start, 
loff_t end, int datasync)
}
 
/*
-* ok we haven't committed the transaction yet, lets do a commit
-*/
-   if (file->private_data)
-   btrfs_ioctl_trans_end(file);
-
-   /*
 * We use start here because we will need to wait on the IO to complete
 * in btrfs_sync_log, which could require joining a transaction (for
 * example checking cross references in the nocow path).  If we use join
diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index f573cad72b7e..3094e079fc4f 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -3935,73 +3935,6 @@ int btrfs_clone_file_range(struct file *src_file, loff_t 
off,
return btrfs_clone_files(dst_file, src_file, off, len, destoff);
 }
 
-/*
- * there are many ways the trans_start and trans_end ioctls can lead
- * to deadlocks.  They should only be used by applications that
- * basically own the machine, and have a very in depth understanding
- * of all the possible deadlocks and enospc problems.
- */
-static long btrfs_ioctl_trans_start(struct file *file)
-{
-   struct inode *inode = file_inode(file);
-   struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb);
-   struct btrfs_root *root = BTRFS_I(inode)->root;
-   struct btrfs_trans_handle *trans;
-   struct btrfs_file_private *private;
-   int ret;
-   static bool warned = false;
-
-   ret = -EPERM;
-   if (!capable(CAP_SYS_ADMIN))
-   goto out;
-
-   if (!warned) {
-   btrfs_warn(fs_info,
-   "Userspace transaction mechanism is considered "
-   "deprecated and slated to be removed in 4.17. "
-   "If you have a valid use case please "
-   "speak up on the mailing list");
-   WARN_ON(1);
-   warned = true;
-   }
-
-   ret = -EINPROGRESS;
-   private = file->private_data;
-   if (private && private->trans)
-   goto out;
-   if (!private) {
-   private = kzalloc(sizeof(struct btrfs_file_private),
- GFP_KERNEL);
-   if (!private)
-   return -ENOMEM;
-   file->private_data = private;
-   }
-
-   ret = -EROFS;
-   if (btrfs_root_readonly(root))
-   goto out;
-
-   ret = mnt_want_write_file(file);
-   if (ret)
-   goto out;
-
-   atomic_inc(_info->open_ioctl_trans);
-
-   ret = -ENOMEM;
-   trans = btrfs_start_ioctl_transaction(root);
-   if (IS_ERR(trans))
-   goto out_drop;
-
-   private->trans = trans;
-   return 0;
-
-out_drop:
-   atomic_dec(_info->open_ioctl_trans);
-   mnt_drop_write_file(file);
-out:
-   return ret;
-}
-
 static long btrfs_ioctl_default_subvol(struct file *file, void __user *argp)
 {
struct inode *inode = file_inode(file);
@@ -4243,30 +4176,6 @@ static long btrfs_ioctl_space_info(struct btrfs_fs_info 
*fs_info,
return ret;
 }
 
-/*
- * there are many ways the trans_start and trans_end ioctls can lead
- * to deadlocks.  They should only be used by applications that
- * basically own the machine, and have a very in depth understanding
- * of all the possible deadlocks and enospc problems.
- */
-long 

[PATCH v2 3/4] btrfs: Remove code referencing unused TRANS_USERSPACE

2018-02-05 Thread Nikolay Borisov
Now that the userspace transaction ioctls have been removed,
TRANS_USERSPACE is no longer used hence we can remove it.

Signed-off-by: Nikolay Borisov 
---

V2: 
 * This was 2/4 but now is 3/4
 * Also remove the declaration of btrfs_start_ioctl_transaction from 
 transaction.h
 fs/btrfs/transaction.c | 27 ++-
 fs/btrfs/transaction.h |  6 +-
 2 files changed, 7 insertions(+), 26 deletions(-)

diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
index 04f07144b45c..d61d1fd59ccd 100644
--- a/fs/btrfs/transaction.c
+++ b/fs/btrfs/transaction.c
@@ -37,22 +37,16 @@
 
 static const unsigned int btrfs_blocked_trans_types[TRANS_STATE_MAX] = {
[TRANS_STATE_RUNNING]   = 0U,
-   [TRANS_STATE_BLOCKED]   = (__TRANS_USERSPACE |
-  __TRANS_START),
-   [TRANS_STATE_COMMIT_START]  = (__TRANS_USERSPACE |
-  __TRANS_START |
-  __TRANS_ATTACH),
-   [TRANS_STATE_COMMIT_DOING]  = (__TRANS_USERSPACE |
-  __TRANS_START |
+   [TRANS_STATE_BLOCKED]   =  __TRANS_START,
+   [TRANS_STATE_COMMIT_START]  = (__TRANS_START | __TRANS_ATTACH),
+   [TRANS_STATE_COMMIT_DOING]  = (__TRANS_START |
   __TRANS_ATTACH |
   __TRANS_JOIN),
-   [TRANS_STATE_UNBLOCKED] = (__TRANS_USERSPACE |
-  __TRANS_START |
+   [TRANS_STATE_UNBLOCKED] = (__TRANS_START |
   __TRANS_ATTACH |
   __TRANS_JOIN |
   __TRANS_JOIN_NOLOCK),
-   [TRANS_STATE_COMPLETED] = (__TRANS_USERSPACE |
-  __TRANS_START |
+   [TRANS_STATE_COMPLETED] = (__TRANS_START |
   __TRANS_ATTACH |
   __TRANS_JOIN |
   __TRANS_JOIN_NOLOCK),
@@ -449,9 +443,6 @@ static int may_wait_transaction(struct btrfs_fs_info 
*fs_info, int type)
if (test_bit(BTRFS_FS_LOG_RECOVERING, _info->flags))
return 0;
 
-   if (type == TRANS_USERSPACE)
-   return 1;
-
if (type == TRANS_START &&
!atomic_read(_info->open_ioctl_trans))
return 1;
@@ -593,7 +584,7 @@ start_transaction(struct btrfs_root *root, unsigned int 
num_items,
 got_it:
btrfs_record_root_in_trans(h, root);
 
-   if (!current->journal_info && type != TRANS_USERSPACE)
+   if (!current->journal_info)
current->journal_info = h;
return h;
 
@@ -678,12 +669,6 @@ struct btrfs_trans_handle 
*btrfs_join_transaction_nolock(struct btrfs_root *root
 BTRFS_RESERVE_NO_FLUSH, true);
 }
 
-struct btrfs_trans_handle *btrfs_start_ioctl_transaction(struct btrfs_root 
*root)
-{
-   return start_transaction(root, 0, TRANS_USERSPACE,
-BTRFS_RESERVE_NO_FLUSH, true);
-}
-
 /*
  * btrfs_attach_transaction() - catch the running transaction
  *
diff --git a/fs/btrfs/transaction.h b/fs/btrfs/transaction.h
index 6beee072b1bd..8a6361828c69 100644
--- a/fs/btrfs/transaction.h
+++ b/fs/btrfs/transaction.h
@@ -89,21 +89,18 @@ struct btrfs_transaction {
 
 #define __TRANS_FREEZABLE  (1U << 0)
 
-#define __TRANS_USERSPACE  (1U << 8)
 #define __TRANS_START  (1U << 9)
 #define __TRANS_ATTACH (1U << 10)
 #define __TRANS_JOIN   (1U << 11)
 #define __TRANS_JOIN_NOLOCK(1U << 12)
 #define __TRANS_DUMMY  (1U << 13)
 
-#define TRANS_USERSPACE(__TRANS_USERSPACE | __TRANS_FREEZABLE)
 #define TRANS_START(__TRANS_START | __TRANS_FREEZABLE)
 #define TRANS_ATTACH   (__TRANS_ATTACH)
 #define TRANS_JOIN (__TRANS_JOIN | __TRANS_FREEZABLE)
 #define TRANS_JOIN_NOLOCK  (__TRANS_JOIN_NOLOCK)
 
-#define TRANS_EXTWRITERS   (__TRANS_USERSPACE | __TRANS_START |\
-__TRANS_ATTACH)
+#define TRANS_EXTWRITERS   (__TRANS_START | __TRANS_ATTACH)
 
 #define BTRFS_SEND_TRANS_STUB  ((void *)1)
 
@@ -194,7 +191,6 @@ struct btrfs_trans_handle 
*btrfs_join_transaction_nolock(struct btrfs_root *root
 struct btrfs_trans_handle *btrfs_attach_transaction(struct btrfs_root *root);
 struct btrfs_trans_handle *btrfs_attach_transaction_barrier(
struct btrfs_root *root);
-struct btrfs_trans_handle *btrfs_start_ioctl_transaction(struct btrfs_root 
*root);
 int btrfs_wait_for_commit(struct btrfs_fs_info *fs_info, u64 transid);
 
 void btrfs_add_dead_root(struct btrfs_root *root);
-- 
2.7.4

--
To unsubscribe from this list: send the line "unsubscribe