read-only for no good reason on 4.9.30
I have a system with less than 50% disk space used. It just started rejecting writes due to lack of disk space. I ran "btrfs balance" and then it started working correctly again. It seems that a btrfs filesystem if left alone will eventually get fragmented enough that it rejects writes (I've had similar issues with other systems running BTRFS with other kernel versions). Is this a known issue? Is there any good way of recognising when it's likely to happen? Is there anything I can do other than rewriting a medium size file to determine when it's happened? # uname -a Linux trex 4.9.0-3-amd64 #1 SMP Debian 4.9.30-2+deb9u2 (2017-06-26) x86_64 GNU/Linux # df -h / Filesystem Size Used Avail Use% Mounted on /dev/sdc239G 113G 126G 48% / # btrfs fi df / Data, RAID1: total=117.00GiB, used=111.81GiB System, RAID1: total=32.00MiB, used=48.00KiB Metadata, RAID1: total=1.00GiB, used=516.00MiB GlobalReserve, single: total=246.59MiB, used=0.00B # btrfs dev usa / /dev/sdc, ID: 1 Device size: 238.47GiB Device slack: 0.00B Data,RAID1:117.00GiB Metadata,RAID1: 1.00GiB System,RAID1: 32.00MiB Unallocated: 120.44GiB /dev/sdd, ID: 2 Device size: 238.47GiB Device slack: 0.00B Data,RAID1:117.00GiB Metadata,RAID1: 1.00GiB System,RAID1: 32.00MiB Unallocated: 120.44GiB -- My Main Blog http://etbe.coker.com.au/ My Documents Bloghttp://doc.coker.com.au/ -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] btrfs-progs: inspect-internal rootid: Allow a file to be specified
Since cmd_inspect_rootid() calls btrfs_open_dir(), it rejects a file to be spcified. But as the document says, a file should be supported. This patch introduces btrfs_open_file_or_dir(), which is a counterpart of btrfs_open_dir(), to safely check and open btrfs file or directory. The original btrfs_open_dir() codes are moved to btrfs_open() and shared by both function. Signed-off-by: Tomohiro Misono--- cmds-inspect.c | 2 +- utils.c| 16 +--- utils.h| 2 ++ 3 files changed, 16 insertions(+), 4 deletions(-) diff --git a/cmds-inspect.c b/cmds-inspect.c index d1a3a0e..885f3ab 100644 --- a/cmds-inspect.c +++ b/cmds-inspect.c @@ -318,7 +318,7 @@ static int cmd_inspect_rootid(int argc, char **argv) if (check_argc_exact(argc - optind, 1)) usage(cmd_inspect_rootid_usage); - fd = btrfs_open_dir(argv[optind], , 1); + fd = btrfs_open_file_or_dir(argv[optind], , 1); if (fd < 0) { ret = -ENOENT; goto out; diff --git a/utils.c b/utils.c index bb04913..9db39eb 100644 --- a/utils.c +++ b/utils.c @@ -568,9 +568,9 @@ int open_path_or_dev_mnt(const char *path, DIR **dirstream, int verbose) /* * Do the following checks before calling open_file_or_dir(): * 1: path is in a btrfs filesystem - * 2: path is a directory + * 2: path is a directory if dir_only is 1 */ -int btrfs_open_dir(const char *path, DIR **dirstream, int verbose) +int btrfs_open(const char *path, DIR **dirstream, int verbose, int dir_only) { struct statfs stfs; struct stat st; @@ -593,7 +593,7 @@ int btrfs_open_dir(const char *path, DIR **dirstream, int verbose) return -1; } - if (!S_ISDIR(st.st_mode)) { + if (dir_only && !S_ISDIR(st.st_mode)) { error_on(verbose, "not a directory: %s", path); return -3; } @@ -607,6 +607,16 @@ int btrfs_open_dir(const char *path, DIR **dirstream, int verbose) return ret; } +int btrfs_open_dir(const char *path, DIR **dirstream, int verbose) +{ + return btrfs_open(path, dirstream, verbose, 1); +} + +int btrfs_open_file_or_dir(const char *path, DIR **dirstream, int verbose) +{ + return btrfs_open(path, dirstream, verbose, 0); +} + /* checks if a device is a loop device */ static int is_loop_device (const char* device) { struct stat statbuf; diff --git a/utils.h b/utils.h index 091f8fa..d28a05a 100644 --- a/utils.h +++ b/utils.h @@ -108,7 +108,9 @@ int is_block_device(const char *file); int is_mount_point(const char *file); int check_arg_type(const char *input); int open_path_or_dev_mnt(const char *path, DIR **dirstream, int verbose); +int btrfs_open(const char *path, DIR **dirstream, int verbose, int dir_only); int btrfs_open_dir(const char *path, DIR **dirstream, int verbose); +int btrfs_open_file_or_dir(const char *path, DIR **dirstream, int verbose); u64 btrfs_device_size(int fd, struct stat *st); /* Helper to always get proper size of the destination string */ #define strncpy_null(dest, src) __strncpy_null(dest, src, sizeof(dest)) -- 2.9.5 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: block group 11778977169408 has wrong amount of free space
On 2017年09月04日 09:51, Christoph Anton Mitterer wrote: Did another mount with clear_cache,rw (cause it was ro before)... now I get even more errors: # btrfs check /dev/mapper/data-a2 ; echo $? Checking filesystem on /dev/mapper/data-a2 UUID: f8acb432-7604-46ba-b3ad-0abe8e92c4db checking extents checking free space cache block group 9857516175360 has wrong amount of free space failed to load free space cache for block group 9857516175360 block group 11778977169408 has wrong amount of free space failed to load free space cache for block group 11778977169408 checking fs roots checking csums checking root refs found 4404625330176 bytes used, no error found total csum bytes: 4293007908 total tree bytes: 7511883776 total fs tree bytes: 1856258048 total extent tree bytes: 1097842688 btree space waste bytes: 887738230 file data blocks allocated: 4397113446400 referenced 4515055595520 0 what the??? IIRC clear_cache will only clear the cache of modified block groups for v1 space cache. And that's why we have btrfs check --clear-space-cache v1, which will wipe out all (v1) space cache. Thanks, Qu -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/2] btrfs-progs: mkfs: Fix wrong file type for dir items and indexes when specifying root directory
[Bug] If using mkfs.btrfs with "-r" parameter and specified direct has fifo/socket/char/block special file, then created btrfs can't pass fsck: -- checking fs roots unresolved ref dir 241158 index 3 namelen 9 name S.dirmngr filetype 0 errors 80, filetype mismatch ERROR: errors found in fs roots -- [Reason] Btrfs dir items/indexes records inode type, while "-r" only handles directors, regular files and soft link, it makes such special files type to be regular file and caused the problem. [Fix] Add missing types for add_directory_items(), so that result of "mkfs.btrfs -r" can pass mkfs. Signed-off-by: Qu Wenruo--- mkfs/main.c | 8 1 file changed, 8 insertions(+) diff --git a/mkfs/main.c b/mkfs/main.c index afd68bc5..84ff300b 100644 --- a/mkfs/main.c +++ b/mkfs/main.c @@ -435,6 +435,14 @@ static int add_directory_items(struct btrfs_trans_handle *trans, filetype = BTRFS_FT_REG_FILE; if (S_ISLNK(st->st_mode)) filetype = BTRFS_FT_SYMLINK; + if (S_ISSOCK(st->st_mode)) + filetype = BTRFS_FT_SOCK; + if (S_ISCHR(st->st_mode)) + filetype = BTRFS_FT_CHRDEV; + if (S_ISBLK(st->st_mode)) + filetype = BTRFS_FT_BLKDEV; + if (S_ISFIFO(st->st_mode)) + filetype = BTRFS_FT_FIFO; ret = btrfs_insert_dir_item(trans, root, name, name_len, parent_inum, , -- 2.14.1 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/2] btrfs-progs: test/mkfs: Add test case for rootdir parameter
Add test case which checks if -r|--rootdir mount option can handle softlink/char/block/fifo files. Signed-off-by: Qu Wenruo--- .../009-special-files-for-rootdir/test.sh | 40 ++ 1 file changed, 40 insertions(+) create mode 100755 tests/mkfs-tests/009-special-files-for-rootdir/test.sh diff --git a/tests/mkfs-tests/009-special-files-for-rootdir/test.sh b/tests/mkfs-tests/009-special-files-for-rootdir/test.sh new file mode 100755 index ..bc5297d0 --- /dev/null +++ b/tests/mkfs-tests/009-special-files-for-rootdir/test.sh @@ -0,0 +1,40 @@ +#!/bin/bash +# Check if --rootdir can handle special files (socket/fifo/char/block) correctly +# +# --rootdir had a problem of filling dir items/indexes with wrong type +# and caused btrfs check to report such error + +source "$TOP/tests/common" + +check_prereq mkfs.btrfs +check_prereq btrfs + +setup_root_helper # For mknod +prepare_test_dev 128M + +# mknod can create FIFO/CHAR/BLOCK file but not SOCK. +# No neat tool to create socket file, unless using python or similar. +# So no SOCK is tested here +check_global_prereq mknod + +# Also check regular file +check_global_prereq dd + +# And dir +check_global_prereq mkdir + +tmp="/tmp/btrfs_selftest_$$" + +run_check mkdir $tmp +run_check mkdir $tmp/dir +run_check mkdir -p $tmp/dir/in/dir +run_check mknod $tmp/fifo p +run_check $SUDO_HELPER mknod $tmp/char c 1 1 +run_check $SUDO_HELPER mknod $tmp/block b 1 1 +run_check dd if=/dev/zero bs=1M count=1 of=$tmp/regular + +run_check $SUDO_HELPER "$TOP/mkfs.btrfs" -f -r "$tmp" $TEST_DEV + +rm "$tmp" -rf + +run_check $SUDO_HELPER "$TOP/btrfs" check $TEST_DEV -- 2.14.1 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: How to disable/revoke 'compression'?
On 2017年09月04日 08:14, Adam Borowski wrote: On Mon, Sep 04, 2017 at 07:55:27AM +0800, Qu Wenruo wrote: On 2017年09月04日 02:06, Adam Borowski wrote: I've once written a tool which does this, but 1. it's extremely slow, 2. insane, 3. so insane a certain member of this list would kill me had I distributed the tool. Thus, I'd need to rewrite it first... AFAIK the only method to determine the compression ratio is to check the EXTENT_DATA key and its corresponding file_extent_item structure. (Which I assume Adam is doing this way) In that structure is records its on-disk data size and in-memory data size. (All rounded up to sectorsize, which is 4K in most case) So in theory it's possible to determine the compression ratio. The only method I can think of (maybe I forgot some methods?) is to use offline tool (btrfs-debug-tree) to check that. FS APIs like fiemap doesn't even support to report on-disk data size so we can't use it. BTRFS_IOC_TREE_SEARCH_V2 returns all we want to know; its only downside is being root only. Just forgot that. But the problem is more complicated, especially when compressed CoW is involved. For example, there is an extent (A) which represents the data for inode 258, range [0,128k). On disk size its just 4K. And when we write the range [32K, 64K), which get CoWed and compressed, resulting a new file extent (B) for inode 258, range [32K, 64K), and on disk size is 4K as an example. Then file extent layout for 258 will be: [0,32k): range [0,32K) of uncompressed Extent A [32k, 64k): range [0,32k) of uncompressed Extent B [64k, 128k): range [64k, 128K) of uncompressed Extent A. And on disk extent size is 4K (compressed Extent A) + 4K (compressed Extent B) = 8K. Before the write, the compresstion ratio is 4K/128K = 3.125% While after write, the compression ratio is 8K/128K = 6.25% There's no real meaningful way to speak about compression ratio of a partial extent. Thus, I decided to, for every extent, take compressed:uncompressed sizes of the whole extent, no matter whether the file uses only a few bytes of that extent or references it a thousand times. Very clever move. Not to mention that it's possible to have uncompressed file extent. Yeah, the tool gives a report like: all 74% 9.2M/ 13M lzo 68% 7.1M/ 11M none 100% 2.1M/ 2.1M as you typically have a mix of compressible and uncompressible data. Looks quite nice! Thanks, Qu 喵! -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: block group 11778977169408 has wrong amount of free space
Did another mount with clear_cache,rw (cause it was ro before)... now I get even more errors: # btrfs check /dev/mapper/data-a2 ; echo $? Checking filesystem on /dev/mapper/data-a2 UUID: f8acb432-7604-46ba-b3ad-0abe8e92c4db checking extents checking free space cache block group 9857516175360 has wrong amount of free space failed to load free space cache for block group 9857516175360 block group 11778977169408 has wrong amount of free space failed to load free space cache for block group 11778977169408 checking fs roots checking csums checking root refs found 4404625330176 bytes used, no error found total csum bytes: 4293007908 total tree bytes: 7511883776 total fs tree bytes: 1856258048 total extent tree bytes: 1097842688 btree space waste bytes: 887738230 file data blocks allocated: 4397113446400 referenced 4515055595520 0 what the??? smime.p7s Description: S/MIME cryptographic signature
Re: block group 11778977169408 has wrong amount of free space
Just checked, and mounting with clear_cache, and then re-fscking doesn't even fix the problem... Output stays the same. Cheers, Chris. smime.p7s Description: S/MIME cryptographic signature
Re: BTRFS: error (device dm-2) in btrfs_run_delayed_refs:2960: errno=-17 Object already exists (since 3.4 / 2012)
Ok this output looked fishy and so I went and tested it on my box again. It looks like I wasn't testing modifying a snapshot with an existing fs so I never saw these errors, but I see them as well. I definitely fucked the building of the initial ref tree. It's too late tonight for me to rework it and have it working for you, but I should be able to get it into shape in the morning. I'll let you know when I have something useful to test, sorry about the mess, Josef Sent from my iPhone > On Sep 3, 2017, at 4:21 PM, Marc MERLINwrote: > >> On Sun, Sep 03, 2017 at 05:33:33PM +, Josef Bacik wrote: >> Alright pushed, sorry about that. > > I'm reasonably sure I'm running the new code, but still got this: > [ 2104.336513] Dropping a ref for a root that doesn't have a ref on the block > [ 2104.358226] Dumping block entry [115253923840 155648], num_refs 1, > metadata 0, from disk 1 > [ 2104.384037] Ref root 0, parent 3414272884736, owner 262813, offset 0, > num_refs 18446744073709551615 > [ 2104.412766] Ref root 418, parent 0, owner 262813, offset 0, num_refs 1 > [ 2104.433888] Root entry 418, num_refs 1 > [ 2104.446648] Root entry 69869, num_refs 0 > [ 2104.459904] Ref action 2, root 69869, ref_root 0, parent 3414272884736, > owner 262813, offset 0, num_refs 18446744073709551615 > [ 2104.496244] No Stacktrace > > Now, in the background I had a monthly md check of the underlying device > (mdadm raid 5), and got some of those. Obviously that's not good, and > I'm assuming that md raid5 may not have a checksum on blocks, so it won't know > which drive has the corrupted data. > Does that sound right? > > Now, the good news is that btrfs on top does have checksums, so running a > scrub should > hopefully find those corrupted blocks if they happen to be in use by the > filesystem > (maybe they are free). > But as a reminder, this whole thread started with my FS maybe not being in a > good state, but both > check --repair and scrub returning clean. Maybe I'll use the opportunity to > re-run a check --repair > and a scrub after that to see what state things are in. > > md6: mismatch sector in range 3581539536-3581539544 > md6: mismatch sector in range 3581539544-3581539552 > md6: mismatch sector in range 3581539552-3581539560 > md6: mismatch sector in range 3581539560-3581539568 > md6: mismatch sector in range 3581543792-3581543800 > md6: mismatch sector in range 3581543800-3581543808 > md6: mismatch sector in range 3581543808-3581543816 > md6: mismatch sector in range 3581543816-3581543824 > md6: mismatch sector in range 3581544112-3581544120 > md6: mismatch sector in range 3581544120-3581544128 > > As for your patch, no idea why it's not giving me a stacktrace, sorry :-/ > > Git log of my tree does show: > commit aa162d2908bd7452805ea812b7550232b0b6ed53 > Author: Josef Bacik > Date: Sun Sep 3 13:32:17 2017 -0400 > >Btrfs: use be->metadata just in case > >I suspect we're not getting the owner in some cases, so we want to just >use the known value. > >Signed-off-by: Josef Bacik > > Marc > -- > "A mouse is a device used to point at the xterm you want to type in" - A.S.R. > Microsoft is to operating systems > what McDonalds is to gourmet cooking > Home page: > https://urldefense.proofpoint.com/v2/url?u=http-3A__marc.merlins.org_=DwIBAg=5VD0RTtNlTh3ycd41b3MUw=sDzg6MvHymKOUgI8SFIm4Q=BaH33jtavN-1wWyV3yseE5v7ImIAaTXLnjChSr4HnQw=3JczS4Mo254uip2aIsYiC_EUHsmGYcCJUUMl6si8NQ8= > | PGP 1024R/763BE901 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
block group 11778977169408 has wrong amount of free space
Hey. Just got the following: $ uname -a Linux heisenberg 4.12.0-1-amd64 #1 SMP Debian 4.12.6-1 (2017-08-12) x86_64 GNU/Linux $ btrfs version btrfs-progs v4.12 on a filesystem: # btrfs check /dev/mapper/data-a2 ; echo $? Checking filesystem on /dev/mapper/data-a2 UUID: f8acb432-7604-46ba-b3ad-0abe8e92c4db checking extents checking free space cache block group 11778977169408 has wrong amount of free space failed to load free space cache for block group 11778977169408 checking fs roots checking csums checking root refs found 4404625739776 bytes used, no error found total csum bytes: 4293007908 total tree bytes: 7511900160 total fs tree bytes: 1856258048 total extent tree bytes: 1097859072 btree space waste bytes: 887753954 file data blocks allocated: 4397113839616 referenced 4515055988736 0 Any idea what could cause these free space issues and how to clean them up? Thought that should work with recent kernels could that mean some data will be corrupted when I do e.g. mount with clean_cache? Interestingly, $? is still 0... even though errors were found. And kernel log shows nothing. Cheers, Chris. smime.p7s Description: S/MIME cryptographic signature
Re: How to disable/revoke 'compression'?
On Mon, Sep 04, 2017 at 07:55:27AM +0800, Qu Wenruo wrote: > On 2017年09月04日 02:06, Adam Borowski wrote: > > I've once written a tool which does this, but 1. it's extremely slow, 2. > > insane, 3. so insane a certain member of this list would kill me had I > > distributed the tool. Thus, I'd need to rewrite it first... > > AFAIK the only method to determine the compression ratio is to check the > EXTENT_DATA key and its corresponding file_extent_item structure. > (Which I assume Adam is doing this way) > > In that structure is records its on-disk data size and in-memory data size. > (All rounded up to sectorsize, which is 4K in most case) > So in theory it's possible to determine the compression ratio. > > The only method I can think of (maybe I forgot some methods?) is to use > offline tool (btrfs-debug-tree) to check that. > FS APIs like fiemap doesn't even support to report on-disk data size so we > can't use it. BTRFS_IOC_TREE_SEARCH_V2 returns all we want to know; its only downside is being root only. > But the problem is more complicated, especially when compressed CoW is > involved. > > For example, there is an extent (A) which represents the data for inode 258, > range [0,128k). > On disk size its just 4K. > > And when we write the range [32K, 64K), which get CoWed and compressed, > resulting a new file extent (B) for inode 258, range [32K, 64K), and on disk > size is 4K as an example. > > Then file extent layout for 258 will be: > [0,32k): range [0,32K) of uncompressed Extent A > [32k, 64k): range [0,32k) of uncompressed Extent B > [64k, 128k): range [64k, 128K) of uncompressed Extent A. > > And on disk extent size is 4K (compressed Extent A) + 4K (compressed Extent > B) = 8K. > > Before the write, the compresstion ratio is 4K/128K = 3.125% > While after write, the compression ratio is 8K/128K = 6.25% There's no real meaningful way to speak about compression ratio of a partial extent. Thus, I decided to, for every extent, take compressed:uncompressed sizes of the whole extent, no matter whether the file uses only a few bytes of that extent or references it a thousand times. > Not to mention that it's possible to have uncompressed file extent. Yeah, the tool gives a report like: all 74% 9.2M/ 13M lzo 68% 7.1M/ 11M none 100% 2.1M/ 2.1M as you typically have a mix of compressible and uncompressible data. 喵! -- ⢀⣴⠾⠻⢶⣦⠀ ⣾⠁⢰⠒⠀⣿⡁ Vat kind uf sufficiently advanced technology iz dis!? ⢿⡄⠘⠷⠚⠋⠀ -- Genghis Ht'rok'din ⠈⠳⣄ -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: How to disable/revoke 'compression'?
On 2017年09月04日 02:06, Adam Borowski wrote: On Sun, Sep 03, 2017 at 07:32:01PM +0200, Cloud Admin wrote: Hi, I used the mount option 'compression' on some mounted sub volumes. How can I revoke the compression? Means to delete the option and get all data uncompressed on this volume. Is it enough to remount the sub volume without this option? Or is it necessary to do some addional step (balancing?) to get all stored data uncompressed. If you set it via mount option, removing the option is enough to disable compression for _new_ files. Other ways are chattr +c and btrfs-property, but if you haven't heard about those you almost surely don't have such attributes set. After remounting, you may uncompress existing files. Balancing won't do this as it moves extents around without looking inside; defrag on the other hand rewrites extents thus as a side effect it applies new [non]compression settings. Thus: 「btrfs fi defrag -r /path/to/filesystem」. Beside of it, is it possible to find out what the real and compressed size of a file, for example or the ratio? Currently not. I've once written a tool which does this, but 1. it's extremely slow, 2. insane, 3. so insane a certain member of this list would kill me had I distributed the tool. Thus, I'd need to rewrite it first... AFAIK the only method to determine the compression ratio is to check the EXTENT_DATA key and its corresponding file_extent_item structure. (Which I assume Adam is doing this way) In that structure is records its on-disk data size and in-memory data size. (All rounded up to sectorsize, which is 4K in most case) So in theory it's possible to determine the compression ratio. The only method I can think of (maybe I forgot some methods?) is to use offline tool (btrfs-debug-tree) to check that. FS APIs like fiemap doesn't even support to report on-disk data size so we can't use it. But the problem is more complicated, especially when compressed CoW is involved. For example, there is an extent (A) which represents the data for inode 258, range [0,128k). On disk size its just 4K. And when we write the range [32K, 64K), which get CoWed and compressed, resulting a new file extent (B) for inode 258, range [32K, 64K), and on disk size is 4K as an example. Then file extent layout for 258 will be: [0,32k): range [0,32K) of uncompressed Extent A [32k, 64k): range [0,32k) of uncompressed Extent B [64k, 128k): range [64k, 128K) of uncompressed Extent A. And on disk extent size is 4K (compressed Extent A) + 4K (compressed Extent B) = 8K. Before the write, the compresstion ratio is 4K/128K = 3.125% While after write, the compression ratio is 8K/128K = 6.25% Not to mention that it's possible to have uncompressed file extent. So it's complicated even we're just using offline tool to determine the compression ratio of btrfs compressed file. Thanks, Qu Meow! -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BTRFS: error (device dm-2) in btrfs_run_delayed_refs:2960: errno=-17 Object already exists (since 3.4 / 2012)
On Sun, Sep 03, 2017 at 05:33:33PM +, Josef Bacik wrote: > Alright pushed, sorry about that. I'm reasonably sure I'm running the new code, but still got this: [ 2104.336513] Dropping a ref for a root that doesn't have a ref on the block [ 2104.358226] Dumping block entry [115253923840 155648], num_refs 1, metadata 0, from disk 1 [ 2104.384037] Ref root 0, parent 3414272884736, owner 262813, offset 0, num_refs 18446744073709551615 [ 2104.412766] Ref root 418, parent 0, owner 262813, offset 0, num_refs 1 [ 2104.433888] Root entry 418, num_refs 1 [ 2104.446648] Root entry 69869, num_refs 0 [ 2104.459904] Ref action 2, root 69869, ref_root 0, parent 3414272884736, owner 262813, offset 0, num_refs 18446744073709551615 [ 2104.496244] No Stacktrace Now, in the background I had a monthly md check of the underlying device (mdadm raid 5), and got some of those. Obviously that's not good, and I'm assuming that md raid5 may not have a checksum on blocks, so it won't know which drive has the corrupted data. Does that sound right? Now, the good news is that btrfs on top does have checksums, so running a scrub should hopefully find those corrupted blocks if they happen to be in use by the filesystem (maybe they are free). But as a reminder, this whole thread started with my FS maybe not being in a good state, but both check --repair and scrub returning clean. Maybe I'll use the opportunity to re-run a check --repair and a scrub after that to see what state things are in. md6: mismatch sector in range 3581539536-3581539544 md6: mismatch sector in range 3581539544-3581539552 md6: mismatch sector in range 3581539552-3581539560 md6: mismatch sector in range 3581539560-3581539568 md6: mismatch sector in range 3581543792-3581543800 md6: mismatch sector in range 3581543800-3581543808 md6: mismatch sector in range 3581543808-3581543816 md6: mismatch sector in range 3581543816-3581543824 md6: mismatch sector in range 3581544112-3581544120 md6: mismatch sector in range 3581544120-3581544128 As for your patch, no idea why it's not giving me a stacktrace, sorry :-/ Git log of my tree does show: commit aa162d2908bd7452805ea812b7550232b0b6ed53 Author: Josef BacikDate: Sun Sep 3 13:32:17 2017 -0400 Btrfs: use be->metadata just in case I suspect we're not getting the owner in some cases, so we want to just use the known value. Signed-off-by: Josef Bacik Marc -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ | PGP 1024R/763BE901 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: speed up big btrfs volumes with ssds
> [ ... ] - needed volume size is 60TB I wonder how long that takes to 'scrub', 'balance', 'check', 'subvolume delete', 'find', etc. > [ ... ] 4x HW Raid 5 with 1GB controller memory of 4TB 3,5" > devices and using btrfs as raid 0 for data and metadata on top > of those 4 raid 5. [ ... ] the write speed is not as good as > i would like - especially for random 8k-16k I/O. [ ... ] Also I noticed that the rain is wet and cold - especially if one walks around for a few hours in a t-shirt, shorts and sandals. :-) > My current idea is to use a pcie flash card with bcache on top > of each raid 5. Is this something which makes sense to speed > up the write speed. Well 'bcache' in the role of write buffer allegedly helps turning unaligned writes into aligned writes, so might help, but I wonder how effective that will be in this case, plus it won't turn low random IOPS-per-TB 4TB devices into high ones. Anyhow if they are battery-backed the 1GB of HW HBA cache/buffer should do exactly that, excep that again in this case that is rather optimistic. But this reminds me of the common story: "Doctor, if I stab repeatedly my hand with a fork it hurts a lot, how to fix that?" "Don't do it". :-) PS Random writes of 8-16KiB over 60TB might seem like storing small records/images in small files. That would be "brave". On a 60TB RAID50 of 20x 4TB disk drives that might mean around 5-10MB/s of random small writes, including both data and metadata. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: How to disable/revoke 'compression'?
On 09/03/2017 08:06 PM, Adam Borowski wrote: > On Sun, Sep 03, 2017 at 07:32:01PM +0200, Cloud Admin wrote: >> Hi, >> I used the mount option 'compression' on some mounted sub volumes. How >> can I revoke the compression? Means to delete the option and get all >> data uncompressed on this volume. >> Is it enough to remount the sub volume without this option? Or is it >> necessary to do some addional step (balancing?) to get all stored data >> uncompressed. > > If you set it via mount option, removing the option is enough to disable > compression for _new_ files. Other ways are chattr +c and btrfs-property, > but if you haven't heard about those you almost surely don't have such > attributes set. > > After remounting, you may uncompress existing files. Balancing won't do > this as it moves extents around without looking inside; defrag on the other > hand rewrites extents thus as a side effect it applies new [non]compression > settings. Thus: 「btrfs fi defrag -r /path/to/filesystem」. > >> Beside of it, is it possible to find out what the real and compressed size >> of a file, for example or the ratio? > > Currently not. > > I've once written a tool which does this, but 1. it's extremely slow, 2. > insane, 3. so insane a certain member of this list would kill me had I > distributed the tool. Thus, I'd need to rewrite it first... Heh, I wouldn't do that, since I need you to do my debian uploads. :D But it would certainly help to be a bit less stubborn only wanting to code in the language that matches your country code. :O Or maybe I can help a bit, since it sounds like a nice one for the coding examples in the lib. ;] Days are getting shorter again, so the amount of indoor coding activity will hopefully increase a bit again soon. -- Hans van Kranenburg -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
speed up big btrfs volumes with ssds
Hello, i'm trying to speed up big btrfs volumes. Some facts: - Kernel will be 4.13-rc7 - needed volume size is 60TB Currently without any ssds i get the best speed with: - 4x HW Raid 5 with 1GB controller memory of 4TB 3,5" devices and using btrfs as raid 0 for data and metadata on top of those 4 raid 5. I can live with a data loss every now and and than ;-) so a raid 0 on top of the 4x radi5 is acceptable for me. Currently the write speed is not as good as i would like - especially for random 8k-16k I/O. My current idea is to use a pcie flash card with bcache on top of each raid 5. Is this something which makes sense to speed up the write speed. Greets, Stefan -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: How to disable/revoke 'compression'?
On Sun, Sep 03, 2017 at 07:32:01PM +0200, Cloud Admin wrote: > Hi, > I used the mount option 'compression' on some mounted sub volumes. How > can I revoke the compression? Means to delete the option and get all > data uncompressed on this volume. > Is it enough to remount the sub volume without this option? Or is it > necessary to do some addional step (balancing?) to get all stored data > uncompressed. If you set it via mount option, removing the option is enough to disable compression for _new_ files. Other ways are chattr +c and btrfs-property, but if you haven't heard about those you almost surely don't have such attributes set. After remounting, you may uncompress existing files. Balancing won't do this as it moves extents around without looking inside; defrag on the other hand rewrites extents thus as a side effect it applies new [non]compression settings. Thus: 「btrfs fi defrag -r /path/to/filesystem」. > Beside of it, is it possible to find out what the real and compressed size > of a file, for example or the ratio? Currently not. I've once written a tool which does this, but 1. it's extremely slow, 2. insane, 3. so insane a certain member of this list would kill me had I distributed the tool. Thus, I'd need to rewrite it first... Meow! -- ⢀⣴⠾⠻⢶⣦⠀ ⣾⠁⢰⠒⠀⣿⡁ Vat kind uf sufficiently advanced technology iz dis!? ⢿⡄⠘⠷⠚⠋⠀ -- Genghis Ht'rok'din ⠈⠳⣄ -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
How to disable/revoke 'compression'?
Hi, I used the mount option 'compression' on some mounted sub volumes. How can I revoke the compression? Means to delete the option and get all data uncompressed on this volume. Is it enough to remount the sub volume without this option? Or is it necessary to do some addional step (balancing?) to get all stored data uncompressed. Beside of it, is it possible to find out what the real and compressed size of a file, for example or the ratio? Bye Frank -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BTRFS: error (device dm-2) in btrfs_run_delayed_refs:2960: errno=-17 Object already exists (since 3.4 / 2012)
Alright pushed, sorry about that. Josef Sent from my iPhone > On Sep 3, 2017, at 10:42 AM, Marc MERLINwrote: > >> On Sun, Sep 03, 2017 at 02:38:57PM +, Josef Bacik wrote: >> Oh yeah you need CONFIG_STACKTRACE turned on, otherwise this is going to be >> difficult ;). Thanks, > > Right, except that I thought I did: > > saruman:/usr/src/linux-btrfs/btrfs-next# grep STACKTRACE .config > CONFIG_STACKTRACE_SUPPORT=y > CONFIG_HAVE_RELIABLE_STACKTRACE=y > CONFIG_STACKTRACE=y > CONFIG_USER_STACKTRACE_SUPPORT=y > > Marc > -- > "A mouse is a device used to point at the xterm you want to type in" - A.S.R. > Microsoft is to operating systems > what McDonalds is to gourmet cooking > Home page: > https://urldefense.proofpoint.com/v2/url?u=http-3A__marc.merlins.org_=DwIBAg=5VD0RTtNlTh3ycd41b3MUw=sDzg6MvHymKOUgI8SFIm4Q=6hYQEzNFsUwvT2CxYV_u4CrE2zAroYdvDkhnSNUI_aY=8wh8ci2P8k3BgZ3s_Fxsh3cZak4P3ESZslRm2vobnqs= > | PGP 1024R/763BE901 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BTRFS: error (device dm-2) in btrfs_run_delayed_refs:2960: errno=-17 Object already exists (since 3.4 / 2012)
Jesus Christ I misspelled it, I'll fix it up when I get home. Thanks, Josef Sent from my iPhone > On Sep 3, 2017, at 10:42 AM, Marc MERLINwrote: > >> On Sun, Sep 03, 2017 at 02:38:57PM +, Josef Bacik wrote: >> Oh yeah you need CONFIG_STACKTRACE turned on, otherwise this is going to be >> difficult ;). Thanks, > > Right, except that I thought I did: > > saruman:/usr/src/linux-btrfs/btrfs-next# grep STACKTRACE .config > CONFIG_STACKTRACE_SUPPORT=y > CONFIG_HAVE_RELIABLE_STACKTRACE=y > CONFIG_STACKTRACE=y > CONFIG_USER_STACKTRACE_SUPPORT=y > > Marc > -- > "A mouse is a device used to point at the xterm you want to type in" - A.S.R. > Microsoft is to operating systems > what McDonalds is to gourmet cooking > Home page: > https://urldefense.proofpoint.com/v2/url?u=http-3A__marc.merlins.org_=DwIBAg=5VD0RTtNlTh3ycd41b3MUw=sDzg6MvHymKOUgI8SFIm4Q=6hYQEzNFsUwvT2CxYV_u4CrE2zAroYdvDkhnSNUI_aY=8wh8ci2P8k3BgZ3s_Fxsh3cZak4P3ESZslRm2vobnqs= > | PGP 1024R/763BE901 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BTRFS: error (device dm-2) in btrfs_run_delayed_refs:2960: errno=-17 Object already exists (since 3.4 / 2012)
On Sun, Sep 03, 2017 at 02:38:57PM +, Josef Bacik wrote: > Oh yeah you need CONFIG_STACKTRACE turned on, otherwise this is going to be > difficult ;). Thanks, Right, except that I thought I did: saruman:/usr/src/linux-btrfs/btrfs-next# grep STACKTRACE .config CONFIG_STACKTRACE_SUPPORT=y CONFIG_HAVE_RELIABLE_STACKTRACE=y CONFIG_STACKTRACE=y CONFIG_USER_STACKTRACE_SUPPORT=y Marc -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ | PGP 1024R/763BE901 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BTRFS: error (device dm-2) in btrfs_run_delayed_refs:2960: errno=-17 Object already exists (since 3.4 / 2012)
Oh yeah you need CONFIG_STACKTRACE turned on, otherwise this is going to be difficult ;). Thanks, Josef Sent from my iPhone > On Sep 3, 2017, at 10:31 AM, Marc MERLINwrote: > >> On Sun, Sep 03, 2017 at 03:26:34AM +, Josef Bacik wrote: >> I was looking through the code for other ways to cut down memory usage when >> I noticed we only catch improper re-allocations, not adding another ref for >> metadata which is what I suspect your problem is. I added another patch and >> pushed it out, sorry for the churn. > > Installed. > > For now, I've seen this once, but otherwise no issues: > Dropping a ref for a root that doesn't have a ref on the block > Dumping block entry [26538725376 4096], num_refs 2, metadata 0, from disk 1 > Ref root 0, parent 29818880, owner 23608, offset 0, num_refs > 18446744073709551615 > Ref root 0, parent 202129408, owner 23608, offset 0, num_refs 1 > Ref root 418, parent 0, owner 23608, offset 0, num_refs 1 > Root entry 418, num_refs 1 > Root entry 69809, num_refs 0 > Ref action 1, root 418, ref_root 0, parent 202129408, owner 23608, offset 0, > num_refs 1 > No stacktrace support > Ref action 2, root 69809, ref_root 0, parent 29818880, owner 23608, offset > 0, num_refs 18446744073709551615 > No stacktrace support > > > I'm assuming this was done by your patch? > Should I worry about 'No stacktrace support' ? > > Marc > -- > "A mouse is a device used to point at the xterm you want to type in" - A.S.R. > Microsoft is to operating systems > what McDonalds is to gourmet cooking > Home page: > https://urldefense.proofpoint.com/v2/url?u=http-3A__marc.merlins.org_=DwIBAg=5VD0RTtNlTh3ycd41b3MUw=sDzg6MvHymKOUgI8SFIm4Q=LcpX_93P3Y777JowgGupu6UcijcbbvSYDebGKuuA1G8=w9rh7zu0AfB72bo7gMQ9oAj20iJYe8KIXuudlTWa_ek= > | PGP 1024R/763BE901 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BTRFS: error (device dm-2) in btrfs_run_delayed_refs:2960: errno=-17 Object already exists (since 3.4 / 2012)
On Sun, Sep 03, 2017 at 03:26:34AM +, Josef Bacik wrote: > I was looking through the code for other ways to cut down memory usage when I > noticed we only catch improper re-allocations, not adding another ref for > metadata which is what I suspect your problem is. I added another patch and > pushed it out, sorry for the churn. Installed. For now, I've seen this once, but otherwise no issues: Dropping a ref for a root that doesn't have a ref on the block Dumping block entry [26538725376 4096], num_refs 2, metadata 0, from disk 1 Ref root 0, parent 29818880, owner 23608, offset 0, num_refs 18446744073709551615 Ref root 0, parent 202129408, owner 23608, offset 0, num_refs 1 Ref root 418, parent 0, owner 23608, offset 0, num_refs 1 Root entry 418, num_refs 1 Root entry 69809, num_refs 0 Ref action 1, root 418, ref_root 0, parent 202129408, owner 23608, offset 0, num_refs 1 No stacktrace support Ref action 2, root 69809, ref_root 0, parent 29818880, owner 23608, offset 0, num_refs 18446744073709551615 No stacktrace support I'm assuming this was done by your patch? Should I worry about 'No stacktrace support' ? Marc -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ | PGP 1024R/763BE901 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: joining to contribute
Hello, > >Alongside this, there's also a requirement for being able to do > round-trip send/receive while preserving the ability to do incremental > sends. This is likely to be related to the above bug-fix. I did a > complete write-up of what's happening, and what needs to happen, here: > > http://www.spinics.net/lists/linux-btrfs/msg44089.html I missed that discussion, but I proposed a different solution in a similar thread about send/receive (https://www.spinics.net/lists/linux-btrfs/msg60694.html) I think it's not very useful that received_uuid encodes where the subvolume comes from. All that send / receive should care about, is that the contents of source(s) used for incremental send match the contents of subvolumes on the receive side. Let's call it, for example, "contents_uuid". The rules would be simple: operations that preserve contents preserve contents_uuid ; operations that change contents change contents_uuid. For simplicity and performance reasons, in order to not need tracking of changes, we could allow for some false positives, where contents_uuid changed when data did not. A simpler to implement set of rules could look like this: - rw subvolumes have no contents_uuid - changing rw subvolume to ro assigns a random contents_uuid - ro snapshot of rw subvolume gets a random contents_uuid - ro snapshot of ro subvolume preserves contents_uuid - send/receive preserves contents_uuid (after successful receive) And then the rule for send / receive would be: - send transmits contents_uuid of subvolumes used as clone sources, which are matched to subvolumes having identical contents_uuid on the receive side. Does it make sense? Did I miss something? I haven't received any feedback last time, which is why I bring it up again for discussion. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html