Re: Can't mount array with super_total_bytes mismatch with fs_devices total_rw_bytes
On 2017年10月04日 12:00, Asif Youssuff wrote: Thanks for the advice. On 10/03/2017 09:38 PM, Qu Wenruo wrote: [210017.281912] BTRFS info (device sdb): disk space caching is enabled [210017.281915] BTRFS info (device sdb): has skinny extents [210017.402084] BTRFS error (device sdb): super_total_bytes 92017859088384 mismatch with fs_devices total_rw_bytes 92017859094528 One of your device size is not aligned to 4K. Which is fine, but recently enhanced validation checker does not allow it. (Which should be a regression, and there is some other WARN_ON related to it) [210017.402126] BTRFS error (device sdb): failed to read chunk tree: -22 [210017.461473] BTRFS error (device sdb): open_ctree failed I've tried a few steps -- btrfs-chunk-recover, super-recover and I have run a btrfs check --repair on two of the disks in the array (this takes a very long time, so I'm hoping I don't have to run this on all of the disks). I had run into this problem once before in the past, and I'm not sure how I recovered from it; I may have simply rolled back the booted kernel to escape the extra checks around this mismatch. I'm at a loss for ideas and am running a btrfs-image so I can also report an issue -- I'm not sure whether 'btrfs-image -c9 -t4 /dev/sdo btrfs.image' is the right command to run if it is a multi-device array. Any ideas would be helpful, and I am happy to provide further information. root@ubuntu-server:~# uname -a Linux ubuntu-server 4.14.0-041400rc2-generic #201709242031 SMP Mon Sep 25 00:33:13 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux You can rollback to an earlier kernel to mount the fs. And manually find and resize the device with unaligned size: # btrfs fi show -b And check each device for its size: Label: none uuid: 839ddcfa-5701-4437-aff3-bcb2a26ae6dd Total devices 1 FS bytes used 397312 devid 1 size <<10737418240>> used 2172649472 path /dev/mapper/data-btrfs If it's not align, round it down to 4K, and resize it using devid: # btrfs fi resize : All device must be rounded. And the command should finish almost in no time. I was able to mount the fs using kernel version 4.4 and rounded down (took the size in bytes, and rounded down to a smaller number divisible by 4096). This is what btrfs fi show looks like now: asif@ubuntu-server:~$ sudo btrfs fi show --raw [sudo] password for asif: Label: none uuid: 48ed8a66-731d-499b-829e-dd07dd7260cc Total devices 13 FS bytes used 44783241728000 devid 4 size 6001141571584 used 5955045097472 path /dev/sdh devid 5 size 6001141571584 used 5955128983552 path /dev/sdg devid 7 size 6001141571584 used 5955709960192 path /dev/sdk devid 9 size 6001175126016 used 5955716186112 path /dev/sde devid 10 size 6001141571584 used 5955145760768 path /dev/sdc devid 11 size 8001563222016 used 7955416416256 path /dev/sdl devid 12 size 6001175126016 used 5956054286336 path /dev/sdf devid 14 size 8001563222016 used 7956009123840 path /dev/sdb devid 15 size 8001563222016 used 7956373831680 path /dev/sdj devid 17 size 8001563222016 used 6341094866944 path /dev/sdd devid 18 size 8001563222016 used 7955827064832 path /dev/sdn devid 20 size 8001563222016 used 7955378339840 path /dev/sdi devid 21 size 8001563222016 used 7955386728448 path /dev/sdo Then check if latest kernel can mount it. Unfortunately, the latest kernel still cannot mount it, showing the same errors as before. [ 139.852862] BTRFS error (device sdj): super_total_bytes 92017859086336 mismatch with fs_devices total_rw_bytes 92017859092480 [ 139.852894] BTRFS error (device sdj): failed to read chunk tree: -22 [ 139.916645] BTRFS error (device sdj): open_ctree failed Then the problem is btrfs doesn't update its super_total_bytes correctly. From what I can see, grow/shrink only update its delta, not re-calculate it. Before we have a good way to fix it in kernel, the only way is to manually modify the superblock to allow it pass kernel validation checker. Thanks, Qu I think it can be made as part of "btrfs check" to fix it. (Although it should be handled by kernel well) Thanks, Qu Hope there are some other ideas (or please correct me if I have done something wrong!). Thanks, Asif root@ubuntu-server:~# btrfs --version btrfs-progs v4.13.1 root@ubuntu-server:~# btrfs fi show Label: none uuid: 48ed8a66-731d-499b-829e-dd07dd7260cc Total devices 13 FS bytes used 40.73TiB devid 4 size 5.46TiB used 5.42TiB path /dev/sdo devid 5 size 5.46TiB used 5.42TiB path /dev/sdn devid 7 size 5.46TiB used 5.42TiB path /dev/sdc devid 9 size 5.46TiB used 5.42TiB path /dev/sdk devid 10 size 5.46TiB used 5.42TiB path /dev/sdj devid 11 size 7.28TiB used 7.24TiB path /dev/sdd devid 12 size 5.46TiB used 5.42TiB path /dev/sdm devid 14 size 7.28TiB used 7.24TiB path /dev/sdh devid 15 size 7.28TiB used 7.24TiB path /dev/sdb devid 17 siz
Re: Btrfs "failed to repair damaged filesystem" - RAID10 going RO when any write attempts are made
Any suggestions on this? Or do I just blow it away and hope the bug is fixed in a newer version? Regards Tim On Mon, Oct 2, 2017 at 8:44 PM, Timothy White wrote: > I have a BTRFS RAID 10 filesystem that was crashing and going into RO > mode. A did a kernel upgrade, upgraded btrfs tools to the latest. A > scrub was going ok ish. btrfs check showed a number of messages such > as: > > Backref 16562625503232 root 14628 owner 3609 offset 23793664 num_refs > 0 not found in extent tree > Incorrect local backref count on 16562625503232 root 14628 owner 3609 > offset 23793664 found 1 wanted 0 back 0x5639c37689d0 > backpointer mismatch on [16562625503232 2703360] > > Root 14628 was a subvolume root ID (for docker), given that I didn't > need any of that data, I removed all those subvolumes under the docker > subvolume and the docker subvolume. This still showed errors, so I > tried a btrfs check with repair, which eventually gave me the > following. > > Backref 16563772112896 root 14628 owner 3608 offset 0 num_refs 0 not > found in extent tree > Incorrect local backref count on 16563772112896 root 14628 owner 3608 > offset 0 found 1 wanted 0 back 0x55f3403f40b0 > Backref disk bytenr does not match extent record, > bytenr=16563772112896, ref bytenr=16563813335040 > Backref bytes do not match extent backref, bytenr=16563772112896, ref > bytes=134217728, backref bytes=133079040 > Backref 16563772112896 root 14628 owner 3607 offset 0 num_refs 0 not > found in extent tree > Incorrect local backref count on 16563772112896 root 14628 owner 3607 > offset 0 found 1 wanted 0 back 0x55f3470cfed0 > Backref bytes do not match extent backref, bytenr=16563772112896, ref > bytes=134217728, backref bytes=41222144 > backpointer mismatch on [16563772112896 134217728] > attempting to repair backref discrepency for bytenr 16563772112896 > Ref is past the entry end, please take a btrfs-image of this file > system and send it to a btrfs developer, ref 16563813335040 > failed to repair damaged filesystem, aborting > > I've taken a btrfs-image, however it's 12Gb, not sure if the > developers want that, but I do have it. > > Filesystem still crashes and goes read only if I try and make changes > (even deletes). The latest dmesg that includes that crash is at > https://drive.google.com/open?id=0B5bmQmu6UugIRFM0RUxwWFdqOGc. I don't > have the earlier ones. > > https://drive.google.com/open?id=0B5bmQmu6UugIYjZkYnA4ZFFpdlE is the > output from running btrfs check with repair, followed by a second run > with repair to see if it got anything different. > > At this stage, I expect to just blow away the filesystem and restore > from backups. However it would be nice to fix whatever the issue is. > Smartctl shows no underlying errors. > > What else do the devs want from me before I blow this away? (Or is it > fixable with something I've missed, as that would save me many hours > of restoration. > > The 12Gb btrfs-image can be uploaded to Google Drive (or FTP) if needed. > > Thanks > > Tim > > $ lsb_release -a > No LSB modules are available. > Distributor ID: Debian > Description: Debian GNU/Linux 9.1 (stretch) > Release: 9.1 > Codename: stretch > > $ uname -a > Linux bruce 4.12.0-0.bpo.1-amd64 #1 SMP Debian 4.12.6-1~bpo9+1 > (2017-08-27) x86_64 GNU/Linux > > $ btrfs --version > btrfs-progs v4.9.1 > > $ btrfs fi show > Label: 'Butter1' uuid: b8d081ac-0271-4481-9a58-c113c921bf49 > Total devices 4 FS bytes used 5.19TiB > devid1 size 3.64TiB used 2.60TiB path /dev/sde > devid2 size 3.64TiB used 2.60TiB path /dev/sdf > devid3 size 3.64TiB used 2.60TiB path /dev/sdd > devid4 size 3.64TiB used 2.60TiB path /dev/sdc > > $ btrfs fi df /mnt/Butter1 > Data, RAID10: total=5.19TiB, used=5.17TiB > System, RAID10: total=128.00MiB, used=560.00KiB > Metadata, RAID10: total=12.00GiB, used=10.34GiB > Metadata, single: total=16.00MiB, used=0.00B > GlobalReserve, single: total=512.00MiB, used=0.00B > > $btrfs subvolume list /mnt/Butter1 > ID 258 gen 250368 top level 5 path data1 > ID 259 gen 250516 top level 5 path data2 > ID 3648 gen 250476 top level 5 path Photos > ID 4597 gen 203041 top level 5 path Snapshots/Photos/20160304 > ID 4608 gen 203041 top level 5 path Snapshots/Photos/20160328 > ID 4628 gen 203041 top level 5 path Snapshots/Photos/20160421 > ID 4654 gen 250368 top level 5 path imap > ID 4656 gen 203041 top level 5 path Snapshots/Photos/20160523 > ID 4893 gen 203041 top level 5 path Snapshots/Photos/20160702 > ID 4946 gen 203041 top level 5 path Snapshots/Photos/20160731 > ID 4947 gen 203041 top level 5 path Snapshots/Photos/20160813 > ID 4948 gen 203041 top level 5 path Snapshots/Photos/2016081301 > ID 4970 gen 203041 top level 5 path Snapshots/Photos/20160919 > ID 5038 gen 203041 top level 5 path Snapshots/Photos/20161229_0911 > ID 5063 gen 250515 top level 5 path mirror > ID 13170 gen 250428 top level 5 path BizBackups > ID 13214 gen 250368 top level 5 path SaraLaptopBackup > ID 13485 gen 250368 top level 5 path PCOMDisks > ID 16176 g
Re: Can't mount array with super_total_bytes mismatch with fs_devices total_rw_bytes
Thanks for the advice. On 10/03/2017 09:38 PM, Qu Wenruo wrote: [210017.281912] BTRFS info (device sdb): disk space caching is enabled [210017.281915] BTRFS info (device sdb): has skinny extents [210017.402084] BTRFS error (device sdb): super_total_bytes 92017859088384 mismatch with fs_devices total_rw_bytes 92017859094528 One of your device size is not aligned to 4K. Which is fine, but recently enhanced validation checker does not allow it. (Which should be a regression, and there is some other WARN_ON related to it) [210017.402126] BTRFS error (device sdb): failed to read chunk tree: -22 [210017.461473] BTRFS error (device sdb): open_ctree failed I've tried a few steps -- btrfs-chunk-recover, super-recover and I have run a btrfs check --repair on two of the disks in the array (this takes a very long time, so I'm hoping I don't have to run this on all of the disks). I had run into this problem once before in the past, and I'm not sure how I recovered from it; I may have simply rolled back the booted kernel to escape the extra checks around this mismatch. I'm at a loss for ideas and am running a btrfs-image so I can also report an issue -- I'm not sure whether 'btrfs-image -c9 -t4 /dev/sdo btrfs.image' is the right command to run if it is a multi-device array. Any ideas would be helpful, and I am happy to provide further information. root@ubuntu-server:~# uname -a Linux ubuntu-server 4.14.0-041400rc2-generic #201709242031 SMP Mon Sep 25 00:33:13 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux You can rollback to an earlier kernel to mount the fs. And manually find and resize the device with unaligned size: # btrfs fi show -b And check each device for its size: Label: none uuid: 839ddcfa-5701-4437-aff3-bcb2a26ae6dd Total devices 1 FS bytes used 397312 devid 1 size <<10737418240>> used 2172649472 path /dev/mapper/data-btrfs If it's not align, round it down to 4K, and resize it using devid: # btrfs fi resize : All device must be rounded. And the command should finish almost in no time. I was able to mount the fs using kernel version 4.4 and rounded down (took the size in bytes, and rounded down to a smaller number divisible by 4096). This is what btrfs fi show looks like now: asif@ubuntu-server:~$ sudo btrfs fi show --raw [sudo] password for asif: Label: none uuid: 48ed8a66-731d-499b-829e-dd07dd7260cc Total devices 13 FS bytes used 44783241728000 devid4 size 6001141571584 used 5955045097472 path /dev/sdh devid5 size 6001141571584 used 5955128983552 path /dev/sdg devid7 size 6001141571584 used 5955709960192 path /dev/sdk devid9 size 6001175126016 used 5955716186112 path /dev/sde devid 10 size 6001141571584 used 5955145760768 path /dev/sdc devid 11 size 8001563222016 used 7955416416256 path /dev/sdl devid 12 size 6001175126016 used 5956054286336 path /dev/sdf devid 14 size 8001563222016 used 7956009123840 path /dev/sdb devid 15 size 8001563222016 used 7956373831680 path /dev/sdj devid 17 size 8001563222016 used 6341094866944 path /dev/sdd devid 18 size 8001563222016 used 7955827064832 path /dev/sdn devid 20 size 8001563222016 used 7955378339840 path /dev/sdi devid 21 size 8001563222016 used 7955386728448 path /dev/sdo Then check if latest kernel can mount it. Unfortunately, the latest kernel still cannot mount it, showing the same errors as before. [ 139.852862] BTRFS error (device sdj): super_total_bytes 92017859086336 mismatch with fs_devices total_rw_bytes 92017859092480 [ 139.852894] BTRFS error (device sdj): failed to read chunk tree: -22 [ 139.916645] BTRFS error (device sdj): open_ctree failed I think it can be made as part of "btrfs check" to fix it. (Although it should be handled by kernel well) Thanks, Qu Hope there are some other ideas (or please correct me if I have done something wrong!). Thanks, Asif root@ubuntu-server:~# btrfs --version btrfs-progs v4.13.1 root@ubuntu-server:~# btrfs fi show Label: none uuid: 48ed8a66-731d-499b-829e-dd07dd7260cc Total devices 13 FS bytes used 40.73TiB devid 4 size 5.46TiB used 5.42TiB path /dev/sdo devid 5 size 5.46TiB used 5.42TiB path /dev/sdn devid 7 size 5.46TiB used 5.42TiB path /dev/sdc devid 9 size 5.46TiB used 5.42TiB path /dev/sdk devid 10 size 5.46TiB used 5.42TiB path /dev/sdj devid 11 size 7.28TiB used 7.24TiB path /dev/sdd devid 12 size 5.46TiB used 5.42TiB path /dev/sdm devid 14 size 7.28TiB used 7.24TiB path /dev/sdh devid 15 size 7.28TiB used 7.24TiB path /dev/sdb devid 17 size 7.28TiB used 5.77TiB path /dev/sdl devid 18 size 7.28TiB used 7.24TiB path /dev/sdf devid 20 size 7.28TiB used 7.24TiB path /dev/sdi devid 21 size 7.28TiB used 7.24TiB path /dev/sdg Thanks, Asif -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@v
[PATCH v2] Btrfs: fix overlap of fs_info->flags values
Because the values of BTRFS_FS_EXCL_OP and BTRFS_FS_QUOTA_OVERRIDE overlap, we should change the value. First, BTRFS_FS_EXCL_OP was set to 14. commit 171938e52807 ("btrfs: track exclusive filesystem operation in flags") Next, the value of BTRFS_FS_QUOTA_OVERRIDE was set to 14. commit f29efe292198 ("btrfs: add quota override flag to enable quota override for CAP_SYS_RESOURCE") As a result, the value 14 overlapped. This problem is solved by defining the value of BTRFS_FS_QUOTA_OVERRIDE as 16. Fixes: f29efe292198 ("btrfs: add quota override flag to enable quota override for CAP_SYS_RESOURCE") CC: sta...@vger.kernel.org # 4.13+ Signed-off-by: Tsutomu Itoh --- v2: changed the value of BTRFS_FS_QUOTA_OVERRIDE instead of BTRFS_FS_EXCL_OP to 16. fs/btrfs/ctree.h | 7 +++ 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index 899ddaeeacec..d265ea7f763e 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -714,15 +714,14 @@ struct btrfs_delayed_root; #define BTRFS_FS_BTREE_ERR 11 #define BTRFS_FS_LOG1_ERR 12 #define BTRFS_FS_LOG2_ERR 13 -#define BTRFS_FS_QUOTA_OVERRIDE14 -/* Used to record internally whether fs has been frozen */ -#define BTRFS_FS_FROZEN15 - /* * Indicate that a whole-filesystem exclusive operation is running * (device replace, resize, device add/delete, balance) */ #define BTRFS_FS_EXCL_OP 14 +/* Used to record internally whether fs has been frozen */ +#define BTRFS_FS_FROZEN15 +#define BTRFS_FS_QUOTA_OVERRIDE16 struct btrfs_fs_info { u8 fsid[BTRFS_FSID_SIZE]; -- 2.13.2 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Can't mount array with super_total_bytes mismatch with fs_devices total_rw_bytes
On 2017年10月04日 07:32, Asif Youssuff wrote: Hi, My power went out at my home, and I'm now having trouble mounting my array. I'm mounting with the 'recovery' option in fstab. When mounting, dmesg output shows: [210017.281912] BTRFS info (device sdb): disk space caching is enabled [210017.281915] BTRFS info (device sdb): has skinny extents [210017.402084] BTRFS error (device sdb): super_total_bytes 92017859088384 mismatch with fs_devices total_rw_bytes 92017859094528 One of your device size is not aligned to 4K. Which is fine, but recently enhanced validation checker does not allow it. (Which should be a regression, and there is some other WARN_ON related to it) [210017.402126] BTRFS error (device sdb): failed to read chunk tree: -22 [210017.461473] BTRFS error (device sdb): open_ctree failed I've tried a few steps -- btrfs-chunk-recover, super-recover and I have run a btrfs check --repair on two of the disks in the array (this takes a very long time, so I'm hoping I don't have to run this on all of the disks). I had run into this problem once before in the past, and I'm not sure how I recovered from it; I may have simply rolled back the booted kernel to escape the extra checks around this mismatch. I'm at a loss for ideas and am running a btrfs-image so I can also report an issue -- I'm not sure whether 'btrfs-image -c9 -t4 /dev/sdo btrfs.image' is the right command to run if it is a multi-device array. Any ideas would be helpful, and I am happy to provide further information. root@ubuntu-server:~# uname -a Linux ubuntu-server 4.14.0-041400rc2-generic #201709242031 SMP Mon Sep 25 00:33:13 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux You can rollback to an earlier kernel to mount the fs. And manually find and resize the device with unaligned size: # btrfs fi show -b And check each device for its size: Label: none uuid: 839ddcfa-5701-4437-aff3-bcb2a26ae6dd Total devices 1 FS bytes used 397312 devid1 size <<10737418240>> used 2172649472 path /dev/mapper/data-btrfs If it's not align, round it down to 4K, and resize it using devid: # btrfs fi resize : All device must be rounded. And the command should finish almost in no time. Then check if latest kernel can mount it. I think it can be made as part of "btrfs check" to fix it. (Although it should be handled by kernel well) Thanks, Qu root@ubuntu-server:~# btrfs --version btrfs-progs v4.13.1 root@ubuntu-server:~# btrfs fi show Label: none uuid: 48ed8a66-731d-499b-829e-dd07dd7260cc Total devices 13 FS bytes used 40.73TiB devid 4 size 5.46TiB used 5.42TiB path /dev/sdo devid 5 size 5.46TiB used 5.42TiB path /dev/sdn devid 7 size 5.46TiB used 5.42TiB path /dev/sdc devid 9 size 5.46TiB used 5.42TiB path /dev/sdk devid 10 size 5.46TiB used 5.42TiB path /dev/sdj devid 11 size 7.28TiB used 7.24TiB path /dev/sdd devid 12 size 5.46TiB used 5.42TiB path /dev/sdm devid 14 size 7.28TiB used 7.24TiB path /dev/sdh devid 15 size 7.28TiB used 7.24TiB path /dev/sdb devid 17 size 7.28TiB used 5.77TiB path /dev/sdl devid 18 size 7.28TiB used 7.24TiB path /dev/sdf devid 20 size 7.28TiB used 7.24TiB path /dev/sdi devid 21 size 7.28TiB used 7.24TiB path /dev/sdg Thanks, Asif -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[RFC] Is it advisable to use btrfs check --repair flag to fix/find errors?
Hi, We are researchers from UT Austin, working on building CrashMonkey[1], a simple, flexible, file-system agnostic test framework to systematically check file-systems for inconsistencies if a failure occurs during a file operation. Here is a brief description of what we are trying to do: Firstly we mount the filesystem(fs), run a few tests on the mounted fs and log the bio requests sent to the fs. We then construct different crash states that are possible by starting with the snapshot of the initial state of the disk and applying different permutations of a subset of the logged bio requests, respecting the ordering rules set by FUA and flush flags. Later, we run file system consistency checks/repairs on these generated crash states to repair the possible inconsistencies and find out if there are still any irreparable inconsistencies. HotStorage'17 Paper, CrashMonkey[2]: A Framework to Automatically Test File-System Crash Consistency has detailed explanation of the methodology. For this purpose, is it advisable to run btrfs check with --repair flag to fix or find errors? We have seen - "Warning: Do not use --repair unless you are advised to by a developer, an experienced user or accept the fact that fsck cannot possibly fix all sorts of damage that could happen to a filesystem because of software and hardware bugs". Hence, please let us know what you think regarding this! Also, the output of `btrfs check` only hints on saying something is wrong by setting err to -1. Is there a way to find out what exactly was found by btrfs? Thanks, Soujanya. $ uname -r 4.4.0-62-generic $ btrfs --version btrfs-progs v4.4 $ btrfs fi show Label: 'btrfs' uuid: 3e6e7154-79b0-44b2-9193-945a86d61550 Total devices 1 FS bytes used 392.00KiB devid1 size 10.00GiB used 2.02GiB path /dev/sda3 $ btrfs fi df /mountpoint Data, single: total=8.00MiB, used=264.00KiB System, DUP: total=8.00MiB, used=16.00KiB Metadata, DUP: total=1.00GiB, used=112.00KiB GlobalReserve, single: total=16.00MiB, used=0.00B $ dmesg > dmesg.log www.cs.utexas.edu/~soujanya/dmesg.log [1] https://github.com/utsaslab/crashmonkey [2] http://www.cs.utexas.edu/%7Evijay/papers/hotstorage17-crashmonkey.pdf -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Can't mount array with super_total_bytes mismatch with fs_devices total_rw_bytes
Hi, My power went out at my home, and I'm now having trouble mounting my array. I'm mounting with the 'recovery' option in fstab. When mounting, dmesg output shows: [210017.281912] BTRFS info (device sdb): disk space caching is enabled [210017.281915] BTRFS info (device sdb): has skinny extents [210017.402084] BTRFS error (device sdb): super_total_bytes 92017859088384 mismatch with fs_devices total_rw_bytes 92017859094528 [210017.402126] BTRFS error (device sdb): failed to read chunk tree: -22 [210017.461473] BTRFS error (device sdb): open_ctree failed I've tried a few steps -- btrfs-chunk-recover, super-recover and I have run a btrfs check --repair on two of the disks in the array (this takes a very long time, so I'm hoping I don't have to run this on all of the disks). I had run into this problem once before in the past, and I'm not sure how I recovered from it; I may have simply rolled back the booted kernel to escape the extra checks around this mismatch. I'm at a loss for ideas and am running a btrfs-image so I can also report an issue -- I'm not sure whether 'btrfs-image -c9 -t4 /dev/sdo btrfs.image' is the right command to run if it is a multi-device array. Any ideas would be helpful, and I am happy to provide further information. root@ubuntu-server:~# uname -a Linux ubuntu-server 4.14.0-041400rc2-generic #201709242031 SMP Mon Sep 25 00:33:13 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux root@ubuntu-server:~# btrfs --version btrfs-progs v4.13.1 root@ubuntu-server:~# btrfs fi show Label: none uuid: 48ed8a66-731d-499b-829e-dd07dd7260cc Total devices 13 FS bytes used 40.73TiB devid4 size 5.46TiB used 5.42TiB path /dev/sdo devid5 size 5.46TiB used 5.42TiB path /dev/sdn devid7 size 5.46TiB used 5.42TiB path /dev/sdc devid9 size 5.46TiB used 5.42TiB path /dev/sdk devid 10 size 5.46TiB used 5.42TiB path /dev/sdj devid 11 size 7.28TiB used 7.24TiB path /dev/sdd devid 12 size 5.46TiB used 5.42TiB path /dev/sdm devid 14 size 7.28TiB used 7.24TiB path /dev/sdh devid 15 size 7.28TiB used 7.24TiB path /dev/sdb devid 17 size 7.28TiB used 5.77TiB path /dev/sdl devid 18 size 7.28TiB used 7.24TiB path /dev/sdf devid 20 size 7.28TiB used 7.24TiB path /dev/sdi devid 21 size 7.28TiB used 7.24TiB path /dev/sdg Thanks, Asif [0.00] Linux version 4.14.0-041400rc2-generic (kernel@tangerine) (gcc version 7.2.0 (Ubuntu 7.2.0-6ubuntu1)) #201709242031 SMP Mon Sep 25 00:33:13 UTC 2017 [0.00] Command line: BOOT_IMAGE=/vmlinuz-4.14.0-041400rc2-generic root=/dev/mapper/ubuntu--server--vg-root ro [0.00] KERNEL supported cpus: [0.00] Intel GenuineIntel [0.00] AMD AuthenticAMD [0.00] Centaur CentaurHauls [0.00] x86/fpu: Supporting XSAVE feature 0x001: 'x87 floating point registers' [0.00] x86/fpu: Supporting XSAVE feature 0x002: 'SSE registers' [0.00] x86/fpu: Supporting XSAVE feature 0x004: 'AVX registers' [0.00] x86/fpu: xstate_offset[2]: 576, xstate_sizes[2]: 256 [0.00] x86/fpu: Enabled xstate features 0x7, context size is 832 bytes, using 'standard' format. [0.00] e820: BIOS-provided physical RAM map: [0.00] BIOS-e820: [mem 0x-0x00099bff] usable [0.00] BIOS-e820: [mem 0x00099c00-0x0009] reserved [0.00] BIOS-e820: [mem 0x000e-0x000f] reserved [0.00] BIOS-e820: [mem 0x0010-0xcd4fbfff] usable [0.00] BIOS-e820: [mem 0xcd4fc000-0xcd502fff] ACPI NVS [0.00] BIOS-e820: [mem 0xcd503000-0xdd7f1fff] usable [0.00] BIOS-e820: [mem 0xdd7f2000-0xdd8d8fff] reserved [0.00] BIOS-e820: [mem 0xdd8d9000-0xdd924fff] usable [0.00] BIOS-e820: [mem 0xdd925000-0xdda5bfff] ACPI NVS [0.00] BIOS-e820: [mem 0xdda5c000-0xdf7fefff] reserved [0.00] BIOS-e820: [mem 0xdf7ff000-0xdf7f] usable [0.00] BIOS-e820: [mem 0xf800-0xfbff] reserved [0.00] BIOS-e820: [mem 0xfec0-0xfec00fff] reserved [0.00] BIOS-e820: [mem 0xfed0-0xfed03fff] reserved [0.00] BIOS-e820: [mem 0xfed1c000-0xfed1] reserved [0.00] BIOS-e820: [mem 0xfee0-0xfee00fff] reserved [0.00] BIOS-e820: [mem 0xff00-0x] reserved [0.00] BIOS-e820: [mem 0x0001-0x00081fff] usable [0.00] NX (Execute Disable) protection: active [0.00] random: fast init done [0.00] SMBIOS 2.7 present. [0.00] DMI: Supermicro X10SLM-F/X10SLM-F, BIOS 3.0 04/24/2015 [0.00] tsc: Fast TSC calibration using PIT [0.00] e820: update [mem 0x-0x0fff] usable ==> reserved [0.00] e820: remove [mem 0x000a-0x000f] usable [
Re: Seeking Help on Corruption Issues
On 10/3/2017 2:11 PM, Hugo Mills wrote: Hi, Stephen, On Tue, Oct 03, 2017 at 08:52:04PM +, Stephen Nesbitt wrote: Here it i. There are a couple of out-of-order entries beginning at 117. And yes I did uncover a bad stick of RAM: btrfs-progs v4.9.1 leaf 2589782867968 items 134 free space 6753 generation 3351574 owner 2 fs uuid 24b768c3-2141-44bf-ae93-1c3833c8c8e3 chunk uuid 19ce12f0-d271-46b8-a691-e0d26c1790c6 [snip] item 116 key (1623012749312 EXTENT_ITEM 45056) itemoff 10908 itemsize 53 extent refs 1 gen 3346444 flags DATA extent data backref root 271 objectid 2478 offset 0 count 1 item 117 key (1621939052544 EXTENT_ITEM 8192) itemoff 10855 itemsize 53 extent refs 1 gen 3346495 flags DATA extent data backref root 271 objectid 21751764 offset 6733824 count 1 item 118 key (1623012450304 EXTENT_ITEM 8192) itemoff 10802 itemsize 53 extent refs 1 gen 3351513 flags DATA extent data backref root 271 objectid 5724364 offset 680640512 count 1 item 119 key (1623012802560 EXTENT_ITEM 12288) itemoff 10749 itemsize 53 extent refs 1 gen 3346376 flags DATA extent data backref root 271 objectid 21751764 offset 6701056 count 1 hex(1623012749312) '0x179e3193000' hex(1621939052544) '0x179a319e000' hex(1623012450304) '0x179e314a000' hex(1623012802560) '0x179e31a' That's "e" -> "a" in the fourth hex digit, which is a single-bit flip, and should be fixable by btrfs check (I think). However, even fixing that, it's not ordered, because 118 is then before 117, which could be another bitflip ("9" -> "4" in the 7th digit), but two bad bits that close to each other seems unlikely to me. Hugo. Hope this is a duplicate reply - I might have fat fingered something. The underlying file is disposable/replaceable. Any way to zero out/zap the bad BTRFS entry? -steve -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Something like ZFS Channel Programs for BTRFS & probably XFS or even VFS?
On Tue, Oct 03, 2017 at 01:40:51PM -0700, Matthew Wilcox wrote: > On Wed, Oct 04, 2017 at 07:10:35AM +1100, Dave Chinner wrote: > > On Tue, Oct 03, 2017 at 03:19:18PM +0200, Martin Steigerwald wrote: > > > [repost. I didn´t notice autocompletion gave me wrong address for > > > fsdevel, > > > blacklisted now] > > > > > > Hello. > > > > > > What do you think of > > > > > > http://open-zfs.org/wiki/Projects/ZFS_Channel_Programs > > > > Domain not found. > > Must be an Australian problem ... Probably, I forgot to stand on my head so everything must have been sent to the server upside down Though it is a curious failure - it failed until I went to "openzfs.org" and that redirected to "open-zfs.org" and now it all works. Somewhat bizarre. > A ZFS channel program (ZCP) is a small script written in a domain specific > language that manipulate ZFS internals in a single, atomically-visible > operation. For instance, to delete all snapshots of a filesystem a ZCP > could be written which 1) generates the list of snapshots, 2) traverses > that list, and 3) destroys each snapshot unconditionally. Because > each of these statements would be evaluated from within the kernel, > ZCPs can guarantee safety from interference with other concurrent ZFS > modifications. Executing from inside the kernel allows us to guarantee > atomic visibility of these operations (correctness) and allows them to > be performed in a single transaction group (performance). > > A successful implementation of ZCP will: > > 1. Support equivalent functionality for all of the current ZFS commands > with improved performance and correctness from the point of view of the > user of ZFS. > > 2. Facilitate the quick addition of new and useful commands as > ZCP enables the implementation of more powerful operations which > previously would have been unsafe to implement in user programs, or > would require modifications to the kernel for correctness. Since the > ZCP layer guarantees the atomicity of each ZCP, we only need to write > new sync_tasks for individual simple operations, then can use ZCPs to > chain those simple operations together into more complicated operations. > > 3. Allow ZFS users to safely implement their own ZFS operations without > performing operations they don’t have the privileges for. > > 4. Improve the performance and correctness of existing applications > built on ZFS operations. /me goes and looks at the slides Seems like they are trying to solve a problem of their own making, in that admin operations are run by the kernel from a separate task that is really, really slow. So this scripting is a method of aggregating multiple "sync tasks" into a single operation so there isn't delays between tasks. /me chokes on slide 8/8 "Add a Lua interpreter to the kernel, implement ZFS intrinsics (...) as extensions to the Lua language" Somehow, I don't see that happening in Linux. Yes, I can see us potentially adding some custom functionality in filesystems with eBPF (e.g. custom allocation policies), but I think admin operations need to be done from userspace through a clear, stable interface that supports all the necessary primitives to customise admin operations for different needs. Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Seeking Help on Corruption Issues
Hi, Stephen, On Tue, Oct 03, 2017 at 08:52:04PM +, Stephen Nesbitt wrote: > Here it i. There are a couple of out-of-order entries beginning at 117. And > yes I did uncover a bad stick of RAM: > > btrfs-progs v4.9.1 > leaf 2589782867968 items 134 free space 6753 generation 3351574 owner 2 > fs uuid 24b768c3-2141-44bf-ae93-1c3833c8c8e3 > chunk uuid 19ce12f0-d271-46b8-a691-e0d26c1790c6 [snip] > item 116 key (1623012749312 EXTENT_ITEM 45056) itemoff 10908 itemsize 53 > extent refs 1 gen 3346444 flags DATA > extent data backref root 271 objectid 2478 offset 0 count 1 > item 117 key (1621939052544 EXTENT_ITEM 8192) itemoff 10855 itemsize 53 > extent refs 1 gen 3346495 flags DATA > extent data backref root 271 objectid 21751764 offset 6733824 count 1 > item 118 key (1623012450304 EXTENT_ITEM 8192) itemoff 10802 itemsize 53 > extent refs 1 gen 3351513 flags DATA > extent data backref root 271 objectid 5724364 offset 680640512 count 1 > item 119 key (1623012802560 EXTENT_ITEM 12288) itemoff 10749 itemsize 53 > extent refs 1 gen 3346376 flags DATA > extent data backref root 271 objectid 21751764 offset 6701056 count 1 >>> hex(1623012749312) '0x179e3193000' >>> hex(1621939052544) '0x179a319e000' >>> hex(1623012450304) '0x179e314a000' >>> hex(1623012802560) '0x179e31a' That's "e" -> "a" in the fourth hex digit, which is a single-bit flip, and should be fixable by btrfs check (I think). However, even fixing that, it's not ordered, because 118 is then before 117, which could be another bitflip ("9" -> "4" in the 7th digit), but two bad bits that close to each other seems unlikely to me. Hugo. -- Hugo Mills | Great films about cricket: Silly Point Break hugo@... carfax.org.uk | http://carfax.org.uk/ | PGP: E2AB1DE4 | signature.asc Description: Digital signature
Re: Something like ZFS Channel Programs for BTRFS & probably XFS or even VFS?
On Wed, Oct 04, 2017 at 07:10:35AM +1100, Dave Chinner wrote: > On Tue, Oct 03, 2017 at 03:19:18PM +0200, Martin Steigerwald wrote: > > [repost. I didn´t notice autocompletion gave me wrong address for fsdevel, > > blacklisted now] > > > > Hello. > > > > What do you think of > > > > http://open-zfs.org/wiki/Projects/ZFS_Channel_Programs > > Domain not found. Must be an Australian problem ... A ZFS channel program (ZCP) is a small script written in a domain specific language that manipulate ZFS internals in a single, atomically-visible operation. For instance, to delete all snapshots of a filesystem a ZCP could be written which 1) generates the list of snapshots, 2) traverses that list, and 3) destroys each snapshot unconditionally. Because each of these statements would be evaluated from within the kernel, ZCPs can guarantee safety from interference with other concurrent ZFS modifications. Executing from inside the kernel allows us to guarantee atomic visibility of these operations (correctness) and allows them to be performed in a single transaction group (performance). A successful implementation of ZCP will: 1. Support equivalent functionality for all of the current ZFS commands with improved performance and correctness from the point of view of the user of ZFS. 2. Facilitate the quick addition of new and useful commands as ZCP enables the implementation of more powerful operations which previously would have been unsafe to implement in user programs, or would require modifications to the kernel for correctness. Since the ZCP layer guarantees the atomicity of each ZCP, we only need to write new sync_tasks for individual simple operations, then can use ZCPs to chain those simple operations together into more complicated operations. 3. Allow ZFS users to safely implement their own ZFS operations without performing operations they don’t have the privileges for. 4. Improve the performance and correctness of existing applications built on ZFS operations. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Something like ZFS Channel Programs for BTRFS & probably XFS or even VFS?
On 10/03/17 13:10, Dave Chinner wrote: > On Tue, Oct 03, 2017 at 03:19:18PM +0200, Martin Steigerwald wrote: >> [repost. I didn´t notice autocompletion gave me wrong address for fsdevel, >> blacklisted now] >> >> Hello. >> >> What do you think of >> >> http://open-zfs.org/wiki/Projects/ZFS_Channel_Programs > > Domain not found. It works for me. -- ~Randy -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Something like ZFS Channel Programs for BTRFS & probably XFS or even VFS?
On Tue, Oct 03, 2017 at 03:19:18PM +0200, Martin Steigerwald wrote: > [repost. I didn´t notice autocompletion gave me wrong address for fsdevel, > blacklisted now] > > Hello. > > What do you think of > > http://open-zfs.org/wiki/Projects/ZFS_Channel_Programs Domain not found. -Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Seeking Help on Corruption Issues
On Tue, Oct 03, 2017 at 01:06:50PM -0700, Stephen Nesbitt wrote: > All: > > I came back to my computer yesterday to find my filesystem in read > only mode. Running a btrfs scrub start -dB aborts as follows: > > btrfs scrub start -dB /mnt > ERROR: scrubbing /mnt failed for device id 4: ret=-1, errno=5 > (Input/output error) > ERROR: scrubbing /mnt failed for device id 5: ret=-1, errno=5 > (Input/output error) > scrub device /dev/sdb (id 4) canceled > scrub started at Mon Oct 2 21:51:46 2017 and was aborted after > 00:09:02 > total bytes scrubbed: 75.58GiB with 1 errors > error details: csum=1 > corrected errors: 0, uncorrectable errors: 1, unverified errors: 0 > scrub device /dev/sdc (id 5) canceled > scrub started at Mon Oct 2 21:51:46 2017 and was aborted after > 00:11:11 > total bytes scrubbed: 50.75GiB with 0 errors > > The resulting dmesg is: > [ 699.534066] BTRFS error (device sdc): bdev /dev/sdb errs: wr 0, > rd 0, flush 0, corrupt 6, gen 0 > [ 699.703045] BTRFS error (device sdc): unable to fixup (regular) > error at logical 1609808347136 on dev /dev/sdb > [ 783.306525] BTRFS critical (device sdc): corrupt leaf, bad key > order: block=2589782867968, root=1, slot=116 This error usually means bad RAM. Can you show us the output of "btrfs-debug-tree -b 2589782867968 /dev/sdc"? Hugo. > [ 789.776132] BTRFS critical (device sdc): corrupt leaf, bad key > order: block=2589782867968, root=1, slot=116 > [ 911.529842] BTRFS critical (device sdc): corrupt leaf, bad key > order: block=2589782867968, root=1, slot=116 > [ 918.365225] BTRFS critical (device sdc): corrupt leaf, bad key > order: block=2589782867968, root=1, slot=116 > > Running btrfs check /dev/sdc results in: > btrfs check /dev/sdc > Checking filesystem on /dev/sdc > UUID: 24b768c3-2141-44bf-ae93-1c3833c8c8e3 > checking extents > bad key ordering 116 117 > bad block 2589782867968 > ERROR: errors found in extent allocation tree or chunk allocation > checking free space cache > There is no free space entry for 1623012450304-1623012663296 > There is no free space entry for 1623012450304-1623225008128 > cache appears valid but isn't 1622151266304 > found 288815742976 bytes used err is -22 > total csum bytes: 0 > total tree bytes: 350781440 > total fs tree bytes: 0 > total extent tree bytes: 350027776 > btree space waste bytes: 115829777 > file data blocks allocated: 156499968 > > uname -a: > Linux sysresccd 4.9.24-std500-amd64 #2 SMP Sat Apr 22 17:14:43 UTC > 2017 x86_64 Intel(R) Core(TM) i5-6500 CPU @ 3.20GHz GenuineIntel > GNU/Linux > > btrfs --version: btrfs-progs v4.9.1 > > btrfs fi show: > Label: none uuid: 24b768c3-2141-44bf-ae93-1c3833c8c8e3 > Total devices 2 FS bytes used 475.08GiB > devid 4 size 931.51GiB used 612.06GiB path /dev/sdb > devid 5 size 931.51GiB used 613.09GiB path /dev/sdc > > btrfs fi df /mnt: > Data, RAID1: total=603.00GiB, used=468.03GiB > System, RAID1: total=64.00MiB, used=112.00KiB > System, single: total=32.00MiB, used=0.00B > Metadata, RAID1: total=9.00GiB, used=7.04GiB > Metadata, single: total=1.00GiB, used=0.00B > GlobalReserve, single: total=512.00MiB, used=0.00B > > What is the recommended procedure at this point? Run btrfs check > --repair? I have backups so losing a file or two isn't critical, but > I really don't want to go through the effort of a bare metal > reinstall. > > In the process of researching this I did uncover a bad DIMM. Am I > correct that the problems I'm seeing are likely linked to the > resulting memory errors. > > Thx in advance, > > -steve > -- Hugo Mills | Quidquid latine dictum sit, altum videtur hugo@... carfax.org.uk | http://carfax.org.uk/ | PGP: E2AB1DE4 | signature.asc Description: Digital signature
Seeking Help on Corruption Issues
All: I came back to my computer yesterday to find my filesystem in read only mode. Running a btrfs scrub start -dB aborts as follows: btrfs scrub start -dB /mnt ERROR: scrubbing /mnt failed for device id 4: ret=-1, errno=5 (Input/output error) ERROR: scrubbing /mnt failed for device id 5: ret=-1, errno=5 (Input/output error) scrub device /dev/sdb (id 4) canceled scrub started at Mon Oct 2 21:51:46 2017 and was aborted after 00:09:02 total bytes scrubbed: 75.58GiB with 1 errors error details: csum=1 corrected errors: 0, uncorrectable errors: 1, unverified errors: 0 scrub device /dev/sdc (id 5) canceled scrub started at Mon Oct 2 21:51:46 2017 and was aborted after 00:11:11 total bytes scrubbed: 50.75GiB with 0 errors The resulting dmesg is: [ 699.534066] BTRFS error (device sdc): bdev /dev/sdb errs: wr 0, rd 0, flush 0, corrupt 6, gen 0 [ 699.703045] BTRFS error (device sdc): unable to fixup (regular) error at logical 1609808347136 on dev /dev/sdb [ 783.306525] BTRFS critical (device sdc): corrupt leaf, bad key order: block=2589782867968, root=1, slot=116 [ 789.776132] BTRFS critical (device sdc): corrupt leaf, bad key order: block=2589782867968, root=1, slot=116 [ 911.529842] BTRFS critical (device sdc): corrupt leaf, bad key order: block=2589782867968, root=1, slot=116 [ 918.365225] BTRFS critical (device sdc): corrupt leaf, bad key order: block=2589782867968, root=1, slot=116 Running btrfs check /dev/sdc results in: btrfs check /dev/sdc Checking filesystem on /dev/sdc UUID: 24b768c3-2141-44bf-ae93-1c3833c8c8e3 checking extents bad key ordering 116 117 bad block 2589782867968 ERROR: errors found in extent allocation tree or chunk allocation checking free space cache There is no free space entry for 1623012450304-1623012663296 There is no free space entry for 1623012450304-1623225008128 cache appears valid but isn't 1622151266304 found 288815742976 bytes used err is -22 total csum bytes: 0 total tree bytes: 350781440 total fs tree bytes: 0 total extent tree bytes: 350027776 btree space waste bytes: 115829777 file data blocks allocated: 156499968 uname -a: Linux sysresccd 4.9.24-std500-amd64 #2 SMP Sat Apr 22 17:14:43 UTC 2017 x86_64 Intel(R) Core(TM) i5-6500 CPU @ 3.20GHz GenuineIntel GNU/Linux btrfs --version: btrfs-progs v4.9.1 btrfs fi show: Label: none uuid: 24b768c3-2141-44bf-ae93-1c3833c8c8e3 Total devices 2 FS bytes used 475.08GiB devid 4 size 931.51GiB used 612.06GiB path /dev/sdb devid 5 size 931.51GiB used 613.09GiB path /dev/sdc btrfs fi df /mnt: Data, RAID1: total=603.00GiB, used=468.03GiB System, RAID1: total=64.00MiB, used=112.00KiB System, single: total=32.00MiB, used=0.00B Metadata, RAID1: total=9.00GiB, used=7.04GiB Metadata, single: total=1.00GiB, used=0.00B GlobalReserve, single: total=512.00MiB, used=0.00B What is the recommended procedure at this point? Run btrfs check --repair? I have backups so losing a file or two isn't critical, but I really don't want to go through the effort of a bare metal reinstall. In the process of researching this I did uncover a bad DIMM. Am I correct that the problems I'm seeing are likely linked to the resulting memory errors. Thx in advance, -steve -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] btrfs: avoid overflow when sector_t is 32 bit
From: Goffredo Baroncelli Jean-Denis Girard noticed commit c821e7f3 "pass bytes to btrfs_bio_alloc" (https://patchwork.kernel.org/patch/9763081/) introduces a regression on 32 bit machines. When CONFIG_LBDAF is _not_ defined (CONFIG_LBDAF == Support for large (2TB+) block devices and files) sector_t is 32 bit on 32bit machines. In the function submit_extent_page, 'sector' (which is sector_t type) is multiplied by 512 to convert it from sectors to bytes, leading to an overflow when the disk is bigger than 4GB (!). I added a cast to u64 to avoid overflow. Based on v4.14-rc3. Signed-off-by: Goffredo Baroncelli Tested-by: Jean-Denis Girard --- fs/btrfs/extent_io.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index 12ab19a4b93e..970190cd347e 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -2801,7 +2801,7 @@ static int submit_extent_page(unsigned int opf, struct extent_io_tree *tree, } } - bio = btrfs_bio_alloc(bdev, sector << 9); + bio = btrfs_bio_alloc(bdev, (u64)sector << 9); bio_add_page(bio, page, page_size, offset); bio->bi_end_io = end_io_func; bio->bi_private = tree; -- 2.14.2 -- gpg @keyserver.linux.it: Goffredo Baroncelli Key fingerprint BBF5 1610 0B64 DAC6 5F7D 17B2 0EDA 9B37 8B82 E0B5 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] btrfs-progs: misc-test: use raid1 for data to enable mount with -o degraded
On Tue, Oct 03, 2017 at 03:47:26PM +0900, Misono, Tomohiro wrote: > kernel 4.14 introduces new function for checking if all chunks is ok for > mount with -o degraded option. > > commit 21634a19f646 ("btrfs: Introduce a function to check if all > chunks a OK for degraded rw mount") > > As a result, raid0 profile cannot be mounted with -o degraded on 4.14. > This causes failure of the misc-test 011 "delete missing device". > > Fix this by using raid1 profile for both data and metadata. > This also should work for kernel before 4.13. > > Signed-off-by: Tomohiro Misono Applied, thanks. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Btrfs: fix fs_info->flags value
On Mon, Oct 02, 2017 at 05:34:12PM +0900, Tsutomu Itoh wrote: > Because the values of BTRFS_FS_QUOTA_OVERRIDE and BTRFS_FS_EXCL_OP overlap, > we should change the value. > > Signed-off-by: Tsutomu Itoh Please write a more descriptive subject and changelog. > --- > fs/btrfs/ctree.h | 3 +-- > 1 file changed, 1 insertion(+), 2 deletions(-) > > diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h > index 899ddaeeacec..566c0ba8dfb8 100644 > --- a/fs/btrfs/ctree.h > +++ b/fs/btrfs/ctree.h > @@ -717,12 +717,11 @@ struct btrfs_delayed_root; > #define BTRFS_FS_QUOTA_OVERRIDE 14 > /* Used to record internally whether fs has been frozen */ > #define BTRFS_FS_FROZEN 15 > - Unrelated change. > /* > * Indicate that a whole-filesystem exclusive operation is running > * (device replace, resize, device add/delete, balance) > */ > -#define BTRFS_FS_EXCL_OP 14 > +#define BTRFS_FS_EXCL_OP 16 Strange how this could have got there. I was suspecting a mis-merge but the patches for number 14 went in in different releases so this actually slipped through the review. Please update and resend the patch with the following tags: Fixes: f29efe292198b ("btrfs: add quota override flag to enable quota override for CAP_SYS_RESOURCE") CC: sta...@vger.kernel.org # 4.13+ -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v8 2/2] btrfs: check device for critical errors and mark failed
From: Anand Jain Write and flush errors are critical errors, upon which the device fd must be closed and marked as failed. There are two type of device close in btrfs, one, close as part of clean up where we shall release the struct btrfs_device and or btrfs_fs_devices as well. And the other type which is introduced here is where we close the device fd for the reason that it has failed and the mounted FS is still present using the other redundant device. In this new case we shall keep the failed device's struct btrfs_device similar to missing device. Further the approach here is to monitor the device statistics and trigger the action based on one or more device state. Signed-off-by: Anand Jain Tested-by: Austin S. Hemmelgarn --- V8: General misc cleanup. Based on v4.14-rc2 fs/btrfs/ctree.h | 2 ++ fs/btrfs/disk-io.c | 78 +- fs/btrfs/volumes.c | 1 + fs/btrfs/volumes.h | 4 +++ 4 files changed, 84 insertions(+), 1 deletion(-) diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index 5a8933da39a7..bad8fbaff18d 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -824,6 +824,7 @@ struct btrfs_fs_info { struct mutex tree_log_mutex; struct mutex transaction_kthread_mutex; struct mutex cleaner_mutex; + struct mutex health_mutex; struct mutex chunk_mutex; struct mutex volume_mutex; @@ -941,6 +942,7 @@ struct btrfs_fs_info { struct btrfs_workqueue *extent_workers; struct task_struct *transaction_kthread; struct task_struct *cleaner_kthread; + struct task_struct *health_kthread; int thread_pool_size; struct kobject *space_info_kobj; diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index 487bbe4fb3c6..be22104bafbf 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -1922,6 +1922,70 @@ static int cleaner_kthread(void *arg) return 0; } +static void btrfs_check_device_fatal_errors(struct btrfs_root *root) +{ + struct btrfs_device *device; + struct btrfs_fs_info *fs_info = root->fs_info; + + /* Mark devices with write or flush errors as failed. */ + mutex_lock(&fs_info->volume_mutex); + list_for_each_entry_rcu(device, + &fs_info->fs_devices->devices, dev_list) { + int c_err; + + if (device->failed) + continue; + + /* Todo: Skip replace target for now. */ + if (device->is_tgtdev_for_dev_replace) + continue; + if (!device->dev_stats_valid) + continue; + + c_err = atomic_read(&device->new_critical_errs); + atomic_sub(c_err, &device->new_critical_errs); + if (c_err) { + btrfs_crit_in_rcu(fs_info, + "%s: Fatal write/flush error", + rcu_str_deref(device->name)); + btrfs_mark_device_failed(device); + } + } + mutex_unlock(&fs_info->volume_mutex); +} + +static int health_kthread(void *arg) +{ + struct btrfs_root *root = arg; + + do { + /* Todo rename the below function */ + if (btrfs_need_cleaner_sleep(root->fs_info)) + goto sleep; + + if (!mutex_trylock(&root->fs_info->health_mutex)) + goto sleep; + + if (btrfs_need_cleaner_sleep(root->fs_info)) { + mutex_unlock(&root->fs_info->health_mutex); + goto sleep; + } + + /* Check devices health */ + btrfs_check_device_fatal_errors(root); + + mutex_unlock(&root->fs_info->health_mutex); + +sleep: + set_current_state(TASK_INTERRUPTIBLE); + if (!kthread_should_stop()) + schedule(); + __set_current_state(TASK_RUNNING); + } while (!kthread_should_stop()); + + return 0; +} + static int transaction_kthread(void *arg) { struct btrfs_root *root = arg; @@ -1969,6 +2033,7 @@ static int transaction_kthread(void *arg) btrfs_end_transaction(trans); } sleep: + wake_up_process(fs_info->health_kthread); wake_up_process(fs_info->cleaner_kthread); mutex_unlock(&fs_info->transaction_kthread_mutex); @@ -2713,6 +2778,7 @@ int open_ctree(struct super_block *sb, mutex_init(&fs_info->chunk_mutex); mutex_init(&fs_info->transaction_kthread_mutex); mutex_init(&fs_info->cleaner_mutex); + mutex_init(&fs_info->health_mutex); mutex_init(&fs_info->volume_mutex); mutex_init(&fs_info->ro_block_group_mutex); init_rwsem(&fs_info->commit_root_sem); @@ -3049,11 +3115,16 @@ int open_ctree(struct super_block *sb, if (IS_ERR(fs_info->cleaner_kthread)
[PATCH v8 1/2] btrfs: introduce device dynamic state transition to failed
From: Anand Jain This patch provides helper functions to force a device to failed, and we need it for the following reasons, 1) a. It can be reported that device has failed when it does and b. Close the device when it goes offline so that blocklayer can cleanup 2) Identify the candidate for the auto replace 3) Stop further RW to the failing device and 4) A device in the multi device btrfs may fail, but as of now in some system config whole of btrfs gets unmounted. Signed-off-by: Anand Jain Tested-by: Austin S. Hemmelgarn --- V8: General misc cleanup. Based on v4.14-rc2 fs/btrfs/volumes.c | 104 + fs/btrfs/volumes.h | 15 +++- 2 files changed, 118 insertions(+), 1 deletion(-) diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index 0e8f16c305df..06e7cf4cef81 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -7255,3 +7255,107 @@ void btrfs_reset_fs_info_ptr(struct btrfs_fs_info *fs_info) fs_devices = fs_devices->seed; } } + +static void do_close_device(struct work_struct *work) +{ + struct btrfs_device *device; + + device = container_of(work, struct btrfs_device, rcu_work); + + if (device->closing_bdev) + blkdev_put(device->closing_bdev, device->mode); + + device->closing_bdev = NULL; +} + +static void btrfs_close_one_device(struct rcu_head *head) +{ + struct btrfs_device *device; + + device = container_of(head, struct btrfs_device, rcu); + + INIT_WORK(&device->rcu_work, do_close_device); + schedule_work(&device->rcu_work); +} + +void btrfs_force_device_close(struct btrfs_device *device) +{ + struct btrfs_fs_info *fs_info; + struct btrfs_fs_devices *fs_devices; + + fs_devices = device->fs_devices; + fs_info = fs_devices->fs_info; + + btrfs_sysfs_rm_device_link(fs_devices, device); + + mutex_lock(&fs_devices->device_list_mutex); + mutex_lock(&fs_devices->fs_info->chunk_mutex); + + btrfs_assign_next_active_device(fs_devices->fs_info, device, NULL); + + if (device->bdev) + fs_devices->open_devices--; + + if (device->writeable) { + list_del_init(&device->dev_alloc_list); + fs_devices->rw_devices--; + } + device->writeable = 0; + + /* +* Todo: We have miss-used missing flag all around, and here +* too for now. (In the long run I want to keep missing to only +* indicate that it was not present when RAID was assembled.) +*/ + device->missing = 1; + fs_devices->missing_devices++; + device->closing_bdev = device->bdev; + device->bdev = NULL; + + call_rcu(&device->rcu, btrfs_close_one_device); + + mutex_unlock(&fs_devices->fs_info->chunk_mutex); + mutex_unlock(&fs_devices->device_list_mutex); + + rcu_barrier(); + + btrfs_warn_in_rcu(fs_info, "device %s failed", + rcu_str_deref(device->name)); + + /* +* We lost one/more disk, which means its not as it +* was configured by the user. Show mount should show +* degraded. +*/ + btrfs_set_opt(fs_info->mount_opt, DEGRADED); + + /* +* Now having lost one of the device, check if chunk stripe +* is incomplete and handle fatal error if needed. +*/ + if (!btrfs_check_rw_degradable(fs_info)) + btrfs_handle_fs_error(fs_info, -EIO, + "devices below critical level"); +} + +void btrfs_mark_device_failed(struct btrfs_device *dev) +{ + struct btrfs_fs_devices *fs_devices = dev->fs_devices; + + /* This shouldn't be called if device is already missing */ + if (dev->missing || !dev->bdev) + return; + if (dev->failed) + return; + dev->failed = 1; + + /* Last RW device is requested to force close let FS handle it. */ + if (fs_devices->rw_devices == 1) { + btrfs_handle_fs_error(fs_devices->fs_info, -EIO, + "Last RW device failed"); + return; + } + + /* Point of no return start here. */ + btrfs_force_device_close(dev); +} diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h index 6108fdfec67f..05b150c03995 100644 --- a/fs/btrfs/volumes.h +++ b/fs/btrfs/volumes.h @@ -65,13 +65,26 @@ struct btrfs_device { struct btrfs_pending_bios pending_sync_bios; struct block_device *bdev; + struct block_device *closing_bdev; /* the mode sent to blkdev_get */ fmode_t mode; int writeable; int in_fs_metadata; + /* missing: device not found at the time of mount */ int missing; + /* failed: device confirmed to have experienced critical io failure */ + int failed; + /* + * offline: system or user or block layer transport has removed +
[PATCH v8 0/2] [RFC] Introduce device state 'failed'
When one device fails it has to be closed and marked as failed. Further it needs sysfs (or some) interface to provide complete information about the device and the volume status to the user land from the kernel. Next when the disappeared device reappears we need to resilver/insync depending on the RAID profile which should be handled per RAID profile specific. The efforts here are to fix above three missing items. To begin with this patch brings a Write/Flush failed device to a failed state. Next about bringing the device back to the alloc list and verifying its consistency and kicking off the re-silvering part that still WIP, & feedback helps. For RAID1 a convert of single raid profile back to all raid1 will help. For RAID56 I am backing on Luibo's recent RAID56 write hole work I am yet to look deeper on that. Next for RAID1 there can be split brain scenario where each of the devices were mounted independently, so to fix this I planning to set an (new) incompatible flag if any of the device is written without the other. Now when they are brought together then incompatible flag should be their on only one of the device, however if incompatible flag is on both the devices then its a split brain scenario where user intervention will be required. On the sysfs part there are patches in the ML which was sent before, I shall be reviving them as well. Thanks, Anand Anand Jain (2): btrfs: introduce device dynamic state transition to failed btrfs: check device for critical errors and mark failed fs/btrfs/ctree.h | 2 + fs/btrfs/disk-io.c | 78 ++- fs/btrfs/volumes.c | 105 + fs/btrfs/volumes.h | 19 +- 4 files changed, 202 insertions(+), 2 deletions(-) -- 2.7.0 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Lost about 3TB
On Tue, Oct 03, 2017 at 05:45:54PM +0200, fred.lar...@free.fr wrote: > Hi, > > > > What does "btrfs sub list -a /RAID01/" say? > Nothing (no lines displayed) > > > Also "grep /RAID01/ /proc/self/mountinfo"? > Nothing (no lines displayed) > > > Also server has been rebooted many times and no process has left "deleted > open files" on the volume (lsof...). OK. The second command (the grep) was incorrect -- I should have omitted the slashes. However, it doesn't matter too much, because the first command indicates that you don't have any subvolumes or snapshots anyway. This means that you're probably looking at the kind of issue Timofey mentioned in his mail, where writes into the middle of an existing extent don't free up the overwritten data. This is most likely to happen on database or VM files, but could happen on others, depending on the application and how it uses files. Since you don't seem to have any snapshots, I _think_ you can deal with the issue most easily by defragmenting the affected files. It's worth just getting a second opinion on this one before you try it for the whole FS. I'm not 100% sure about what defrag will do in this case, and there are some people round here who have investigated the behaviour of partially-overwritten extents in more detail than I have. Hugo. > Fred. > > > - Mail original - > De: "Hugo Mills - h...@carfax.org.uk" > > À: "btrfs fredo" > Cc: linux-btrfs@vger.kernel.org > Envoyé: Mardi 3 Octobre 2017 12:54:05 > Objet: Re: Lost about 3TB > > On Tue, Oct 03, 2017 at 12:44:29PM +0200, btrfs.fr...@xoxy.net wrote: > > Hi, > > > > I can't figure out were 3TB on a 36 TB BTRFS volume (on LVM) are gone ! > > > > I know BTRFS can be tricky when speaking about space usage when using many > > physical drives in a RAID setup, but my conf is a very simple BTRFS volume > > without RAID(single Data type) using the whole disk (perhaps did I do > > something wrong with the LVM setup ?). > > > > My BTRFS volume is mounted on /RAID01/. > > > > There's only one folder in /RAID01/ shared with Samba, Windows also see a > > total of 28 TB used. > > > > It only contains 443 files (big backup files created by Veeam), most of the > > file size is greater than 1GB and be be up to 5TB. > > > > ##> du -hs /RAID01/ > > 28T /RAID01/ > > > > If I sum up the result of : ##> find . -printf '%s\n' > > I also find 28TB. > > > > I extracted btrfs binary from rpm version v4.9.1 and used ##> btrfs fi > > du > > on each file and the result is 28TB. > >The conclusion here is that there are things that aren't being > found by these processes. This is usually in the form of dot-files > (but I think you've covered that case in what you did above) or > snapshots/subvolumes outside the subvol you've mounted. > >What does "btrfs sub list -a /RAID01/" say? >Also "grep /RAID01/ /proc/self/mountinfo"? > >There are other possibilities for missing space, but let's cover > the obvious ones first. > >Hugo. > > > OS : CentOS Linux release 7.3.1611 (Core) > > btrfs-progs v4.4.1 > > > > > > ##> ssm list > > > > - > > DeviceFree Used Total Pool Mount point > > - > > /dev/sda 36.39 TB PARTITIONED > > /dev/sda1 200.00 MB /boot/efi > > /dev/sda2 1.00 GB /boot > > /dev/sda3 0.00 KB 36.32 TB 36.32 TB lvm_pool > > /dev/sda4 0.00 KB 54.00 GB 54.00 GB cl_xxx-xxxamrepo-01 > > - > > --- > > PoolType Devices Free Used Total > > --- > > cl_xxx-xxxamrepo-01 lvm10.00 KB 54.00 GB 54.00 GB > > lvm_poollvm10.00 KB 36.32 TB 36.32 TB > > btrfs_lvm_pool-lvol001 btrfs 14.84 TB 36.32 TB 36.32 TB > > --- > > - > > Volume PoolVolume size FS > > FS size Free TypeMount point > > - > > /dev/cl_xxx-xxxamrepo-01/root cl_xxx-xxxamrepo-0150.00 GB xfs > > 49.97 GB 48.50 GB linear / > > /dev/cl_xxx-xxxamrepo-01/swap cl_xxx-xxxamrepo-01 4.00 GB > > linear > > /dev/lvm_pool/lvol001 lvm_pool 36.32 TB > >
Re: Lost about 3TB
Hi, > What does "btrfs sub list -a /RAID01/" say? Nothing (no lines displayed) > Also "grep /RAID01/ /proc/self/mountinfo"? Nothing (no lines displayed) Also server has been rebooted many times and no process has left "deleted open files" on the volume (lsof...). Fred. - Mail original - De: "Hugo Mills - h...@carfax.org.uk" À: "btrfs fredo" Cc: linux-btrfs@vger.kernel.org Envoyé: Mardi 3 Octobre 2017 12:54:05 Objet: Re: Lost about 3TB On Tue, Oct 03, 2017 at 12:44:29PM +0200, btrfs.fr...@xoxy.net wrote: > Hi, > > I can't figure out were 3TB on a 36 TB BTRFS volume (on LVM) are gone ! > > I know BTRFS can be tricky when speaking about space usage when using many > physical drives in a RAID setup, but my conf is a very simple BTRFS volume > without RAID(single Data type) using the whole disk (perhaps did I do > something wrong with the LVM setup ?). > > My BTRFS volume is mounted on /RAID01/. > > There's only one folder in /RAID01/ shared with Samba, Windows also see a > total of 28 TB used. > > It only contains 443 files (big backup files created by Veeam), most of the > file size is greater than 1GB and be be up to 5TB. > > ##> du -hs /RAID01/ > 28T /RAID01/ > > If I sum up the result of : ##> find . -printf '%s\n' > I also find 28TB. > > I extracted btrfs binary from rpm version v4.9.1 and used ##> btrfs fi du > on each file and the result is 28TB. The conclusion here is that there are things that aren't being found by these processes. This is usually in the form of dot-files (but I think you've covered that case in what you did above) or snapshots/subvolumes outside the subvol you've mounted. What does "btrfs sub list -a /RAID01/" say? Also "grep /RAID01/ /proc/self/mountinfo"? There are other possibilities for missing space, but let's cover the obvious ones first. Hugo. > OS : CentOS Linux release 7.3.1611 (Core) > btrfs-progs v4.4.1 > > > ##> ssm list > > - > DeviceFree Used Total Pool Mount point > - > /dev/sda 36.39 TB PARTITIONED > /dev/sda1 200.00 MB /boot/efi > /dev/sda2 1.00 GB /boot > /dev/sda3 0.00 KB 36.32 TB 36.32 TB lvm_pool > /dev/sda4 0.00 KB 54.00 GB 54.00 GB cl_xxx-xxxamrepo-01 > - > --- > PoolType Devices Free Used Total > --- > cl_xxx-xxxamrepo-01 lvm10.00 KB 54.00 GB 54.00 GB > lvm_poollvm10.00 KB 36.32 TB 36.32 TB > btrfs_lvm_pool-lvol001 btrfs 14.84 TB 36.32 TB 36.32 TB > --- > - > Volume PoolVolume size FS > FS size Free TypeMount point > - > /dev/cl_xxx-xxxamrepo-01/root cl_xxx-xxxamrepo-0150.00 GB xfs > 49.97 GB 48.50 GB linear / > /dev/cl_xxx-xxxamrepo-01/swap cl_xxx-xxxamrepo-01 4.00 GB > linear > /dev/lvm_pool/lvol001 lvm_pool 36.32 TB > linear /RAID01 > btrfs_lvm_pool-lvol001 btrfs_lvm_pool-lvol001 36.32 TB btrfs > 36.32 TB4.84 TB btrfs /RAID01 > /dev/sda1200.00 MB vfat > part/boot/efi > /dev/sda2 1.00 GB xfs > 1015.00 MB 882.54 MB part/boot > - > > > ##> btrfs fi sh > > Label: none uuid: df7ce232-056a-4c27-bde4-6f785d5d9f68 > Total devices 1 FS bytes used 31.48TiB > devid1 size 36.32TiB used 31.66TiB path > /dev/mapper/lvm_pool-lvol001 > > > > ##> btrfs fi df /RAID01/ > > Data, single: total=31.58TiB, used=31.44TiB > System, DUP: total=8.00MiB, used=3.67MiB > Metadata, DUP: total=38.00GiB, used=35.37GiB > GlobalReserve, single: total=512.00MiB, used=0.00B > > > > I tried to repair it : > > > ##> btrfs check --repair -p /dev/mapper/lvm_pool-lvol001 > > enabling repair mode > Checking filesystem on /dev/mapper/lvm_pool-lvol001 > UUID: df7ce232-056a-4c27-bde4-6f785d5d9f68 > checking extents > Fixed 0 roo
[PATCH 1/4] Btrfs: compress_file_range() remove dead variable num_bytes
Remove dead assigment of num_bytes Also as num_bytes only used in will_compress block as copy of total_in just replace that with total_in and drop num_bytes entire Signed-off-by: Timofey Titovets Reviewed-by: Nikolay Borisov --- fs/btrfs/inode.c | 10 +++--- 1 file changed, 3 insertions(+), 7 deletions(-) diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index b728397ba6e1..237df8fdf7b8 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -458,7 +458,6 @@ static noinline void compress_file_range(struct inode *inode, { struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb); struct btrfs_root *root = BTRFS_I(inode)->root; - u64 num_bytes; u64 blocksize = fs_info->sectorsize; u64 actual_end; u64 isize = i_size_read(inode); @@ -508,8 +507,6 @@ static noinline void compress_file_range(struct inode *inode, total_compressed = min_t(unsigned long, total_compressed, BTRFS_MAX_UNCOMPRESSED); - num_bytes = ALIGN(end - start + 1, blocksize); - num_bytes = max(blocksize, num_bytes); total_in = 0; ret = 0; @@ -628,7 +625,6 @@ static noinline void compress_file_range(struct inode *inode, */ total_in = ALIGN(total_in, PAGE_SIZE); if (total_compressed + blocksize <= total_in) { - num_bytes = total_in; *num_added += 1; /* @@ -636,12 +632,12 @@ static noinline void compress_file_range(struct inode *inode, * allocation on disk for these compressed pages, and * will submit them to the elevator. */ - add_async_extent(async_cow, start, num_bytes, + add_async_extent(async_cow, start, total_in, total_compressed, pages, nr_pages, compress_type); - if (start + num_bytes < end) { - start += num_bytes; + if (start + total_in < end) { + start += total_in; pages = NULL; cond_resched(); goto again; -- 2.14.2 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 4/4] Btrfs: btrfs_dedupe_file_range() ioctl, remove 16MiB restriction
At now btrfs_dedupe_file_range() restricted to 16MiB range for limit locking time and memory requirement for dedup ioctl() For too big input rage code silently set range to 16MiB Let's remove that restriction by do iterating over dedup range. That's backward compatible and will not change anything for request less then 16MiB. Signed-off-by: Timofey Titovets Reviewed-by: Qu Wenruo --- fs/btrfs/ioctl.c | 22 ++ 1 file changed, 18 insertions(+), 4 deletions(-) diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c index 31407c62da63..4b468e5dfa11 100644 --- a/fs/btrfs/ioctl.c +++ b/fs/btrfs/ioctl.c @@ -3200,11 +3200,9 @@ ssize_t btrfs_dedupe_file_range(struct file *src_file, u64 loff, u64 olen, struct inode *src = file_inode(src_file); struct inode *dst = file_inode(dst_file); u64 bs = BTRFS_I(src)->root->fs_info->sb->s_blocksize; + u64 i, tail_len, chunk_count; ssize_t res; - if (olen > BTRFS_MAX_DEDUPE_LEN) - olen = BTRFS_MAX_DEDUPE_LEN; - if (WARN_ON_ONCE(bs < PAGE_SIZE)) { /* * Btrfs does not support blocksize < page_size. As a @@ -3214,7 +3212,23 @@ ssize_t btrfs_dedupe_file_range(struct file *src_file, u64 loff, u64 olen, return -EINVAL; } - res = btrfs_extent_same(src, loff, olen, dst, dst_loff); + tail_len = olen % BTRFS_MAX_DEDUPE_LEN; + chunk_count = div_u64(olen, BTRFS_MAX_DEDUPE_LEN); + + for (i = 0; i < chunk_count; i++) { + res = btrfs_extent_same(src, loff, BTRFS_MAX_DEDUPE_LEN, + dst, dst_loff); + if (res) + return res; + + loff += BTRFS_MAX_DEDUPE_LEN; + dst_loff += BTRFS_MAX_DEDUPE_LEN; + } + + if (tail_len > 0) + res = btrfs_extent_same(src, loff, tail_len, + dst, dst_loff); + if (res) return res; return olen; -- 2.14.2 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 3/4] Btrfs: handle unaligned tail of data ranges more efficient
At now while switch page bits in data ranges we always hande +1 page, for cover case where end of data range is not page aligned Let's handle that case more obvious and efficient Check end aligment directly and touch +1 page only then needed Signed-off-by: Timofey Titovets --- fs/btrfs/extent_io.c | 12 ++-- fs/btrfs/inode.c | 6 +- 2 files changed, 15 insertions(+), 3 deletions(-) diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index 0538bf85adc3..131b7d1df9f7 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -1359,7 +1359,11 @@ void extent_range_clear_dirty_for_io(struct inode *inode, u64 start, u64 end) unsigned long end_index = end >> PAGE_SHIFT; struct page *page; - while (index <= end_index) { + /* Don't miss unaligned end */ + if (!IS_ALIGNED(end, PAGE_SIZE)) + end_index++; + + while (index < end_index) { page = find_get_page(inode->i_mapping, index); BUG_ON(!page); /* Pages should be in the extent_io_tree */ clear_page_dirty_for_io(page); @@ -1374,7 +1378,11 @@ void extent_range_redirty_for_io(struct inode *inode, u64 start, u64 end) unsigned long end_index = end >> PAGE_SHIFT; struct page *page; - while (index <= end_index) { + /* Don't miss unaligned end */ + if (!IS_ALIGNED(end, PAGE_SIZE)) + end_index++; + + while (index < end_index) { page = find_get_page(inode->i_mapping, index); BUG_ON(!page); /* Pages should be in the extent_io_tree */ __set_page_dirty_nobuffers(page); diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index b6e81bd650ea..b4974d969f67 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -10799,7 +10799,11 @@ void btrfs_set_range_writeback(void *private_data, u64 start, u64 end) unsigned long end_index = end >> PAGE_SHIFT; struct page *page; - while (index <= end_index) { + /* Don't miss unaligned end */ + if (!IS_ALIGNED(end, PAGE_SIZE)) + end_index++; + + while (index < end_index) { page = find_get_page(inode->i_mapping, index); ASSERT(page); /* Pages should be in the extent_io_tree */ set_page_writeback(page); -- 2.14.2 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/4] Btrfs: clear_dirty only on pages only in compression range
We need to call extent_range_clear_dirty_for_io() on compression range to prevent application from changing page content, while pages compressing. but "(end - start)" can be much (up to 1024 times) bigger then compression range (BTRFS_MAX_UNCOMPRESSED), so optimize that by calculating compression range for that loop iteration, and flip bits only on that range v1 -> v2: - Make that more obviously and more safeprone v2 -> v3: - Rebased on: Btrfs: compress_file_range() remove dead variable num_bytes - Update change log - Add comments Signed-off-by: Timofey Titovets --- fs/btrfs/inode.c | 27 ++- 1 file changed, 22 insertions(+), 5 deletions(-) diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 237df8fdf7b8..b6e81bd650ea 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -460,6 +460,7 @@ static noinline void compress_file_range(struct inode *inode, struct btrfs_root *root = BTRFS_I(inode)->root; u64 blocksize = fs_info->sectorsize; u64 actual_end; + u64 current_end; u64 isize = i_size_read(inode); int ret = 0; struct page **pages = NULL; @@ -505,6 +506,21 @@ static noinline void compress_file_range(struct inode *inode, (start > 0 || end + 1 < BTRFS_I(inode)->disk_i_size)) goto cleanup_and_bail_uncompressed; + /* +* We need to call extent_range_clear_dirty_for_io() +* on compression range to prevent application from changing +* page content, while pages compressing. +* +* but (end - start) can be much (up to 1024 times) bigger +* then compression range, so optimize that +* by calculating compression range for +* that iteration, and flip bits only on that range +*/ + if (end - start > BTRFS_MAX_UNCOMPRESSED) + current_end = start + BTRFS_MAX_UNCOMPRESSED; + else + current_end = end; + total_compressed = min_t(unsigned long, total_compressed, BTRFS_MAX_UNCOMPRESSED); total_in = 0; @@ -515,7 +531,7 @@ static noinline void compress_file_range(struct inode *inode, * inode has not been flagged as nocompress. This flag can * change at any time if we discover bad compression ratios. */ - if (inode_need_compress(inode, start, end)) { + if (inode_need_compress(inode, start, current_end)) { WARN_ON(pages); pages = kcalloc(nr_pages, sizeof(struct page *), GFP_NOFS); if (!pages) { @@ -530,14 +546,15 @@ static noinline void compress_file_range(struct inode *inode, /* * we need to call clear_page_dirty_for_io on each -* page in the range. Otherwise applications with the file -* mmap'd can wander in and change the page contents while +* page in compression the range. +* Otherwise applications with the file mmap'd +* can wander in and change the page contents while * we are compressing them. * * If the compression fails for any reason, we set the pages * dirty again later on. */ - extent_range_clear_dirty_for_io(inode, start, end); + extent_range_clear_dirty_for_io(inode, start, current_end); redirty = 1; /* Compression level is applied here and only here */ @@ -678,7 +695,7 @@ static noinline void compress_file_range(struct inode *inode, /* unlocked later on in the async handlers */ if (redirty) - extent_range_redirty_for_io(inode, start, end); + extent_range_redirty_for_io(inode, start, current_end); add_async_extent(async_cow, start, end - start + 1, 0, NULL, 0, BTRFS_COMPRESS_NONE); *num_added += 1; -- 2.14.2 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 0/4] Just bunch of btrfs patches
Some patches has review, some not, all compile tested and hand tested. (i.e. boot into patched system and do some small tests). All based on kDave for-next branch Patches: 1. Just remove useless u64 num_bytes from compress_file_range() No functional changes 2. For make compression on on mmap'd files safe, while compression logic works, we switch page dirty page bit on whole input range, but input range can be much bigger the 128KiB So try optimize that by only switch bits on current compression range 3. Function: extent_range_clear_dirty_for_io() extent_range_redirty_for_io() btrfs_set_range_writeback() Used to switch some bits on pages, but use not obvious while (index <= end_index) to cover unaligned end to pages. (I don't think that not obvious for me only, as on IRC no one can help me understand that until i found answer) So i change handling of unaligned end to more obvious way 4. btrfs_dedupe_file_range() on range bigger then 16MiB instead of return error, silently set it to 16MiB. So just add loop over input range, to get working bigger range P.S. May be that make a sense to change loop iterator to some lower value if one of deduped files are compressed? Thanks. Timofey Titovets (4): Btrfs: compress_file_range() remove dead variable num_bytes Btrfs: clear_dirty only on pages in compression range Btrfs: handle unaligned tail of data ranges more efficient Btrfs: btrfs_dedupe_file_range() ioctl, remove 16MiB restriction fs/btrfs/extent_io.c | 12 ++-- fs/btrfs/inode.c | 43 ++- fs/btrfs/ioctl.c | 22 ++ 3 files changed, 58 insertions(+), 19 deletions(-) -- 2.14.2 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Kickstarting snapshot-aware defrag?
Hi, It seems to me that the proposal[1] for a snapshot-aware defrag has long been abandoned. Since most peoples badly need this feature I tought about how to possibly speed up the achievement of this goal. I know of several bounty-based kickstarting platforms, among them the best ones are probably bountysource.com[2] and freedomsponsors.org[3]. With both platforms everyone interested can place a bounty on the issue and if/when there will be someone interested to implement it he will get the bounty. I created an issue on both of them just to show how the platform will handle it. Since btrfs is a small community, before actually placing bounties and sponsoring it I would like to know if there is someone against this development model or someone interested in implementing a feature because of a bounty. Bests, Niccolò [1]https://www.spinics.net/lists/linux-btrfs/msg34539.html [2]https://www.bountysource.com/issues/50004702-feature-request-snapshot-aware-defrag [3]https://freedomsponsors.org/issue/817/feature-request-snapshot-aware-defrag?alert=KICKSTART -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Something like ZFS Channel Programs for BTRFS & probably XFS or even VFS?
[repost. I didn´t notice autocompletion gave me wrong address for fsdevel, blacklisted now] Hello. What do you think of http://open-zfs.org/wiki/Projects/ZFS_Channel_Programs ? There are quite some BTRFS maintenance programs like the deduplication stuff. Also regular scrubs… and in certain circumstances probably balances can make sense. In addition to this XFS got scrub functionality as well. Now putting the foundation for such a functionality in the kernel I think would only be reasonable if it cannot be done purely within user space, so I wonder about the safety from other concurrent ZFS modification and atomicity that are mentioned on the wiki page. The second set of slides, those the OpenZFS Developer Commit 2014, which are linked to on the wiki page explain this more. (I didn´t look the first ones, as I am no fan of slideshare.net and prefer a simple PDF to download and view locally anytime, not for privacy reasons alone, but also to avoid a using a crappy webpage over a wonderfully functional PDF viewer fat client like Okular) Also I wonder about putting a lua interpreter into the kernel, but it seems at least NetBSD developers added one to their kernel with version 7.0¹. I also ask this cause I wondered about a kind of fsmaintd or volmaintd for quite a while, and thought… it would be nice to do this in a generic way, as BTRFS is not the only filesystem which supports maintenance operations. However if it can all just nicely be done in userspace, I am all for it. [1] http://www.netbsd.org/releases/formal-7/NetBSD-7.0.html (tons of presentation PDFs on their site as well) Thanks, -- Martin -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Lost about 3TB
On Tue, 3 Oct 2017 10:54:05 + Hugo Mills wrote: >There are other possibilities for missing space, but let's cover > the obvious ones first. One more obvious thing would be files that are deleted, but still kept open by some app (possibly even from network, via NFS or SMB!). @Frederic, did you try rebooting the system? -- With respect, Roman -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Lost about 3TB
2017-10-03 13:54 GMT+03:00 Hugo Mills : > On Tue, Oct 03, 2017 at 12:44:29PM +0200, btrfs.fr...@xoxy.net wrote: >> Hi, >> >> I can't figure out were 3TB on a 36 TB BTRFS volume (on LVM) are gone ! >> >> I know BTRFS can be tricky when speaking about space usage when using many >> physical drives in a RAID setup, but my conf is a very simple BTRFS volume >> without RAID(single Data type) using the whole disk (perhaps did I do >> something wrong with the LVM setup ?). >> >> My BTRFS volume is mounted on /RAID01/. >> >> There's only one folder in /RAID01/ shared with Samba, Windows also see a >> total of 28 TB used. >> >> It only contains 443 files (big backup files created by Veeam), most of the >> file size is greater than 1GB and be be up to 5TB. >> >> ##> du -hs /RAID01/ >> 28T /RAID01/ >> >> If I sum up the result of : ##> find . -printf '%s\n' >> I also find 28TB. >> >> I extracted btrfs binary from rpm version v4.9.1 and used ##> btrfs fi du >> on each file and the result is 28TB. > >The conclusion here is that there are things that aren't being > found by these processes. This is usually in the form of dot-files > (but I think you've covered that case in what you did above) or > snapshots/subvolumes outside the subvol you've mounted. > >What does "btrfs sub list -a /RAID01/" say? >Also "grep /RAID01/ /proc/self/mountinfo"? > >There are other possibilities for missing space, but let's cover > the obvious ones first. > >Hugo. > >> OS : CentOS Linux release 7.3.1611 (Core) >> btrfs-progs v4.4.1 >> >> >> ##> ssm list >> >> - >> DeviceFree Used Total Pool Mount point >> - >> /dev/sda 36.39 TB PARTITIONED >> /dev/sda1 200.00 MB /boot/efi >> /dev/sda2 1.00 GB /boot >> /dev/sda3 0.00 KB 36.32 TB 36.32 TB lvm_pool >> /dev/sda4 0.00 KB 54.00 GB 54.00 GB cl_xxx-xxxamrepo-01 >> - >> --- >> PoolType Devices Free Used Total >> --- >> cl_xxx-xxxamrepo-01 lvm10.00 KB 54.00 GB 54.00 GB >> lvm_poollvm10.00 KB 36.32 TB 36.32 TB >> btrfs_lvm_pool-lvol001 btrfs 14.84 TB 36.32 TB 36.32 TB >> --- >> - >> Volume PoolVolume size FS >> FS size Free TypeMount point >> - >> /dev/cl_xxx-xxxamrepo-01/root cl_xxx-xxxamrepo-0150.00 GB xfs >> 49.97 GB 48.50 GB linear / >> /dev/cl_xxx-xxxamrepo-01/swap cl_xxx-xxxamrepo-01 4.00 GB >> linear >> /dev/lvm_pool/lvol001 lvm_pool 36.32 TB >> linear /RAID01 >> btrfs_lvm_pool-lvol001 btrfs_lvm_pool-lvol001 36.32 TB btrfs >> 36.32 TB4.84 TB btrfs /RAID01 >> /dev/sda1200.00 MB vfat >> part/boot/efi >> /dev/sda2 1.00 GB xfs >> 1015.00 MB 882.54 MB part/boot >> - >> >> >> ##> btrfs fi sh >> >> Label: none uuid: df7ce232-056a-4c27-bde4-6f785d5d9f68 >> Total devices 1 FS bytes used 31.48TiB >> devid1 size 36.32TiB used 31.66TiB path >> /dev/mapper/lvm_pool-lvol001 >> >> >> >> ##> btrfs fi df /RAID01/ >> >> Data, single: total=31.58TiB, used=31.44TiB >> System, DUP: total=8.00MiB, used=3.67MiB >> Metadata, DUP: total=38.00GiB, used=35.37GiB >> GlobalReserve, single: total=512.00MiB, used=0.00B >> >> >> >> I tried to repair it : >> >> >> ##> btrfs check --repair -p /dev/mapper/lvm_pool-lvol001 >> >> enabling repair mode >> Checking filesystem on /dev/mapper/lvm_pool-lvol001 >> UUID: df7ce232-056a-4c27-bde4-6f785d5d9f68 >> checking extents >> Fixed 0 roots. >> cache and super generation don't match, space cache will be invalidated >> checking fs roots >> checking csums >> checking root refs >> found 34600611349019 bytes used err is 0 >> total csum bytes: 33752513152 >> total tree bytes: 38037848064 >> total fs tree bytes: 583942144 >> total extent tree bytes: 653754368 >> btree
Re: Lost about 3TB
On Tue, Oct 03, 2017 at 12:44:29PM +0200, btrfs.fr...@xoxy.net wrote: > Hi, > > I can't figure out were 3TB on a 36 TB BTRFS volume (on LVM) are gone ! > > I know BTRFS can be tricky when speaking about space usage when using many > physical drives in a RAID setup, but my conf is a very simple BTRFS volume > without RAID(single Data type) using the whole disk (perhaps did I do > something wrong with the LVM setup ?). > > My BTRFS volume is mounted on /RAID01/. > > There's only one folder in /RAID01/ shared with Samba, Windows also see a > total of 28 TB used. > > It only contains 443 files (big backup files created by Veeam), most of the > file size is greater than 1GB and be be up to 5TB. > > ##> du -hs /RAID01/ > 28T /RAID01/ > > If I sum up the result of : ##> find . -printf '%s\n' > I also find 28TB. > > I extracted btrfs binary from rpm version v4.9.1 and used ##> btrfs fi du > on each file and the result is 28TB. The conclusion here is that there are things that aren't being found by these processes. This is usually in the form of dot-files (but I think you've covered that case in what you did above) or snapshots/subvolumes outside the subvol you've mounted. What does "btrfs sub list -a /RAID01/" say? Also "grep /RAID01/ /proc/self/mountinfo"? There are other possibilities for missing space, but let's cover the obvious ones first. Hugo. > OS : CentOS Linux release 7.3.1611 (Core) > btrfs-progs v4.4.1 > > > ##> ssm list > > - > DeviceFree Used Total Pool Mount point > - > /dev/sda 36.39 TB PARTITIONED > /dev/sda1 200.00 MB /boot/efi > /dev/sda2 1.00 GB /boot > /dev/sda3 0.00 KB 36.32 TB 36.32 TB lvm_pool > /dev/sda4 0.00 KB 54.00 GB 54.00 GB cl_xxx-xxxamrepo-01 > - > --- > PoolType Devices Free Used Total > --- > cl_xxx-xxxamrepo-01 lvm10.00 KB 54.00 GB 54.00 GB > lvm_poollvm10.00 KB 36.32 TB 36.32 TB > btrfs_lvm_pool-lvol001 btrfs 14.84 TB 36.32 TB 36.32 TB > --- > - > Volume PoolVolume size FS > FS size Free TypeMount point > - > /dev/cl_xxx-xxxamrepo-01/root cl_xxx-xxxamrepo-0150.00 GB xfs > 49.97 GB 48.50 GB linear / > /dev/cl_xxx-xxxamrepo-01/swap cl_xxx-xxxamrepo-01 4.00 GB > linear > /dev/lvm_pool/lvol001 lvm_pool 36.32 TB > linear /RAID01 > btrfs_lvm_pool-lvol001 btrfs_lvm_pool-lvol001 36.32 TB btrfs > 36.32 TB4.84 TB btrfs /RAID01 > /dev/sda1200.00 MB vfat > part/boot/efi > /dev/sda2 1.00 GB xfs > 1015.00 MB 882.54 MB part/boot > - > > > ##> btrfs fi sh > > Label: none uuid: df7ce232-056a-4c27-bde4-6f785d5d9f68 > Total devices 1 FS bytes used 31.48TiB > devid1 size 36.32TiB used 31.66TiB path > /dev/mapper/lvm_pool-lvol001 > > > > ##> btrfs fi df /RAID01/ > > Data, single: total=31.58TiB, used=31.44TiB > System, DUP: total=8.00MiB, used=3.67MiB > Metadata, DUP: total=38.00GiB, used=35.37GiB > GlobalReserve, single: total=512.00MiB, used=0.00B > > > > I tried to repair it : > > > ##> btrfs check --repair -p /dev/mapper/lvm_pool-lvol001 > > enabling repair mode > Checking filesystem on /dev/mapper/lvm_pool-lvol001 > UUID: df7ce232-056a-4c27-bde4-6f785d5d9f68 > checking extents > Fixed 0 roots. > cache and super generation don't match, space cache will be invalidated > checking fs roots > checking csums > checking root refs > found 34600611349019 bytes used err is 0 > total csum bytes: 33752513152 > total tree bytes: 38037848064 > total fs tree bytes: 583942144 > total extent tree bytes: 653754368 > btree space waste bytes: 2197658704 > file data blocks allocated: 183716661284864 ?? what's this ?? > referenced 30095956975616 = 27.3 TB !! > > >
Lost about 3TB
Hi, I can't figure out were 3TB on a 36 TB BTRFS volume (on LVM) are gone ! I know BTRFS can be tricky when speaking about space usage when using many physical drives in a RAID setup, but my conf is a very simple BTRFS volume without RAID(single Data type) using the whole disk (perhaps did I do something wrong with the LVM setup ?). My BTRFS volume is mounted on /RAID01/. There's only one folder in /RAID01/ shared with Samba, Windows also see a total of 28 TB used. It only contains 443 files (big backup files created by Veeam), most of the file size is greater than 1GB and be be up to 5TB. ##> du -hs /RAID01/ 28T /RAID01/ If I sum up the result of : ##> find . -printf '%s\n' I also find 28TB. I extracted btrfs binary from rpm version v4.9.1 and used ##> btrfs fi du on each file and the result is 28TB. OS : CentOS Linux release 7.3.1611 (Core) btrfs-progs v4.4.1 ##> ssm list - DeviceFree Used Total Pool Mount point - /dev/sda 36.39 TB PARTITIONED /dev/sda1 200.00 MB /boot/efi /dev/sda2 1.00 GB /boot /dev/sda3 0.00 KB 36.32 TB 36.32 TB lvm_pool /dev/sda4 0.00 KB 54.00 GB 54.00 GB cl_xxx-xxxamrepo-01 - --- PoolType Devices Free Used Total --- cl_xxx-xxxamrepo-01 lvm10.00 KB 54.00 GB 54.00 GB lvm_poollvm10.00 KB 36.32 TB 36.32 TB btrfs_lvm_pool-lvol001 btrfs 14.84 TB 36.32 TB 36.32 TB --- - Volume PoolVolume size FS FS size Free TypeMount point - /dev/cl_xxx-xxxamrepo-01/root cl_xxx-xxxamrepo-0150.00 GB xfs 49.97 GB 48.50 GB linear / /dev/cl_xxx-xxxamrepo-01/swap cl_xxx-xxxamrepo-01 4.00 GB linear /dev/lvm_pool/lvol001 lvm_pool 36.32 TB linear /RAID01 btrfs_lvm_pool-lvol001 btrfs_lvm_pool-lvol001 36.32 TB btrfs 36.32 TB4.84 TB btrfs /RAID01 /dev/sda1200.00 MB vfat part/boot/efi /dev/sda2 1.00 GB xfs 1015.00 MB 882.54 MB part/boot - ##> btrfs fi sh Label: none uuid: df7ce232-056a-4c27-bde4-6f785d5d9f68 Total devices 1 FS bytes used 31.48TiB devid1 size 36.32TiB used 31.66TiB path /dev/mapper/lvm_pool-lvol001 ##> btrfs fi df /RAID01/ Data, single: total=31.58TiB, used=31.44TiB System, DUP: total=8.00MiB, used=3.67MiB Metadata, DUP: total=38.00GiB, used=35.37GiB GlobalReserve, single: total=512.00MiB, used=0.00B I tried to repair it : ##> btrfs check --repair -p /dev/mapper/lvm_pool-lvol001 enabling repair mode Checking filesystem on /dev/mapper/lvm_pool-lvol001 UUID: df7ce232-056a-4c27-bde4-6f785d5d9f68 checking extents Fixed 0 roots. cache and super generation don't match, space cache will be invalidated checking fs roots checking csums checking root refs found 34600611349019 bytes used err is 0 total csum bytes: 33752513152 total tree bytes: 38037848064 total fs tree bytes: 583942144 total extent tree bytes: 653754368 btree space waste bytes: 2197658704 file data blocks allocated: 183716661284864 ?? what's this ?? referenced 30095956975616 = 27.3 TB !! Tried the "new usage" display but the problem is the same : 31 TB used but total file size is 28TB Overall: Device size: 36.32TiB Device allocated: 31.65TiB Device unallocated:4.67TiB Device missing: 0.00B Used: 31.52TiB Free (estimated): 4.80TiB (min: 2.46TiB) Data ratio: 1.00 Metadata ratio: 2.00 Global reserve: 512.00MiB (used: 0.00B) Data,single: Size:31.58TiB, Used:31.45TiB /dev/mapper/lvm_pool-lvol001 31.58TiB Metadata,DUP: Size:38.00GiB, Used:35.37GiB /dev/mapper/lvm_pool-lvol001 76.00GiB System,DUP: Size:8.00MiB, U
Something like ZFS Channel Programs for BTRFS & probably XFS or even VFS?
Hello. What do you think of http://open-zfs.org/wiki/Projects/ZFS_Channel_Programs ? There are quite some BTRFS maintenance programs like the deduplication stuff. Also regular scrubs… and in certain circumstances probably balances can make sense. In addition to this XFS got scrub functionality as well. Now putting the foundation for such a functionality in the kernel I think would only be reasonable if it cannot be done purely within user space, so I wonder about the safety from other concurrent ZFS modification and atomicity that are mentioned on the wiki page. The second set of slides, those the OpenZFS Developer Commit 2014, which are linked to on the wiki page explain this more. (I didn´t look the first ones, as I am no fan of slideshare.net and prefer a simple PDF to download and view locally anytime, not for privacy reasons alone, but also to avoid a using a crappy webpage over a wonderfully functional PDF viewer fat client like Okular) Also I wonder about putting a lua interpreter into the kernel, but it seems at least NetBSD developers added one to their kernel with version 7.0¹. I also ask this cause I wondered about a kind of fsmaintd or volmaintd for quite a while, and thought… it would be nice to do this in a generic way, as BTRFS is not the only filesystem which supports maintenance operations. However if it can all just nicely be done in userspace, I am all for it. [1] http://www.netbsd.org/releases/formal-7/NetBSD-7.0.html (tons of presentation PDFs on their site as well) Thanks, -- Martin -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] btrfs-progs: doc: update help/document of btrfs device remove
This patch updates help/document of "btrfs device remove" in two points: 1. Add explanation of 'missing' for 'device remove'. This is only written in wikipage currently. (https://btrfs.wiki.kernel.org/index.php/Using_Btrfs_with_Multiple_Devices) 2. Add example of device removal in the man document. This is because that explanation of "remove" says "See the example section below", but there is no example of removal currently. Signed-off-by: Tomohiro Misono --- Documentation/btrfs-device.asciidoc | 19 +++ cmds-device.c | 10 +- 2 files changed, 28 insertions(+), 1 deletion(-) diff --git a/Documentation/btrfs-device.asciidoc b/Documentation/btrfs-device.asciidoc index 88822ec..dc523a9 100644 --- a/Documentation/btrfs-device.asciidoc +++ b/Documentation/btrfs-device.asciidoc @@ -75,6 +75,10 @@ The operation can take long as it needs to move all data from the device. It is possible to delete the device that was used to mount the filesystem. The device entry in mount table will be replaced by another device name with the lowest device id. ++ +If device is mounted as degraded mode (-o degraded), special term "missing" +can be used for . In that case, the first device that is described by +the filesystem metadata, but not presented at the mount time will be removed. *delete* | [|...] :: Alias of remove kept for backward compatibility @@ -206,6 +210,21 @@ data or the block groups occupy the whole first device. The device size of '/dev/sdb' as seen by the filesystem remains unchanged, but the logical space from 50-100GiB will be unused. + REMOVE DEVICE + +Device removal must satisfy the profile constraints, otherwise the command +fails. For example: + + $ btrfs device remove /dev/sda /mnt + $ ERROR: error removing device '/dev/sda': unable to go below two devices on raid1 + + +In order to remove a device, you need to convert profile in this case: + + $ btrfs balance start -mconvert=dup /mnt + $ btrfs balance start -dconvert=single /mnt + $ btrfs device remove /dev/sda /mnt + DEVICE STATS diff --git a/cmds-device.c b/cmds-device.c index 4337eb2..6cb53ff 100644 --- a/cmds-device.c +++ b/cmds-device.c @@ -224,9 +224,16 @@ static int _cmd_device_remove(int argc, char **argv, return !!ret; } +#define COMMON_USAGE_REMOVE_DELETE \ + "", \ + "If 'missing' is specified for , the first device that is", \ + "described by the filesystem metadata, but not presented at the", \ + "mount time will be removed." + static const char * const cmd_device_remove_usage[] = { "btrfs device remove | [|...] ", "Remove a device from a filesystem", + COMMON_USAGE_REMOVE_DELETE, NULL }; @@ -237,7 +244,8 @@ static int cmd_device_remove(int argc, char **argv) static const char * const cmd_device_delete_usage[] = { "btrfs device delete | [|...] ", - "Remove a device from a filesystem", + "Remove a device from a filesystem (alias of \"btrfs device remove\")", + COMMON_USAGE_REMOVE_DELETE, NULL }; -- 2.9.5 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html