Re: Can't mount array with super_total_bytes mismatch with fs_devices total_rw_bytes

2017-10-03 Thread Qu Wenruo



On 2017年10月04日 12:00, Asif Youssuff wrote:

Thanks for the advice.

On 10/03/2017 09:38 PM, Qu Wenruo wrote:




[210017.281912] BTRFS info (device sdb): disk space caching is enabled
[210017.281915] BTRFS info (device sdb): has skinny extents
[210017.402084] BTRFS error (device sdb): super_total_bytes 
92017859088384

mismatch with fs_devices total_rw_bytes 92017859094528


One of your device size is not aligned to 4K.
Which is fine, but recently enhanced validation checker does not allow 
it.
(Which should be a regression, and there is some other WARN_ON related 
to it)



[210017.402126] BTRFS error (device sdb): failed to read chunk tree: -22
[210017.461473] BTRFS error (device sdb): open_ctree failed

I've tried a few steps --

btrfs-chunk-recover, super-recover and I have run a btrfs check 
--repair on two of the disks in the array (this takes a very long 
time, so I'm hoping I don't have to run this on all of the disks).


I had run into this problem once before in the past, and I'm not sure 
how I recovered from it; I may have simply rolled back the booted 
kernel to escape the extra checks around this mismatch.


I'm at a loss for ideas and am running a btrfs-image so I can also 
report an issue -- I'm not sure whether 'btrfs-image -c9 -t4 /dev/sdo 
btrfs.image' is  the right command to run if it is a multi-device array.


Any ideas would be helpful, and I am happy to provide further 
information.


root@ubuntu-server:~#   uname -a
Linux ubuntu-server 4.14.0-041400rc2-generic #201709242031 SMP Mon 
Sep 25 00:33:13 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux


You can rollback to an earlier kernel to mount the fs.

And manually find and resize the device with unaligned size:

# btrfs fi show -b
And check each device for its size:
Label: none  uuid: 839ddcfa-5701-4437-aff3-bcb2a26ae6dd
 Total devices 1 FS bytes used 397312
 devid    1 size <<10737418240>> used 2172649472 path 
/dev/mapper/data-btrfs


If it's not align, round it down to 4K, and resize it using devid:

# btrfs fi resize : 

All device must be rounded. And the command should finish almost in no 
time.


I was able to mount the fs using kernel version 4.4 and rounded down 
(took the size in bytes, and rounded down to a smaller number divisible 
by 4096).


This is what btrfs fi show looks like now:

asif@ubuntu-server:~$ sudo btrfs fi show --raw
[sudo] password for asif:
Label: none  uuid: 48ed8a66-731d-499b-829e-dd07dd7260cc
 Total devices 13 FS bytes used 44783241728000
 devid    4 size 6001141571584 used 5955045097472 path /dev/sdh
 devid    5 size 6001141571584 used 5955128983552 path /dev/sdg
 devid    7 size 6001141571584 used 5955709960192 path /dev/sdk
 devid    9 size 6001175126016 used 5955716186112 path /dev/sde
 devid   10 size 6001141571584 used 5955145760768 path /dev/sdc
 devid   11 size 8001563222016 used 7955416416256 path /dev/sdl
 devid   12 size 6001175126016 used 5956054286336 path /dev/sdf
 devid   14 size 8001563222016 used 7956009123840 path /dev/sdb
 devid   15 size 8001563222016 used 7956373831680 path /dev/sdj
 devid   17 size 8001563222016 used 6341094866944 path /dev/sdd
 devid   18 size 8001563222016 used 7955827064832 path /dev/sdn
 devid   20 size 8001563222016 used 7955378339840 path /dev/sdi
 devid   21 size 8001563222016 used 7955386728448 path /dev/sdo



Then check if latest kernel can mount it.


Unfortunately, the latest kernel still cannot mount it, showing  the 
same errors as before.


[  139.852862] BTRFS error (device sdj): super_total_bytes 
92017859086336 mismatch with fs_devices total_rw_bytes 92017859092480

[  139.852894] BTRFS error (device sdj): failed to read chunk tree: -22
[  139.916645] BTRFS error (device sdj): open_ctree failed


Then the problem is btrfs doesn't update its super_total_bytes correctly.

From what I can see, grow/shrink only update its delta, not 
re-calculate it.


Before we have a good way to fix it in kernel, the only way is to 
manually modify the superblock to allow it pass kernel validation checker.


Thanks,
Qu




I think it can be made as part of "btrfs check" to fix it.
(Although it should be handled by kernel well)

Thanks,
Qu


Hope there are some other ideas (or please correct me if I have done 
something wrong!).


Thanks,
Asif





root@ubuntu-server:~#   btrfs --version
btrfs-progs v4.13.1

root@ubuntu-server:~#   btrfs fi show
Label: none  uuid: 48ed8a66-731d-499b-829e-dd07dd7260cc
Total devices 13 FS bytes used 40.73TiB
devid    4 size 5.46TiB used 5.42TiB path /dev/sdo
devid    5 size 5.46TiB used 5.42TiB path /dev/sdn
devid    7 size 5.46TiB used 5.42TiB path /dev/sdc
devid    9 size 5.46TiB used 5.42TiB path /dev/sdk
devid   10 size 5.46TiB used 5.42TiB path /dev/sdj
devid   11 size 7.28TiB used 7.24TiB path /dev/sdd
devid   12 size 5.46TiB used 5.42TiB path /dev/sdm
devid   14 size 7.28TiB used 7.24TiB path /dev/sdh
devid   15 size 7.28TiB used 7.24TiB path /dev/sdb
devid   17 siz

Re: Btrfs "failed to repair damaged filesystem" - RAID10 going RO when any write attempts are made

2017-10-03 Thread Timothy White
Any suggestions on this? Or do I just blow it away and hope the bug is
fixed in a newer version?

Regards

Tim

On Mon, Oct 2, 2017 at 8:44 PM, Timothy White  wrote:
> I have a BTRFS RAID 10 filesystem that was crashing and going into RO
> mode. A did a kernel upgrade, upgraded btrfs tools to the latest. A
> scrub was going ok ish. btrfs check showed a number of messages such
> as:
>
> Backref 16562625503232 root 14628 owner 3609 offset 23793664 num_refs
> 0 not found in extent tree
> Incorrect local backref count on 16562625503232 root 14628 owner 3609
> offset 23793664 found 1 wanted 0 back 0x5639c37689d0
> backpointer mismatch on [16562625503232 2703360]
>
> Root 14628 was a subvolume root ID (for docker), given that I didn't
> need any of that data, I removed all those subvolumes under the docker
> subvolume and the docker subvolume. This still showed errors, so I
> tried a btrfs check with repair, which eventually gave me the
> following.
>
> Backref 16563772112896 root 14628 owner 3608 offset 0 num_refs 0 not
> found in extent tree
> Incorrect local backref count on 16563772112896 root 14628 owner 3608
> offset 0 found 1 wanted 0 back 0x55f3403f40b0
> Backref disk bytenr does not match extent record,
> bytenr=16563772112896, ref bytenr=16563813335040
> Backref bytes do not match extent backref, bytenr=16563772112896, ref
> bytes=134217728, backref bytes=133079040
> Backref 16563772112896 root 14628 owner 3607 offset 0 num_refs 0 not
> found in extent tree
> Incorrect local backref count on 16563772112896 root 14628 owner 3607
> offset 0 found 1 wanted 0 back 0x55f3470cfed0
> Backref bytes do not match extent backref, bytenr=16563772112896, ref
> bytes=134217728, backref bytes=41222144
> backpointer mismatch on [16563772112896 134217728]
> attempting to repair backref discrepency for bytenr 16563772112896
> Ref is past the entry end, please take a btrfs-image of this file
> system and send it to a btrfs developer, ref 16563813335040
> failed to repair damaged filesystem, aborting
>
> I've taken a btrfs-image, however it's 12Gb, not sure if the
> developers want that, but I do have it.
>
> Filesystem still crashes and goes read only if I try and make changes
> (even deletes). The latest dmesg that includes that crash is at
> https://drive.google.com/open?id=0B5bmQmu6UugIRFM0RUxwWFdqOGc. I don't
> have the earlier ones.
>
> https://drive.google.com/open?id=0B5bmQmu6UugIYjZkYnA4ZFFpdlE is the
> output from running btrfs check with repair, followed by a second run
> with repair to see if it got anything different.
>
> At this stage, I expect to just blow away the filesystem and restore
> from backups. However it would be nice to fix whatever the issue is.
> Smartctl shows no underlying errors.
>
> What else do the devs want from me before I blow this away? (Or is it
> fixable with something I've missed, as that would save me many hours
> of restoration.
>
> The 12Gb btrfs-image can be uploaded to Google Drive (or FTP) if needed.
>
> Thanks
>
> Tim
>
> $ lsb_release -a
> No LSB modules are available.
> Distributor ID: Debian
> Description: Debian GNU/Linux 9.1 (stretch)
> Release: 9.1
> Codename: stretch
>
> $ uname -a
> Linux bruce 4.12.0-0.bpo.1-amd64 #1 SMP Debian 4.12.6-1~bpo9+1
> (2017-08-27) x86_64 GNU/Linux
>
> $ btrfs --version
> btrfs-progs v4.9.1
>
> $ btrfs fi show
> Label: 'Butter1'  uuid: b8d081ac-0271-4481-9a58-c113c921bf49
> Total devices 4 FS bytes used 5.19TiB
> devid1 size 3.64TiB used 2.60TiB path /dev/sde
> devid2 size 3.64TiB used 2.60TiB path /dev/sdf
> devid3 size 3.64TiB used 2.60TiB path /dev/sdd
> devid4 size 3.64TiB used 2.60TiB path /dev/sdc
>
> $ btrfs fi df /mnt/Butter1
> Data, RAID10: total=5.19TiB, used=5.17TiB
> System, RAID10: total=128.00MiB, used=560.00KiB
> Metadata, RAID10: total=12.00GiB, used=10.34GiB
> Metadata, single: total=16.00MiB, used=0.00B
> GlobalReserve, single: total=512.00MiB, used=0.00B
>
> $btrfs subvolume list /mnt/Butter1
> ID 258 gen 250368 top level 5 path data1
> ID 259 gen 250516 top level 5 path data2
> ID 3648 gen 250476 top level 5 path Photos
> ID 4597 gen 203041 top level 5 path Snapshots/Photos/20160304
> ID 4608 gen 203041 top level 5 path Snapshots/Photos/20160328
> ID 4628 gen 203041 top level 5 path Snapshots/Photos/20160421
> ID 4654 gen 250368 top level 5 path imap
> ID 4656 gen 203041 top level 5 path Snapshots/Photos/20160523
> ID 4893 gen 203041 top level 5 path Snapshots/Photos/20160702
> ID 4946 gen 203041 top level 5 path Snapshots/Photos/20160731
> ID 4947 gen 203041 top level 5 path Snapshots/Photos/20160813
> ID 4948 gen 203041 top level 5 path Snapshots/Photos/2016081301
> ID 4970 gen 203041 top level 5 path Snapshots/Photos/20160919
> ID 5038 gen 203041 top level 5 path Snapshots/Photos/20161229_0911
> ID 5063 gen 250515 top level 5 path mirror
> ID 13170 gen 250428 top level 5 path BizBackups
> ID 13214 gen 250368 top level 5 path SaraLaptopBackup
> ID 13485 gen 250368 top level 5 path PCOMDisks
> ID 16176 g

Re: Can't mount array with super_total_bytes mismatch with fs_devices total_rw_bytes

2017-10-03 Thread Asif Youssuff

Thanks for the advice.

On 10/03/2017 09:38 PM, Qu Wenruo wrote:




[210017.281912] BTRFS info (device sdb): disk space caching is enabled
[210017.281915] BTRFS info (device sdb): has skinny extents
[210017.402084] BTRFS error (device sdb): super_total_bytes 
92017859088384

mismatch with fs_devices total_rw_bytes 92017859094528


One of your device size is not aligned to 4K.
Which is fine, but recently enhanced validation checker does not allow it.
(Which should be a regression, and there is some other WARN_ON related 
to it)



[210017.402126] BTRFS error (device sdb): failed to read chunk tree: -22
[210017.461473] BTRFS error (device sdb): open_ctree failed

I've tried a few steps --

btrfs-chunk-recover, super-recover and I have run a btrfs check 
--repair on two of the disks in the array (this takes a very long 
time, so I'm hoping I don't have to run this on all of the disks).


I had run into this problem once before in the past, and I'm not sure 
how I recovered from it; I may have simply rolled back the booted 
kernel to escape the extra checks around this mismatch.


I'm at a loss for ideas and am running a btrfs-image so I can also 
report an issue -- I'm not sure whether 'btrfs-image -c9 -t4 /dev/sdo 
btrfs.image' is  the right command to run if it is a multi-device array.


Any ideas would be helpful, and I am happy to provide further 
information.


root@ubuntu-server:~#   uname -a
Linux ubuntu-server 4.14.0-041400rc2-generic #201709242031 SMP Mon Sep 
25 00:33:13 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux


You can rollback to an earlier kernel to mount the fs.

And manually find and resize the device with unaligned size:

# btrfs fi show -b
And check each device for its size:
Label: none  uuid: 839ddcfa-5701-4437-aff3-bcb2a26ae6dd
 Total devices 1 FS bytes used 397312
 devid    1 size <<10737418240>> used 2172649472 path 
/dev/mapper/data-btrfs


If it's not align, round it down to 4K, and resize it using devid:

# btrfs fi resize : 

All device must be rounded. And the command should finish almost in no 
time.


I was able to mount the fs using kernel version 4.4 and rounded down 
(took the size in bytes, and rounded down to a smaller number divisible 
by 4096).


This is what btrfs fi show looks like now:

asif@ubuntu-server:~$ sudo btrfs fi show --raw
[sudo] password for asif:
Label: none  uuid: 48ed8a66-731d-499b-829e-dd07dd7260cc
Total devices 13 FS bytes used 44783241728000
devid4 size 6001141571584 used 5955045097472 path /dev/sdh
devid5 size 6001141571584 used 5955128983552 path /dev/sdg
devid7 size 6001141571584 used 5955709960192 path /dev/sdk
devid9 size 6001175126016 used 5955716186112 path /dev/sde
devid   10 size 6001141571584 used 5955145760768 path /dev/sdc
devid   11 size 8001563222016 used 7955416416256 path /dev/sdl
devid   12 size 6001175126016 used 5956054286336 path /dev/sdf
devid   14 size 8001563222016 used 7956009123840 path /dev/sdb
devid   15 size 8001563222016 used 7956373831680 path /dev/sdj
devid   17 size 8001563222016 used 6341094866944 path /dev/sdd
devid   18 size 8001563222016 used 7955827064832 path /dev/sdn
devid   20 size 8001563222016 used 7955378339840 path /dev/sdi
devid   21 size 8001563222016 used 7955386728448 path /dev/sdo



Then check if latest kernel can mount it.


Unfortunately, the latest kernel still cannot mount it, showing  the 
same errors as before.


[  139.852862] BTRFS error (device sdj): super_total_bytes 
92017859086336 mismatch with fs_devices total_rw_bytes 92017859092480

[  139.852894] BTRFS error (device sdj): failed to read chunk tree: -22
[  139.916645] BTRFS error (device sdj): open_ctree failed



I think it can be made as part of "btrfs check" to fix it.
(Although it should be handled by kernel well)

Thanks,
Qu


Hope there are some other ideas (or please correct me if I have done 
something wrong!).


Thanks,
Asif





root@ubuntu-server:~#   btrfs --version
btrfs-progs v4.13.1

root@ubuntu-server:~#   btrfs fi show
Label: none  uuid: 48ed8a66-731d-499b-829e-dd07dd7260cc
Total devices 13 FS bytes used 40.73TiB
devid    4 size 5.46TiB used 5.42TiB path /dev/sdo
devid    5 size 5.46TiB used 5.42TiB path /dev/sdn
devid    7 size 5.46TiB used 5.42TiB path /dev/sdc
devid    9 size 5.46TiB used 5.42TiB path /dev/sdk
devid   10 size 5.46TiB used 5.42TiB path /dev/sdj
devid   11 size 7.28TiB used 7.24TiB path /dev/sdd
devid   12 size 5.46TiB used 5.42TiB path /dev/sdm
devid   14 size 7.28TiB used 7.24TiB path /dev/sdh
devid   15 size 7.28TiB used 7.24TiB path /dev/sdb
devid   17 size 7.28TiB used 5.77TiB path /dev/sdl
devid   18 size 7.28TiB used 7.24TiB path /dev/sdf
devid   20 size 7.28TiB used 7.24TiB path /dev/sdi
devid   21 size 7.28TiB used 7.24TiB path /dev/sdg

Thanks,
Asif

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@v

[PATCH v2] Btrfs: fix overlap of fs_info->flags values

2017-10-03 Thread Tsutomu Itoh
Because the values of BTRFS_FS_EXCL_OP and BTRFS_FS_QUOTA_OVERRIDE overlap,
we should change the value.

First, BTRFS_FS_EXCL_OP was set to 14.

  commit 171938e52807 ("btrfs: track exclusive filesystem operation in flags")

Next, the value of BTRFS_FS_QUOTA_OVERRIDE was set to 14.

  commit f29efe292198 ("btrfs: add quota override flag to enable quota override 
for CAP_SYS_RESOURCE")

As a result, the value 14 overlapped.
This problem is solved by defining the value of BTRFS_FS_QUOTA_OVERRIDE
as 16.

Fixes: f29efe292198 ("btrfs: add quota override flag to enable quota override 
for CAP_SYS_RESOURCE")
CC: sta...@vger.kernel.org # 4.13+
Signed-off-by: Tsutomu Itoh 
---
v2: changed the value of BTRFS_FS_QUOTA_OVERRIDE instead of BTRFS_FS_EXCL_OP
to 16.

 fs/btrfs/ctree.h | 7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 899ddaeeacec..d265ea7f763e 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -714,15 +714,14 @@ struct btrfs_delayed_root;
 #define BTRFS_FS_BTREE_ERR 11
 #define BTRFS_FS_LOG1_ERR  12
 #define BTRFS_FS_LOG2_ERR  13
-#define BTRFS_FS_QUOTA_OVERRIDE14
-/* Used to record internally whether fs has been frozen */
-#define BTRFS_FS_FROZEN15
-
 /*
  * Indicate that a whole-filesystem exclusive operation is running
  * (device replace, resize, device add/delete, balance)
  */
 #define BTRFS_FS_EXCL_OP   14
+/* Used to record internally whether fs has been frozen */
+#define BTRFS_FS_FROZEN15
+#define BTRFS_FS_QUOTA_OVERRIDE16
 
 struct btrfs_fs_info {
u8 fsid[BTRFS_FSID_SIZE];
-- 
2.13.2

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Can't mount array with super_total_bytes mismatch with fs_devices total_rw_bytes

2017-10-03 Thread Qu Wenruo



On 2017年10月04日 07:32, Asif Youssuff wrote:

Hi,

My power went out at my home, and I'm now having trouble mounting my array.

I'm mounting with the 'recovery' option in fstab.

When mounting, dmesg output shows:

[210017.281912] BTRFS info (device sdb): disk space caching is enabled
[210017.281915] BTRFS info (device sdb): has skinny extents
[210017.402084] BTRFS error (device sdb): super_total_bytes 92017859088384
mismatch with fs_devices total_rw_bytes 92017859094528


One of your device size is not aligned to 4K.
Which is fine, but recently enhanced validation checker does not allow it.
(Which should be a regression, and there is some other WARN_ON related 
to it)



[210017.402126] BTRFS error (device sdb): failed to read chunk tree: -22
[210017.461473] BTRFS error (device sdb): open_ctree failed

I've tried a few steps --

btrfs-chunk-recover, super-recover and I have run a btrfs check --repair 
on two of the disks in the array (this takes a very long time, so I'm 
hoping I don't have to run this on all of the disks).


I had run into this problem once before in the past, and I'm not sure 
how I recovered from it; I may have simply rolled back the booted kernel 
to escape the extra checks around this mismatch.


I'm at a loss for ideas and am running a btrfs-image so I can also 
report an issue -- I'm not sure whether 'btrfs-image -c9 -t4 /dev/sdo 
btrfs.image' is  the right command to run if it is a multi-device array.


Any ideas would be helpful, and I am happy to provide further information.

root@ubuntu-server:~#   uname -a
Linux ubuntu-server 4.14.0-041400rc2-generic #201709242031 SMP Mon Sep 
25 00:33:13 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux


You can rollback to an earlier kernel to mount the fs.

And manually find and resize the device with unaligned size:

# btrfs fi show -b
And check each device for its size:
Label: none  uuid: 839ddcfa-5701-4437-aff3-bcb2a26ae6dd
Total devices 1 FS bytes used 397312
devid1 size <<10737418240>> used 2172649472 path 
/dev/mapper/data-btrfs

If it's not align, round it down to 4K, and resize it using devid:

# btrfs fi resize : 

All device must be rounded. And the command should finish almost in no time.

Then check if latest kernel can mount it.

I think it can be made as part of "btrfs check" to fix it.
(Although it should be handled by kernel well)

Thanks,
Qu



root@ubuntu-server:~#   btrfs --version
btrfs-progs v4.13.1

root@ubuntu-server:~#   btrfs fi show
Label: none  uuid: 48ed8a66-731d-499b-829e-dd07dd7260cc
Total devices 13 FS bytes used 40.73TiB
devid    4 size 5.46TiB used 5.42TiB path /dev/sdo
devid    5 size 5.46TiB used 5.42TiB path /dev/sdn
devid    7 size 5.46TiB used 5.42TiB path /dev/sdc
devid    9 size 5.46TiB used 5.42TiB path /dev/sdk
devid   10 size 5.46TiB used 5.42TiB path /dev/sdj
devid   11 size 7.28TiB used 7.24TiB path /dev/sdd
devid   12 size 5.46TiB used 5.42TiB path /dev/sdm
devid   14 size 7.28TiB used 7.24TiB path /dev/sdh
devid   15 size 7.28TiB used 7.24TiB path /dev/sdb
devid   17 size 7.28TiB used 5.77TiB path /dev/sdl
devid   18 size 7.28TiB used 7.24TiB path /dev/sdf
devid   20 size 7.28TiB used 7.24TiB path /dev/sdi
devid   21 size 7.28TiB used 7.24TiB path /dev/sdg

Thanks,
Asif

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC] Is it advisable to use btrfs check --repair flag to fix/find errors?

2017-10-03 Thread Soujanya Ponnapalli
Hi,

We are researchers from UT Austin, working on building CrashMonkey[1], a
simple, flexible, file-system agnostic test framework to
systematically check file-systems for inconsistencies if a failure
occurs during a file operation.

Here is a brief description of what we are trying to do:

Firstly we mount the filesystem(fs), run a few tests on the mounted fs
and log the bio requests sent to the fs. We then construct different
crash states that are possible by starting with the snapshot of the
initial state of the disk and applying different permutations of a
subset of the logged bio requests, respecting the ordering rules set
by FUA and flush flags. Later, we run file system consistency
checks/repairs on these generated crash states to repair the possible
inconsistencies and find out if there are still any irreparable
inconsistencies. HotStorage'17 Paper, CrashMonkey[2]: A Framework to
Automatically Test File-System Crash Consistency has detailed
explanation of the methodology.

For this purpose, is it advisable to run btrfs check with --repair
flag to fix or find errors? We have seen - "Warning: Do not use
--repair unless you are advised to by a developer, an experienced user
or accept the fact that fsck cannot possibly fix all sorts of damage
that could happen to a filesystem because of software and hardware
bugs". Hence, please let us know what you think regarding this! Also,
the output of `btrfs check` only hints on saying something is wrong by
setting err to -1. Is there a way to find out what exactly was found
by btrfs?

Thanks,
Soujanya.

$ uname -r
4.4.0-62-generic

$ btrfs --version
btrfs-progs v4.4

$ btrfs fi show
Label: 'btrfs'  uuid: 3e6e7154-79b0-44b2-9193-945a86d61550
Total devices 1 FS bytes used 392.00KiB
devid1 size 10.00GiB used 2.02GiB path /dev/sda3

$ btrfs fi df /mountpoint
Data, single: total=8.00MiB, used=264.00KiB
System, DUP: total=8.00MiB, used=16.00KiB
Metadata, DUP: total=1.00GiB, used=112.00KiB
GlobalReserve, single: total=16.00MiB, used=0.00B

$ dmesg > dmesg.log
www.cs.utexas.edu/~soujanya/dmesg.log

[1] https://github.com/utsaslab/crashmonkey
[2] http://www.cs.utexas.edu/%7Evijay/papers/hotstorage17-crashmonkey.pdf
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Can't mount array with super_total_bytes mismatch with fs_devices total_rw_bytes

2017-10-03 Thread Asif Youssuff

Hi,

My power went out at my home, and I'm now having trouble mounting my array.

I'm mounting with the 'recovery' option in fstab.

When mounting, dmesg output shows:

[210017.281912] BTRFS info (device sdb): disk space caching is enabled
[210017.281915] BTRFS info (device sdb): has skinny extents
[210017.402084] BTRFS error (device sdb): super_total_bytes 92017859088384
mismatch with fs_devices total_rw_bytes 92017859094528
[210017.402126] BTRFS error (device sdb): failed to read chunk tree: -22
[210017.461473] BTRFS error (device sdb): open_ctree failed

I've tried a few steps --

btrfs-chunk-recover, super-recover and I have run a btrfs check --repair 
on two of the disks in the array (this takes a very long time, so I'm 
hoping I don't have to run this on all of the disks).


I had run into this problem once before in the past, and I'm not sure 
how I recovered from it; I may have simply rolled back the booted kernel 
to escape the extra checks around this mismatch.


I'm at a loss for ideas and am running a btrfs-image so I can also 
report an issue -- I'm not sure whether 'btrfs-image -c9 -t4 /dev/sdo 
btrfs.image' is  the right command to run if it is a multi-device array.


Any ideas would be helpful, and I am happy to provide further information.

root@ubuntu-server:~#   uname -a
Linux ubuntu-server 4.14.0-041400rc2-generic #201709242031 SMP Mon Sep 
25 00:33:13 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux


root@ubuntu-server:~#   btrfs --version
btrfs-progs v4.13.1

root@ubuntu-server:~#   btrfs fi show
Label: none  uuid: 48ed8a66-731d-499b-829e-dd07dd7260cc
Total devices 13 FS bytes used 40.73TiB
devid4 size 5.46TiB used 5.42TiB path /dev/sdo
devid5 size 5.46TiB used 5.42TiB path /dev/sdn
devid7 size 5.46TiB used 5.42TiB path /dev/sdc
devid9 size 5.46TiB used 5.42TiB path /dev/sdk
devid   10 size 5.46TiB used 5.42TiB path /dev/sdj
devid   11 size 7.28TiB used 7.24TiB path /dev/sdd
devid   12 size 5.46TiB used 5.42TiB path /dev/sdm
devid   14 size 7.28TiB used 7.24TiB path /dev/sdh
devid   15 size 7.28TiB used 7.24TiB path /dev/sdb
devid   17 size 7.28TiB used 5.77TiB path /dev/sdl
devid   18 size 7.28TiB used 7.24TiB path /dev/sdf
devid   20 size 7.28TiB used 7.24TiB path /dev/sdi
devid   21 size 7.28TiB used 7.24TiB path /dev/sdg

Thanks,
Asif
[0.00] Linux version 4.14.0-041400rc2-generic (kernel@tangerine) (gcc version 7.2.0 (Ubuntu 7.2.0-6ubuntu1)) #201709242031 SMP Mon Sep 25 00:33:13 UTC 2017
[0.00] Command line: BOOT_IMAGE=/vmlinuz-4.14.0-041400rc2-generic root=/dev/mapper/ubuntu--server--vg-root ro
[0.00] KERNEL supported cpus:
[0.00]   Intel GenuineIntel
[0.00]   AMD AuthenticAMD
[0.00]   Centaur CentaurHauls
[0.00] x86/fpu: Supporting XSAVE feature 0x001: 'x87 floating point registers'
[0.00] x86/fpu: Supporting XSAVE feature 0x002: 'SSE registers'
[0.00] x86/fpu: Supporting XSAVE feature 0x004: 'AVX registers'
[0.00] x86/fpu: xstate_offset[2]:  576, xstate_sizes[2]:  256
[0.00] x86/fpu: Enabled xstate features 0x7, context size is 832 bytes, using 'standard' format.
[0.00] e820: BIOS-provided physical RAM map:
[0.00] BIOS-e820: [mem 0x-0x00099bff] usable
[0.00] BIOS-e820: [mem 0x00099c00-0x0009] reserved
[0.00] BIOS-e820: [mem 0x000e-0x000f] reserved
[0.00] BIOS-e820: [mem 0x0010-0xcd4fbfff] usable
[0.00] BIOS-e820: [mem 0xcd4fc000-0xcd502fff] ACPI NVS
[0.00] BIOS-e820: [mem 0xcd503000-0xdd7f1fff] usable
[0.00] BIOS-e820: [mem 0xdd7f2000-0xdd8d8fff] reserved
[0.00] BIOS-e820: [mem 0xdd8d9000-0xdd924fff] usable
[0.00] BIOS-e820: [mem 0xdd925000-0xdda5bfff] ACPI NVS
[0.00] BIOS-e820: [mem 0xdda5c000-0xdf7fefff] reserved
[0.00] BIOS-e820: [mem 0xdf7ff000-0xdf7f] usable
[0.00] BIOS-e820: [mem 0xf800-0xfbff] reserved
[0.00] BIOS-e820: [mem 0xfec0-0xfec00fff] reserved
[0.00] BIOS-e820: [mem 0xfed0-0xfed03fff] reserved
[0.00] BIOS-e820: [mem 0xfed1c000-0xfed1] reserved
[0.00] BIOS-e820: [mem 0xfee0-0xfee00fff] reserved
[0.00] BIOS-e820: [mem 0xff00-0x] reserved
[0.00] BIOS-e820: [mem 0x0001-0x00081fff] usable
[0.00] NX (Execute Disable) protection: active
[0.00] random: fast init done
[0.00] SMBIOS 2.7 present.
[0.00] DMI: Supermicro X10SLM-F/X10SLM-F, BIOS 3.0 04/24/2015
[0.00] tsc: Fast TSC calibration using PIT
[0.00] e820: update [mem 0x-0x0fff] usable ==> reserved
[0.00] e820: remove [mem 0x000a-0x000f] usable
[   

Re: Seeking Help on Corruption Issues

2017-10-03 Thread Stephen Nesbitt


On 10/3/2017 2:11 PM, Hugo Mills wrote:

Hi, Stephen,

On Tue, Oct 03, 2017 at 08:52:04PM +, Stephen Nesbitt wrote:

Here it i. There are a couple of out-of-order entries beginning at 117. And
yes I did uncover a bad stick of RAM:

btrfs-progs v4.9.1
leaf 2589782867968 items 134 free space 6753 generation 3351574 owner 2
fs uuid 24b768c3-2141-44bf-ae93-1c3833c8c8e3
chunk uuid 19ce12f0-d271-46b8-a691-e0d26c1790c6

[snip]

item 116 key (1623012749312 EXTENT_ITEM 45056) itemoff 10908 itemsize 53
extent refs 1 gen 3346444 flags DATA
extent data backref root 271 objectid 2478 offset 0 count 1
item 117 key (1621939052544 EXTENT_ITEM 8192) itemoff 10855 itemsize 53
extent refs 1 gen 3346495 flags DATA
extent data backref root 271 objectid 21751764 offset 6733824 count 1
item 118 key (1623012450304 EXTENT_ITEM 8192) itemoff 10802 itemsize 53
extent refs 1 gen 3351513 flags DATA
extent data backref root 271 objectid 5724364 offset 680640512 count 1
item 119 key (1623012802560 EXTENT_ITEM 12288) itemoff 10749 itemsize 53
extent refs 1 gen 3346376 flags DATA
extent data backref root 271 objectid 21751764 offset 6701056 count 1

hex(1623012749312)

'0x179e3193000'

hex(1621939052544)

'0x179a319e000'

hex(1623012450304)

'0x179e314a000'

hex(1623012802560)

'0x179e31a'

That's "e" -> "a" in the fourth hex digit, which is a single-bit
flip, and should be fixable by btrfs check (I think). However, even
fixing that, it's not ordered, because 118 is then before 117, which
could be another bitflip ("9" -> "4" in the 7th digit), but two bad
bits that close to each other seems unlikely to me.

Hugo.


Hope this is a duplicate reply - I might have fat fingered something.

The underlying file is disposable/replaceable. Any way to zero out/zap 
the bad BTRFS entry?


-steve

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Something like ZFS Channel Programs for BTRFS & probably XFS or even VFS?

2017-10-03 Thread Dave Chinner
On Tue, Oct 03, 2017 at 01:40:51PM -0700, Matthew Wilcox wrote:
> On Wed, Oct 04, 2017 at 07:10:35AM +1100, Dave Chinner wrote:
> > On Tue, Oct 03, 2017 at 03:19:18PM +0200, Martin Steigerwald wrote:
> > > [repost. I didn´t notice autocompletion gave me wrong address for 
> > > fsdevel, 
> > > blacklisted now]
> > > 
> > > Hello.
> > > 
> > > What do you think of
> > > 
> > > http://open-zfs.org/wiki/Projects/ZFS_Channel_Programs
> > 
> > Domain not found.
> 
> Must be an Australian problem ...

Probably, I forgot to stand on my head so everything must have been
sent to the server upside down

Though it is a curious failure - it failed until I went to
"openzfs.org" and that redirected to "open-zfs.org" and now it all
works. Somewhat bizarre.

> A ZFS channel program (ZCP) is a small script written in a domain specific
> language that manipulate ZFS internals in a single, atomically-visible
> operation.  For instance, to delete all snapshots of a filesystem a ZCP
> could be written which 1) generates the list of snapshots, 2) traverses
> that list, and 3) destroys each snapshot unconditionally. Because
> each of these statements would be evaluated from within the kernel,
> ZCPs can guarantee safety from interference with other concurrent ZFS
> modifications. Executing from inside the kernel allows us to guarantee
> atomic visibility of these operations (correctness) and allows them to
> be performed in a single transaction group (performance).
>
> A successful implementation of ZCP will:
> 
> 1. Support equivalent functionality for all of the current ZFS commands
> with improved performance and correctness from the point of view of the
> user of ZFS.
> 
> 2. Facilitate the quick addition of new and useful commands as
> ZCP enables the implementation of more powerful operations which
> previously would have been unsafe to implement in user programs, or
> would require modifications to the kernel for correctness. Since the
> ZCP layer guarantees the atomicity of each ZCP, we only need to write
> new sync_tasks for individual simple operations, then can use ZCPs to
> chain those simple operations together into more complicated operations.
> 
> 3. Allow ZFS users to safely implement their own ZFS operations without
> performing operations they don’t have the privileges for.
> 
> 4. Improve the performance and correctness of existing applications
> built on ZFS operations.

/me goes and looks at the slides

Seems like they are trying to solve a problem of their own making,
in that admin operations are run by the kernel from a separate task
that is really, really slow. So this scripting is a method of aggregating
multiple "sync tasks" into a single operation so there isn't delays
between tasks.

/me chokes on slide 8/8

"Add a Lua interpreter to the kernel, implement ZFS intrinsics (...)
as extensions to the Lua language"

Somehow, I don't see that happening in Linux.

Yes, I can see us potentially adding some custom functionality in
filesystems with eBPF (e.g. custom allocation policies), but I think
admin operations need to be done from userspace through a clear,
stable interface that supports all the necessary primitives to
customise admin operations for different needs.

Cheers,

Dave.
-- 
Dave Chinner
da...@fromorbit.com
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Seeking Help on Corruption Issues

2017-10-03 Thread Hugo Mills
   Hi, Stephen,

On Tue, Oct 03, 2017 at 08:52:04PM +, Stephen Nesbitt wrote:
> Here it i. There are a couple of out-of-order entries beginning at 117. And
> yes I did uncover a bad stick of RAM:
> 
> btrfs-progs v4.9.1
> leaf 2589782867968 items 134 free space 6753 generation 3351574 owner 2
> fs uuid 24b768c3-2141-44bf-ae93-1c3833c8c8e3
> chunk uuid 19ce12f0-d271-46b8-a691-e0d26c1790c6
[snip]
> item 116 key (1623012749312 EXTENT_ITEM 45056) itemoff 10908 itemsize 53
> extent refs 1 gen 3346444 flags DATA
> extent data backref root 271 objectid 2478 offset 0 count 1
> item 117 key (1621939052544 EXTENT_ITEM 8192) itemoff 10855 itemsize 53
> extent refs 1 gen 3346495 flags DATA
> extent data backref root 271 objectid 21751764 offset 6733824 count 1
> item 118 key (1623012450304 EXTENT_ITEM 8192) itemoff 10802 itemsize 53
> extent refs 1 gen 3351513 flags DATA
> extent data backref root 271 objectid 5724364 offset 680640512 count 1
> item 119 key (1623012802560 EXTENT_ITEM 12288) itemoff 10749 itemsize 53
> extent refs 1 gen 3346376 flags DATA
> extent data backref root 271 objectid 21751764 offset 6701056 count 1

>>> hex(1623012749312)
'0x179e3193000'
>>> hex(1621939052544)
'0x179a319e000'
>>> hex(1623012450304)
'0x179e314a000'
>>> hex(1623012802560)
'0x179e31a'

   That's "e" -> "a" in the fourth hex digit, which is a single-bit
flip, and should be fixable by btrfs check (I think). However, even
fixing that, it's not ordered, because 118 is then before 117, which
could be another bitflip ("9" -> "4" in the 7th digit), but two bad
bits that close to each other seems unlikely to me.

   Hugo.

-- 
Hugo Mills | Great films about cricket: Silly Point Break
hugo@... carfax.org.uk |
http://carfax.org.uk/  |
PGP: E2AB1DE4  |


signature.asc
Description: Digital signature


Re: Something like ZFS Channel Programs for BTRFS & probably XFS or even VFS?

2017-10-03 Thread Matthew Wilcox
On Wed, Oct 04, 2017 at 07:10:35AM +1100, Dave Chinner wrote:
> On Tue, Oct 03, 2017 at 03:19:18PM +0200, Martin Steigerwald wrote:
> > [repost. I didn´t notice autocompletion gave me wrong address for fsdevel, 
> > blacklisted now]
> > 
> > Hello.
> > 
> > What do you think of
> > 
> > http://open-zfs.org/wiki/Projects/ZFS_Channel_Programs
> 
> Domain not found.

Must be an Australian problem ...

A ZFS channel program (ZCP) is a small script written in a domain specific
language that manipulate ZFS internals in a single, atomically-visible
operation. For instance, to delete all snapshots of a filesystem a ZCP
could be written which 1) generates the list of snapshots, 2) traverses
that list, and 3) destroys each snapshot unconditionally. Because
each of these statements would be evaluated from within the kernel,
ZCPs can guarantee safety from interference with other concurrent ZFS
modifications. Executing from inside the kernel allows us to guarantee
atomic visibility of these operations (correctness) and allows them to
be performed in a single transaction group (performance).

A successful implementation of ZCP will:

1. Support equivalent functionality for all of the current ZFS commands
with improved performance and correctness from the point of view of the
user of ZFS.

2. Facilitate the quick addition of new and useful commands as
ZCP enables the implementation of more powerful operations which
previously would have been unsafe to implement in user programs, or
would require modifications to the kernel for correctness. Since the
ZCP layer guarantees the atomicity of each ZCP, we only need to write
new sync_tasks for individual simple operations, then can use ZCPs to
chain those simple operations together into more complicated operations.

3. Allow ZFS users to safely implement their own ZFS operations without
performing operations they don’t have the privileges for.

4. Improve the performance and correctness of existing applications
built on ZFS operations.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Something like ZFS Channel Programs for BTRFS & probably XFS or even VFS?

2017-10-03 Thread Randy Dunlap
On 10/03/17 13:10, Dave Chinner wrote:
> On Tue, Oct 03, 2017 at 03:19:18PM +0200, Martin Steigerwald wrote:
>> [repost. I didn´t notice autocompletion gave me wrong address for fsdevel, 
>> blacklisted now]
>>
>> Hello.
>>
>> What do you think of
>>
>> http://open-zfs.org/wiki/Projects/ZFS_Channel_Programs
> 
> Domain not found.

It works for me.


-- 
~Randy
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Something like ZFS Channel Programs for BTRFS & probably XFS or even VFS?

2017-10-03 Thread Dave Chinner
On Tue, Oct 03, 2017 at 03:19:18PM +0200, Martin Steigerwald wrote:
> [repost. I didn´t notice autocompletion gave me wrong address for fsdevel, 
> blacklisted now]
> 
> Hello.
> 
> What do you think of
> 
> http://open-zfs.org/wiki/Projects/ZFS_Channel_Programs

Domain not found.

-Dave.
-- 
Dave Chinner
da...@fromorbit.com
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Seeking Help on Corruption Issues

2017-10-03 Thread Hugo Mills
On Tue, Oct 03, 2017 at 01:06:50PM -0700, Stephen Nesbitt wrote:
> All:
> 
> I came back to my computer yesterday to find my filesystem in read
> only mode. Running a btrfs scrub start -dB aborts as follows:
> 
> btrfs scrub start -dB /mnt
> ERROR: scrubbing /mnt failed for device id 4: ret=-1, errno=5
> (Input/output error)
> ERROR: scrubbing /mnt failed for device id 5: ret=-1, errno=5
> (Input/output error)
> scrub device /dev/sdb (id 4) canceled
>     scrub started at Mon Oct  2 21:51:46 2017 and was aborted after
> 00:09:02
>     total bytes scrubbed: 75.58GiB with 1 errors
>     error details: csum=1
>     corrected errors: 0, uncorrectable errors: 1, unverified errors: 0
> scrub device /dev/sdc (id 5) canceled
>     scrub started at Mon Oct  2 21:51:46 2017 and was aborted after
> 00:11:11
>     total bytes scrubbed: 50.75GiB with 0 errors
> 
> The resulting dmesg is:
> [  699.534066] BTRFS error (device sdc): bdev /dev/sdb errs: wr 0,
> rd 0, flush 0, corrupt 6, gen 0
> [  699.703045] BTRFS error (device sdc): unable to fixup (regular)
> error at logical 1609808347136 on dev /dev/sdb
> [  783.306525] BTRFS critical (device sdc): corrupt leaf, bad key
> order: block=2589782867968, root=1, slot=116

   This error usually means bad RAM. Can you show us the output of
"btrfs-debug-tree -b 2589782867968 /dev/sdc"?

   Hugo.

> [  789.776132] BTRFS critical (device sdc): corrupt leaf, bad key
> order: block=2589782867968, root=1, slot=116
> [  911.529842] BTRFS critical (device sdc): corrupt leaf, bad key
> order: block=2589782867968, root=1, slot=116
> [  918.365225] BTRFS critical (device sdc): corrupt leaf, bad key
> order: block=2589782867968, root=1, slot=116
> 
> Running btrfs check /dev/sdc results in:
> btrfs check /dev/sdc
> Checking filesystem on /dev/sdc
> UUID: 24b768c3-2141-44bf-ae93-1c3833c8c8e3
> checking extents
> bad key ordering 116 117
> bad block 2589782867968
> ERROR: errors found in extent allocation tree or chunk allocation
> checking free space cache
> There is no free space entry for 1623012450304-1623012663296
> There is no free space entry for 1623012450304-1623225008128
> cache appears valid but isn't 1622151266304
> found 288815742976 bytes used err is -22
> total csum bytes: 0
> total tree bytes: 350781440
> total fs tree bytes: 0
> total extent tree bytes: 350027776
> btree space waste bytes: 115829777
> file data blocks allocated: 156499968
> 
> uname -a:
> Linux sysresccd 4.9.24-std500-amd64 #2 SMP Sat Apr 22 17:14:43 UTC
> 2017 x86_64 Intel(R) Core(TM) i5-6500 CPU @ 3.20GHz GenuineIntel
> GNU/Linux
> 
> btrfs --version: btrfs-progs v4.9.1
> 
> btrfs fi show:
> Label: none  uuid: 24b768c3-2141-44bf-ae93-1c3833c8c8e3
>     Total devices 2 FS bytes used 475.08GiB
>     devid    4 size 931.51GiB used 612.06GiB path /dev/sdb
>     devid    5 size 931.51GiB used 613.09GiB path /dev/sdc
> 
> btrfs fi df /mnt:
> Data, RAID1: total=603.00GiB, used=468.03GiB
> System, RAID1: total=64.00MiB, used=112.00KiB
> System, single: total=32.00MiB, used=0.00B
> Metadata, RAID1: total=9.00GiB, used=7.04GiB
> Metadata, single: total=1.00GiB, used=0.00B
> GlobalReserve, single: total=512.00MiB, used=0.00B
> 
> What is the recommended procedure at this point? Run btrfs check
> --repair? I have backups so losing a file or two isn't critical, but
> I really don't want to go through the effort of a bare metal
> reinstall.
> 
> In the process of researching this I did uncover a bad DIMM. Am I
> correct that the problems I'm seeing are likely linked to the
> resulting memory errors.
> 
> Thx in advance,
> 
> -steve
> 

-- 
Hugo Mills | Quidquid latine dictum sit, altum videtur
hugo@... carfax.org.uk |
http://carfax.org.uk/  |
PGP: E2AB1DE4  |


signature.asc
Description: Digital signature


Seeking Help on Corruption Issues

2017-10-03 Thread Stephen Nesbitt

All:

I came back to my computer yesterday to find my filesystem in read only 
mode. Running a btrfs scrub start -dB aborts as follows:


btrfs scrub start -dB /mnt
ERROR: scrubbing /mnt failed for device id 4: ret=-1, errno=5 
(Input/output error)
ERROR: scrubbing /mnt failed for device id 5: ret=-1, errno=5 
(Input/output error)

scrub device /dev/sdb (id 4) canceled
    scrub started at Mon Oct  2 21:51:46 2017 and was aborted after 
00:09:02

    total bytes scrubbed: 75.58GiB with 1 errors
    error details: csum=1
    corrected errors: 0, uncorrectable errors: 1, unverified errors: 0
scrub device /dev/sdc (id 5) canceled
    scrub started at Mon Oct  2 21:51:46 2017 and was aborted after 
00:11:11

    total bytes scrubbed: 50.75GiB with 0 errors

The resulting dmesg is:
[  699.534066] BTRFS error (device sdc): bdev /dev/sdb errs: wr 0, rd 0, 
flush 0, corrupt 6, gen 0
[  699.703045] BTRFS error (device sdc): unable to fixup (regular) error 
at logical 1609808347136 on dev /dev/sdb
[  783.306525] BTRFS critical (device sdc): corrupt leaf, bad key order: 
block=2589782867968, root=1, slot=116
[  789.776132] BTRFS critical (device sdc): corrupt leaf, bad key order: 
block=2589782867968, root=1, slot=116
[  911.529842] BTRFS critical (device sdc): corrupt leaf, bad key order: 
block=2589782867968, root=1, slot=116
[  918.365225] BTRFS critical (device sdc): corrupt leaf, bad key order: 
block=2589782867968, root=1, slot=116


Running btrfs check /dev/sdc results in:
btrfs check /dev/sdc
Checking filesystem on /dev/sdc
UUID: 24b768c3-2141-44bf-ae93-1c3833c8c8e3
checking extents
bad key ordering 116 117
bad block 2589782867968
ERROR: errors found in extent allocation tree or chunk allocation
checking free space cache
There is no free space entry for 1623012450304-1623012663296
There is no free space entry for 1623012450304-1623225008128
cache appears valid but isn't 1622151266304
found 288815742976 bytes used err is -22
total csum bytes: 0
total tree bytes: 350781440
total fs tree bytes: 0
total extent tree bytes: 350027776
btree space waste bytes: 115829777
file data blocks allocated: 156499968

uname -a:
Linux sysresccd 4.9.24-std500-amd64 #2 SMP Sat Apr 22 17:14:43 UTC 2017 
x86_64 Intel(R) Core(TM) i5-6500 CPU @ 3.20GHz GenuineIntel GNU/Linux


btrfs --version: btrfs-progs v4.9.1

btrfs fi show:
Label: none  uuid: 24b768c3-2141-44bf-ae93-1c3833c8c8e3
    Total devices 2 FS bytes used 475.08GiB
    devid    4 size 931.51GiB used 612.06GiB path /dev/sdb
    devid    5 size 931.51GiB used 613.09GiB path /dev/sdc

btrfs fi df /mnt:
Data, RAID1: total=603.00GiB, used=468.03GiB
System, RAID1: total=64.00MiB, used=112.00KiB
System, single: total=32.00MiB, used=0.00B
Metadata, RAID1: total=9.00GiB, used=7.04GiB
Metadata, single: total=1.00GiB, used=0.00B
GlobalReserve, single: total=512.00MiB, used=0.00B

What is the recommended procedure at this point? Run btrfs check 
--repair? I have backups so losing a file or two isn't critical, but I 
really don't want to go through the effort of a bare metal reinstall.


In the process of researching this I did uncover a bad DIMM. Am I 
correct that the problems I'm seeing are likely linked to the resulting 
memory errors.


Thx in advance,

-steve

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] btrfs: avoid overflow when sector_t is 32 bit

2017-10-03 Thread Goffredo Baroncelli
From: Goffredo Baroncelli 

Jean-Denis Girard noticed commit c821e7f3 "pass bytes to
btrfs_bio_alloc" (https://patchwork.kernel.org/patch/9763081/) introduces a
regression on 32 bit machines.
When CONFIG_LBDAF is _not_ defined (CONFIG_LBDAF == Support for large
(2TB+) block devices and files) sector_t is 32 bit on 32bit machines.

In the function submit_extent_page, 'sector' (which is sector_t type) is
multiplied by 512 to convert it from sectors to bytes, leading to an
overflow when the disk is bigger than 4GB (!).

I added a cast to u64 to avoid overflow.

Based on v4.14-rc3.


Signed-off-by: Goffredo Baroncelli 
Tested-by: Jean-Denis Girard 

---
 fs/btrfs/extent_io.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index 12ab19a4b93e..970190cd347e 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -2801,7 +2801,7 @@ static int submit_extent_page(unsigned int opf, struct 
extent_io_tree *tree,
}
}
 
-   bio = btrfs_bio_alloc(bdev, sector << 9);
+   bio = btrfs_bio_alloc(bdev, (u64)sector << 9);
bio_add_page(bio, page, page_size, offset);
bio->bi_end_io = end_io_func;
bio->bi_private = tree;
-- 
2.14.2

--
gpg @keyserver.linux.it: Goffredo Baroncelli 
Key fingerprint BBF5 1610 0B64 DAC6 5F7D  17B2 0EDA 9B37 8B82 E0B5
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] btrfs-progs: misc-test: use raid1 for data to enable mount with -o degraded

2017-10-03 Thread David Sterba
On Tue, Oct 03, 2017 at 03:47:26PM +0900, Misono, Tomohiro wrote:
> kernel 4.14 introduces new function for checking if all chunks is ok for
> mount with -o degraded option.
> 
>   commit 21634a19f646 ("btrfs: Introduce a function to check if all
>   chunks a OK for degraded rw mount")
> 
> As a result, raid0 profile cannot be mounted with -o degraded on 4.14.
> This causes failure of the misc-test 011 "delete missing device". 
> 
> Fix this by using raid1 profile for both data and metadata.
> This also should work for kernel before 4.13.
> 
> Signed-off-by: Tomohiro Misono 

Applied, thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Btrfs: fix fs_info->flags value

2017-10-03 Thread David Sterba
On Mon, Oct 02, 2017 at 05:34:12PM +0900, Tsutomu Itoh wrote:
> Because the values of BTRFS_FS_QUOTA_OVERRIDE and BTRFS_FS_EXCL_OP overlap,
> we should change the value.
> 
> Signed-off-by: Tsutomu Itoh 

Please write a more descriptive subject and changelog.

> ---
>  fs/btrfs/ctree.h | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
> index 899ddaeeacec..566c0ba8dfb8 100644
> --- a/fs/btrfs/ctree.h
> +++ b/fs/btrfs/ctree.h
> @@ -717,12 +717,11 @@ struct btrfs_delayed_root;
>  #define BTRFS_FS_QUOTA_OVERRIDE  14
>  /* Used to record internally whether fs has been frozen */
>  #define BTRFS_FS_FROZEN  15
> -

Unrelated change.

>  /*
>   * Indicate that a whole-filesystem exclusive operation is running
>   * (device replace, resize, device add/delete, balance)
>   */
> -#define BTRFS_FS_EXCL_OP 14
> +#define BTRFS_FS_EXCL_OP 16

Strange how this could have got there. I was suspecting a mis-merge but
the patches for number 14 went in in different releases so this actually
slipped through the review.

Please update and resend the patch with the following tags:

Fixes: f29efe292198b ("btrfs: add quota override flag to enable quota override 
for CAP_SYS_RESOURCE")
CC: sta...@vger.kernel.org # 4.13+
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v8 2/2] btrfs: check device for critical errors and mark failed

2017-10-03 Thread Anand Jain
From: Anand Jain 

Write and flush errors are critical errors, upon which the device fd
must be closed and marked as failed.

There are two type of device close in btrfs, one, close as part
of clean up where we shall release the struct btrfs_device and
or btrfs_fs_devices as well. And the other type which is introduced
here is where we close the device fd for the reason that it has failed
and the mounted FS is still present using the other redundant device.
In this new case we shall keep the failed device's struct btrfs_device
similar to missing device.

Further the approach here is to monitor the device statistics and
trigger the action based on one or more device state.

Signed-off-by: Anand Jain 
Tested-by: Austin S. Hemmelgarn 
---
V8: General misc cleanup. Based on v4.14-rc2

 fs/btrfs/ctree.h   |  2 ++
 fs/btrfs/disk-io.c | 78 +-
 fs/btrfs/volumes.c |  1 +
 fs/btrfs/volumes.h |  4 +++
 4 files changed, 84 insertions(+), 1 deletion(-)

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 5a8933da39a7..bad8fbaff18d 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -824,6 +824,7 @@ struct btrfs_fs_info {
struct mutex tree_log_mutex;
struct mutex transaction_kthread_mutex;
struct mutex cleaner_mutex;
+   struct mutex health_mutex;
struct mutex chunk_mutex;
struct mutex volume_mutex;
 
@@ -941,6 +942,7 @@ struct btrfs_fs_info {
struct btrfs_workqueue *extent_workers;
struct task_struct *transaction_kthread;
struct task_struct *cleaner_kthread;
+   struct task_struct *health_kthread;
int thread_pool_size;
 
struct kobject *space_info_kobj;
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 487bbe4fb3c6..be22104bafbf 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -1922,6 +1922,70 @@ static int cleaner_kthread(void *arg)
return 0;
 }
 
+static void btrfs_check_device_fatal_errors(struct btrfs_root *root)
+{
+   struct btrfs_device *device;
+   struct btrfs_fs_info *fs_info = root->fs_info;
+
+   /* Mark devices with write or flush errors as failed. */
+   mutex_lock(&fs_info->volume_mutex);
+   list_for_each_entry_rcu(device,
+   &fs_info->fs_devices->devices, dev_list) {
+   int c_err;
+
+   if (device->failed)
+   continue;
+
+   /* Todo: Skip replace target for now. */
+   if (device->is_tgtdev_for_dev_replace)
+   continue;
+   if (!device->dev_stats_valid)
+   continue;
+
+   c_err = atomic_read(&device->new_critical_errs);
+   atomic_sub(c_err, &device->new_critical_errs);
+   if (c_err) {
+   btrfs_crit_in_rcu(fs_info,
+   "%s: Fatal write/flush error",
+   rcu_str_deref(device->name));
+   btrfs_mark_device_failed(device);
+   }
+   }
+   mutex_unlock(&fs_info->volume_mutex);
+}
+
+static int health_kthread(void *arg)
+{
+   struct btrfs_root *root = arg;
+
+   do {
+   /* Todo rename the below function */
+   if (btrfs_need_cleaner_sleep(root->fs_info))
+   goto sleep;
+
+   if (!mutex_trylock(&root->fs_info->health_mutex))
+   goto sleep;
+
+   if (btrfs_need_cleaner_sleep(root->fs_info)) {
+   mutex_unlock(&root->fs_info->health_mutex);
+   goto sleep;
+   }
+
+   /* Check devices health */
+   btrfs_check_device_fatal_errors(root);
+
+   mutex_unlock(&root->fs_info->health_mutex);
+
+sleep:
+   set_current_state(TASK_INTERRUPTIBLE);
+   if (!kthread_should_stop())
+   schedule();
+   __set_current_state(TASK_RUNNING);
+   } while (!kthread_should_stop());
+
+   return 0;
+}
+
 static int transaction_kthread(void *arg)
 {
struct btrfs_root *root = arg;
@@ -1969,6 +2033,7 @@ static int transaction_kthread(void *arg)
btrfs_end_transaction(trans);
}
 sleep:
+   wake_up_process(fs_info->health_kthread);
wake_up_process(fs_info->cleaner_kthread);
mutex_unlock(&fs_info->transaction_kthread_mutex);
 
@@ -2713,6 +2778,7 @@ int open_ctree(struct super_block *sb,
mutex_init(&fs_info->chunk_mutex);
mutex_init(&fs_info->transaction_kthread_mutex);
mutex_init(&fs_info->cleaner_mutex);
+   mutex_init(&fs_info->health_mutex);
mutex_init(&fs_info->volume_mutex);
mutex_init(&fs_info->ro_block_group_mutex);
init_rwsem(&fs_info->commit_root_sem);
@@ -3049,11 +3115,16 @@ int open_ctree(struct super_block *sb,
if (IS_ERR(fs_info->cleaner_kthread)

[PATCH v8 1/2] btrfs: introduce device dynamic state transition to failed

2017-10-03 Thread Anand Jain
From: Anand Jain 

This patch provides helper functions to force a device to failed,
and we need it for the following reasons,
1) a. It can be reported that device has failed when it does and
   b. Close the device when it goes offline so that blocklayer can
  cleanup
2) Identify the candidate for the auto replace
3) Stop further RW to the failing device and
4) A device in the multi device btrfs may fail, but as of now in
   some system config whole of btrfs gets unmounted.

Signed-off-by: Anand Jain 
Tested-by: Austin S. Hemmelgarn 
---
V8: General misc cleanup. Based on v4.14-rc2

 fs/btrfs/volumes.c | 104 +
 fs/btrfs/volumes.h |  15 +++-
 2 files changed, 118 insertions(+), 1 deletion(-)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 0e8f16c305df..06e7cf4cef81 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -7255,3 +7255,107 @@ void btrfs_reset_fs_info_ptr(struct btrfs_fs_info 
*fs_info)
fs_devices = fs_devices->seed;
}
 }
+
+static void do_close_device(struct work_struct *work)
+{
+   struct btrfs_device *device;
+
+   device = container_of(work, struct btrfs_device, rcu_work);
+
+   if (device->closing_bdev)
+   blkdev_put(device->closing_bdev, device->mode);
+
+   device->closing_bdev = NULL;
+}
+
+static void btrfs_close_one_device(struct rcu_head *head)
+{
+   struct btrfs_device *device;
+
+   device = container_of(head, struct btrfs_device, rcu);
+
+   INIT_WORK(&device->rcu_work, do_close_device);
+   schedule_work(&device->rcu_work);
+}
+
+void btrfs_force_device_close(struct btrfs_device *device)
+{
+   struct btrfs_fs_info *fs_info;
+   struct btrfs_fs_devices *fs_devices;
+
+   fs_devices = device->fs_devices;
+   fs_info = fs_devices->fs_info;
+
+   btrfs_sysfs_rm_device_link(fs_devices, device);
+
+   mutex_lock(&fs_devices->device_list_mutex);
+   mutex_lock(&fs_devices->fs_info->chunk_mutex);
+
+   btrfs_assign_next_active_device(fs_devices->fs_info, device, NULL);
+
+   if (device->bdev)
+   fs_devices->open_devices--;
+
+   if (device->writeable) {
+   list_del_init(&device->dev_alloc_list);
+   fs_devices->rw_devices--;
+   }
+   device->writeable = 0;
+
+   /*
+* Todo: We have miss-used missing flag all around, and here
+* too for now. (In the long run I want to keep missing to only
+* indicate that it was not present when RAID was assembled.)
+*/
+   device->missing = 1;
+   fs_devices->missing_devices++;
+   device->closing_bdev = device->bdev;
+   device->bdev = NULL;
+
+   call_rcu(&device->rcu, btrfs_close_one_device);
+
+   mutex_unlock(&fs_devices->fs_info->chunk_mutex);
+   mutex_unlock(&fs_devices->device_list_mutex);
+
+   rcu_barrier();
+
+   btrfs_warn_in_rcu(fs_info, "device %s failed",
+   rcu_str_deref(device->name));
+
+   /*
+* We lost one/more disk, which means its not as it
+* was configured by the user. Show mount should show
+* degraded.
+*/
+   btrfs_set_opt(fs_info->mount_opt, DEGRADED);
+
+   /*
+* Now having lost one of the device, check if chunk stripe
+* is incomplete and handle fatal error if needed.
+*/
+   if (!btrfs_check_rw_degradable(fs_info))
+   btrfs_handle_fs_error(fs_info, -EIO,
+   "devices below critical level");
+}
+
+void btrfs_mark_device_failed(struct btrfs_device *dev)
+{
+   struct btrfs_fs_devices *fs_devices = dev->fs_devices;
+
+   /* This shouldn't be called if device is already missing */
+   if (dev->missing || !dev->bdev)
+   return;
+   if (dev->failed)
+   return;
+   dev->failed = 1;
+
+   /* Last RW device is requested to force close let FS handle it. */
+   if (fs_devices->rw_devices == 1) {
+   btrfs_handle_fs_error(fs_devices->fs_info, -EIO,
+   "Last RW device failed");
+   return;
+   }
+
+   /* Point of no return start here. */
+   btrfs_force_device_close(dev);
+}
diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h
index 6108fdfec67f..05b150c03995 100644
--- a/fs/btrfs/volumes.h
+++ b/fs/btrfs/volumes.h
@@ -65,13 +65,26 @@ struct btrfs_device {
struct btrfs_pending_bios pending_sync_bios;
 
struct block_device *bdev;
+   struct block_device *closing_bdev;
 
/* the mode sent to blkdev_get */
fmode_t mode;
 
int writeable;
int in_fs_metadata;
+   /* missing: device not found at the time of mount */
int missing;
+   /* failed: device confirmed to have experienced critical io failure */
+   int failed;
+   /*
+   * offline: system or user or block layer transport has removed
+   

[PATCH v8 0/2] [RFC] Introduce device state 'failed'

2017-10-03 Thread Anand Jain
When one device fails it has to be closed and marked as failed.
Further it needs sysfs (or some) interface to provide complete
information about the device and the volume status to the user
land from the kernel. Next when the disappeared device reappears
we need to resilver/insync depending on the RAID profile which
should be handled per RAID profile specific.

The efforts here are to fix above three missing items.

To begin with this patch brings a Write/Flush failed device to
a failed state.

Next about bringing the device back to the alloc list and verifying
its consistency and kicking off the re-silvering part that still WIP,
& feedback helps. For RAID1 a convert of single raid profile back to
all raid1 will help. For RAID56 I am backing on Luibo's recent RAID56
write hole work I am yet to look deeper on that. Next for RAID1 there
can be split brain scenario where each of the devices were mounted
independently, so to fix this I planning to set an (new) incompatible
flag if any of the device is written without the other. Now when they
are brought together then incompatible flag should be their on only
one of the device, however if incompatible flag is on both the devices
then its a split brain scenario where user intervention will be required.

On the sysfs part there are patches in the ML which was sent before,
I shall be reviving them as well.

Thanks, Anand

Anand Jain (2):
  btrfs: introduce device dynamic state transition to failed
  btrfs: check device for critical errors and mark failed

 fs/btrfs/ctree.h   |   2 +
 fs/btrfs/disk-io.c |  78 ++-
 fs/btrfs/volumes.c | 105 +
 fs/btrfs/volumes.h |  19 +-
 4 files changed, 202 insertions(+), 2 deletions(-)

-- 
2.7.0

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Lost about 3TB

2017-10-03 Thread Hugo Mills
On Tue, Oct 03, 2017 at 05:45:54PM +0200, fred.lar...@free.fr wrote:
> Hi,
> 
> 
> >   What does "btrfs sub list -a /RAID01/" say?
> Nothing (no lines displayed)
> 
> >   Also "grep /RAID01/ /proc/self/mountinfo"?
> Nothing (no lines displayed)
> 
> 
> Also server has been rebooted many times and no process has left "deleted 
> open files" on the volume (lsof...).

   OK. The second command (the grep) was incorrect -- I should have
omitted the slashes. However, it doesn't matter too much, because the
first command indicates that you don't have any subvolumes or
snapshots anyway.

   This means that you're probably looking at the kind of issue
Timofey mentioned in his mail, where writes into the middle of an
existing extent don't free up the overwritten data. This is most
likely to happen on database or VM files, but could happen on others,
depending on the application and how it uses files.

   Since you don't seem to have any snapshots, I _think_ you can deal
with the issue most easily by defragmenting the affected files. It's
worth just getting a second opinion on this one before you try it for
the whole FS. I'm not 100% sure about what defrag will do in this
case, and there are some people round here who have investigated the
behaviour of partially-overwritten extents in more detail than I have.

   Hugo.

> Fred.
> 
> 
> - Mail original -
> De: "Hugo Mills - h...@carfax.org.uk" 
> 
> À: "btrfs fredo" 
> Cc: linux-btrfs@vger.kernel.org
> Envoyé: Mardi 3 Octobre 2017 12:54:05
> Objet: Re: Lost about 3TB
> 
> On Tue, Oct 03, 2017 at 12:44:29PM +0200, btrfs.fr...@xoxy.net wrote:
> > Hi,
> > 
> > I can't figure out were 3TB on a 36 TB BTRFS volume (on LVM) are gone !
> > 
> > I know BTRFS can be tricky when speaking about space usage when using many 
> > physical drives in a RAID setup, but my conf is a very simple BTRFS volume 
> > without RAID(single Data type) using the whole disk (perhaps did I do 
> > something wrong with the LVM setup ?).
> > 
> > My BTRFS volume is mounted on /RAID01/.
> > 
> > There's only one folder in /RAID01/ shared with Samba, Windows also see a 
> > total of 28 TB used.
> > 
> > It only contains 443 files (big backup files created by Veeam), most of the 
> > file size is greater than 1GB and be be up to 5TB.
> > 
> > ##> du -hs /RAID01/
> > 28T /RAID01/
> > 
> > If I sum up the result of : ##> find . -printf '%s\n'
> > I also find 28TB.
> > 
> > I extracted btrfs binary from rpm version v4.9.1 and used ##> btrfs fi 
> > du
> > on each file and the result is 28TB.
> 
>The conclusion here is that there are things that aren't being
> found by these processes. This is usually in the form of dot-files
> (but I think you've covered that case in what you did above) or
> snapshots/subvolumes outside the subvol you've mounted.
> 
>What does "btrfs sub list -a /RAID01/" say?
>Also "grep /RAID01/ /proc/self/mountinfo"?
> 
>There are other possibilities for missing space, but let's cover
> the obvious ones first.
> 
>Hugo.
> 
> > OS : CentOS Linux release 7.3.1611 (Core)
> > btrfs-progs v4.4.1
> > 
> > 
> > ##> ssm list
> > 
> > -
> > DeviceFree  Used  Total  Pool Mount point
> > -
> > /dev/sda   36.39 TB   PARTITIONED
> > /dev/sda1 200.00 MB   /boot/efi
> > /dev/sda2   1.00 GB   /boot
> > /dev/sda3  0.00 KB  36.32 TB   36.32 TB  lvm_pool
> > /dev/sda4  0.00 KB  54.00 GB   54.00 GB  cl_xxx-xxxamrepo-01
> > -
> > ---
> > PoolType   Devices Free  Used Total
> > ---
> > cl_xxx-xxxamrepo-01 lvm10.00 KB  54.00 GB  54.00 GB
> > lvm_poollvm10.00 KB  36.32 TB  36.32 TB
> > btrfs_lvm_pool-lvol001  btrfs  14.84 TB  36.32 TB  36.32 TB
> > ---
> > -
> > Volume PoolVolume size  FS  
> >   FS size   Free  TypeMount point
> > -
> > /dev/cl_xxx-xxxamrepo-01/root  cl_xxx-xxxamrepo-0150.00 GB  xfs 
> >  49.97 GB   48.50 GB  linear  /
> > /dev/cl_xxx-xxxamrepo-01/swap  cl_xxx-xxxamrepo-01 4.00 GB  
> >   linear
> > /dev/lvm_pool/lvol001  lvm_pool   36.32 TB  
> >  

Re: Lost about 3TB

2017-10-03 Thread fred . larive
Hi,


>   What does "btrfs sub list -a /RAID01/" say?
Nothing (no lines displayed)

>   Also "grep /RAID01/ /proc/self/mountinfo"?
Nothing (no lines displayed)


Also server has been rebooted many times and no process has left "deleted open 
files" on the volume (lsof...).


Fred.


- Mail original -
De: "Hugo Mills - h...@carfax.org.uk" 

À: "btrfs fredo" 
Cc: linux-btrfs@vger.kernel.org
Envoyé: Mardi 3 Octobre 2017 12:54:05
Objet: Re: Lost about 3TB

On Tue, Oct 03, 2017 at 12:44:29PM +0200, btrfs.fr...@xoxy.net wrote:
> Hi,
> 
> I can't figure out were 3TB on a 36 TB BTRFS volume (on LVM) are gone !
> 
> I know BTRFS can be tricky when speaking about space usage when using many 
> physical drives in a RAID setup, but my conf is a very simple BTRFS volume 
> without RAID(single Data type) using the whole disk (perhaps did I do 
> something wrong with the LVM setup ?).
> 
> My BTRFS volume is mounted on /RAID01/.
> 
> There's only one folder in /RAID01/ shared with Samba, Windows also see a 
> total of 28 TB used.
> 
> It only contains 443 files (big backup files created by Veeam), most of the 
> file size is greater than 1GB and be be up to 5TB.
> 
> ##> du -hs /RAID01/
> 28T /RAID01/
> 
> If I sum up the result of : ##> find . -printf '%s\n'
> I also find 28TB.
> 
> I extracted btrfs binary from rpm version v4.9.1 and used ##> btrfs fi du
> on each file and the result is 28TB.

   The conclusion here is that there are things that aren't being
found by these processes. This is usually in the form of dot-files
(but I think you've covered that case in what you did above) or
snapshots/subvolumes outside the subvol you've mounted.

   What does "btrfs sub list -a /RAID01/" say?
   Also "grep /RAID01/ /proc/self/mountinfo"?

   There are other possibilities for missing space, but let's cover
the obvious ones first.

   Hugo.

> OS : CentOS Linux release 7.3.1611 (Core)
> btrfs-progs v4.4.1
> 
> 
> ##> ssm list
> 
> -
> DeviceFree  Used  Total  Pool Mount point
> -
> /dev/sda   36.39 TB   PARTITIONED
> /dev/sda1 200.00 MB   /boot/efi
> /dev/sda2   1.00 GB   /boot
> /dev/sda3  0.00 KB  36.32 TB   36.32 TB  lvm_pool
> /dev/sda4  0.00 KB  54.00 GB   54.00 GB  cl_xxx-xxxamrepo-01
> -
> ---
> PoolType   Devices Free  Used Total
> ---
> cl_xxx-xxxamrepo-01 lvm10.00 KB  54.00 GB  54.00 GB
> lvm_poollvm10.00 KB  36.32 TB  36.32 TB
> btrfs_lvm_pool-lvol001  btrfs  14.84 TB  36.32 TB  36.32 TB
> ---
> -
> Volume PoolVolume size  FS
> FS size   Free  TypeMount point
> -
> /dev/cl_xxx-xxxamrepo-01/root  cl_xxx-xxxamrepo-0150.00 GB  xfs  
> 49.97 GB   48.50 GB  linear  /
> /dev/cl_xxx-xxxamrepo-01/swap  cl_xxx-xxxamrepo-01 4.00 GB
> linear
> /dev/lvm_pool/lvol001  lvm_pool   36.32 TB
> linear  /RAID01
> btrfs_lvm_pool-lvol001 btrfs_lvm_pool-lvol001 36.32 TB  btrfs
> 36.32 TB4.84 TB  btrfs   /RAID01
> /dev/sda1200.00 MB  vfat  
> part/boot/efi
> /dev/sda2  1.00 GB  xfs
> 1015.00 MB  882.54 MB  part/boot
> -
> 
> 
> ##> btrfs fi sh
> 
> Label: none  uuid: df7ce232-056a-4c27-bde4-6f785d5d9f68
> Total devices 1 FS bytes used 31.48TiB
> devid1 size 36.32TiB used 31.66TiB path 
> /dev/mapper/lvm_pool-lvol001
> 
> 
> 
> ##> btrfs fi df /RAID01/
> 
> Data, single: total=31.58TiB, used=31.44TiB
> System, DUP: total=8.00MiB, used=3.67MiB
> Metadata, DUP: total=38.00GiB, used=35.37GiB
> GlobalReserve, single: total=512.00MiB, used=0.00B
> 
> 
> 
> I tried to repair it :
> 
> 
> ##> btrfs check --repair -p /dev/mapper/lvm_pool-lvol001
> 
> enabling repair mode
> Checking filesystem on /dev/mapper/lvm_pool-lvol001
> UUID: df7ce232-056a-4c27-bde4-6f785d5d9f68
> checking extents
> Fixed 0 roo

[PATCH 1/4] Btrfs: compress_file_range() remove dead variable num_bytes

2017-10-03 Thread Timofey Titovets
Remove dead assigment of num_bytes

Also as num_bytes only used in will_compress block as
copy of total_in just replace that with total_in and drop num_bytes entire

Signed-off-by: Timofey Titovets 
Reviewed-by: Nikolay Borisov 
---
 fs/btrfs/inode.c | 10 +++---
 1 file changed, 3 insertions(+), 7 deletions(-)

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index b728397ba6e1..237df8fdf7b8 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -458,7 +458,6 @@ static noinline void compress_file_range(struct inode 
*inode,
 {
struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb);
struct btrfs_root *root = BTRFS_I(inode)->root;
-   u64 num_bytes;
u64 blocksize = fs_info->sectorsize;
u64 actual_end;
u64 isize = i_size_read(inode);
@@ -508,8 +507,6 @@ static noinline void compress_file_range(struct inode 
*inode,
 
total_compressed = min_t(unsigned long, total_compressed,
BTRFS_MAX_UNCOMPRESSED);
-   num_bytes = ALIGN(end - start + 1, blocksize);
-   num_bytes = max(blocksize,  num_bytes);
total_in = 0;
ret = 0;
 
@@ -628,7 +625,6 @@ static noinline void compress_file_range(struct inode 
*inode,
 */
total_in = ALIGN(total_in, PAGE_SIZE);
if (total_compressed + blocksize <= total_in) {
-   num_bytes = total_in;
*num_added += 1;
 
/*
@@ -636,12 +632,12 @@ static noinline void compress_file_range(struct inode 
*inode,
 * allocation on disk for these compressed pages, and
 * will submit them to the elevator.
 */
-   add_async_extent(async_cow, start, num_bytes,
+   add_async_extent(async_cow, start, total_in,
total_compressed, pages, nr_pages,
compress_type);
 
-   if (start + num_bytes < end) {
-   start += num_bytes;
+   if (start + total_in < end) {
+   start += total_in;
pages = NULL;
cond_resched();
goto again;
-- 
2.14.2

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 4/4] Btrfs: btrfs_dedupe_file_range() ioctl, remove 16MiB restriction

2017-10-03 Thread Timofey Titovets
At now btrfs_dedupe_file_range() restricted to 16MiB range for
limit locking time and memory requirement for dedup ioctl()

For too big input rage code silently set range to 16MiB

Let's remove that restriction by do iterating over dedup range.
That's backward compatible and will not change anything for request
less then 16MiB.

Signed-off-by: Timofey Titovets 
Reviewed-by: Qu Wenruo 
---
 fs/btrfs/ioctl.c | 22 ++
 1 file changed, 18 insertions(+), 4 deletions(-)

diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index 31407c62da63..4b468e5dfa11 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -3200,11 +3200,9 @@ ssize_t btrfs_dedupe_file_range(struct file *src_file, 
u64 loff, u64 olen,
struct inode *src = file_inode(src_file);
struct inode *dst = file_inode(dst_file);
u64 bs = BTRFS_I(src)->root->fs_info->sb->s_blocksize;
+   u64 i, tail_len, chunk_count;
ssize_t res;

-   if (olen > BTRFS_MAX_DEDUPE_LEN)
-   olen = BTRFS_MAX_DEDUPE_LEN;
-
if (WARN_ON_ONCE(bs < PAGE_SIZE)) {
/*
 * Btrfs does not support blocksize < page_size. As a
@@ -3214,7 +3212,23 @@ ssize_t btrfs_dedupe_file_range(struct file *src_file, 
u64 loff, u64 olen,
return -EINVAL;
}

-   res = btrfs_extent_same(src, loff, olen, dst, dst_loff);
+   tail_len = olen % BTRFS_MAX_DEDUPE_LEN;
+   chunk_count = div_u64(olen, BTRFS_MAX_DEDUPE_LEN);
+
+   for (i = 0; i < chunk_count; i++) {
+   res = btrfs_extent_same(src, loff, BTRFS_MAX_DEDUPE_LEN,
+   dst, dst_loff);
+   if (res)
+   return res;
+
+   loff += BTRFS_MAX_DEDUPE_LEN;
+   dst_loff += BTRFS_MAX_DEDUPE_LEN;
+   }
+
+   if (tail_len > 0)
+   res = btrfs_extent_same(src, loff, tail_len,
+   dst, dst_loff);
+
if (res)
return res;
return olen;
--
2.14.2
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/4] Btrfs: handle unaligned tail of data ranges more efficient

2017-10-03 Thread Timofey Titovets
At now while switch page bits in data ranges
we always hande +1 page, for cover case
where end of data range is not page aligned

Let's handle that case more obvious and efficient
Check end aligment directly and touch +1 page
only then needed

Signed-off-by: Timofey Titovets 
---
 fs/btrfs/extent_io.c | 12 ++--
 fs/btrfs/inode.c |  6 +-
 2 files changed, 15 insertions(+), 3 deletions(-)

diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index 0538bf85adc3..131b7d1df9f7 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -1359,7 +1359,11 @@ void extent_range_clear_dirty_for_io(struct inode 
*inode, u64 start, u64 end)
unsigned long end_index = end >> PAGE_SHIFT;
struct page *page;

-   while (index <= end_index) {
+   /* Don't miss unaligned end */
+   if (!IS_ALIGNED(end, PAGE_SIZE))
+   end_index++;
+
+   while (index < end_index) {
page = find_get_page(inode->i_mapping, index);
BUG_ON(!page); /* Pages should be in the extent_io_tree */
clear_page_dirty_for_io(page);
@@ -1374,7 +1378,11 @@ void extent_range_redirty_for_io(struct inode *inode, 
u64 start, u64 end)
unsigned long end_index = end >> PAGE_SHIFT;
struct page *page;

-   while (index <= end_index) {
+   /* Don't miss unaligned end */
+   if (!IS_ALIGNED(end, PAGE_SIZE))
+   end_index++;
+
+   while (index < end_index) {
page = find_get_page(inode->i_mapping, index);
BUG_ON(!page); /* Pages should be in the extent_io_tree */
__set_page_dirty_nobuffers(page);
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index b6e81bd650ea..b4974d969f67 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -10799,7 +10799,11 @@ void btrfs_set_range_writeback(void *private_data, u64 
start, u64 end)
unsigned long end_index = end >> PAGE_SHIFT;
struct page *page;

-   while (index <= end_index) {
+   /* Don't miss unaligned end */
+   if (!IS_ALIGNED(end, PAGE_SIZE))
+   end_index++;
+
+   while (index < end_index) {
page = find_get_page(inode->i_mapping, index);
ASSERT(page); /* Pages should be in the extent_io_tree */
set_page_writeback(page);
--
2.14.2
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/4] Btrfs: clear_dirty only on pages only in compression range

2017-10-03 Thread Timofey Titovets
We need to call extent_range_clear_dirty_for_io()
on compression range to prevent application from changing
page content, while pages compressing.

but "(end - start)" can be much (up to 1024 times) bigger
then compression range (BTRFS_MAX_UNCOMPRESSED), so optimize that
by calculating compression range for
that loop iteration, and flip bits only on that range

v1 -> v2:
 - Make that more obviously and more safeprone

v2 -> v3:
 - Rebased on:
   Btrfs: compress_file_range() remove dead variable num_bytes
 - Update change log
 - Add comments

Signed-off-by: Timofey Titovets 
---
 fs/btrfs/inode.c | 27 ++-
 1 file changed, 22 insertions(+), 5 deletions(-)

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 237df8fdf7b8..b6e81bd650ea 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -460,6 +460,7 @@ static noinline void compress_file_range(struct inode 
*inode,
struct btrfs_root *root = BTRFS_I(inode)->root;
u64 blocksize = fs_info->sectorsize;
u64 actual_end;
+   u64 current_end;
u64 isize = i_size_read(inode);
int ret = 0;
struct page **pages = NULL;
@@ -505,6 +506,21 @@ static noinline void compress_file_range(struct inode 
*inode,
   (start > 0 || end + 1 < BTRFS_I(inode)->disk_i_size))
goto cleanup_and_bail_uncompressed;

+   /*
+* We need to call extent_range_clear_dirty_for_io()
+* on compression range to prevent application from changing
+* page content, while pages compressing.
+*
+* but (end - start) can be much (up to 1024 times) bigger
+* then compression range, so optimize that
+* by calculating compression range for
+* that iteration, and flip bits only on that range
+*/
+   if (end - start > BTRFS_MAX_UNCOMPRESSED)
+   current_end = start + BTRFS_MAX_UNCOMPRESSED;
+   else
+   current_end = end;
+
total_compressed = min_t(unsigned long, total_compressed,
BTRFS_MAX_UNCOMPRESSED);
total_in = 0;
@@ -515,7 +531,7 @@ static noinline void compress_file_range(struct inode 
*inode,
 * inode has not been flagged as nocompress.  This flag can
 * change at any time if we discover bad compression ratios.
 */
-   if (inode_need_compress(inode, start, end)) {
+   if (inode_need_compress(inode, start, current_end)) {
WARN_ON(pages);
pages = kcalloc(nr_pages, sizeof(struct page *), GFP_NOFS);
if (!pages) {
@@ -530,14 +546,15 @@ static noinline void compress_file_range(struct inode 
*inode,

/*
 * we need to call clear_page_dirty_for_io on each
-* page in the range.  Otherwise applications with the file
-* mmap'd can wander in and change the page contents while
+* page in compression the range.
+* Otherwise applications with the file mmap'd
+* can wander in and change the page contents while
 * we are compressing them.
 *
 * If the compression fails for any reason, we set the pages
 * dirty again later on.
 */
-   extent_range_clear_dirty_for_io(inode, start, end);
+   extent_range_clear_dirty_for_io(inode, start, current_end);
redirty = 1;

/* Compression level is applied here and only here */
@@ -678,7 +695,7 @@ static noinline void compress_file_range(struct inode 
*inode,
/* unlocked later on in the async handlers */

if (redirty)
-   extent_range_redirty_for_io(inode, start, end);
+   extent_range_redirty_for_io(inode, start, current_end);
add_async_extent(async_cow, start, end - start + 1, 0, NULL, 0,
 BTRFS_COMPRESS_NONE);
*num_added += 1;
--
2.14.2
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/4] Just bunch of btrfs patches

2017-10-03 Thread Timofey Titovets
Some patches has review, some not, all compile tested and hand tested.
(i.e. boot into patched system and do some small tests).

All based on kDave for-next branch

Patches:
1. Just remove useless u64 num_bytes from compress_file_range()
   No functional changes
2. For make compression on on mmap'd files safe,
   while compression logic works, we switch page dirty page bit on whole
   input range, but input range can be much bigger the 128KiB
   So try optimize that by only switch bits on current compression range
3. Function:
   extent_range_clear_dirty_for_io()
   extent_range_redirty_for_io()
   btrfs_set_range_writeback()
   Used to switch some bits on pages,
   but use not obvious while (index <= end_index) to cover
   unaligned end to pages.
   (I don't think that not obvious for me only, as on IRC no one can help me
   understand that until i found answer)
   So i change handling of unaligned end to more obvious way
4. btrfs_dedupe_file_range() on range bigger then 16MiB
   instead of return error, silently set it to 16MiB.
   So just add loop over input range, to get working bigger range
   P.S. May be that make a sense to change loop iterator to some lower value
   if one of deduped files are compressed?

Thanks.

Timofey Titovets (4):
  Btrfs: compress_file_range() remove dead variable num_bytes
  Btrfs: clear_dirty only on pages in compression range
  Btrfs: handle unaligned tail of data ranges more efficient
  Btrfs: btrfs_dedupe_file_range() ioctl, remove 16MiB restriction

 fs/btrfs/extent_io.c | 12 ++--
 fs/btrfs/inode.c | 43 ++-
 fs/btrfs/ioctl.c | 22 ++
 3 files changed, 58 insertions(+), 19 deletions(-)

--
2.14.2
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Kickstarting snapshot-aware defrag?

2017-10-03 Thread Niccolò Belli

Hi,
It seems to me that the proposal[1] for a snapshot-aware defrag has long 
been abandoned. Since most peoples badly need this feature I tought about 
how to possibly speed up the achievement of this goal.


I know of several bounty-based kickstarting platforms, among them the best 
ones are probably bountysource.com[2] and freedomsponsors.org[3].
With both platforms everyone interested can place a bounty on the issue and 
if/when there will be someone interested to implement it he will get the 
bounty.
I created an issue on both of them just to show how the platform will 
handle it.


Since btrfs is a small community, before actually placing bounties and 
sponsoring it I would like to know if there is someone against this 
development model or someone interested in implementing a feature because 
of a bounty.


Bests,
Niccolò


[1]https://www.spinics.net/lists/linux-btrfs/msg34539.html
[2]https://www.bountysource.com/issues/50004702-feature-request-snapshot-aware-defrag
[3]https://freedomsponsors.org/issue/817/feature-request-snapshot-aware-defrag?alert=KICKSTART
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Something like ZFS Channel Programs for BTRFS & probably XFS or even VFS?

2017-10-03 Thread Martin Steigerwald
[repost. I didn´t notice autocompletion gave me wrong address for fsdevel, 
blacklisted now]

Hello.

What do you think of

http://open-zfs.org/wiki/Projects/ZFS_Channel_Programs

?

There are quite some BTRFS maintenance programs like the deduplication stuff. 
Also regular scrubs… and in certain circumstances probably balances can make 
sense.

In addition to this XFS got scrub functionality as well.

Now putting the foundation for such a functionality in the kernel I think 
would only be reasonable if it cannot be done purely within user space, so I 
wonder about the safety from other concurrent ZFS modification and atomicity 
that are mentioned on the wiki page. The second set of slides, those the 
OpenZFS Developer Commit 2014, which are linked to on the wiki page explain 
this more. (I didn´t look the first ones, as I am no fan of slideshare.net and 
prefer a simple PDF to download and view locally anytime, not for privacy 
reasons alone, but also to avoid a using a crappy webpage over a wonderfully 
functional PDF viewer fat client like Okular)

Also I wonder about putting a lua interpreter into the kernel, but it seems at 
least NetBSD developers added one to their kernel with version 7.0¹.

I also ask this cause I wondered about a kind of fsmaintd or volmaintd for 
quite a while, and thought… it would be nice to do this in a generic way, as 
BTRFS is not the only filesystem which supports maintenance operations. However 
if it can all just nicely be done in userspace, I am all for it.

[1] http://www.netbsd.org/releases/formal-7/NetBSD-7.0.html
(tons of presentation PDFs on their site as well)

Thanks,
-- 
Martin

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Lost about 3TB

2017-10-03 Thread Roman Mamedov
On Tue, 3 Oct 2017 10:54:05 +
Hugo Mills  wrote:

>There are other possibilities for missing space, but let's cover
> the obvious ones first.

One more obvious thing would be files that are deleted, but still kept open by
some app (possibly even from network, via NFS or SMB!). @Frederic, did you try
rebooting the system?

-- 
With respect,
Roman
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Lost about 3TB

2017-10-03 Thread Timofey Titovets
2017-10-03 13:54 GMT+03:00 Hugo Mills :
> On Tue, Oct 03, 2017 at 12:44:29PM +0200, btrfs.fr...@xoxy.net wrote:
>> Hi,
>>
>> I can't figure out were 3TB on a 36 TB BTRFS volume (on LVM) are gone !
>>
>> I know BTRFS can be tricky when speaking about space usage when using many 
>> physical drives in a RAID setup, but my conf is a very simple BTRFS volume 
>> without RAID(single Data type) using the whole disk (perhaps did I do 
>> something wrong with the LVM setup ?).
>>
>> My BTRFS volume is mounted on /RAID01/.
>>
>> There's only one folder in /RAID01/ shared with Samba, Windows also see a 
>> total of 28 TB used.
>>
>> It only contains 443 files (big backup files created by Veeam), most of the 
>> file size is greater than 1GB and be be up to 5TB.
>>
>> ##> du -hs /RAID01/
>> 28T /RAID01/
>>
>> If I sum up the result of : ##> find . -printf '%s\n'
>> I also find 28TB.
>>
>> I extracted btrfs binary from rpm version v4.9.1 and used ##> btrfs fi du
>> on each file and the result is 28TB.
>
>The conclusion here is that there are things that aren't being
> found by these processes. This is usually in the form of dot-files
> (but I think you've covered that case in what you did above) or
> snapshots/subvolumes outside the subvol you've mounted.
>
>What does "btrfs sub list -a /RAID01/" say?
>Also "grep /RAID01/ /proc/self/mountinfo"?
>
>There are other possibilities for missing space, but let's cover
> the obvious ones first.
>
>Hugo.
>
>> OS : CentOS Linux release 7.3.1611 (Core)
>> btrfs-progs v4.4.1
>>
>>
>> ##> ssm list
>>
>> -
>> DeviceFree  Used  Total  Pool Mount point
>> -
>> /dev/sda   36.39 TB   PARTITIONED
>> /dev/sda1 200.00 MB   /boot/efi
>> /dev/sda2   1.00 GB   /boot
>> /dev/sda3  0.00 KB  36.32 TB   36.32 TB  lvm_pool
>> /dev/sda4  0.00 KB  54.00 GB   54.00 GB  cl_xxx-xxxamrepo-01
>> -
>> ---
>> PoolType   Devices Free  Used Total
>> ---
>> cl_xxx-xxxamrepo-01 lvm10.00 KB  54.00 GB  54.00 GB
>> lvm_poollvm10.00 KB  36.32 TB  36.32 TB
>> btrfs_lvm_pool-lvol001  btrfs  14.84 TB  36.32 TB  36.32 TB
>> ---
>> -
>> Volume PoolVolume size  FS   
>>  FS size   Free  TypeMount point
>> -
>> /dev/cl_xxx-xxxamrepo-01/root  cl_xxx-xxxamrepo-0150.00 GB  xfs  
>> 49.97 GB   48.50 GB  linear  /
>> /dev/cl_xxx-xxxamrepo-01/swap  cl_xxx-xxxamrepo-01 4.00 GB   
>>  linear
>> /dev/lvm_pool/lvol001  lvm_pool   36.32 TB   
>>  linear  /RAID01
>> btrfs_lvm_pool-lvol001 btrfs_lvm_pool-lvol001 36.32 TB  btrfs
>> 36.32 TB4.84 TB  btrfs   /RAID01
>> /dev/sda1200.00 MB  vfat 
>>  part/boot/efi
>> /dev/sda2  1.00 GB  xfs
>> 1015.00 MB  882.54 MB  part/boot
>> -
>>
>>
>> ##> btrfs fi sh
>>
>> Label: none  uuid: df7ce232-056a-4c27-bde4-6f785d5d9f68
>> Total devices 1 FS bytes used 31.48TiB
>> devid1 size 36.32TiB used 31.66TiB path 
>> /dev/mapper/lvm_pool-lvol001
>>
>>
>>
>> ##> btrfs fi df /RAID01/
>>
>> Data, single: total=31.58TiB, used=31.44TiB
>> System, DUP: total=8.00MiB, used=3.67MiB
>> Metadata, DUP: total=38.00GiB, used=35.37GiB
>> GlobalReserve, single: total=512.00MiB, used=0.00B
>>
>>
>>
>> I tried to repair it :
>>
>>
>> ##> btrfs check --repair -p /dev/mapper/lvm_pool-lvol001
>>
>> enabling repair mode
>> Checking filesystem on /dev/mapper/lvm_pool-lvol001
>> UUID: df7ce232-056a-4c27-bde4-6f785d5d9f68
>> checking extents
>> Fixed 0 roots.
>> cache and super generation don't match, space cache will be invalidated
>> checking fs roots
>> checking csums
>> checking root refs
>> found 34600611349019 bytes used err is 0
>> total csum bytes: 33752513152
>> total tree bytes: 38037848064
>> total fs tree bytes: 583942144
>> total extent tree bytes: 653754368
>> btree

Re: Lost about 3TB

2017-10-03 Thread Hugo Mills
On Tue, Oct 03, 2017 at 12:44:29PM +0200, btrfs.fr...@xoxy.net wrote:
> Hi,
> 
> I can't figure out were 3TB on a 36 TB BTRFS volume (on LVM) are gone !
> 
> I know BTRFS can be tricky when speaking about space usage when using many 
> physical drives in a RAID setup, but my conf is a very simple BTRFS volume 
> without RAID(single Data type) using the whole disk (perhaps did I do 
> something wrong with the LVM setup ?).
> 
> My BTRFS volume is mounted on /RAID01/.
> 
> There's only one folder in /RAID01/ shared with Samba, Windows also see a 
> total of 28 TB used.
> 
> It only contains 443 files (big backup files created by Veeam), most of the 
> file size is greater than 1GB and be be up to 5TB.
> 
> ##> du -hs /RAID01/
> 28T /RAID01/
> 
> If I sum up the result of : ##> find . -printf '%s\n'
> I also find 28TB.
> 
> I extracted btrfs binary from rpm version v4.9.1 and used ##> btrfs fi du
> on each file and the result is 28TB.

   The conclusion here is that there are things that aren't being
found by these processes. This is usually in the form of dot-files
(but I think you've covered that case in what you did above) or
snapshots/subvolumes outside the subvol you've mounted.

   What does "btrfs sub list -a /RAID01/" say?
   Also "grep /RAID01/ /proc/self/mountinfo"?

   There are other possibilities for missing space, but let's cover
the obvious ones first.

   Hugo.

> OS : CentOS Linux release 7.3.1611 (Core)
> btrfs-progs v4.4.1
> 
> 
> ##> ssm list
> 
> -
> DeviceFree  Used  Total  Pool Mount point
> -
> /dev/sda   36.39 TB   PARTITIONED
> /dev/sda1 200.00 MB   /boot/efi
> /dev/sda2   1.00 GB   /boot
> /dev/sda3  0.00 KB  36.32 TB   36.32 TB  lvm_pool
> /dev/sda4  0.00 KB  54.00 GB   54.00 GB  cl_xxx-xxxamrepo-01
> -
> ---
> PoolType   Devices Free  Used Total
> ---
> cl_xxx-xxxamrepo-01 lvm10.00 KB  54.00 GB  54.00 GB
> lvm_poollvm10.00 KB  36.32 TB  36.32 TB
> btrfs_lvm_pool-lvol001  btrfs  14.84 TB  36.32 TB  36.32 TB
> ---
> -
> Volume PoolVolume size  FS
> FS size   Free  TypeMount point
> -
> /dev/cl_xxx-xxxamrepo-01/root  cl_xxx-xxxamrepo-0150.00 GB  xfs  
> 49.97 GB   48.50 GB  linear  /
> /dev/cl_xxx-xxxamrepo-01/swap  cl_xxx-xxxamrepo-01 4.00 GB
> linear
> /dev/lvm_pool/lvol001  lvm_pool   36.32 TB
> linear  /RAID01
> btrfs_lvm_pool-lvol001 btrfs_lvm_pool-lvol001 36.32 TB  btrfs
> 36.32 TB4.84 TB  btrfs   /RAID01
> /dev/sda1200.00 MB  vfat  
> part/boot/efi
> /dev/sda2  1.00 GB  xfs
> 1015.00 MB  882.54 MB  part/boot
> -
> 
> 
> ##> btrfs fi sh
> 
> Label: none  uuid: df7ce232-056a-4c27-bde4-6f785d5d9f68
> Total devices 1 FS bytes used 31.48TiB
> devid1 size 36.32TiB used 31.66TiB path 
> /dev/mapper/lvm_pool-lvol001
> 
> 
> 
> ##> btrfs fi df /RAID01/
> 
> Data, single: total=31.58TiB, used=31.44TiB
> System, DUP: total=8.00MiB, used=3.67MiB
> Metadata, DUP: total=38.00GiB, used=35.37GiB
> GlobalReserve, single: total=512.00MiB, used=0.00B
> 
> 
> 
> I tried to repair it :
> 
> 
> ##> btrfs check --repair -p /dev/mapper/lvm_pool-lvol001
> 
> enabling repair mode
> Checking filesystem on /dev/mapper/lvm_pool-lvol001
> UUID: df7ce232-056a-4c27-bde4-6f785d5d9f68
> checking extents
> Fixed 0 roots.
> cache and super generation don't match, space cache will be invalidated
> checking fs roots
> checking csums
> checking root refs
> found 34600611349019 bytes used err is 0
> total csum bytes: 33752513152
> total tree bytes: 38037848064
> total fs tree bytes: 583942144
> total extent tree bytes: 653754368
> btree space waste bytes: 2197658704
> file data blocks allocated: 183716661284864 ?? what's this ??
>  referenced 30095956975616 = 27.3 TB !!
> 
> 
> 

Lost about 3TB

2017-10-03 Thread btrfs . fredo
Hi,

I can't figure out were 3TB on a 36 TB BTRFS volume (on LVM) are gone !

I know BTRFS can be tricky when speaking about space usage when using many 
physical drives in a RAID setup, but my conf is a very simple BTRFS volume 
without RAID(single Data type) using the whole disk (perhaps did I do something 
wrong with the LVM setup ?).

My BTRFS volume is mounted on /RAID01/.

There's only one folder in /RAID01/ shared with Samba, Windows also see a total 
of 28 TB used.

It only contains 443 files (big backup files created by Veeam), most of the 
file size is greater than 1GB and be be up to 5TB.

##> du -hs /RAID01/
28T /RAID01/

If I sum up the result of : ##> find . -printf '%s\n'
I also find 28TB.

I extracted btrfs binary from rpm version v4.9.1 and used ##> btrfs fi du
on each file and the result is 28TB.



OS : CentOS Linux release 7.3.1611 (Core)
btrfs-progs v4.4.1


##> ssm list

-
DeviceFree  Used  Total  Pool Mount point
-
/dev/sda   36.39 TB   PARTITIONED
/dev/sda1 200.00 MB   /boot/efi
/dev/sda2   1.00 GB   /boot
/dev/sda3  0.00 KB  36.32 TB   36.32 TB  lvm_pool
/dev/sda4  0.00 KB  54.00 GB   54.00 GB  cl_xxx-xxxamrepo-01
-
---
PoolType   Devices Free  Used Total
---
cl_xxx-xxxamrepo-01 lvm10.00 KB  54.00 GB  54.00 GB
lvm_poollvm10.00 KB  36.32 TB  36.32 TB
btrfs_lvm_pool-lvol001  btrfs  14.84 TB  36.32 TB  36.32 TB
---
-
Volume PoolVolume size  FS
FS size   Free  TypeMount point
-
/dev/cl_xxx-xxxamrepo-01/root  cl_xxx-xxxamrepo-0150.00 GB  xfs  
49.97 GB   48.50 GB  linear  /
/dev/cl_xxx-xxxamrepo-01/swap  cl_xxx-xxxamrepo-01 4.00 GB  
  linear
/dev/lvm_pool/lvol001  lvm_pool   36.32 TB  
  linear  /RAID01
btrfs_lvm_pool-lvol001 btrfs_lvm_pool-lvol001 36.32 TB  btrfs
36.32 TB4.84 TB  btrfs   /RAID01
/dev/sda1200.00 MB  vfat
  part/boot/efi
/dev/sda2  1.00 GB  xfs
1015.00 MB  882.54 MB  part/boot
-


##> btrfs fi sh

Label: none  uuid: df7ce232-056a-4c27-bde4-6f785d5d9f68
Total devices 1 FS bytes used 31.48TiB
devid1 size 36.32TiB used 31.66TiB path /dev/mapper/lvm_pool-lvol001



##> btrfs fi df /RAID01/

Data, single: total=31.58TiB, used=31.44TiB
System, DUP: total=8.00MiB, used=3.67MiB
Metadata, DUP: total=38.00GiB, used=35.37GiB
GlobalReserve, single: total=512.00MiB, used=0.00B



I tried to repair it :


##> btrfs check --repair -p /dev/mapper/lvm_pool-lvol001

enabling repair mode
Checking filesystem on /dev/mapper/lvm_pool-lvol001
UUID: df7ce232-056a-4c27-bde4-6f785d5d9f68
checking extents
Fixed 0 roots.
cache and super generation don't match, space cache will be invalidated
checking fs roots
checking csums
checking root refs
found 34600611349019 bytes used err is 0
total csum bytes: 33752513152
total tree bytes: 38037848064
total fs tree bytes: 583942144
total extent tree bytes: 653754368
btree space waste bytes: 2197658704
file data blocks allocated: 183716661284864 ?? what's this ??
 referenced 30095956975616 = 27.3 TB !!



Tried the "new usage" display but the problem is the same : 31 TB used but 
total file size is 28TB

Overall:
Device size:  36.32TiB
Device allocated: 31.65TiB
Device unallocated:4.67TiB
Device missing:  0.00B
Used: 31.52TiB
Free (estimated):  4.80TiB  (min: 2.46TiB)
Data ratio:   1.00
Metadata ratio:   2.00
Global reserve:  512.00MiB  (used: 0.00B)

Data,single: Size:31.58TiB, Used:31.45TiB
   /dev/mapper/lvm_pool-lvol001   31.58TiB

Metadata,DUP: Size:38.00GiB, Used:35.37GiB
   /dev/mapper/lvm_pool-lvol001   76.00GiB

System,DUP: Size:8.00MiB, U

Something like ZFS Channel Programs for BTRFS & probably XFS or even VFS?

2017-10-03 Thread Martin Steigerwald
Hello.

What do you think of

http://open-zfs.org/wiki/Projects/ZFS_Channel_Programs

?

There are quite some BTRFS maintenance programs like the deduplication stuff. 
Also regular scrubs… and in certain circumstances probably balances can make 
sense.

In addition to this XFS got scrub functionality as well.

Now putting the foundation for such a functionality in the kernel I think 
would only be reasonable if it cannot be done purely within user space, so I 
wonder about the safety from other concurrent ZFS modification and atomicity 
that are mentioned on the wiki page. The second set of slides, those the 
OpenZFS Developer Commit 2014, which are linked to on the wiki page explain 
this more. (I didn´t look the first ones, as I am no fan of slideshare.net and 
prefer a simple PDF to download and view locally anytime, not for privacy 
reasons alone, but also to avoid a using a crappy webpage over a wonderfully 
functional PDF viewer fat client like Okular)

Also I wonder about putting a lua interpreter into the kernel, but it seems at 
least NetBSD developers added one to their kernel with version 7.0¹.

I also ask this cause I wondered about a kind of fsmaintd or volmaintd for 
quite a while, and thought… it would be nice to do this in a generic way, as 
BTRFS is not the only filesystem which supports maintenance operations. However 
if it can all just nicely be done in userspace, I am all for it.

[1] http://www.netbsd.org/releases/formal-7/NetBSD-7.0.html
(tons of presentation PDFs on their site as well)

Thanks,
-- 
Martin
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] btrfs-progs: doc: update help/document of btrfs device remove

2017-10-03 Thread Misono, Tomohiro
This patch updates help/document of "btrfs device remove" in two points:

1. Add explanation of 'missing' for 'device remove'. This is only
written in wikipage currently.
(https://btrfs.wiki.kernel.org/index.php/Using_Btrfs_with_Multiple_Devices)

2. Add example of device removal in the man document. This is because
that explanation of "remove" says "See the example section below", but
there is no example of removal currently.

Signed-off-by: Tomohiro Misono 
---
 Documentation/btrfs-device.asciidoc | 19 +++
 cmds-device.c   | 10 +-
 2 files changed, 28 insertions(+), 1 deletion(-)

diff --git a/Documentation/btrfs-device.asciidoc 
b/Documentation/btrfs-device.asciidoc
index 88822ec..dc523a9 100644
--- a/Documentation/btrfs-device.asciidoc
+++ b/Documentation/btrfs-device.asciidoc
@@ -75,6 +75,10 @@ The operation can take long as it needs to move all data 
from the device.
 It is possible to delete the device that was used to mount the filesystem. The
 device entry in mount table will be replaced by another device name with the
 lowest device id.
++
+If device is mounted as degraded mode (-o degraded), special term "missing"
+can be used for . In that case, the first device that is described by
+the filesystem metadata, but not presented at the mount time will be removed.
 
 *delete* | [|...] ::
 Alias of remove kept for backward compatibility
@@ -206,6 +210,21 @@ data or the block groups occupy the whole first device.
 The device size of '/dev/sdb' as seen by the filesystem remains unchanged, but
 the logical space from 50-100GiB will be unused.
 
+ REMOVE DEVICE 
+
+Device removal must satisfy the profile constraints, otherwise the command
+fails. For example:
+
+ $ btrfs device remove /dev/sda /mnt
+ $ ERROR: error removing device '/dev/sda': unable to go below two devices on 
raid1
+
+
+In order to remove a device, you need to convert profile in this case:
+
+ $ btrfs balance start -mconvert=dup /mnt
+ $ btrfs balance start -dconvert=single /mnt
+ $ btrfs device remove /dev/sda /mnt
+
 DEVICE STATS
 
 
diff --git a/cmds-device.c b/cmds-device.c
index 4337eb2..6cb53ff 100644
--- a/cmds-device.c
+++ b/cmds-device.c
@@ -224,9 +224,16 @@ static int _cmd_device_remove(int argc, char **argv,
return !!ret;
 }
 
+#define COMMON_USAGE_REMOVE_DELETE \
+   "", \
+   "If 'missing' is specified for , the first device that is", \
+   "described by the filesystem metadata, but not presented at the", \
+   "mount time will be removed."
+
 static const char * const cmd_device_remove_usage[] = {
"btrfs device remove | [|...] ",
"Remove a device from a filesystem",
+   COMMON_USAGE_REMOVE_DELETE,
NULL
 };
 
@@ -237,7 +244,8 @@ static int cmd_device_remove(int argc, char **argv)
 
 static const char * const cmd_device_delete_usage[] = {
"btrfs device delete | [|...] ",
-   "Remove a device from a filesystem",
+   "Remove a device from a filesystem (alias of \"btrfs device remove\")",
+   COMMON_USAGE_REMOVE_DELETE,
NULL
 };
 
-- 
2.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html