Re: help!!! error when mount a btrfs file system

2017-05-22 Thread Qu Wenruo



At 03/16/2017 08:23 PM, 李云甫 wrote:

hi, buddy

I have a file server with btrfs file system, it's work well for several 
months.

but after last system reboot, the /dev/sdb become not mountable.

below is the details.   is there any advise?


##Version info
Fedora 25 Server
Kernel 4.9.13-201.fc25.x86_64
btrfs-progs v4.6.1

#error messages when mount
mount: wrong fs type, bad option, bad superblock on /dev/sdb,
missing codepage or helper program, or other error

In some cases useful info is found in syslog - try
dmesg | tail or so.

##dmesg |tail
[79570.756871] BTRFS error (device sdb): parent transid verify failed on 
21413888 wanted 755660 found 623605
[79570.762307] BTRFS error (device sdb): bad tree block start 0 21413888


Chunk tree corrupted.


[79570.762345] BTRFS error (device sdb): failed to read chunk tree: -5
[79570.778129] BTRFS error (device sdb): open_ctree failed
[79589.743772] BTRFS error (device sdb): support for check_integrity* not 
compiled in!
[79589.803176] BTRFS error (device sdb): open_ctree failed

##btrfsck
parent transid verify failed on 21413888 wanted 755660 found 623605
parent transid verify failed on 21413888 wanted 755660 found 623605
checksum verify failed on 21413888 found E4E3BDB6 wanted 


E4E3BDB6 is the crc32 of a leaf filled with all zero data.
And wanted csum is also 0, means the whole leaf is all zero.

Either something went wrong related to discard, or your chunk tree got 
completely corrupted.



parent transid verify failed on 21413888 wanted 755660 found 623605
Ignoring transid failure
checksum verify failed on 21331968 found E4E3BDB6 wanted 
checksum verify failed on 21331968 found E4E3BDB6 wanted 
checksum verify failed on 21692416 found E4E3BDB6 wanted 
checksum verify failed on 21692416 found E4E3BDB6 wanted 
checksum verify failed on 22888448 found E4E3BDB6 wanted 
checksum verify failed on 22888448 found E4E3BDB6 wanted 
checksum verify failed on 22888448 found E4E3BDB6 wanted 
checksum verify failed on 22888448 found E4E3BDB6 wanted 


At least 4 leaves/nodes of chunk tree are corrupted.
I assume that's all your chunk tree.

I would say the chance to recover is very low.

Thanks,
Qu




bytenr mismatch, want=22888448, have=0
Couldn't read chunk tree
Couldn't open file system

##btrfs-find-root
parent transid verify failed on 21413888 wanted 755660 found 623605
parent transid verify failed on 21413888 wanted 755660 found 623605
parent transid verify failed on 21413888 wanted 755660 found 623605
Ignoring transid failure
Couldn't read chunk tree
ERROR: open ctree failed

##btrfs-show-super -a /dev/sdb
superblock: bytenr=65536, device=/dev/sdb
-
csum0xb6f3ccb1 [match]
bytenr  65536
flags   0x1
( WRITTEN )
magic   _BHRfS_M [match]
fsid7f5aecd6-40fe-45ce-90c3-e86bacc4bf6f
label   samba_fs
generation  770740
root16187774615552
sys_array_size  355
chunk_root_generation   755799
root_level  1
chunk_root  24331161698304
chunk_root_level1
log_root0
log_root_transid0
log_root_level  0
total_bytes 2396231680
bytes_used  22205028102144
sectorsize  4096
nodesize16384
leafsize16384
stripesize  4096
root_dir6
num_devices 1
compat_flags0x0
compat_ro_flags 0x0
incompat_flags  0x169
( MIXED_BACKREF |
COMPRESS_LZO |
BIG_METADATA |
EXTENDED_IREF |
SKINNY_METADATA )
csum_type   0
csum_size   4
cache_generation770740
uuid_tree_generation770740
dev_item.uuid   dd8c8d66-b6f5-48d8-9d5e-6f56b2ad4751
dev_item.fsid   7f5aecd6-40fe-45ce-90c3-e86bacc4bf6f [match]
dev_item.type   0
dev_item.total_bytes2396231680
dev_item.bytes_used 23274943676416
dev_item.io_align   4096
dev_item.io_width   4096
dev_item.sector_size4096
dev_item.devid  1
dev_item.dev_group  0
dev_item.seek_speed 0
dev_item.bandwidth  0
dev_item.generation 0

superblock: bytenr=67108864, device=/dev/sdb
-
csum0x1692e47f [match]
bytenr  67108864
flags   0x1
( WRITTEN )
magic   _BHRfS_M [match]
fsid7f5aecd6-40fe-45ce-90c3-e86bacc4bf6f
label   samba_fs
generation  770740
root16187774615552
sys_array_size  355
chunk_root_generation   755799
root_level  1
chunk_root  24331161698304
chunk_root_level1
log_root0
log_root_transid0
log_root_level  0
total_bytes  

Re: help converting btrfs to new writeback error tracking?

2017-05-09 Thread Jeff Layton
On Mon, 2017-05-08 at 11:39 -0700, Liu Bo wrote:
> Hi Jeff,
> 
> On Fri, May 05, 2017 at 04:11:18PM -0400, Jeff Layton wrote:
> > On Fri, 2017-05-05 at 12:21 -0700, Liu Bo wrote:
> > > Hi Jeff,
> > > 
> > > On Thu, May 04, 2017 at 07:26:17AM -0400, Jeff Layton wrote:
> > > > I've been working on set of patches to clean up how writeback errors are
> > > > tracked and handled in the kernel:
> > > > 
> > > > http://marc.info/?l=linux-fsdevel=149304074111261=2
> > > > 
> > > > The basic idea is that rather than having a set of flags that are
> > > > cleared whenever they are checked, we have a sequence counter and error
> > > > that are tracked on a per-mapping basis, and can then use that sequence
> > > > counter to tell whether the error should be reported.
> > > > 
> > > > This changes the way that things like filemap_write_and_wait work.
> > > > Rather than having to ensure that AS_EIO/AS_ENOSPC are not cleared
> > > > inappropriately (and thus losing errors that should be reported), you
> > > > can now tell whether there has been a writeback error since a certain
> > > > point in time, irrespective of whether anyone else is checking for
> > > > errors.
> > > > 
> > > > I've been doing some conversions of the existing code to the new scheme,
> > > > but btrfs has _really_ complicated error handling. I think it could
> > > > probably be simplified with this new scheme, but I could use some help
> > > > here.
> > > > 
> > > > What I think we probably want to do is to sample the error sequence in
> > > > the mapping at well-defined points in time (probably when starting a
> > > > transaction?) and then use that to determine whether writeback errors
> > > > have occurred since then. Is there anyone in the btrfs community who
> > > > could help me here?
> > > > 
> > > 
> > > I went through the patch set and reviewed the btrfs part particular in
> > > [PATCH v3 14/20] fs: retrofit old error reporting API onto new 
> > > infrastructure
> > > 
> > > It looks good to me.
> > > 
> > > In btrfs ->writepage(), it sets PG_error whenever an error
> > > (-EIO/-ENOSPC/-ENOMEM) occurs and it sets mapping's error as well in
> > > end_extent_writepage().  And the special case is the compression code, 
> > > where it
> > > only sets mapping's error when there is any error during processing 
> > > compression
> > > bytes.
> > > 
> > > Similar to ext4, btrfs tracks the IO error by setting mapping's error in
> > > writepage_endio and other places (eg. compression code), and around 
> > > tree-log.c
> > > it's checking BTRFS_ORDERED_IOERR from ordered_extent->flags, which is 
> > > usually
> > > set in writepage_endio and sometimes in some error handling code where it
> > > couldn't call endio.
> > > 
> > > So the conversion in btrfs's fsync() seems to be good enough, did I miss
> > > anything?
> > > 
> > 
> > Many thanks for taking a look:
> > 
> > There are a number of calls in btrfs to filemap_fdatawait_range that
> > check the return code. That function will wait for writeback on all of
> > the pages in the mapping range and return an error if there has been
> > one. Note too that there are also some places that ignore the return
> > code.
> > 
> > These patches change how filemap_fdatawait_range (and some similar
> > functions) work. Before this set, you'd get an error if one had occurred
> > since anyone last checked it. Now, you only get an error there if one
> > occurred since you started waiting. If the failed writeback occurred
> > before that function was called, you won't get an error back.
> > 
> 
> Since all filemap_fdatawait_range() called in btrfs checked the return value, 
> it
> is supposed to catch any errors that are occured from 
> filemap_fdatawrite_range()
> which is called twice by btrfs_fdatawrite_range()[1], so with this set, it's
> possible to fail to detect errors if only calling filemap_fdatawait_range().
> 
> [1]: filemap_fdatawrite_range() needs to be called twice to make sure 
> compressed
> data is flushed.
> 
> > For fsync, it shouldn't matter. You'll get an error back there if one
> > occurred since the last fsync since you're setting it in the mapping.
> > The bigger question is whether other callers in this code do anything
> > with that e

Re: File system is oddly full after kernel upgrade, balance doesn't help

2017-05-08 Thread Andrew E. Mileski

On 2017-01-28 13:15, MegaBrutal wrote:

Hello,

Of course I can't retrieve the data from before the balance, but here
is the data from now:

root@vmhost:~# btrfs fi show /tmp/mnt/curlybrace
Label: 'curlybrace'  uuid: f471bfca-51c4-4e44-ac72-c6cd9ccaf535
 Total devices 1 FS bytes used 752.38MiB
 devid1 size 2.00GiB used 1.90GiB path
/dev/mapper/vmdata--vg-lxc--curlybrace

root@vmhost:~# btrfs fi df /tmp/mnt/curlybrace
Data, single: total=773.62MiB, used=714.82MiB
System, DUP: total=8.00MiB, used=16.00KiB
Metadata, DUP: total=577.50MiB, used=37.55MiB
GlobalReserve, single: total=512.00MiB, used=0.00B
root@vmhost:~# btrfs fi usage /tmp/mnt/curlybrace
Overall:
 Device size:   2.00GiB
 Device allocated:   1.90GiB
 Device unallocated: 103.38MiB
 Device missing: 0.00B
 Used: 789.94MiB
 Free (estimated): 162.18MiB(min: 110.50MiB)
 Data ratio:  1.00
 Metadata ratio:  2.00
 Global reserve: 512.00MiB(used: 0.00B)

Data,single: Size:773.62MiB, Used:714.82MiB
/dev/mapper/vmdata--vg-lxc--curlybrace 773.62MiB

Metadata,DUP: Size:577.50MiB, Used:37.55MiB
/dev/mapper/vmdata--vg-lxc--curlybrace   1.13GiB

System,DUP: Size:8.00MiB, Used:16.00KiB
/dev/mapper/vmdata--vg-lxc--curlybrace  16.00MiB

Unallocated:
/dev/mapper/vmdata--vg-lxc--curlybrace 103.38MiB


So... if I sum the data, metadata, and the global reserve, I see why
only ~170 MB is left. I have no idea, however, why the global reserve
sneaked up to 512 MB for such a small file system, and how could I
resolve this situation. Any ideas?


MegaBrutal


Total amateur here just jumping in, so feel free to ignore me, but what 
caught my eye was the small device size.


I've had issues with BTRFS on small devices (4GiB & 8GiB), forcing me to 
use other filesystems on them (like EXT4, which has a smaller allocation 
size).  Issues being both ENOSPC and miscellaneous other strange errors 
(which may have been fixed by now).


My theory being that the 1GiB data and 256MiB metadata chunk sizes are 
significant on such small devices.


I don't know if there is an official recommended minimum device size, 
but keeping 4GiB or more free seems to work most of the time for my 
usage patterns.


~~AEM
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: help converting btrfs to new writeback error tracking?

2017-05-08 Thread Liu Bo
Hi Jeff,

On Fri, May 05, 2017 at 04:11:18PM -0400, Jeff Layton wrote:
> On Fri, 2017-05-05 at 12:21 -0700, Liu Bo wrote:
> > Hi Jeff,
> > 
> > On Thu, May 04, 2017 at 07:26:17AM -0400, Jeff Layton wrote:
> > > I've been working on set of patches to clean up how writeback errors are
> > > tracked and handled in the kernel:
> > > 
> > > http://marc.info/?l=linux-fsdevel=149304074111261=2
> > > 
> > > The basic idea is that rather than having a set of flags that are
> > > cleared whenever they are checked, we have a sequence counter and error
> > > that are tracked on a per-mapping basis, and can then use that sequence
> > > counter to tell whether the error should be reported.
> > > 
> > > This changes the way that things like filemap_write_and_wait work.
> > > Rather than having to ensure that AS_EIO/AS_ENOSPC are not cleared
> > > inappropriately (and thus losing errors that should be reported), you
> > > can now tell whether there has been a writeback error since a certain
> > > point in time, irrespective of whether anyone else is checking for
> > > errors.
> > > 
> > > I've been doing some conversions of the existing code to the new scheme,
> > > but btrfs has _really_ complicated error handling. I think it could
> > > probably be simplified with this new scheme, but I could use some help
> > > here.
> > > 
> > > What I think we probably want to do is to sample the error sequence in
> > > the mapping at well-defined points in time (probably when starting a
> > > transaction?) and then use that to determine whether writeback errors
> > > have occurred since then. Is there anyone in the btrfs community who
> > > could help me here?
> > > 
> > 
> > I went through the patch set and reviewed the btrfs part particular in
> > [PATCH v3 14/20] fs: retrofit old error reporting API onto new 
> > infrastructure
> > 
> > It looks good to me.
> > 
> > In btrfs ->writepage(), it sets PG_error whenever an error
> > (-EIO/-ENOSPC/-ENOMEM) occurs and it sets mapping's error as well in
> > end_extent_writepage().  And the special case is the compression code, 
> > where it
> > only sets mapping's error when there is any error during processing 
> > compression
> > bytes.
> > 
> > Similar to ext4, btrfs tracks the IO error by setting mapping's error in
> > writepage_endio and other places (eg. compression code), and around 
> > tree-log.c
> > it's checking BTRFS_ORDERED_IOERR from ordered_extent->flags, which is 
> > usually
> > set in writepage_endio and sometimes in some error handling code where it
> > couldn't call endio.
> > 
> > So the conversion in btrfs's fsync() seems to be good enough, did I miss
> > anything?
> > 
> 
> Many thanks for taking a look:
> 
> There are a number of calls in btrfs to filemap_fdatawait_range that
> check the return code. That function will wait for writeback on all of
> the pages in the mapping range and return an error if there has been
> one. Note too that there are also some places that ignore the return
> code.
> 
> These patches change how filemap_fdatawait_range (and some similar
> functions) work. Before this set, you'd get an error if one had occurred
> since anyone last checked it. Now, you only get an error there if one
> occurred since you started waiting. If the failed writeback occurred
> before that function was called, you won't get an error back.
> 

Since all filemap_fdatawait_range() called in btrfs checked the return value, it
is supposed to catch any errors that are occured from filemap_fdatawrite_range()
which is called twice by btrfs_fdatawrite_range()[1], so with this set, it's
possible to fail to detect errors if only calling filemap_fdatawait_range().

[1]: filemap_fdatawrite_range() needs to be called twice to make sure compressed
data is flushed.

> For fsync, it shouldn't matter. You'll get an error back there if one
> occurred since the last fsync since you're setting it in the mapping.
> The bigger question is whether other callers in this code do anything
> with that error return.
> 
> If they do, then the next question is: from what point do you want to
> detect errors that have occurred?
> 
> What sort of makes sense to me (in a handwavy way) would be to sample
> the errseq_t in the mapping when you start a transaction, and then check
> vs. that for errors. Then, even if you have parallel transactions going
> on the same inode (is that even possible?) then you can tell whether
> they all succeded or not.
> 
> Thoughts?


Re: help converting btrfs to new writeback error tracking?

2017-05-05 Thread Jeff Layton
On Fri, 2017-05-05 at 12:21 -0700, Liu Bo wrote:
> Hi Jeff,
> 
> On Thu, May 04, 2017 at 07:26:17AM -0400, Jeff Layton wrote:
> > I've been working on set of patches to clean up how writeback errors are
> > tracked and handled in the kernel:
> > 
> > http://marc.info/?l=linux-fsdevel=149304074111261=2
> > 
> > The basic idea is that rather than having a set of flags that are
> > cleared whenever they are checked, we have a sequence counter and error
> > that are tracked on a per-mapping basis, and can then use that sequence
> > counter to tell whether the error should be reported.
> > 
> > This changes the way that things like filemap_write_and_wait work.
> > Rather than having to ensure that AS_EIO/AS_ENOSPC are not cleared
> > inappropriately (and thus losing errors that should be reported), you
> > can now tell whether there has been a writeback error since a certain
> > point in time, irrespective of whether anyone else is checking for
> > errors.
> > 
> > I've been doing some conversions of the existing code to the new scheme,
> > but btrfs has _really_ complicated error handling. I think it could
> > probably be simplified with this new scheme, but I could use some help
> > here.
> > 
> > What I think we probably want to do is to sample the error sequence in
> > the mapping at well-defined points in time (probably when starting a
> > transaction?) and then use that to determine whether writeback errors
> > have occurred since then. Is there anyone in the btrfs community who
> > could help me here?
> > 
> 
> I went through the patch set and reviewed the btrfs part particular in
> [PATCH v3 14/20] fs: retrofit old error reporting API onto new infrastructure
> 
> It looks good to me.
> 
> In btrfs ->writepage(), it sets PG_error whenever an error
> (-EIO/-ENOSPC/-ENOMEM) occurs and it sets mapping's error as well in
> end_extent_writepage().  And the special case is the compression code, where 
> it
> only sets mapping's error when there is any error during processing 
> compression
> bytes.
> 
> Similar to ext4, btrfs tracks the IO error by setting mapping's error in
> writepage_endio and other places (eg. compression code), and around tree-log.c
> it's checking BTRFS_ORDERED_IOERR from ordered_extent->flags, which is usually
> set in writepage_endio and sometimes in some error handling code where it
> couldn't call endio.
> 
> So the conversion in btrfs's fsync() seems to be good enough, did I miss
> anything?
> 

Many thanks for taking a look:

There are a number of calls in btrfs to filemap_fdatawait_range that
check the return code. That function will wait for writeback on all of
the pages in the mapping range and return an error if there has been
one. Note too that there are also some places that ignore the return
code.

These patches change how filemap_fdatawait_range (and some similar
functions) work. Before this set, you'd get an error if one had occurred
since anyone last checked it. Now, you only get an error there if one
occurred since you started waiting. If the failed writeback occurred
before that function was called, you won't get an error back.

For fsync, it shouldn't matter. You'll get an error back there if one
occurred since the last fsync since you're setting it in the mapping.
The bigger question is whether other callers in this code do anything
with that error return.

If they do, then the next question is: from what point do you want to
detect errors that have occurred?

What sort of makes sense to me (in a handwavy way) would be to sample
the errseq_t in the mapping when you start a transaction, and then check
vs. that for errors. Then, even if you have parallel transactions going
on the same inode (is that even possible?) then you can tell whether
they all succeded or not.

Thoughts?
-- 
Jeff Layton <jlay...@redhat.com>
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: help converting btrfs to new writeback error tracking?

2017-05-05 Thread Liu Bo
Hi Jeff,

On Thu, May 04, 2017 at 07:26:17AM -0400, Jeff Layton wrote:
> I've been working on set of patches to clean up how writeback errors are
> tracked and handled in the kernel:
> 
> http://marc.info/?l=linux-fsdevel=149304074111261=2
> 
> The basic idea is that rather than having a set of flags that are
> cleared whenever they are checked, we have a sequence counter and error
> that are tracked on a per-mapping basis, and can then use that sequence
> counter to tell whether the error should be reported.
> 
> This changes the way that things like filemap_write_and_wait work.
> Rather than having to ensure that AS_EIO/AS_ENOSPC are not cleared
> inappropriately (and thus losing errors that should be reported), you
> can now tell whether there has been a writeback error since a certain
> point in time, irrespective of whether anyone else is checking for
> errors.
> 
> I've been doing some conversions of the existing code to the new scheme,
> but btrfs has _really_ complicated error handling. I think it could
> probably be simplified with this new scheme, but I could use some help
> here.
> 
> What I think we probably want to do is to sample the error sequence in
> the mapping at well-defined points in time (probably when starting a
> transaction?) and then use that to determine whether writeback errors
> have occurred since then. Is there anyone in the btrfs community who
> could help me here?
>

I went through the patch set and reviewed the btrfs part particular in
[PATCH v3 14/20] fs: retrofit old error reporting API onto new infrastructure

It looks good to me.

In btrfs ->writepage(), it sets PG_error whenever an error
(-EIO/-ENOSPC/-ENOMEM) occurs and it sets mapping's error as well in
end_extent_writepage().  And the special case is the compression code, where it
only sets mapping's error when there is any error during processing compression
bytes.

Similar to ext4, btrfs tracks the IO error by setting mapping's error in
writepage_endio and other places (eg. compression code), and around tree-log.c
it's checking BTRFS_ORDERED_IOERR from ordered_extent->flags, which is usually
set in writepage_endio and sometimes in some error handling code where it
couldn't call endio.

So the conversion in btrfs's fsync() seems to be good enough, did I miss
anything?

Thanks,

-liubo
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


help converting btrfs to new writeback error tracking?

2017-05-04 Thread Jeff Layton
I've been working on set of patches to clean up how writeback errors are
tracked and handled in the kernel:

http://marc.info/?l=linux-fsdevel=149304074111261=2

The basic idea is that rather than having a set of flags that are
cleared whenever they are checked, we have a sequence counter and error
that are tracked on a per-mapping basis, and can then use that sequence
counter to tell whether the error should be reported.

This changes the way that things like filemap_write_and_wait work.
Rather than having to ensure that AS_EIO/AS_ENOSPC are not cleared
inappropriately (and thus losing errors that should be reported), you
can now tell whether there has been a writeback error since a certain
point in time, irrespective of whether anyone else is checking for
errors.

I've been doing some conversions of the existing code to the new scheme,
but btrfs has _really_ complicated error handling. I think it could
probably be simplified with this new scheme, but I could use some help
here.

What I think we probably want to do is to sample the error sequence in
the mapping at well-defined points in time (probably when starting a
transaction?) and then use that to determine whether writeback errors
have occurred since then. Is there anyone in the btrfs community who
could help me here?

Thanks,
-- 
Jeff Layton <jlay...@redhat.com>
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 4/9] btrfs-progs: help: Unbind short help description from btrfs

2017-04-16 Thread Qu Wenruo
usage_command_group_short() always binds its description to 'btrfs',
making us unable to this function in other progs.

This patch makes the short description independent, so callers need to
pass the short description by themselves.

Signed-off-by: Qu Wenruo <quwen...@cn.fujitsu.com>
---
 btrfs.c | 12 +++-
 help.c  | 14 +++---
 help.h  |  3 ++-
 3 files changed, 20 insertions(+), 9 deletions(-)

diff --git a/btrfs.c b/btrfs.c
index f096e780..b3686c4b 100644
--- a/btrfs.c
+++ b/btrfs.c
@@ -24,6 +24,15 @@
 #include "utils.h"
 #include "help.h"
 
+static const char * const btrfs_short_desc[] = {
+   "For an overview of a given command use 'btrfs command --help'",
+   "or 'btrfs [command...] --help --full' to print all available options.",
+   "Any command name can be shortened as far as it stays unambiguous,",
+   "however it is recommended to use full command names in scripts.",
+   "All command groups have their manual page named 'btrfs-'.",
+   NULL
+};
+
 static const char * const btrfs_cmd_group_usage[] = {
"btrfs [--help] [--version]  [...]  []",
NULL
@@ -126,7 +135,8 @@ int main(int argc, char **argv)
if (!prefixcmp(argv[0], "--"))
argv[0] += 2;
} else {
-   usage_command_group_short(_cmd_group);
+   usage_command_group_short(_cmd_group,
+ btrfs_short_desc);
exit(1);
}
}
diff --git a/help.c b/help.c
index 19b0d357..13c45ffd 100644
--- a/help.c
+++ b/help.c
@@ -262,7 +262,8 @@ static void usage_command_group_internal(const struct 
cmd_group *grp, int full,
}
 }
 
-void usage_command_group_short(const struct cmd_group *grp)
+void usage_command_group_short(const struct cmd_group *grp,
+  const char * const *short_desc)
 {
const char * const *usagestr = grp->usagestr;
FILE *outf = stdout;
@@ -298,12 +299,11 @@ void usage_command_group_short(const struct cmd_group 
*grp)
fprintf(outf, "  %-16s  %s\n", cmd->token, cmd->usagestr[1]);
    }
 
-   fputc('\n', outf);
-   fprintf(stderr, "For an overview of a given command use 'btrfs command 
--help'\n");
-   fprintf(stderr, "or 'btrfs [command...] --help --full' to print all 
available options.\n");
-   fprintf(stderr, "Any command name can be shortened as far as it stays 
unambiguous,\n");
-   fprintf(stderr, "however it is recommended to use full command names in 
scripts.\n");
-   fprintf(stderr, "All command groups have their manual page named 
'btrfs-'.\n");
+   if (short_desc) {
+   fputc('\n', outf);
+   while (*short_desc && **short_desc)
+   fprintf(outf, "%s\n", *short_desc++);
+   }
 }
 
 void usage_command_group(const struct cmd_group *grp, int full, int err)
diff --git a/help.h b/help.h
index 7458e745..9b190fb1 100644
--- a/help.h
+++ b/help.h
@@ -58,7 +58,8 @@ struct cmd_group;
 void usage(const char * const *usagestr) __attribute__((noreturn));
 void usage_command(const struct cmd_struct *cmd, int full, int err);
 void usage_command_group(const struct cmd_group *grp, int all, int err);
-void usage_command_group_short(const struct cmd_group *grp);
+void usage_command_group_short(const struct cmd_group *grp,
+  const char * const *short_desc);
 
 void help_unknown_token(const char *arg, const struct cmd_group *grp) 
__attribute__((noreturn));
 void help_ambiguous_token(const char *arg, const struct cmd_group *grp) 
__attribute__((noreturn));
-- 
2.12.2



--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Need some help: "BTRFS critical (device sda): corrupt leaf, slot offset bad: block"

2017-04-05 Thread Robert Krig


On 04.04.2017 18:55, Chris Murphy wrote:
> On Tue, Apr 4, 2017 at 10:52 AM, Chris Murphy <li...@colorremedies.com> wrote:
>
>
>> Mounting -o ro,degraded is probably permitted by the file system, but
>> chunks of the file system and certainly your data, will be missing. So
>> it's just a matter of time before copying data off will fail.
> ** Context here is, more than 1 device missing.
>

Thanks you guys for all your help and input.

I've ordered two new drives to backup all my data. I have a cloud backup
in place, but 13TB takes a while to upload :-)
I think I'm gonna abandon btrfs as the main fs for my home server. I'm
just gonna set up a separate LVM volume for storing snapshots and
backups, since I use btrfs on all my single disk machines.
Thanks again everyone.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Need some help: "BTRFS critical (device sda): corrupt leaf, slot offset bad: block"

2017-04-04 Thread Chris Murphy
On Mon, Apr 3, 2017 at 10:02 PM, Robert Krig
 wrote:
>
>
> On 03.04.2017 16:25, Robert Krig wrote:
>>
>> I'm gonna run a extensive memory check once I get home, since you
>> mentioned corrupt memory might be an issue here.
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>> the body of a message to majord...@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
>
> I ran a memtest over a couple of hours with no errors. Ram seems to be
> fine so far.

Inconclusive. A memtest can take days to expose a problem, and even
that's not conclusive. The list archive has some examples of where
memory testers gave RAM a pass, but doing things like compiling the
kernel would fail.


>
> I've looked at the link you provided. Frankly it looks very scary. (At
> least to me it does)
> But I've just thought of something else.
>
> My storage array is BTRFS Raid1 with 4x8TB Drives.
> Wouldn't it be possible to simply disconnect two of those drives, mount
> with -o degraded and still have access (even if read-only) to all my data?

man mkfs.btrfs

Btrfs raid1 supports only one device missing, no matter how many drives.

Mounting -o ro,degraded is probably permitted by the file system, but
chunks of the file system and certainly your data, will be missing. So
it's just a matter of time before copying data off will fail.

I suggest trying -o ro with all drives, not a degraded mount, and
copying data off. Any failures should be logged. Metadata errors are
logged without paths, whereas data corruption included path to the
affected file. This is easier than scraping the file system with btrfs
restore.

If you can't mount ro with all drives, or ro,degraded with just one
device missing, you'll need to use btrfs restore which is more
tolerant of missing metadata.


-- 
Chris Murphy
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Need some help: "BTRFS critical (device sda): corrupt leaf, slot offset bad: block"

2017-04-04 Thread Austin S. Hemmelgarn

On 2017-04-04 09:29, Brian B wrote:

On 04/04/2017 12:02 AM, Robert Krig wrote:

My storage array is BTRFS Raid1 with 4x8TB Drives.
Wouldn't it be possible to simply disconnect two of those drives, mount
with -o degraded and still have access (even if read-only) to all my data?

Just jumping on this point: my understanding of BTRFS "RAID1" is that
each file (block?) is randomly assigned to two disks of the array (no
matter how many disks are in the array).  So if you remove two disks,
you will probably have files that were "assigned" to both of those
disks, and will be missing.

In short, you can't remove more than one disk of a BTRFS RAID1 and still
have all of your data.

That understanding is correct.  From a functional perspective, BTRFS 
raid1 is currently a RAID10 implementation with striping happening at a 
very large granularity.

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Need some help: "BTRFS critical (device sda): corrupt leaf, slot offset bad: block"

2017-04-04 Thread Hugo Mills
On Tue, Apr 04, 2017 at 09:29:11AM -0400, Brian B wrote:
> On 04/04/2017 12:02 AM, Robert Krig wrote:
> > My storage array is BTRFS Raid1 with 4x8TB Drives.
> > Wouldn't it be possible to simply disconnect two of those drives, mount
> > with -o degraded and still have access (even if read-only) to all my data?
> Just jumping on this point: my understanding of BTRFS "RAID1" is that
> each file (block?) is randomly assigned to two disks of the array (no

   Arbitrarily assigned, rather than randomly assigned (there is a
deterministic algorithm for it, but it's wise not to rely on the exact
behaviour of that algorithm, because there are a number of factors
that can alter its behaviour).

> matter how many disks are in the array).  So if you remove two disks,
> you will probably have files that were "assigned" to both of those
> disks, and will be missing.
> 
> In short, you can't remove more than one disk of a BTRFS RAID1 and still
> have all of your data.

   Indeed.

   Hugo.

-- 
Hugo Mills | Some days, it's just not worth gnawing through the
hugo@... carfax.org.uk | straps
http://carfax.org.uk/  |
PGP: E2AB1DE4  |


signature.asc
Description: Digital signature


Re: Need some help: "BTRFS critical (device sda): corrupt leaf, slot offset bad: block"

2017-04-04 Thread Brian B
On 04/04/2017 12:02 AM, Robert Krig wrote:
> My storage array is BTRFS Raid1 with 4x8TB Drives.
> Wouldn't it be possible to simply disconnect two of those drives, mount
> with -o degraded and still have access (even if read-only) to all my data?
Just jumping on this point: my understanding of BTRFS "RAID1" is that
each file (block?) is randomly assigned to two disks of the array (no
matter how many disks are in the array).  So if you remove two disks,
you will probably have files that were "assigned" to both of those
disks, and will be missing.

In short, you can't remove more than one disk of a BTRFS RAID1 and still
have all of your data.



signature.asc
Description: OpenPGP digital signature


Re: Need some help: "BTRFS critical (device sda): corrupt leaf, slot offset bad: block"

2017-04-03 Thread Robert Krig


On 03.04.2017 16:25, Robert Krig wrote:
>
> I'm gonna run a extensive memory check once I get home, since you
> mentioned corrupt memory might be an issue here.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


I ran a memtest over a couple of hours with no errors. Ram seems to be
fine so far.

I've looked at the link you provided. Frankly it looks very scary. (At
least to me it does)
But I've just thought of something else.

My storage array is BTRFS Raid1 with 4x8TB Drives.
Wouldn't it be possible to simply disconnect two of those drives, mount
with -o degraded and still have access (even if read-only) to all my data?
E.g. I could use the two removed drives as a backup and rebuild my array
from there. Since I'm kind of playing with the idea of turning it into a
MD RAID5 and only use btrfs on specific lvm volumes which need it.

The one thing that slightly worries me with this idea is, I don't know
if there is a way to tell which datablocks are on which drives. If I've
understood btrfs raid1 correctly it simply ensures that there is at
least a copy of each block on a different device.

Would my idea work? Or could it be that I can only safely remove one
drive, since the other drives might contain blocks from any of the other
drives?
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Need some help: "BTRFS critical (device sda): corrupt leaf, slot offset bad: block"

2017-04-03 Thread Hans van Kranenburg
On 04/03/2017 04:20 PM, Robert Krig wrote:
> 
> 
> On 03.04.2017 16:08, Hans van Kranenburg wrote:
>> On 04/03/2017 12:11 PM, Robert Krig wrote:
>> The corruption is at item 157. Can you attach all of the output, or
>> pastebin it?
>>
> 
> I've attached the entire log of btrfs-debug-tree. This was generated
> with btrfs-progs 4.7.3

Meuh,

item 156 key (23416298414080 EXTENT_ITEM 4096) itemoff 8643 itemsize 53
item 157 key (23416298418176 EXTENT_ITEM 4096) itemoff 8590 itemsize 53

8590 + 53 = 8643.

I don't get what's invalid about that.

"incorrect offsets 8590 1258314415"

if (btrfs_item_offset_nr(buf, i) !=
btrfs_item_end_nr(buf, i + 1)) {
ret = BTRFS_TREE_BLOCK_INVALID_OFFSETS;
fprintf(stderr, "incorrect offsets %u %u\n",
btrfs_item_offset_nr(buf, i),
btrfs_item_end_nr(buf, i + 1));
goto fail;
}

Ah, ok, so the corruption is in item 158, but it's reported as
corruption in item 157.

There's no really simple tool right now to fix this manually. We can
also try to dd 16kiB of metadata from disk, fix it, and write it back.
We've been doing that before, it's a bit of work, but it can succeed.
Here's more instructions:

https://www.spinics.net/lists/linux-btrfs/msg62459.html

So, if you're the adventurous type...

But then again, if this is really memory failure, there might be other
errors all around the fs, which you didn't hit while reading back the
data yet.

Also note that btrfs does not protect you against this, also not for
data in files that gets corrupted in memory before it's written out
(which contains the checksum step).

> If it makes a difference, I can try it again with the newest version of
> btrfs-progs?

No, that code hasn't been touched in over 5 years.

-- 
Hans van Kranenburg
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Need some help: "BTRFS critical (device sda): corrupt leaf, slot offset bad: block"

2017-04-03 Thread Robert Krig


On 03.04.2017 16:20, Robert Krig wrote:
>
> On 03.04.2017 16:08, Hans van Kranenburg wrote:
>> On 04/03/2017 12:11 PM, Robert Krig wrote:
>> The corruption is at item 157. Can you attach all of the output, or
>> pastebin it?
>>
>
> I've attached the entire log of btrfs-debug-tree. This was generated
> with btrfs-progs 4.7.3
>
> If it makes a difference, I can try it again with the newest version of
> btrfs-progs?


I forgot to mention that btrfs-debug-tree also segfaults with a "memory
access error"

I'm gonna run a extensive memory check once I get home, since you
mentioned corrupt memory might be an issue here.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Need some help: "BTRFS critical (device sda): corrupt leaf, slot offset bad: block"

2017-04-03 Thread Robert Krig


On 03.04.2017 16:08, Hans van Kranenburg wrote:
> On 04/03/2017 12:11 PM, Robert Krig wrote:
> The corruption is at item 157. Can you attach all of the output, or
> pastebin it?
>


I've attached the entire log of btrfs-debug-tree. This was generated
with btrfs-progs 4.7.3

If it makes a difference, I can try it again with the newest version of
btrfs-progs?
btrfs-progs v4.7.3
leaf 38666170826752 items 199 free space 1506 generation 1248226 owner 2
fs uuid 8c4f8e26-3442-463f-ad8a-668dfef02593
chunk uuid 1f04f64e-0ec8-4b39-83d9-a2df75179d3e
item 0 key (23416295448576 EXTENT_ITEM 36864) itemoff 16230 itemsize 53
extent refs 1 gen 671397 flags DATA
extent data backref root 5 objectid 4959957 offset 0 count 1
item 1 key (23416295485440 EXTENT_ITEM 8192) itemoff 16177 itemsize 53
extent refs 1 gen 972749 flags DATA
extent data backref root 5 objectid 7328099 offset 0 count 1
item 2 key (23416295493632 EXTENT_ITEM 12288) itemoff 16124 itemsize 53
extent refs 1 gen 797708 flags DATA
extent data backref root 5 objectid 5842103 offset 1966080 
count 1
item 3 key (23416295505920 EXTENT_ITEM 8192) itemoff 16071 itemsize 53
extent refs 1 gen 1244513 flags DATA
extent data backref root 44107 objectid 28528 offset 974848 
count 1
item 4 key (23416295514112 EXTENT_ITEM 8192) itemoff 16034 itemsize 37
extent refs 1 gen 625327 flags DATA
shared data backref parent 38666872045568 count 1
item 5 key (23416295522304 EXTENT_ITEM 16384) itemoff 15997 itemsize 37
extent refs 1 gen 625327 flags DATA
shared data backref parent 38666872045568 count 1
item 6 key (23416295538688 EXTENT_ITEM 49152) itemoff 15944 itemsize 53
extent refs 1 gen 585321 flags DATA
extent data backref root 5 objectid 4742401 offset 393216 count 
1
item 7 key (23416295587840 EXTENT_ITEM 8192) itemoff 15907 itemsize 37
extent refs 1 gen 625327 flags DATA
shared data backref parent 38666872045568 count 1
item 8 key (23416295596032 EXTENT_ITEM 4096) itemoff 15854 itemsize 53
extent refs 1 gen 625327 flags DATA
extent data backref root 5 objectid 1123021 offset 6029312 
count 1
item 9 key (23416295600128 EXTENT_ITEM 4096) itemoff 15801 itemsize 53
extent refs 1 gen 975337 flags DATA
extent data backref root 5 objectid 7334929 offset 0 count 1
item 10 key (23416295604224 EXTENT_ITEM 57344) itemoff 15748 itemsize 53
extent refs 1 gen 572974 flags DATA
extent data backref root 5 objectid 4430156 offset 0 count 1
item 11 key (23416295661568 EXTENT_ITEM 106496) itemoff 15695 itemsize 
53
extent refs 1 gen 585319 flags DATA
extent data backref root 5 objectid 4742398 offset 2490368 
count 1
item 12 key (23416295768064 EXTENT_ITEM 4096) itemoff 15642 itemsize 53
extent refs 1 gen 795227 flags DATA
extent data backref root 5 objectid 5769382 offset 12288 count 1
item 13 key (23416295772160 EXTENT_ITEM 4096) itemoff 15589 itemsize 53
extent refs 1 gen 795227 flags DATA
extent data backref root 5 objectid 5769383 offset 4096 count 1
item 14 key (23416295776256 EXTENT_ITEM 4096) itemoff 15536 itemsize 53
extent refs 1 gen 585370 flags DATA
extent data backref root 5 objectid 4742594 offset 1310720 
count 1
item 15 key (23416295780352 EXTENT_ITEM 8192) itemoff 15499 itemsize 37
extent refs 1 gen 625327 flags DATA
shared data backref parent 32477101621248 count 1
item 16 key (23416295788544 EXTENT_ITEM 151552) itemoff 15446 itemsize 
53
extent refs 1 gen 992062 flags DATA
extent data backref root 5 objectid 7458028 offset 0 count 1
item 17 key (23416295940096 EXTENT_ITEM 4096) itemoff 15393 itemsize 53
extent refs 1 gen 1027477 flags DATA
extent data backref root 5 objectid 7508879 offset 4096 count 1
item 18 key (23416295944192 EXTENT_ITEM 4096) itemoff 15340 itemsize 53
extent refs 1 gen 1023977 flags DATA
extent data backref root 5 objectid 7496365 offset 20480 count 1
item 19 key (23416295948288 EXTENT_ITEM 36864) itemoff 15287 itemsize 53
extent refs 1 gen 516177 flags DATA
extent data backref root 5 objectid 3897818 offset 12976128 
count 1
item 20 key (23416295985152 EXTENT_ITEM 45056) itemoff 15234 itemsize 53
extent refs 1 gen 444976 flags DATA
extent data backref root 5 objectid 3591929 offset 12320768 
count 1
item 21 key 

Re: Need some help: "BTRFS critical (device sda): corrupt leaf, slot offset bad: block"

2017-04-03 Thread Hans van Kranenburg
On 04/03/2017 03:50 PM, Robert Krig wrote:
> 
> 
> On 03.04.2017 12:11, Robert Krig wrote:
>> Hi guys, I seem to have run into a spot of trouble with my btrfs partition.
>>
>> I've got 4 x 8TB in a RAID1 BTRFS configuration.
>>
>> I'm running Debian Jessie 64 Bit, 4.9.0-0.bpo.2-amd64 kernel. Btrfs
>> progs version v4.7.3
>>
>> Server has 8GB of Ram.
>>
>>
>> I was running duperemove using a hashfile, which seemed to have run out
>> space and aborted. Then I tried a balance operation, with -dusage
>> progressively set to 0 1 5 15 30 50, which then aborted, I presume that
>> this caused the fs to mount readonly. I only noticed it somewhat later.
>>
>> I've since rebooted, and I can mount the filesystem OK, but after some
>> time (I presume caused by reads or writes) it once again switches to
>> readonly.
>>
>> I tried unmounting/remounting again and running a scrub, but the scrub
>> aborts after some time.
>>
>>
> 
> 
> I've compiled the newest btrfs-tools version 4.10.2
> 
> This is what I get when running a btrfsck -p /dev/sda
> 
> hecking filesystem on
> /dev/sda  
> 
> 
> UUID:
> 8c4f8e26-3442-463f-ad8a-668dfef02593  
>  
> 
> incorrect offsets 8590
> 1258314415
> 
> 
> bad block
> 38666170826752
>  
> 
>   
>
> 
> ERROR: errors found in extent allocation tree or chunk
> allocation
> Speicherzugriffsfehler
> 
> For the non-german speakers: Speicherzugriffsfehler = Memory Access Error
> 
> Dmesg shows this:
> 
> Apr 03 15:47:05 atlas kernel: btrfs[9140]: segfault at 9476b99e ip
> 0044c459 sp 7fff556b4b10 error 4 in
> btrfs[40+9d000]

That's probably because the tool does not verify if the numbers in the
fields make sense before using them.


-- 
Hans van Kranenburg
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Need some help: "BTRFS critical (device sda): corrupt leaf, slot offset bad: block"

2017-04-03 Thread Hans van Kranenburg
On 04/03/2017 12:11 PM, Robert Krig wrote:
> Hi guys, I seem to have run into a spot of trouble with my btrfs partition.
> 
> I've got 4 x 8TB in a RAID1 BTRFS configuration.
> 
> I'm running Debian Jessie 64 Bit, 4.9.0-0.bpo.2-amd64 kernel. Btrfs
> progs version v4.7.3
> 
> Server has 8GB of Ram.
> 
> 
> I was running duperemove using a hashfile, which seemed to have run out
> space and aborted. Then I tried a balance operation, with -dusage
> progressively set to 0 1 5 15 30 50, which then aborted, I presume that
> this caused the fs to mount readonly. I only noticed it somewhat later.

The balance probably did not cause the issue, but it ran across the
invalid metadata page, while digging around in the filesyste and then
choked on it.

> I've since rebooted, and I can mount the filesystem OK, but after some
> time (I presume caused by reads or writes) it once again switches to
> readonly.
> 
> I tried unmounting/remounting again and running a scrub, but the scrub
> aborts after some time.
> 
> 
> Here is the output from the kernel when the partition crashes:
> 
> Apr 03 11:32:57 atlas kernel: BTRFS info (device sda): The free space
> cache file (37732863967232) is invalid. skip it
> Apr 03 11:33:46 atlas kernel: BTRFS critical (device sda): corrupt leaf,
> slot offset bad: block=38666170826752, root=1, slot=157
> [...]

Note: The root=1 is a lie? Looking at the output of btrfs-debug-tree
below, this is definitely a tree block of tree 2, not 1. I have seen
this more often, but not looked at the code yet. Maybe some bug in
assembling the error message?

> I tried running a btrfs-debug-tree -b 38666170826752 /dev/sda
> 
> btrfs-progs
> v4.7.3
> 
> 
> leaf 38666170826752 items 199 free space 1506 generation 1248226 owner
> 2 
>  
> 
> fs uuid
> 8c4f8e26-3442-463f-ad8a-668dfef02593  
> 
> 
> chunk uuid
> 1f04f64e-0ec8-4b39-83d9-a2df75179d3e  
>  
> 
> item 0 key (23416295448576 EXTENT_ITEM 36864) itemoff 16230
> itemsize
> 53   
> 
> extent refs 1 gen 671397 flags
> DATA  
>  
> 
> extent data backref root 5 objectid 4959957 offset 0
> count
> 1 
>  
> 
> [...]

The corruption is at item 157. Can you attach all of the output, or
pastebin it?

> this goes on and on.  I can provide the entire output if thats helpful.

Yes. The corruption is in item 157, and then from the point of the
itemoff value. This is the offset of the item data in the metadata page.
See https://btrfs.wiki.kernel.org/index.php/On-disk_Format#Leaf_Node

> Any ideas on what I could do to fix the partition? Is it fixable, or is
> it a lost cause?

Memory corruption, not on disk corruption.

So, either a bitflip, or garbage which ended up on this memory location
for whatever reason or a bug in whatever part of the kernel, a pointer
in another module gone wonky, etc, which we might learn more about after
seeing more of the output.


-- 
Hans van Kranenburg
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Need some help: "BTRFS critical (device sda): corrupt leaf, slot offset bad: block"

2017-04-03 Thread Robert Krig


On 03.04.2017 12:11, Robert Krig wrote:
> Hi guys, I seem to have run into a spot of trouble with my btrfs partition.
>
> I've got 4 x 8TB in a RAID1 BTRFS configuration.
>
> I'm running Debian Jessie 64 Bit, 4.9.0-0.bpo.2-amd64 kernel. Btrfs
> progs version v4.7.3
>
> Server has 8GB of Ram.
>
>
> I was running duperemove using a hashfile, which seemed to have run out
> space and aborted. Then I tried a balance operation, with -dusage
> progressively set to 0 1 5 15 30 50, which then aborted, I presume that
> this caused the fs to mount readonly. I only noticed it somewhat later.
>
> I've since rebooted, and I can mount the filesystem OK, but after some
> time (I presume caused by reads or writes) it once again switches to
> readonly.
>
> I tried unmounting/remounting again and running a scrub, but the scrub
> aborts after some time.
>
>


I've compiled the newest btrfs-tools version 4.10.2

This is what I get when running a btrfsck -p /dev/sda

hecking filesystem on
/dev/sda
  

UUID:
8c4f8e26-3442-463f-ad8a-668dfef02593
   

incorrect offsets 8590
1258314415  
  

bad block
38666170826752  
   


 

ERROR: errors found in extent allocation tree or chunk
allocation
Speicherzugriffsfehler

For the non-german speakers: Speicherzugriffsfehler = Memory Access Error

Dmesg shows this:

Apr 03 15:47:05 atlas kernel: btrfs[9140]: segfault at 9476b99e ip
0044c459 sp 7fff556b4b10 error 4 in
btrfs[40+9d000]



--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Need some help: "BTRFS critical (device sda): corrupt leaf, slot offset bad: block"

2017-04-03 Thread Robert Krig
Hi guys, I seem to have run into a spot of trouble with my btrfs partition.

I've got 4 x 8TB in a RAID1 BTRFS configuration.

I'm running Debian Jessie 64 Bit, 4.9.0-0.bpo.2-amd64 kernel. Btrfs
progs version v4.7.3

Server has 8GB of Ram.


I was running duperemove using a hashfile, which seemed to have run out
space and aborted. Then I tried a balance operation, with -dusage
progressively set to 0 1 5 15 30 50, which then aborted, I presume that
this caused the fs to mount readonly. I only noticed it somewhat later.

I've since rebooted, and I can mount the filesystem OK, but after some
time (I presume caused by reads or writes) it once again switches to
readonly.

I tried unmounting/remounting again and running a scrub, but the scrub
aborts after some time.


Here is the output from the kernel when the partition crashes:

Apr 03 11:32:57 atlas kernel: BTRFS info (device sda): The free space
cache file (37732863967232) is invalid. skip it
Apr 03 11:33:46 atlas kernel: BTRFS critical (device sda): corrupt leaf,
slot offset bad: block=38666170826752, root=1, slot=157
Apr 03 11:33:46 atlas kernel: [ cut here ]
Apr 03 11:33:46 atlas kernel: WARNING: CPU: 0 PID: 17810 at
/home/zumbi/linux-4.9.13/fs/btrfs/extent-tree.c:6961
__btrfs_free_extent.isra.69+0x152/0xd60 [b
Apr 03 11:33:46 atlas kernel: BTRFS: Transaction aborted (error -5)
Apr 03 11:33:46 atlas kernel: Modules linked in: xt_multiport
iptable_filter ip_tables x_tables binfmt_misc cpufreq_userspace
cpufreq_conservative cpufreq_
Apr 03 11:33:46 atlas kernel:  ppdev lp parport autofs4 btrfs xor
raid6_pq dm_mod md_mod fuse sg sd_mod ahci libahci libata crc32c_intel
scsi_mod fan therm
Apr 03 11:33:46 atlas kernel: CPU: 0 PID: 17810 Comm: mc Not tainted
4.9.0-0.bpo.2-amd64 #1 Debian 4.9.13-1~bpo8+1
Apr 03 11:33:46 atlas kernel: Hardware name: ASUS All Series/H87M-E,
BIOS 0703 10/30/2013
Apr 03 11:33:46 atlas kernel:   97d29cd5
b8ab4bb53a50 
Apr 03 11:33:46 atlas kernel:  97a778a4 154c080b2000
b8ab4bb53aa8 8908ad438b40
Apr 03 11:33:46 atlas kernel:  890951b96000 
89086c3d4000 97a7791f
Apr 03 11:33:46 atlas kernel: Call Trace:
Apr 03 11:33:46 atlas kernel:  [] ? dump_stack+0x5c/0x77
Apr 03 11:33:46 atlas kernel:  [] ? __warn+0xc4/0xe0
Apr 03 11:33:46 atlas kernel:  [] ?
warn_slowpath_fmt+0x5f/0x80
Apr 03 11:33:46 atlas kernel:  [] ?
__btrfs_free_extent.isra.69+0x152/0xd60 [btrfs]
Apr 03 11:33:46 atlas kernel:  [] ?
__btrfs_run_delayed_refs+0x466/0x1360 [btrfs]
Apr 03 11:33:46 atlas kernel:  [] ?
set_extent_buffer_dirty+0x64/0xb0 [btrfs]
Apr 03 11:33:46 atlas kernel:  [] ?
btrfs_run_delayed_refs+0x8f/0x2b0 [btrfs]
Apr 03 11:33:46 atlas kernel:  [] ?
btrfs_should_end_transaction+0x3f/0x60 [btrfs]
Apr 03 11:33:46 atlas kernel:  [] ?
btrfs_truncate_inode_items+0x63a/0xde0 [btrfs]
Apr 03 11:33:46 atlas kernel:  [] ?
btrfs_evict_inode+0x4a2/0x5f0 [btrfs]
Apr 03 11:33:46 atlas kernel:  [] ? evict+0xb6/0x180
Apr 03 11:33:46 atlas kernel:  [] ?
do_unlinkat+0x148/0x300
Apr 03 11:33:46 atlas kernel:  [] ?
system_call_fast_compare_end+0xc/0x9b
Apr 03 11:33:46 atlas kernel: ---[ end trace 2a45c2819ff7b785 ]---
Apr 03 11:33:46 atlas kernel: BTRFS: error (device sda) in
__btrfs_free_extent:6961: errno=-5 IO failure
Apr 03 11:33:46 atlas kernel: BTRFS info (device sda): forced readonly
Apr 03 11:33:46 atlas kernel: BTRFS: error (device sda) in
btrfs_run_delayed_refs:2967: errno=-5 IO failure
Apr 03 11:33:50 atlas kernel: BTRFS warning (device sda): failed setting
block group ro, ret=-30
Apr 03 11:33:50 atlas kernel: BTRFS warning (device sda): failed setting
block group ro, ret=-30
Apr 03 11:33:52 atlas kernel: BTRFS warning (device sda): failed setting
block group ro, ret=-30
Apr 03 11:33:53 atlas kernel: BTRFS warning (device sda): Skipping
commit of aborted transaction.
Apr 03 11:33:53 atlas kernel: BTRFS: error (device sda) in
cleanup_transaction:1850: errno=-5 IO failure
Apr 03 11:33:53 atlas kernel: BTRFS info (device sda): delayed_refs has
NO entry
Apr 03 11:33:54 atlas kernel: BTRFS warning (device sda): failed setting
block group ro, ret=-30



I tried running a btrfs-debug-tree -b 38666170826752 /dev/sda

btrfs-progs
v4.7.3  
  

leaf 38666170826752 items 199 free space 1506 generation 1248226 owner
2   
   

fs uuid
8c4f8e26-3442-463f-ad8a-668dfef02593
  

chunk uuid
1f04f64e-0ec8-4b39-83d9-a2df75179d3e
   

item 0 key (23416295448576 EXTENT_ITEM 36864) itemoff 16230
itemsize
53 

Re: help : "bad tree block start" -> btrfs forced readonly

2017-03-17 Thread Lionel Bouton
Hi,

some news from the coal mine...

Le 17/03/2017 à 11:03, Lionel Bouton a écrit :
> [...]
> I'm considering trying to use a 4 week old snapshot of the device to
> find out if it was corrupted or not instead. It will still be a pain if
> it works but rsync for less than a month of data is at least an order of
> magnitude faster than a full restore.

btrfs check -p /dev/sdb is running on this 4 week old snapshot. The
extents check passed without any error, it is currently checking the
free space (and it's just done while I was writing this and is doing fs
roots).

I'm not sure of the list of checks it performs. I assume the free
space^H... fs roots can't be much longer than the rest (on a ~13TB of
20TB used filesystem with ~ 10 million files and half a dozen subvolumes).
It took less than an hour to check extents. I'll give it another hour
and stop it if its not done : it's already passing stages than the live
data couldn't get to.

I may be wrong but I suspect Ceph is innocent of any wrong-doing here :
I think there's a high probability that if Ceph could corrupt its data
in our configuration the snapshot would have been corrupted too (most of
its data is shared with the live data). I wonder if QEMU or the VM
kernel managed to transform IO timeouts (which clearly happened below
Ceph and were passed to the VM in many instances) into garbage reads
which ended in garbage writes. If it isn't in QEMU and happened in the
kernel this was with 4.1.15 so it might be a corrected kernel bug in
either the block or fs layers. I'm not especially ecstatic at the
prospect of testing this behavior again but I will automate more Ceph
snapshots in the future (and the VM is now on 4.9.6).

Best regards,

Lionel
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: help : "bad tree block start" -> btrfs forced readonly

2017-03-17 Thread Lionel Bouton
Le 17/03/2017 à 10:51, Roman Mamedov a écrit :
> On Fri, 17 Mar 2017 10:27:11 +0100
> Lionel Bouton <lionel-subscript...@bouton.name> wrote:
>
>> Hi,
>>
>> Le 17/03/2017 à 09:43, Hans van Kranenburg a écrit :
>>> btrfs-debug-tree -b 3415463870464
>> Here is what it gives me back :
>>
>> btrfs-debug-tree -b 3415463870464 /dev/sdb
>> btrfs-progs v4.6.1
>> checksum verify failed on 3415463870464 found A85405B7 wanted 01010101
>> checksum verify failed on 3415463870464 found A85405B7 wanted 01010101
>> bytenr mismatch, want=3415463870464, have=72340172838076673
>> ERROR: failed to read 3415463870464
>>
>> Is there a way to remove part of the tree and keep the rest ? It could
>> help minimize the time needed to restore data.
> If you are able to experiment with writable snapshots, you could try using
> "btrfs-corrupt-block" to kill the bad block, and see what btrfsck makes out of
> the rest. In a similar case I got little to no damage to the overall FS.
> http://www.spinics.net/lists/linux-btrfs/msg53061.html
>
I've launched btrfs check in read-only mode :

btrfs check -p /dev/sdb
Checking filesystem on /dev/sdb
UUID: dbbde1f0-d8a0-4c7c-a7b8-17237e98e525
checksum verify failed on 3415463755776 found A85405B7 wanted 01010101
checksum verify failed on 3415463755776 found A85405B7 wanted 01010101
bytenr mismatch, want=3415463755776, have=72340172838076673
checksum verify failed on 3415464001536 found A85405B7 wanted 01010101
checksum verify failed on 3415464001536 found A85405B7 wanted 01010101
bytenr mismatch, want=3415464001536, have=72340172838076673
checksum verify failed on 3415464640512 found A85405B7 wanted 01010101
checksum verify failed on 3415464640512 found A85405B7 wanted 01010101
bytenr mismatch, want=3415464640512, have=72340172838076673

This goes on for pages... I probably missed some output and then there
are lots of errors like this one :

ref mismatch on [3415470456832 16384] extent item 1, found 0
Backref 3415470456832 root 3420 not referenced back 0x268013d0
Incorrect global backref count on 3415470456832 found 1 wanted 0
backpointer mismatch on [3415470456832 16384]
owner ref check failed [3415470456832 16384]

...

Followed by lots of this :

ref mismatch on [11010388205568 278528] extent item 1, found 0
checksum verify failed on 3415464869888 found A85405B7 wanted 01010101
checksum verify failed on 3415464869888 found A85405B7 wanted 01010101
bytenr mismatch, want=3415464869888, have=72340172838076673
Incorrect local backref count on 11010388205568 root 257 owner 7487206
offset 0 found 0 wanted 1 back 0x72335670
Backref disk bytenr does not match extent record, bytenr=11010388205568,
ref bytenr=0
backpointer mismatch on [11010388205568 278528]
owner ref check failed [11010388205568 278528]

...

I stopped there : am I correct in thinking that it will take ages to try
to salvage this without any guarantee that I'll get a substantial amount
of the 10 million files on this filesystem ?

I'm considering trying to use a 4 week old snapshot of the device to
find out if it was corrupted or not instead. It will still be a pain if
it works but rsync for less than a month of data is at least an order of
magnitude faster than a full restore.

Lionel
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: help : "bad tree block start" -> btrfs forced readonly

2017-03-17 Thread Roman Mamedov
On Fri, 17 Mar 2017 10:27:11 +0100
Lionel Bouton <lionel-subscript...@bouton.name> wrote:

> Hi,
> 
> Le 17/03/2017 à 09:43, Hans van Kranenburg a écrit :
> > btrfs-debug-tree -b 3415463870464
> 
> Here is what it gives me back :
> 
> btrfs-debug-tree -b 3415463870464 /dev/sdb
> btrfs-progs v4.6.1
> checksum verify failed on 3415463870464 found A85405B7 wanted 01010101
> checksum verify failed on 3415463870464 found A85405B7 wanted 01010101
> bytenr mismatch, want=3415463870464, have=72340172838076673
> ERROR: failed to read 3415463870464
> 
> Is there a way to remove part of the tree and keep the rest ? It could
> help minimize the time needed to restore data.

If you are able to experiment with writable snapshots, you could try using
"btrfs-corrupt-block" to kill the bad block, and see what btrfsck makes out of
the rest. In a similar case I got little to no damage to the overall FS.
http://www.spinics.net/lists/linux-btrfs/msg53061.html

-- 
With respect,
Roman
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: help : "bad tree block start" -> btrfs forced readonly

2017-03-17 Thread Hans van Kranenburg
On 03/17/2017 10:27 AM, Lionel Bouton wrote:
> Hi,
> 
> Le 17/03/2017 à 09:43, Hans van Kranenburg a écrit :
>> btrfs-debug-tree -b 3415463870464
> 
> Here is what it gives me back :
> 
> btrfs-debug-tree -b 3415463870464 /dev/sdb
> btrfs-progs v4.6.1
> checksum verify failed on 3415463870464 found A85405B7 wanted 01010101
> checksum verify failed on 3415463870464 found A85405B7 wanted 01010101
> bytenr mismatch, want=3415463870464, have=72340172838076673
> ERROR: failed to read 3415463870464

So in the place where checksum is supposed to be stored, it has 01010101
and recomputing the checksum of the garbage results in A85405B7. Found /
wanted is also confusing here, since 01010101 is what it found, but
A85405B7 is what it 'found out'.

> Is there a way to remove part of the tree and keep the rest ? It could
> help minimize the time needed to restore data.

No, that's not how it works. Those trees are not file/directory
structure trees.

You can try btrfs-debug-tree  and see how far it gets
dumping everything it can find, and then search for 3415463870464 in the
output. Somewhere, there has to be another object (one level higher)
which points to this address. If you find it, you can find out in which
tree the block lives.

-- 
Hans van Kranenburg
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: help : "bad tree block start" -> btrfs forced readonly

2017-03-17 Thread Lionel Bouton
Hi,

Le 17/03/2017 à 09:43, Hans van Kranenburg a écrit :
> btrfs-debug-tree -b 3415463870464

Here is what it gives me back :

btrfs-debug-tree -b 3415463870464 /dev/sdb
btrfs-progs v4.6.1
checksum verify failed on 3415463870464 found A85405B7 wanted 01010101
checksum verify failed on 3415463870464 found A85405B7 wanted 01010101
bytenr mismatch, want=3415463870464, have=72340172838076673
ERROR: failed to read 3415463870464

Is there a way to remove part of the tree and keep the rest ? It could
help minimize the time needed to restore data.

Lionel
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: help : "bad tree block start" -> btrfs forced readonly

2017-03-17 Thread Hans van Kranenburg
On 03/17/2017 09:11 AM, Lionel Bouton wrote:
> Le 17/03/2017 à 05:32, Lionel Bouton a écrit :
>> Hi,
>>
>> [...]
>> I'll catch some sleep right now (it's 5:28 AM here) but I'll be able to
>> work on this in 3 or 4 hours.
> 
> I woke up to this :
> 
> Mar 17 06:56:30 fileserver kernel: btree_readpage_end_io_hook: 104476
> callbacks suppressed
> Mar 17 06:56:30 fileserver kernel: BTRFS error (device sdb): bad tree
> block start 72340172838076673 3415463870464
> Mar 17 06:56:30 fileserver kernel: BTRFS error (device sdb): bad tree
> block start 72340172838076673 3415463870464
> Mar 17 06:56:30 fileserver kernel: BTRFS error (device sdb): bad tree
> block start 72340172838076673 3415463870464
> Mar 17 06:56:30 fileserver kernel: BTRFS error (device sdb): bad tree
> block start 72340172838076673 3415463870464
> Mar 17 06:56:30 fileserver kernel: BTRFS error (device sdb): bad tree
> block start 72340172838076673 3415463870464
> Mar 17 06:56:30 fileserver kernel: BTRFS error (device sdb): bad tree
> block start 72340172838076673 3415463870464
> Mar 17 06:56:30 fileserver kernel: BTRFS error (device sdb): bad tree
> block start 72340172838076673 3415463870464
> Mar 17 06:56:30 fileserver kernel: BTRFS error (device sdb): bad tree
> block start 72340172838076673 3415463870464
> Mar 17 06:56:30 fileserver kernel: BTRFS error (device sdb): bad tree
> block start 72340172838076673 3415463870464
> Mar 17 06:56:30 fileserver kernel: BTRFS error (device sdb): bad tree
> block start 72340172838076673 3415463870464

The error is about a page of metadata (tree block) that is damaged or
has been lost.

Your btrfs is reading the metadata page at location 3415463870464
(virtual address space). Inside the page, the address is stored again as
a method of verification.

The error means that it expected to see metadata items that live in a
block at position 3415463870464 in your filesystem virtual address
space, but instead it encounters some data, from which the bytes in the
location where that address should be translate back to 72340172838076673.

I needed to look at the kernel source code to figure this out, the error
is not very descriptive.

found_start = btrfs_header_bytenr(eb);
if (found_start != eb->start) {
btrfs_err_rl(fs_info, "bad tree block start %llu %llu",
found_start, eb->start);
ret = -EIO;
goto err;
}

> and the server was unusable.

The impact depends heavily on what part of the metadata it is, which
tree it's from, how much tree is hidden behind it etc.

You can try btrfs-debug-tree -b 3415463870464  to see if it
outputs any readable information. If this was a metadata page, it would
have at least a corrupted bytenr field, otherwise it's likely not
something in the btrfs metadata format.

> I just moved the client to a read-only backup server and we are trying
> to find out if we can salvage this or if we start the full restore
> procedure.

-- 
Hans van Kranenburg
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: help : "bad tree block start" -> btrfs forced readonly

2017-03-17 Thread Lionel Bouton
Le 17/03/2017 à 05:32, Lionel Bouton a écrit :
> Hi,
>
> [...]
> I'll catch some sleep right now (it's 5:28 AM here) but I'll be able to
> work on this in 3 or 4 hours.

I woke up to this :

Mar 17 06:56:30 fileserver kernel: btree_readpage_end_io_hook: 104476
callbacks suppressed
Mar 17 06:56:30 fileserver kernel: BTRFS error (device sdb): bad tree
block start 72340172838076673 3415463870464
Mar 17 06:56:30 fileserver kernel: BTRFS error (device sdb): bad tree
block start 72340172838076673 3415463870464
Mar 17 06:56:30 fileserver kernel: BTRFS error (device sdb): bad tree
block start 72340172838076673 3415463870464
Mar 17 06:56:30 fileserver kernel: BTRFS error (device sdb): bad tree
block start 72340172838076673 3415463870464
Mar 17 06:56:30 fileserver kernel: BTRFS error (device sdb): bad tree
block start 72340172838076673 3415463870464
Mar 17 06:56:30 fileserver kernel: BTRFS error (device sdb): bad tree
block start 72340172838076673 3415463870464
Mar 17 06:56:30 fileserver kernel: BTRFS error (device sdb): bad tree
block start 72340172838076673 3415463870464
Mar 17 06:56:30 fileserver kernel: BTRFS error (device sdb): bad tree
block start 72340172838076673 3415463870464
Mar 17 06:56:30 fileserver kernel: BTRFS error (device sdb): bad tree
block start 72340172838076673 3415463870464
Mar 17 06:56:30 fileserver kernel: BTRFS error (device sdb): bad tree
block start 72340172838076673 3415463870464

and the server was unusable.

I just moved the client to a read-only backup server and we are trying
to find out if we can salvage this or if we start the full restore
procedure.

Help ?

Lionel
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


help : "bad tree block start" -> btrfs forced readonly

2017-03-16 Thread Lionel Bouton
 fileserver kernel:  [] ? kthread+0xbc/0xe0
Mar 16 23:30:20 fileserver kernel:  [] ?
kthread_create_on_node+0x180/0x180
Mar 16 23:30:20 fileserver kernel:  [] ?
ret_from_fork+0x42/0x70
Mar 16 23:30:20 fileserver kernel:  [] ?
kthread_create_on_node+0x180/0x180
Mar 16 23:30:20 fileserver kernel: ---[ end trace f03445c45d440372 ]---
Mar 16 23:30:20 fileserver kernel: BTRFS: error (device sdb) in
__btrfs_run_delayed_items:1188: errno=-5 IO failure
Mar 16 23:30:20 fileserver kernel: BTRFS info (device sdb): forced readonly
Mar 16 23:30:20 fileserver kernel: BTRFS warning (device sdb): Skipping
commit of aborted transaction.
Mar 16 23:30:20 fileserver kernel: BTRFS: error (device sdb) in
cleanup_transaction:1692: errno=-5 IO failure
Mar 16 23:30:22 fileserver kernel: BTRFS (device sdb): bad tree block
start 72340172838076673 3415463870464
Mar 16 23:30:22 fileserver kernel: BTRFS (device sdb): bad tree block
start 72340172838076673 3415463870464

I removed the failing disk from the cluster and rebooted the server. The
filesystem mounted fine but some time later I got these :

Mar 17 03:49:48 fileserver kernel: BTRFS (device sdb): bad tree block
start 72340172838076673 3415464230912
Mar 17 03:49:48 fileserver kernel: BTRFS (device sdb): bad tree block
start 72340172838076673 3415464230912
Mar 17 03:49:48 fileserver kernel: BTRFS (device sdb): bad tree block
start 72340172838076673 3415464230912

The filesystem didn't remount readonly this time but I installed a new
kernel (4.9.6 with the r1 Gentoo patchset insteal of 4.1.15-r1) and
rebooted again. I have a snapshot of the full device at the time of each
reboot if it can help (I can relatively easily make rw copies and work
on them without affecting the ro snapshots) and an earlier one from 4
weeks ago.

Can someone please help me determine if I can save this filesystem and
how ? I suspect there isn't much damage in quantity (there were only a
handful of damaged sectors before the disk was removed). I'm just not
sure how I can check if the internal BTRFS structures are still sound
and won't create a snowball effect destroying much more.

It is still currently used in this state in production and I'm trying to
avoid a painful switch to a remote, slow snapshot from yesterday while
beginning a very long recovery from scratch (this is at least a 2 weeks
procedure maybe more).
I'll catch some sleep right now (it's 5:28 AM here) but I'll be able to
work on this in 3 or 4 hours.

Best regards,

Lionel
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: help!!! error when mount a btrfs file system

2017-03-16 Thread Qu Wenruo



At 03/17/2017 01:36 AM, Liu Bo wrote:

On Thu, Mar 16, 2017 at 08:23:05PM +0800, 李云甫 wrote:

hi, buddy

   I have a file server with btrfs file system, it's work well for several 
months.

but after last system reboot, the /dev/sdb become not mountable.

below is the details.   is there any advise?


##Version info
Fedora 25 Server
Kernel 4.9.13-201.fc25.x86_64
btrfs-progs v4.6.1

#error messages when mount
mount: wrong fs type, bad option, bad superblock on /dev/sdb,
missing codepage or helper program, or other error

In some cases useful info is found in syslog - try
dmesg | tail or so.

##dmesg |tail
[79570.756871] BTRFS error (device sdb): parent transid verify failed on 
21413888 wanted 755660 found 623605
[79570.762307] BTRFS error (device sdb): bad tree block start 0 21413888
[79570.762345] BTRFS error (device sdb): failed to read chunk tree: -5
[79570.778129] BTRFS error (device sdb): open_ctree failed
[79589.743772] BTRFS error (device sdb): support for check_integrity* not 
compiled in!
[79589.803176] BTRFS error (device sdb): open_ctree failed



Looks like one node of the chunk tree was zero'd by something, were you use -o 
discard or fstrim before reboot?

Thanks,

-liubo


##btrfsck
parent transid verify failed on 21413888 wanted 755660 found 623605
parent transid verify failed on 21413888 wanted 755660 found 623605
checksum verify failed on 21413888 found E4E3BDB6 wanted 
parent transid verify failed on 21413888 wanted 755660 found 623605
Ignoring transid failure
checksum verify failed on 21331968 found E4E3BDB6 wanted 
checksum verify failed on 21331968 found E4E3BDB6 wanted 
checksum verify failed on 21692416 found E4E3BDB6 wanted 
checksum verify failed on 21692416 found E4E3BDB6 wanted 
checksum verify failed on 22888448 found E4E3BDB6 wanted 
checksum verify failed on 22888448 found E4E3BDB6 wanted 
checksum verify failed on 22888448 found E4E3BDB6 wanted 
checksum verify failed on 22888448 found E4E3BDB6 wanted 


I'm afraid not only one, but two chunk tree blocks are zeroed.

While you still have a little chance to recovery chunk tree by using 
backup chunk roots.


Would you please paste the output of "btrfs-show-super -f /dev/sdb"?

Thanks,
Qu


bytenr mismatch, want=22888448, have=0
Couldn't read chunk tree
Couldn't open file system

##btrfs-find-root
parent transid verify failed on 21413888 wanted 755660 found 623605
parent transid verify failed on 21413888 wanted 755660 found 623605
parent transid verify failed on 21413888 wanted 755660 found 623605
Ignoring transid failure
Couldn't read chunk tree
ERROR: open ctree failed

##btrfs-show-super -a /dev/sdb
superblock: bytenr=65536, device=/dev/sdb
-
csum0xb6f3ccb1 [match]
bytenr  65536
flags   0x1
( WRITTEN )
magic   _BHRfS_M [match]
fsid7f5aecd6-40fe-45ce-90c3-e86bacc4bf6f
label   samba_fs
generation  770740
root16187774615552
sys_array_size  355
chunk_root_generation   755799
root_level  1
chunk_root  24331161698304
chunk_root_level1
log_root0
log_root_transid0
log_root_level  0
total_bytes 2396231680
bytes_used  22205028102144
sectorsize  4096
nodesize16384
leafsize16384
stripesize  4096
root_dir6
num_devices 1
compat_flags0x0
compat_ro_flags 0x0
incompat_flags  0x169
( MIXED_BACKREF |
COMPRESS_LZO |
BIG_METADATA |
EXTENDED_IREF |
SKINNY_METADATA )
csum_type   0
csum_size   4
cache_generation770740
uuid_tree_generation770740
dev_item.uuid   dd8c8d66-b6f5-48d8-9d5e-6f56b2ad4751
dev_item.fsid   7f5aecd6-40fe-45ce-90c3-e86bacc4bf6f [match]
dev_item.type   0
dev_item.total_bytes2396231680
dev_item.bytes_used 23274943676416
dev_item.io_align   4096
dev_item.io_width   4096
dev_item.sector_size4096
dev_item.devid  1
dev_item.dev_group  0
dev_item.seek_speed 0
dev_item.bandwidth  0
dev_item.generation 0

superblock: bytenr=67108864, device=/dev/sdb
-
csum0x1692e47f [match]
bytenr  67108864
flags   0x1
( WRITTEN )
magic   _BHRfS_M [match]
fsid7f5aecd6-40fe-45ce-90c3-e86bacc4bf6f
label   samba_fs
generation  770740
root16187774615552
sys_array_size  355
chunk_root_generation   755799
root_level  1
chunk_root  24331161698304
chunk_root_level1
log_root0
log_root_transid0
log_root_level

Re: help!!! error when mount a btrfs file system

2017-03-16 Thread Liu Bo
On Thu, Mar 16, 2017 at 08:23:05PM +0800, 李云甫 wrote:
> hi, buddy
> 
>I have a file server with btrfs file system, it's work well for several 
> months.
> 
> but after last system reboot, the /dev/sdb become not mountable.
> 
> below is the details.   is there any advise?
> 
> 
> ##Version info
> Fedora 25 Server
> Kernel 4.9.13-201.fc25.x86_64
> btrfs-progs v4.6.1
> 
> #error messages when mount
> mount: wrong fs type, bad option, bad superblock on /dev/sdb,
> missing codepage or helper program, or other error
> 
> In some cases useful info is found in syslog - try
> dmesg | tail or so.
> 
> ##dmesg |tail
> [79570.756871] BTRFS error (device sdb): parent transid verify failed on 
> 21413888 wanted 755660 found 623605
> [79570.762307] BTRFS error (device sdb): bad tree block start 0 21413888
> [79570.762345] BTRFS error (device sdb): failed to read chunk tree: -5
> [79570.778129] BTRFS error (device sdb): open_ctree failed
> [79589.743772] BTRFS error (device sdb): support for check_integrity* not 
> compiled in!
> [79589.803176] BTRFS error (device sdb): open_ctree failed
>

Looks like one node of the chunk tree was zero'd by something, were you use -o 
discard or fstrim before reboot?

Thanks,

-liubo

> ##btrfsck 
> parent transid verify failed on 21413888 wanted 755660 found 623605
> parent transid verify failed on 21413888 wanted 755660 found 623605
> checksum verify failed on 21413888 found E4E3BDB6 wanted 
> parent transid verify failed on 21413888 wanted 755660 found 623605
> Ignoring transid failure
> checksum verify failed on 21331968 found E4E3BDB6 wanted 
> checksum verify failed on 21331968 found E4E3BDB6 wanted 
> checksum verify failed on 21692416 found E4E3BDB6 wanted 
> checksum verify failed on 21692416 found E4E3BDB6 wanted 
> checksum verify failed on 22888448 found E4E3BDB6 wanted 
> checksum verify failed on 22888448 found E4E3BDB6 wanted 
> checksum verify failed on 22888448 found E4E3BDB6 wanted 
> checksum verify failed on 22888448 found E4E3BDB6 wanted 
> bytenr mismatch, want=22888448, have=0
> Couldn't read chunk tree
> Couldn't open file system
> 
> ##btrfs-find-root
> parent transid verify failed on 21413888 wanted 755660 found 623605
> parent transid verify failed on 21413888 wanted 755660 found 623605
> parent transid verify failed on 21413888 wanted 755660 found 623605
> Ignoring transid failure
> Couldn't read chunk tree
> ERROR: open ctree failed
> 
> ##btrfs-show-super -a /dev/sdb 
> superblock: bytenr=65536, device=/dev/sdb
> -
> csum  0xb6f3ccb1 [match]
> bytenr65536
> flags 0x1
> ( WRITTEN )
> magic _BHRfS_M [match]
> fsid  7f5aecd6-40fe-45ce-90c3-e86bacc4bf6f
> label samba_fs
> generation770740
> root  16187774615552
> sys_array_size355
> chunk_root_generation 755799
> root_level1
> chunk_root24331161698304
> chunk_root_level  1
> log_root  0
> log_root_transid  0
> log_root_level0
> total_bytes   2396231680
> bytes_used22205028102144
> sectorsize4096
> nodesize  16384
> leafsize  16384
> stripesize4096
> root_dir  6
> num_devices   1
> compat_flags  0x0
> compat_ro_flags   0x0
> incompat_flags0x169
> ( MIXED_BACKREF |
> COMPRESS_LZO |
> BIG_METADATA |
> EXTENDED_IREF |
> SKINNY_METADATA )
> csum_type 0
> csum_size 4
> cache_generation  770740
> uuid_tree_generation  770740
> dev_item.uuid dd8c8d66-b6f5-48d8-9d5e-6f56b2ad4751
> dev_item.fsid 7f5aecd6-40fe-45ce-90c3-e86bacc4bf6f [match]
> dev_item.type 0
> dev_item.total_bytes  2396231680
> dev_item.bytes_used   23274943676416
> dev_item.io_align 4096
> dev_item.io_width 4096
> dev_item.sector_size  4096
> dev_item.devid1
> dev_item.dev_group0
> dev_item.seek_speed   0
> dev_item.bandwidth0
> dev_item.generation   0
> 
> superblock: bytenr=67108864, device=/dev/sdb
> -
> csum  0x1692e47f [match]
> bytenr67108864
> flags 0x1
> ( WRITTEN )
> magic _BHRfS_M [match]
> fsid  7f5aecd6-40fe-45ce-90c3-e86bacc4bf6f
> label samba_fs
> generation770740
> root  16187774615552
> sys_array_size355
> chunk_root_generation 755799
> root_level1
> chunk_root24331161698304
> chunk_root_level  1
> log_root  0
> log_root_transid  0
> log_root_level0
> total_bytes   2396231680
> bytes_used

help!!! error when mount a btrfs file system

2017-03-16 Thread 李云甫
hi, buddy

   I have a file server with btrfs file system, it's work well for several 
months.

but after last system reboot, the /dev/sdb become not mountable.

below is the details.   is there any advise?


##Version info
Fedora 25 Server
Kernel 4.9.13-201.fc25.x86_64
btrfs-progs v4.6.1

#error messages when mount
mount: wrong fs type, bad option, bad superblock on /dev/sdb,
missing codepage or helper program, or other error

In some cases useful info is found in syslog - try
dmesg | tail or so.

##dmesg |tail
[79570.756871] BTRFS error (device sdb): parent transid verify failed on 
21413888 wanted 755660 found 623605
[79570.762307] BTRFS error (device sdb): bad tree block start 0 21413888
[79570.762345] BTRFS error (device sdb): failed to read chunk tree: -5
[79570.778129] BTRFS error (device sdb): open_ctree failed
[79589.743772] BTRFS error (device sdb): support for check_integrity* not 
compiled in!
[79589.803176] BTRFS error (device sdb): open_ctree failed

##btrfsck 
parent transid verify failed on 21413888 wanted 755660 found 623605
parent transid verify failed on 21413888 wanted 755660 found 623605
checksum verify failed on 21413888 found E4E3BDB6 wanted 
parent transid verify failed on 21413888 wanted 755660 found 623605
Ignoring transid failure
checksum verify failed on 21331968 found E4E3BDB6 wanted 
checksum verify failed on 21331968 found E4E3BDB6 wanted 
checksum verify failed on 21692416 found E4E3BDB6 wanted 
checksum verify failed on 21692416 found E4E3BDB6 wanted 
checksum verify failed on 22888448 found E4E3BDB6 wanted 
checksum verify failed on 22888448 found E4E3BDB6 wanted 
checksum verify failed on 22888448 found E4E3BDB6 wanted 
checksum verify failed on 22888448 found E4E3BDB6 wanted 
bytenr mismatch, want=22888448, have=0
Couldn't read chunk tree
Couldn't open file system

##btrfs-find-root
parent transid verify failed on 21413888 wanted 755660 found 623605
parent transid verify failed on 21413888 wanted 755660 found 623605
parent transid verify failed on 21413888 wanted 755660 found 623605
Ignoring transid failure
Couldn't read chunk tree
ERROR: open ctree failed

##btrfs-show-super -a /dev/sdb 
superblock: bytenr=65536, device=/dev/sdb
-
csum0xb6f3ccb1 [match]
bytenr  65536
flags   0x1
( WRITTEN )
magic   _BHRfS_M [match]
fsid7f5aecd6-40fe-45ce-90c3-e86bacc4bf6f
label   samba_fs
generation  770740
root16187774615552
sys_array_size  355
chunk_root_generation   755799
root_level  1
chunk_root  24331161698304
chunk_root_level1
log_root0
log_root_transid0
log_root_level  0
total_bytes 2396231680
bytes_used  22205028102144
sectorsize  4096
nodesize16384
leafsize16384
stripesize  4096
root_dir6
num_devices 1
compat_flags0x0
compat_ro_flags 0x0
incompat_flags  0x169
( MIXED_BACKREF |
COMPRESS_LZO |
BIG_METADATA |
EXTENDED_IREF |
SKINNY_METADATA )
csum_type   0
csum_size   4
cache_generation770740
uuid_tree_generation770740
dev_item.uuid   dd8c8d66-b6f5-48d8-9d5e-6f56b2ad4751
dev_item.fsid   7f5aecd6-40fe-45ce-90c3-e86bacc4bf6f [match]
dev_item.type   0
dev_item.total_bytes2396231680
dev_item.bytes_used 23274943676416
dev_item.io_align   4096
dev_item.io_width   4096
dev_item.sector_size4096
dev_item.devid  1
dev_item.dev_group  0
dev_item.seek_speed 0
dev_item.bandwidth  0
dev_item.generation 0

superblock: bytenr=67108864, device=/dev/sdb
-
csum0x1692e47f [match]
bytenr  67108864
flags   0x1
( WRITTEN )
magic   _BHRfS_M [match]
fsid7f5aecd6-40fe-45ce-90c3-e86bacc4bf6f
label   samba_fs
generation  770740
root16187774615552
sys_array_size  355
chunk_root_generation   755799
root_level  1
chunk_root  24331161698304
chunk_root_level1
log_root0
log_root_transid0
log_root_level  0
total_bytes 2396231680
bytes_used  22205028102144
sectorsize  4096
nodesize16384
leafsize16384
stripesize  4096
root_dir6
num_devices 1
compat_flags0x0
compat_ro_flags 0x0
incompat_flags  0x169
( MIXED_BACKREF |
COMPRESS_LZO |
BIG_METADATA |
EXTENDED_IREF |
SKINNY_METADATA )
csum_type   0
csum_size  

help!!! error when mount a btrfs file system

2017-03-16 Thread ????????
hi, buddy

   I have a file server with btrfs file system, it's work well for several 
months.

but after last system reboot, the /dev/sdb become not mountable.

below is the details.   is there any advise?


##Version info
Fedora 25 Server
Kernel 4.9.13-201.fc25.x86_64
btrfs-progs v4.6.1

#error messages when mount
mount: wrong fs type, bad option, bad superblock on /dev/sdb,
missing codepage or helper program, or other error

In some cases useful info is found in syslog - try
dmesg | tail or so.

##dmesg |tail
[79570.756871] BTRFS error (device sdb): parent transid verify failed on 
21413888 wanted 755660 found 623605
[79570.762307] BTRFS error (device sdb): bad tree block start 0 21413888
[79570.762345] BTRFS error (device sdb): failed to read chunk tree: -5
[79570.778129] BTRFS error (device sdb): open_ctree failed
[79589.743772] BTRFS error (device sdb): support for check_integrity* not 
compiled in!
[79589.803176] BTRFS error (device sdb): open_ctree failed

##btrfsck 
parent transid verify failed on 21413888 wanted 755660 found 623605
parent transid verify failed on 21413888 wanted 755660 found 623605
checksum verify failed on 21413888 found E4E3BDB6 wanted 
parent transid verify failed on 21413888 wanted 755660 found 623605
Ignoring transid failure
checksum verify failed on 21331968 found E4E3BDB6 wanted 
checksum verify failed on 21331968 found E4E3BDB6 wanted 
checksum verify failed on 21692416 found E4E3BDB6 wanted 
checksum verify failed on 21692416 found E4E3BDB6 wanted 
checksum verify failed on 22888448 found E4E3BDB6 wanted 
checksum verify failed on 22888448 found E4E3BDB6 wanted 
checksum verify failed on 22888448 found E4E3BDB6 wanted 
checksum verify failed on 22888448 found E4E3BDB6 wanted 
bytenr mismatch, want=22888448, have=0
Couldn't read chunk tree
Couldn't open file system

##btrfs-find-root
parent transid verify failed on 21413888 wanted 755660 found 623605
parent transid verify failed on 21413888 wanted 755660 found 623605
parent transid verify failed on 21413888 wanted 755660 found 623605
Ignoring transid failure
Couldn't read chunk tree
ERROR: open ctree failed

##btrfs-show-super -a /dev/sdb 
superblock: bytenr=65536, device=/dev/sdb
-
csum 0xb6f3ccb1 [match]
bytenr 65536
flags 0x1
( WRITTEN )
magic _BHRfS_M [match]
fsid 7f5aecd6-40fe-45ce-90c3-e86bacc4bf6f
label samba_fs
generation 770740
root 16187774615552
sys_array_size 355
chunk_root_generation 755799
root_level 1
chunk_root 24331161698304
chunk_root_level 1
log_root 0
log_root_transid 0
log_root_level 0
total_bytes 2396231680
bytes_used 22205028102144
sectorsize 4096
nodesize 16384
leafsize 16384
stripesize 4096
root_dir 6
num_devices 1
compat_flags 0x0
compat_ro_flags 0x0
incompat_flags 0x169
( MIXED_BACKREF |
COMPRESS_LZO |
BIG_METADATA |
EXTENDED_IREF |
SKINNY_METADATA )
csum_type 0
csum_size 4
cache_generation 770740
uuid_tree_generation 770740
dev_item.uuid dd8c8d66-b6f5-48d8-9d5e-6f56b2ad4751
dev_item.fsid 7f5aecd6-40fe-45ce-90c3-e86bacc4bf6f [match]
dev_item.type 0
dev_item.total_bytes 2396231680
dev_item.bytes_used 23274943676416
dev_item.io_align 4096
dev_item.io_width 4096
dev_item.sector_size 4096
dev_item.devid 1
dev_item.dev_group 0
dev_item.seek_speed 0
dev_item.bandwidth 0
dev_item.generation 0

superblock: bytenr=67108864, device=/dev/sdb
-
csum 0x1692e47f [match]
bytenr 67108864
flags 0x1
( WRITTEN )
magic _BHRfS_M [match]
fsid 7f5aecd6-40fe-45ce-90c3-e86bacc4bf6f
label samba_fs
generation 770740
root 16187774615552
sys_array_size 355
chunk_root_generation 755799
root_level 1
chunk_root 24331161698304
chunk_root_level 1
log_root 0
log_root_transid 0
log_root_level 0
total_bytes 2396231680
bytes_used 22205028102144
sectorsize 4096
nodesize 16384
leafsize 16384
stripesize 4096
root_dir 6
num_devices 1
compat_flags 0x0
compat_ro_flags 0x0
incompat_flags 0x169
( MIXED_BACKREF |
COMPRESS_LZO |
BIG_METADATA |
EXTENDED_IREF |
SKINNY_METADATA )
csum_type 0
csum_size 4
cache_generation 770740
uuid_tree_generation 770740
dev_item.uuid dd8c8d66-b6f5-48d8-9d5e-6f56b2ad4751
dev_item.fsid 7f5aecd6-40fe-45ce-90c3-e86bacc4bf6f [match]
dev_item.type 0
dev_item.total_bytes 2396231680
dev_item.bytes_used 23274943676416
dev_item.io_align 4096
dev_item.io_width 4096
dev_item.sector_size 4096
dev_item.devid 1
dev_item.dev_group 0
dev_item.seek_speed 0
dev_item.bandwidth 0
dev_item.generation 0

superblock: bytenr=274877906944, device=/dev/sdb
-
csum 0xeb15b24e [match]
bytenr 274877906944
flags 0x1
( WRITTEN )
magic _BHRfS_M [match]
fsid 7f5aecd6-40fe-45ce-90c3-e86bacc4bf6f
label samba_fs
generation 770740
root 16187774615552
sys_array_size 355
chunk_root_generation 755799
root_level 1
chunk_root 24331161698304
chunk_root_level 1
log_root 0

Re: Help understanding autodefrag details

2017-02-13 Thread Austin S. Hemmelgarn

On 2017-02-10 09:21, Peter Zaitsev wrote:

Hi,

As I have been reading btrfs whitepaper  it speaks about autodefrag in very
generic terms - once random write in the file is detected it is put in the
queue to be defragmented.   Yet I could not find any specifics about this
process described anywhere.

My use case is databases and as such large files (100GB+)so my
questions are

- is my understanding what defrag queue is based on files not parts of
files which got fragmented correct ?
Autodefrag is location based within the file, not for the whole file.  I 
forget the exact size of the area around the write it will try to 
defrag, and the maximum size the write can be to trigger it, but the 
selection amounts to the following:
1. Is this write not likely to be followed by a write to the next 
logical address in the file? (I'm not certain exactly what heuristic is 
used to determine this).
2. Is this write small enough to likely cause fragmentation?  (This one 
is a simple threshold test, but I forget the threshold).
3. If both 1 and 2 are true, schedule the area containing the write to 
be defragmented.


- Is single random write is enough to schedule file for defrag or is there
some more elaborate math to consider file fragmented and needing
optimization  ?
I'm not sure.  It depends on whether or not the random write detection 
heuristic that is used has some handling for the first few writes, or 
needs some data from their position to determine the 'randomness' of 
future writes.


- Is this queue FIFO or is it priority queue where files in more need of
fragmentation jump in front (or is there some other mechanics ?
I think it's a FIFO queue, but there may be multiple threads servicing 
it, and I think it's smart enough to merge areas that overlap into a 
single operation.


- Will file to be attempted to be defragmented completely or does defrag
focuses on the most fragmented areas of the file first ?

AFAIK, autodefrag only defrags the region around where the write happened.


- Is there any way to view this defrag queue ?
Not that I know of, but in most cases it should be mostly empty, since 
the areas being handled are usually small enough that items get 
processed pretty quick.


- How are resources allocated to background autodefrag vs resources serving
foreground user load are controlled
AFAIK, there is no way to manually control this.  It would be kind of 
nice though if autodefrag ran as it's own thread.


- What are space requirements for defrag ? is it required for the space to
be available for complete file copy or is it not required ?
Pretty minimal space requirements.  Even regular defrag technically 
doesn't need enough space for the whole file.  Both work with whatever 
amount of space they have, but you obviously get better results with 
more free space.


- Can defrag handle file which is being constantly written to or is it
based on the concept what file should be idle for some time and when it is
going to be defragmented
In my experience, it handles files seeing constant writes just fine, 
even if you're saturating the disk bandwidth (it will just reduce your 
effective bandwidth a small amount).

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Help understanding autodefrag details

2017-02-10 Thread Peter Zaitsev
Hi,

As I have been reading btrfs whitepaper  it speaks about autodefrag in very
generic terms - once random write in the file is detected it is put in the
queue to be defragmented.   Yet I could not find any specifics about this
process described anywhere.

My use case is databases and as such large files (100GB+)so my
questions are

- is my understanding what defrag queue is based on files not parts of
files which got fragmented correct ?

- Is single random write is enough to schedule file for defrag or is there
some more elaborate math to consider file fragmented and needing
optimization  ?

- Is this queue FIFO or is it priority queue where files in more need of
fragmentation jump in front (or is there some other mechanics ?

- Will file to be attempted to be defragmented completely or does defrag
focuses on the most fragmented areas of the file first ?

- Is there any way to view this defrag queue ?

- How are resources allocated to background autodefrag vs resources serving
foreground user load are controlled

- What are space requirements for defrag ? is it required for the space to
be available for complete file copy or is it not required ?

- Can defrag handle file which is being constantly written to or is it
based on the concept what file should be idle for some time and when it is
going to be defragmented

Let me know if you have any information on these

-- 
Peter Zaitsev, CEO, Percona
Tel: +1 888 401 3401 ext 7360   Skype:  peter_zaitsev
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: File system is oddly full after kernel upgrade, balance doesn't help

2017-01-30 Thread Duncan
lable and will error out, without 
using the global reserve.

So if at any time btrfs reports more than 0 global reserve used, it means 
btrfs thinks it's in pretty serious straits and it's in quite a pickle, 
making non-zero global reserve usage a primary indicator of a filesystem 
in trouble, no matter what else is reported.


So with all that said, you can see that on that 8-gig per device, pair-
device raid1, btrfs has allocated only 512 MiB of metadata on each 
device, of which 232 MiB on each is used, *nominally* leaving 280 MiB 
metadata unused on each device, tho global reserve comes from that.

But, there's only 16 MiB of global reserve, counted only once.  If we 
assume it'd be used equally from each device, that's 8 MiB of global 
reserve on each device subtracted from that 280 MiB nominally free, 
leaving 272 MiB of metadata free, a reasonably healthy filesystem state, 
considering that's more metadata than actually used, plus there's nearly 
4.5 GiB entirely unallocated on each device, that can be allocated to 
data or metadata as needed.

That's quite a contrast compared to yours, a quarter the size, 2 GiB 
instead of 8, and as you have only the single device, the metadata 
defaulted to dup, so it uses twice as much space on the single device.

But the *real* contrast is as you said, your global reserve, an entirely 
unrealistic half a GiB, on a 2 GiB filesystem!

Of course global reserve being accounted single, while your metadata is 
dup, half should come from each side of that dup, so your real metadata 
usage vs. free can be calculated as 577.5 size (per side of the dup) - 
37.5 (normal used), - 256 (half of the global reserve), basically 284 MiB 
of usable metadata space (per side of the dup, but each side should be 
used equally).

Add to that the ~100 MiB unallocated, tho if used for dup metadata you'd 
only have half that usable, and you're not in /horrible/ shape.

But that 512 MiB global reserve, a quarter of the total filesystem size, 
is just killing you.

And unless it has something to do with snapshots/subvolumes, I don't have 
a clue why, or what to do about it.

But here's what I'd try, based on the answer to the question of whether 
you use snapshots/subvolumes (or use any of the btrfs reflink-based dedup 
tools as they have many of the same implications as snapshots, tho the 
scope is of course a bit different), and how many you have if so:

* Snapshots and reflinks are great, but unfortunately, have limited 
scaling ability at this time.  While on normal sized btrfs the limit 
before scaling becomes an issue seems to be a few hundred (under 1000 and 
for most under 500), it /may/ be that on a btrfs as small as your two-
GiB, more than say 10 may be an issue.

As I said, I don't /know/ if it'll help, but if you're over this, I'd 
certainly try reducing the number of snapshots/reflinks to under 10 per 
subvolume/file and see if it helps at all.

* You /may/ be able to try btrfs bal start -musage=, starting with a 
relatively low value (you tried 0, it's percentage, try 2, 5, 10.. up 
toward 100%, until you see some results or you get ENOSPC errors), and 
see some results.  However, typical metadata chunks are 256 MiB in size, 
tho they should be smaller on a 2 GiB btrfs, but I'm not sure by how 
much, and it's relatively likely you'll run into ENOSPC errors due to 
metadata chunks larger than half (dup so it'll take two chunks of the 
same size) your unallocated space size, before you get anywhere, even if 
balancing would otherwise help -- which again I'm not even sure it will, 
as I don't know whether it helps with bloated global reserve, or not.

* If the balance ENOSPCs, you may of course try (temporarily) increasing 
the size of the filesystem, possibly by adding a device.  There's 
discussion of that on the wiki.  But I honestly don't know how global 
reserve will behave, because something's clearly going on with it and I 
have no idea what.  For all I know, it'll eat most of the new space 
again, and you'll be in an even worse position, as it won't then let you 
remove the device you added to try to fix the problem.

* Similarly, but perhaps less risky with regard to global reserve size, 
tho definitely being more risky in terms of data safety in case something 
goes wrong (but the data's backed up, right?), you could try doing a 
btrfs balance start -mconvert=single, to reduce the metadata usage from 
dup to single mode.  Tho personally, I'd probably bother with the risk, 
simply double-checking my backups, then going ahead with the next one 
instead of this one.

* Since in data admin terms, data without a backup is considered to be 
defined by the lack thereof of that backup, as worth less than the time 
and trouble necessary to do it, and that applies even stronger to a still 
under heavy development and not yet fully stable filesystem such as 
btrfs, it's relatively safe to assume you either have a backup, or don't 
really care about the possibility of losing the

Re: File system is oddly full after kernel upgrade, balance doesn't help

2017-01-28 Thread MegaBrutal
Hello,

Of course I can't retrieve the data from before the balance, but here
is the data from now:

root@vmhost:~# btrfs fi show /tmp/mnt/curlybrace
Label: 'curlybrace'  uuid: f471bfca-51c4-4e44-ac72-c6cd9ccaf535
Total devices 1 FS bytes used 752.38MiB
devid1 size 2.00GiB used 1.90GiB path
/dev/mapper/vmdata--vg-lxc--curlybrace

root@vmhost:~# btrfs fi df /tmp/mnt/curlybrace
Data, single: total=773.62MiB, used=714.82MiB
System, DUP: total=8.00MiB, used=16.00KiB
Metadata, DUP: total=577.50MiB, used=37.55MiB
GlobalReserve, single: total=512.00MiB, used=0.00B
root@vmhost:~# btrfs fi usage /tmp/mnt/curlybrace
Overall:
Device size:   2.00GiB
Device allocated:   1.90GiB
Device unallocated: 103.38MiB
Device missing: 0.00B
Used: 789.94MiB
Free (estimated): 162.18MiB(min: 110.50MiB)
Data ratio:  1.00
Metadata ratio:  2.00
Global reserve: 512.00MiB(used: 0.00B)

Data,single: Size:773.62MiB, Used:714.82MiB
   /dev/mapper/vmdata--vg-lxc--curlybrace 773.62MiB

Metadata,DUP: Size:577.50MiB, Used:37.55MiB
   /dev/mapper/vmdata--vg-lxc--curlybrace   1.13GiB

System,DUP: Size:8.00MiB, Used:16.00KiB
   /dev/mapper/vmdata--vg-lxc--curlybrace  16.00MiB

Unallocated:
   /dev/mapper/vmdata--vg-lxc--curlybrace 103.38MiB


So... if I sum the data, metadata, and the global reserve, I see why
only ~170 MB is left. I have no idea, however, why the global reserve
sneaked up to 512 MB for such a small file system, and how could I
resolve this situation. Any ideas?


MegaBrutal



2017-01-28 7:46 GMT+01:00 Duncan <1i5t5.dun...@cox.net>:
> MegaBrutal posted on Fri, 27 Jan 2017 19:45:00 +0100 as excerpted:
>
>> Hi,
>>
>> Not sure if it caused by the upgrade, but I only encountered this
>> problem after I upgraded to Ubuntu Yakkety, which comes with a 4.8
>> kernel.
>> Linux vmhost 4.8.0-34-generic #36-Ubuntu SMP Wed Dec 21 17:24:18 UTC
>> 2016 x86_64 x86_64 x86_64 GNU/Linux
>>
>> This is the 2nd file system which showed these symptoms, so I thought
>> it's more than happenstance. I don't remember what I did with the first
>> one, but I somehow managed to fix it with balance, if I remember
>> correctly, but it doesn't help with this one.
>>
>> FS state before any attempts to fix:
>> Filesystem  1M-blocks   Used Available Use% Mounted on
>> [...]curlybrace  1024   1024 0 100% /tmp/mnt/curlybrace
>>
>> Resized LV, run „btrfs filesystem resize max /tmp/mnt/curlybrace”:
>> [...]curlybrace  2048   1303 0 100% /tmp/mnt/curlybrace
>>
>> Notice how the usage magically jumped up to 1303 MB, and despite the FS
>> size is 2048 MB, the usage is still displayed as 100%.
>>
>> Tried full balance (other options with -dusage had no result):
>> root@vmhost:~# btrfs balance start -v /tmp/mnt/curlybrace
>
>> Starting balance without any filters.
>> ERROR: error during balancing '/tmp/mnt/curlybrace':
>> No space left on device
>
>> No space left on device? How?
>>
>> But it changed the situation:
>> [...]curlybrace  2048   1302   190  88% /tmp/mnt/curlybrace
>>
>> This is still not acceptable. I need to recover at least 50% free space
>> (since I increased the FS to the double).
>>
>> A 2nd balance attempt resulted in this:
>> [...]curlybrace  2048   1302   162  89% /tmp/mnt/curlybrace
>>
>> So... it became slightly worse.
>>
>> What's going on? How can I fix the file system to show real data?
>
> Something seems off, yes, but...
>
> https://btrfs.wiki.kernel.org/index.php/FAQ
>
> Reading the whole thing will likely be useful, but especially 1.3/1.4 and
> 4.6-4.9 discussing the problem of space usage, reporting, and (primarily
> in some of the other space related FAQs beyond the specific ones above)
> how to try and fix it when space runes out, on btrfs.
>
> If you read them before, read them again, because you didn't post the
> btrfs free-space reports covered in 4.7, instead posting what appears to
> be the standard (non-btrfs) df report, which for all the reasons
> explained in the FAQ, is at best only an estimate on btrfs.  That
> estimate is obviously behaving unexpectedly in your case, but without the
> btrfs specific reports, it's nigh impossible to even guess with any
> chance at accuracy what's going on, or how to fix it.
>
> A WAG would be that part of the problem might be that you were into
> global reserve before the resize, so after the filesystem got more space
> to use, the first thing it did was unload that global reserve usage,
> thereby immediately upping apparent usage.  That might explain that

Re: File system is oddly full after kernel upgrade, balance doesn't help

2017-01-27 Thread Duncan
MegaBrutal posted on Fri, 27 Jan 2017 19:45:00 +0100 as excerpted:

> Hi,
> 
> Not sure if it caused by the upgrade, but I only encountered this
> problem after I upgraded to Ubuntu Yakkety, which comes with a 4.8
> kernel.
> Linux vmhost 4.8.0-34-generic #36-Ubuntu SMP Wed Dec 21 17:24:18 UTC
> 2016 x86_64 x86_64 x86_64 GNU/Linux
> 
> This is the 2nd file system which showed these symptoms, so I thought
> it's more than happenstance. I don't remember what I did with the first
> one, but I somehow managed to fix it with balance, if I remember
> correctly, but it doesn't help with this one.
> 
> FS state before any attempts to fix:
> Filesystem  1M-blocks   Used Available Use% Mounted on
> [...]curlybrace  1024   1024 0 100% /tmp/mnt/curlybrace
> 
> Resized LV, run „btrfs filesystem resize max /tmp/mnt/curlybrace”:
> [...]curlybrace  2048   1303 0 100% /tmp/mnt/curlybrace
> 
> Notice how the usage magically jumped up to 1303 MB, and despite the FS
> size is 2048 MB, the usage is still displayed as 100%.
> 
> Tried full balance (other options with -dusage had no result):
> root@vmhost:~# btrfs balance start -v /tmp/mnt/curlybrace

> Starting balance without any filters.
> ERROR: error during balancing '/tmp/mnt/curlybrace':
> No space left on device

> No space left on device? How?
> 
> But it changed the situation:
> [...]curlybrace  2048   1302   190  88% /tmp/mnt/curlybrace
> 
> This is still not acceptable. I need to recover at least 50% free space
> (since I increased the FS to the double).
> 
> A 2nd balance attempt resulted in this:
> [...]curlybrace  2048   1302   162  89% /tmp/mnt/curlybrace
> 
> So... it became slightly worse.
> 
> What's going on? How can I fix the file system to show real data?

Something seems off, yes, but...

https://btrfs.wiki.kernel.org/index.php/FAQ

Reading the whole thing will likely be useful, but especially 1.3/1.4 and 
4.6-4.9 discussing the problem of space usage, reporting, and (primarily 
in some of the other space related FAQs beyond the specific ones above) 
how to try and fix it when space runes out, on btrfs.

If you read them before, read them again, because you didn't post the 
btrfs free-space reports covered in 4.7, instead posting what appears to 
be the standard (non-btrfs) df report, which for all the reasons 
explained in the FAQ, is at best only an estimate on btrfs.  That 
estimate is obviously behaving unexpectedly in your case, but without the 
btrfs specific reports, it's nigh impossible to even guess with any 
chance at accuracy what's going on, or how to fix it.

A WAG would be that part of the problem might be that you were into 
global reserve before the resize, so after the filesystem got more space 
to use, the first thing it did was unload that global reserve usage, 
thereby immediately upping apparent usage.  That might explain that 
initial jump in usage after the resize.  But that's just a WAG.  Without 
at least btrfs filesystem usage, or btrfs filesystem df plus btrfs 
filesystem show, from before the resize, after, and before and after the 
balances, a WAG is what it remains.  And again, without those reports, 
there's no way to say whether balance can be expected to help, or not.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


File system is oddly full after kernel upgrade, balance doesn't help

2017-01-27 Thread MegaBrutal
Hi,

Not sure if it caused by the upgrade, but I only encountered this
problem after I upgraded to Ubuntu Yakkety, which comes with a 4.8
kernel.
Linux vmhost 4.8.0-34-generic #36-Ubuntu SMP Wed Dec 21 17:24:18 UTC
2016 x86_64 x86_64 x86_64 GNU/Linux

This is the 2nd file system which showed these symptoms, so I thought
it's more than happenstance. I don't remember what I did with the
first one, but I somehow managed to fix it with balance, if I remember
correctly, but it doesn't help with this one.

FS state before any attempts to fix:
Filesystem 1M-blocks   Used Available Use%
Mounted on
/dev/mapper/vmdata--vg-lxc--curlybrace  1024   1024 0 100%
/tmp/mnt/curlybrace

Resized LV, run „btrfs filesystem resize max /tmp/mnt/curlybrace”:
/dev/mapper/vmdata--vg-lxc--curlybrace  2048   1303 0 100%
/tmp/mnt/curlybrace

Notice how the usage magically jumped up to 1303 MB, and despite the
FS size is 2048 MB, the usage is still displayed as 100%.

Tried full balance (other options with -dusage had no result):
root@vmhost:~# btrfs balance start -v /tmp/mnt/curlybrace
Dumping filters: flags 0x7, state 0x0, force is off
  DATA (flags 0x0): balancing
  METADATA (flags 0x0): balancing
  SYSTEM (flags 0x0): balancing
WARNING:

Full balance without filters requested. This operation is very
intense and takes potentially very long. It is recommended to
use the balance filters to narrow down the balanced data.
Use 'btrfs balance start --full-balance' option to skip this
warning. The operation will start in 10 seconds.
Use Ctrl-C to stop it.
10 9 8 7 6 5 4 3 2 1
Starting balance without any filters.
ERROR: error during balancing '/tmp/mnt/curlybrace': No space left on device
There may be more info in syslog - try dmesg | tail

No space left on device? How?

But it changed the situation:
/dev/mapper/vmdata--vg-lxc--curlybrace  2048   1302   190  88%
/tmp/mnt/curlybrace

This is still not acceptable. I need to recover at least 50% free
space (since I increased the FS to the double).

A 2nd balance attempt resulted in this:
/dev/mapper/vmdata--vg-lxc--curlybrace  2048   1302   162  89%
/tmp/mnt/curlybrace

So... it became slightly worse.

What's going on? How can I fix the file system to show real data?


Regards,
MegaBrutal
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Help please: BTRFS fs crashed due to bad removal of USB drive, no help from recovery procedures

2016-12-19 Thread Jari Seppälä
ize   4
cache_generation75012
uuid_tree_generation75012
dev_item.uuid   108c02c0-9812-428e-8f90-23bdf88e11bf
dev_item.fsid   82651f91-4989-415b-bd83-ae830f12608c [match]
dev_item.type   0
dev_item.total_bytes536869842944
dev_item.bytes_used 440259313664
dev_item.io_align   0
dev_item.io_width   0
dev_item.sector_size0
dev_item.devid  1
dev_item.dev_group  0
dev_item.seek_speed 0
dev_item.bandwidth  0
dev_item.generation 0


Regards,

Jari

> Regards,
> Xin
>  
>  
> 
> Sent: Monday, December 19, 2016 at 2:32 AM
> From: "Jari Seppälä" <lihamakaroonilaati...@gmail.com>
> To: linux-btrfs@vger.kernel.org
> Cc: "Xin Zhou" <xin.z...@gmx.com>
> Subject: Re: Help please: BTRFS fs crashed due to bad removal of USB drive, 
> no help from recovery procedures
> Xin Zhou <xin.z...@gmx.com> kirjoitti 17.12.2016 kello 22.27:
>> 
>> Hi Jari,
>> 
>> Similar with other file system, btrfs has copies of super blocks.
>> Try to run "man btrfs check", "man btrfs rescue" and related commands for 
>> more details.
>> Regards,
>> Xin
> 
> Hi Xin,
> 
> I did follow all recovery procedures from man and wiki pages. Tools do not 
> help as they thing there is no BTRFS fs anymore. However if I try to reformat 
> the device I get:
> 
> btrfs-progs v4.4
> See http://btrfs.wiki.kernel.org for more information.
> /dev/sdb1 appears to contain an existing filesystem (btrfs).
> 
> So, recovery tools seem to thing there is no btrfs filesystem. Mkfs seems to 
> thing there is.
> 
> What I have tried:
> btrfsck /dev/sdb1
> mount -t btrfs -o ro /dev/sdb1 /mnt/share/
> mount -t btrfs -o ro,recovery /dev/sdb1 /mnt/share/
> mount -t btrfs -o roootflags=recovery,nospace_cache /dev/sdb1 /mnt/share/
> mount -t btrfs -o rootflags=recovery,nospace_cache /dev/sdb1 /mnt/share/
> mount -t btrfs -o rootflags=recovery,nospace_cache,clear_cache /dev/sdb1 
> /mnt/share/
> mount -t btrfs -o ro,rootflags=recovery,nospace_cache,clear_cache /dev/sdb1 
> /mnt/share/
> btrfs restore /dev/sdb1 /target/device
> btrfs rescue zero-log /dev/sdb1
> btrfsck --init-csum-tree /dev/sdb1
> btrfsck --fix-crc /dev/sdb1
> btrfsck --check-data-csum /dev/sdb1
> btrfs rescue chunk-recover /dev/sdb1
> btrfs rescue super-recover /dev/sdb1
> btrfs rescue zero-log /dev/sdb1
> 
> No help whatsoever.
> 
> Jari
> 
>> 
>> 
>> 
>> Sent: Saturday, December 17, 2016 at 2:06 AM
>> From: "Jari Seppälä" <lihamakaroonilaati...@gmail.com>
>> To: linux-btrfs@vger.kernel.org
>> Subject: Help please: BTRFS fs crashed due to bad removal of USB drive, no 
>> help from recovery procedures
>> Syslog tells:
>> [ 135.446222] BTRFS error (device sdb1): system chunk array too small 0 < 97
>> [ 135.446260] BTRFS error (device sdb1): superblock contains fatal errors
>> [ 135.462544] BTRFS error (device sdb1): open_ctree failed
>> 
>> What have been done:
>> * All "btrfs rescue" options
>> 
>> Info on system
>> * fs on external SSD via USB
>> * kernel 4.9.0 (tried with 4.8.13)
>> * btrfs-tools 4.4
>> * Mythbuntu (Ubuntu) 16.04.1 LTS with latest fixes 2012-12-16
>> 
>> Any help appreciated. Around 300G of TV recordings on the drive, which of 
>> course will eventually come as replays.
>> 
>> Jari
>> --
>> *** Jari Seppälä
>> 
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>> the body of a message to majord...@vger.kernel.org
>> More majordomo info at 
>> http://vger.kernel.org/majordomo-info.html[http://vger.kernel.org/majordomo-info.html]
> 

--
*** Jari Seppälä


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Help please: BTRFS fs crashed due to bad removal of USB drive, no help from recovery procedures

2016-12-19 Thread Xin Zhou
Hi Jari,

The message shows:
> [ 135.446260] BTRFS error (device sdb1): superblock contains fatal errors
 
So according this info, before trying to run repair / rescue procedure, would 
you like to show the 0,1,2 superblock status?

Regards,
Xin
 
 

Sent: Monday, December 19, 2016 at 2:32 AM
From: "Jari Seppälä" <lihamakaroonilaati...@gmail.com>
To: linux-btrfs@vger.kernel.org
Cc: "Xin Zhou" <xin.z...@gmx.com>
Subject: Re: Help please: BTRFS fs crashed due to bad removal of USB drive, no 
help from recovery procedures
Xin Zhou <xin.z...@gmx.com> kirjoitti 17.12.2016 kello 22.27:
>
> Hi Jari,
>
> Similar with other file system, btrfs has copies of super blocks.
> Try to run "man btrfs check", "man btrfs rescue" and related commands for 
> more details.
> Regards,
> Xin

Hi Xin,

I did follow all recovery procedures from man and wiki pages. Tools do not help 
as they thing there is no BTRFS fs anymore. However if I try to reformat the 
device I get:

btrfs-progs v4.4
See http://btrfs.wiki.kernel.org for more information.
/dev/sdb1 appears to contain an existing filesystem (btrfs).

So, recovery tools seem to thing there is no btrfs filesystem. Mkfs seems to 
thing there is.

What I have tried:
btrfsck /dev/sdb1
mount -t btrfs -o ro /dev/sdb1 /mnt/share/
mount -t btrfs -o ro,recovery /dev/sdb1 /mnt/share/
mount -t btrfs -o roootflags=recovery,nospace_cache /dev/sdb1 /mnt/share/
mount -t btrfs -o rootflags=recovery,nospace_cache /dev/sdb1 /mnt/share/
mount -t btrfs -o rootflags=recovery,nospace_cache,clear_cache /dev/sdb1 
/mnt/share/
mount -t btrfs -o ro,rootflags=recovery,nospace_cache,clear_cache /dev/sdb1 
/mnt/share/
btrfs restore /dev/sdb1 /target/device
btrfs rescue zero-log /dev/sdb1
btrfsck --init-csum-tree /dev/sdb1
btrfsck --fix-crc /dev/sdb1
btrfsck --check-data-csum /dev/sdb1
btrfs rescue chunk-recover /dev/sdb1
btrfs rescue super-recover /dev/sdb1
btrfs rescue zero-log /dev/sdb1

No help whatsoever.

Jari

>
>
>
> Sent: Saturday, December 17, 2016 at 2:06 AM
> From: "Jari Seppälä" <lihamakaroonilaati...@gmail.com>
> To: linux-btrfs@vger.kernel.org
> Subject: Help please: BTRFS fs crashed due to bad removal of USB drive, no 
> help from recovery procedures
> Syslog tells:
> [ 135.446222] BTRFS error (device sdb1): system chunk array too small 0 < 97
> [ 135.446260] BTRFS error (device sdb1): superblock contains fatal errors
> [ 135.462544] BTRFS error (device sdb1): open_ctree failed
>
> What have been done:
> * All "btrfs rescue" options
>
> Info on system
> * fs on external SSD via USB
> * kernel 4.9.0 (tried with 4.8.13)
> * btrfs-tools 4.4
> * Mythbuntu (Ubuntu) 16.04.1 LTS with latest fixes 2012-12-16
>
> Any help appreciated. Around 300G of TV recordings on the drive, which of 
> course will eventually come as replays.
>
> Jari
> --
> *** Jari Seppälä
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at 
> http://vger.kernel.org/majordomo-info.html[http://vger.kernel.org/majordomo-info.html]

--
*** Jari Seppälä
 
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Help please: BTRFS fs crashed due to bad removal of USB drive, no help from recovery procedures

2016-12-19 Thread Jari Seppälä
Xin Zhou <xin.z...@gmx.com> kirjoitti 17.12.2016 kello 22.27:
> 
> Hi Jari,
>  
> Similar with other file system, btrfs has copies of super blocks.
> Try to run "man btrfs check", "man btrfs rescue" and related commands for 
> more details.
> Regards,
> Xin

Hi Xin,

I did follow all recovery procedures from man and wiki pages. Tools do not help 
as they thing there is no BTRFS fs anymore. However if I try to reformat the 
device I get: 

 btrfs-progs v4.4
 See http://btrfs.wiki.kernel.org for more information.
 /dev/sdb1 appears to contain an existing filesystem (btrfs).

So, recovery tools seem to thing there is no btrfs filesystem. Mkfs seems to 
thing there is. 

What I have tried:
btrfsck /dev/sdb1
mount -t btrfs -o ro /dev/sdb1 /mnt/share/
mount -t btrfs -o ro,recovery /dev/sdb1 /mnt/share/
mount -t btrfs -o roootflags=recovery,nospace_cache /dev/sdb1 /mnt/share/
mount -t btrfs -o rootflags=recovery,nospace_cache /dev/sdb1 /mnt/share/
mount -t btrfs -o rootflags=recovery,nospace_cache,clear_cache /dev/sdb1 
/mnt/share/
mount -t btrfs -o ro,rootflags=recovery,nospace_cache,clear_cache /dev/sdb1 
/mnt/share/
btrfs restore /dev/sdb1 /target/device
btrfs rescue zero-log /dev/sdb1
btrfsck --init-csum-tree /dev/sdb1
btrfsck --fix-crc /dev/sdb1
btrfsck --check-data-csum /dev/sdb1
btrfs rescue chunk-recover /dev/sdb1
btrfs rescue super-recover /dev/sdb1
btrfs rescue zero-log /dev/sdb1

No help whatsoever.

Jari
  
>  
>  
> 
> Sent: Saturday, December 17, 2016 at 2:06 AM
> From: "Jari Seppälä" <lihamakaroonilaati...@gmail.com>
> To: linux-btrfs@vger.kernel.org
> Subject: Help please: BTRFS fs crashed due to bad removal of USB drive, no 
> help from recovery procedures
> Syslog tells:
> [ 135.446222] BTRFS error (device sdb1): system chunk array too small 0 < 97
> [ 135.446260] BTRFS error (device sdb1): superblock contains fatal errors
> [ 135.462544] BTRFS error (device sdb1): open_ctree failed
> 
> What have been done:
> * All "btrfs rescue" options
> 
> Info on system
> * fs on external SSD via USB
> * kernel 4.9.0 (tried with 4.8.13)
> * btrfs-tools 4.4
> * Mythbuntu (Ubuntu) 16.04.1 LTS with latest fixes 2012-12-16
> 
> Any help appreciated. Around 300G of TV recordings on the drive, which of 
> course will eventually come as replays.
> 
> Jari
> --
> *** Jari Seppälä
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html

--
*** Jari Seppälä

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Help please: BTRFS fs crashed due to bad removal of USB drive, no help from recovery procedures

2016-12-17 Thread Xin Zhou


Hi Jari,
 
Similar with other file system, btrfs has copies of super blocks.
Try to run "man btrfs check", "man btrfs rescue" and related commands for more 
details.
Regards,
Xin
 
 

Sent: Saturday, December 17, 2016 at 2:06 AM
From: "Jari Seppälä" <lihamakaroonilaati...@gmail.com>
To: linux-btrfs@vger.kernel.org
Subject: Help please: BTRFS fs crashed due to bad removal of USB drive, no help 
from recovery procedures
Syslog tells:
[ 135.446222] BTRFS error (device sdb1): system chunk array too small 0 < 97
[ 135.446260] BTRFS error (device sdb1): superblock contains fatal errors
[ 135.462544] BTRFS error (device sdb1): open_ctree failed

What have been done:
* All "btrfs rescue" options

Info on system
* fs on external SSD via USB
* kernel 4.9.0 (tried with 4.8.13)
* btrfs-tools 4.4
* Mythbuntu (Ubuntu) 16.04.1 LTS with latest fixes 2012-12-16

Any help appreciated. Around 300G of TV recordings on the drive, which of 
course will eventually come as replays.

Jari
--
*** Jari Seppälä

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Help please: BTRFS fs crashed due to bad removal of USB drive, no help from recovery procedures

2016-12-17 Thread Jari Seppälä
Syslog tells:
[  135.446222] BTRFS error (device sdb1): system chunk array too small 0 < 97
[  135.446260] BTRFS error (device sdb1): superblock contains fatal errors
[  135.462544] BTRFS error (device sdb1): open_ctree failed

What have been done:
* All "btrfs rescue" options

Info on system
* fs on external SSD via USB
* kernel 4.9.0 (tried with 4.8.13)
* btrfs-tools 4.4
* Mythbuntu (Ubuntu) 16.04.1 LTS with latest fixes 2012-12-16

Any help appreciated. Around 300G of TV recordings on the drive, which of 
course will eventually come as replays.

Jari
--
*** Jari Seppälä

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Help with stack trace

2016-11-25 Thread Timofey Titovets
So it's btrfs problem,
i catch hung again with 4.8.7, and i can't catch if ES data stored on ext4
Trace from 4.8.7:
Nov 25 14:09:30 msq-k1-srv-ids-01 kernel: INFO: task
btrfs-transacti:4143 blocked for more than 120 seconds.
Nov 25 14:09:30 msq-k1-srv-ids-01 kernel:   Not tainted 4.8.0-1-amd64 #1
Nov 25 14:09:30 msq-k1-srv-ids-01 kernel: "echo 0 >
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
Nov 25 14:09:30 msq-k1-srv-ids-01 kernel: btrfs-transacti D
9dd15e0d8180 0  4143  2 0x
Nov 25 14:09:30 msq-k1-srv-ids-01 kernel:  9dd954a97100
9dd15a7b80c0 920e5e15 9dd956ffbe08
Nov 25 14:09:30 msq-k1-srv-ids-01 kernel:  9dd956ffc000
9dd9553091f0 9dd955309000 9dd9553091f0
Nov 25 14:09:30 msq-k1-srv-ids-01 kernel:  
9dd954a97100 925eb4d1 9dca41f6e550
Nov 25 14:09:30 msq-k1-srv-ids-01 kernel: Call Trace:
Nov 25 14:09:30 msq-k1-srv-ids-01 kernel:  [] ?
try_to_del_timer_sync+0x55/0x80
Nov 25 14:09:30 msq-k1-srv-ids-01 kernel:  [] ?
schedule+0x31/0x80
Nov 25 14:09:30 msq-k1-srv-ids-01 kernel:  [] ?
wait_current_trans.isra.21+0xcd/0x110 [btrfs]
Nov 25 14:09:30 msq-k1-srv-ids-01 kernel:  [] ?
wake_atomic_t_function+0x60/0x60
Nov 25 14:09:30 msq-k1-srv-ids-01 kernel:  [] ?
start_transaction+0x273/0x4b0 [btrfs]
Nov 25 14:09:30 msq-k1-srv-ids-01 kernel:  [] ?
transaction_kthread+0x77/0x200 [btrfs]
Nov 25 14:09:30 msq-k1-srv-ids-01 kernel:  [] ?
btrfs_cleanup_transaction+0x590/0x590 [btrfs]
Nov 25 14:09:30 msq-k1-srv-ids-01 kernel:  [] ?
kthread+0xcd/0xf0
Nov 25 14:09:30 msq-k1-srv-ids-01 kernel:  [] ?
ret_from_fork+0x1f/0x40
Nov 25 14:09:30 msq-k1-srv-ids-01 kernel:  [] ?
kthread_create_on_node+0x190/0x190
Nov 25 14:09:30 msq-k1-srv-ids-01 kernel: INFO: task htop:12776
blocked for more than 120 seconds.
Nov 25 14:09:30 msq-k1-srv-ids-01 kernel:   Not tainted 4.8.0-1-amd64 #1
Nov 25 14:09:30 msq-k1-srv-ids-01 kernel: "echo 0 >
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
Nov 25 14:09:30 msq-k1-srv-ids-01 kernel: htopD
9dd95d898180 0 12776  1 0x0004
Nov 25 14:09:30 msq-k1-srv-ids-01 kernel:  9dd84d02d0c0
9dd959c9a0c0 9dd183ed3e00 0041
Nov 25 14:09:30 msq-k1-srv-ids-01 kernel:  9dd84d05
9dd84d04fdf8 9dca4319cc68 9dca4319cc80
Nov 25 14:09:30 msq-k1-srv-ids-01 kernel:  
9dd84d04fd90 925eb4d1 9dd84d02d0c0
Nov 25 14:09:30 msq-k1-srv-ids-01 kernel: Call Trace:
Nov 25 14:09:30 msq-k1-srv-ids-01 kernel:  [] ?
schedule+0x31/0x80
Nov 25 14:09:30 msq-k1-srv-ids-01 kernel:  [] ?
rwsem_down_read_failed+0xf8/0x150
Nov 25 14:09:30 msq-k1-srv-ids-01 kernel:  [] ?
call_rwsem_down_read_failed+0x14/0x30
Nov 25 14:09:30 msq-k1-srv-ids-01 kernel:  [] ?
down_read+0x1c/0x30
Nov 25 14:09:30 msq-k1-srv-ids-01 kernel:  [] ?
proc_pid_cmdline_read+0xae/0x540
Nov 25 14:09:30 msq-k1-srv-ids-01 kernel:  [] ?
vfs_read+0x90/0x130
Nov 25 14:09:30 msq-k1-srv-ids-01 kernel:  [] ?
SyS_read+0x52/0xc0
Nov 25 14:09:30 msq-k1-srv-ids-01 kernel:  [] ?
system_call_fast_compare_end+0xc/0x96
Nov 25 14:09:30 msq-k1-srv-ids-01 kernel: INFO: task iotop:12785
blocked for more than 120 seconds.
Nov 25 14:09:30 msq-k1-srv-ids-01 kernel:   Not tainted 4.8.0-1-amd64 #1
Nov 25 14:09:30 msq-k1-srv-ids-01 kernel: "echo 0 >
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
Nov 25 14:09:30 msq-k1-srv-ids-01 kernel: iotop   D
9dd15e158180 0 12785  1 0x0004
Nov 25 14:09:30 msq-k1-srv-ids-01 kernel:  9dd855a92100
9dd15a7ba140 9dd9546e1c00 7ff86ac74000
Nov 25 14:09:30 msq-k1-srv-ids-01 kernel:  9dd8549cc000
9dd8549cbdf8 9dca4319cc68 9dca4319cc80
Nov 25 14:09:30 msq-k1-srv-ids-01 kernel:  
9dd8549cbd90 925eb4d1 9dd855a92100
Nov 25 14:09:30 msq-k1-srv-ids-01 kernel: Call Trace:
Nov 25 14:09:30 msq-k1-srv-ids-01 kernel:  [] ?
schedule+0x31/0x80
Nov 25 14:09:30 msq-k1-srv-ids-01 kernel:  [] ?
rwsem_down_read_failed+0xf8/0x150
Nov 25 14:09:30 msq-k1-srv-ids-01 kernel:  [] ?
call_rwsem_down_read_failed+0x14/0x30
Nov 25 14:09:30 msq-k1-srv-ids-01 kernel:  [] ?
down_read+0x1c/0x30
Nov 25 14:09:30 msq-k1-srv-ids-01 kernel:  [] ?
proc_pid_cmdline_read+0xae/0x540
Nov 25 14:09:30 msq-k1-srv-ids-01 kernel:  [] ?
vfs_read+0x90/0x130
Nov 25 14:09:30 msq-k1-srv-ids-01 kernel:  [] ?
SyS_read+0x52/0xc0
Nov 25 14:09:30 msq-k1-srv-ids-01 kernel:  [] ?
system_call_fast_compare_end+0xc/0x96
Nov 25 14:09:30 msq-k1-srv-ids-01 kernel: INFO: task java:18198
blocked for more than 120 seconds.
Nov 25 14:09:30 msq-k1-srv-ids-01 kernel:   Not tainted 4.8.0-1-amd64 #1
Nov 25 14:09:30 msq-k1-srv-ids-01 kernel: "echo 0 >
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
Nov 25 14:09:30 msq-k1-srv-ids-01 kernel: javaD
9dd95d898180 0 18198  1 0x0100
Nov 25 14:09:30 msq-k1-srv-ids-01 kernel:  9dd660e1a140
9dd959c9a0c0 9dd660e1a140 

Help with stack trace

2016-11-24 Thread Timofey Titovets
Hi, i use btrfs as a storage for root and data for ElasticSearch
servers and i catch strange bug then servers hungs.
But i get this stack trace only if start Elastic.

Debian 8 x64
Linux msq-k1-srv-ids-02 4.8.0-1-amd64 #1 SMP Debian 4.8.5-1
(2016-10-28) x86_64 GNU/Linux
Also catch it on Debian linux 4.7.6
btrfs-progs v4.7.3

btrfs check don't find any errors, so i think may be this is some kind
of race condition?

Stack trace:
[  365.619814] INFO: task kworker/u480:1:205 blocked for more than 120 seconds.
[  365.619891]   Not tainted 4.8.0-1-amd64 #1
[  365.619926] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[  365.619984] kworker/u480:1  D 888d7bb18180 0   205  2 0x
[  365.620103] Workqueue: btrfs-endio-write btrfs_endio_write_helper [btrfs]
[  365.620158]  888d7235a000 888d74b820c0 c03c68fb
888d6b9d3a58
[  365.620227]  888d6b9d4000 887e12709508 ff00
888d7235a000
[  365.620292]  888d7235a000 888d6b9d3a70 aabeb4e1
887e127094a0
[  365.620358] Call Trace:
[  365.620416]  [] ? btrfs_get_token_32+0x6b/0x130 [btrfs]
[  365.620475]  [] ? schedule+0x31/0x80
[  365.620542]  [] ? btrfs_tree_read_lock+0xd5/0x120 [btrfs]
[  365.620597]  [] ? wake_atomic_t_function+0x60/0x60
[  365.620666]  [] ?
btrfs_read_lock_root_node+0x2f/0x40 [btrfs]
[  365.620742]  [] ? btrfs_search_slot+0x756/0x9f0 [btrfs]
[  365.620817]  [] ? btrfs_buffer_uptodate+0x4b/0x70 [btrfs]
[  365.620889]  [] ?
generic_bin_search.constprop.37+0x9b/0x210 [btrfs]
[  365.620971]  [] ?
btrfs_lookup_file_extent+0x4a/0x70 [btrfs]
[  365.621049]  [] ? __btrfs_drop_extents+0x164/0xdd0 [btrfs]
[  365.621105]  [] ? kmem_cache_alloc+0xbc/0x530
[  365.621176]  [] ?
insert_reserved_file_extent.constprop.64+0xb4/0x330 [btrfs]
[  365.621263]  [] ? start_transaction+0x95/0x4b0 [btrfs]
[  365.621336]  [] ?
btrfs_finish_ordered_io+0x307/0x680 [btrfs]
[  365.621394]  [] ? check_preempt_curr+0x50/0x90
[  365.621467]  [] ?
btrfs_scrubparity_helper+0xd1/0x2d0 [btrfs]
[  365.621524]  [] ? process_one_work+0x160/0x410
[  365.621570]  [] ? worker_thread+0x4d/0x480
[  365.621614]  [] ? process_one_work+0x410/0x410
[  365.621662]  [] ? kthread+0xcd/0xf0
[  365.621704]  [] ? ret_from_fork+0x1f/0x40
[  365.621748]  [] ? kthread_create_on_node+0x190/0x190
[  365.621799] INFO: task kworker/u480:2:1467 blocked for more than 120 seconds.
[  365.621852]   Not tainted 4.8.0-1-amd64 #1
[  365.621886] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[  365.621942] kworker/u480:2  D 888d7bd98180 0  1467  2 0x
[  365.622032] Workqueue: btrfs-endio-write btrfs_endio_write_helper [btrfs]
[  365.622085]  888d6bab50c0 888d74b8d040 c03c68fb
888d6a1f7a58
[  365.622151]  888d6a1f8000 887e12709508 ff00
888d6bab50c0
[  365.622217]  888d6bab50c0 888d6a1f7a70 aabeb4e1
887e127094a0
[  365.622283] Call Trace:
[  365.622334]  [] ? btrfs_get_token_32+0x6b/0x130 [btrfs]
[  365.622387]  [] ? schedule+0x31/0x80
[  365.622453]  [] ? btrfs_tree_read_lock+0xd5/0x120 [btrfs]
[  365.622506]  [] ? wake_atomic_t_function+0x60/0x60
[  365.622575]  [] ?
btrfs_read_lock_root_node+0x2f/0x40 [btrfs]
[  365.624240]  [] ? btrfs_search_slot+0x756/0x9f0 [btrfs]
[  365.625855]  [] ? swiotlb_map_sg_attrs+0x6a/0x130
[  365.627491]  [] ?
btrfs_lookup_file_extent+0x4a/0x70 [btrfs]
[  365.629124]  [] ? __btrfs_drop_extents+0x164/0xdd0 [btrfs]
[  365.630692]  [] ? kmem_cache_alloc+0xbc/0x530
[  365.632279]  [] ?
insert_reserved_file_extent.constprop.64+0xb4/0x330 [btrfs]
[  365.633759]  [] ? start_transaction+0x95/0x4b0 [btrfs]
[  365.635211]  [] ?
btrfs_finish_ordered_io+0x307/0x680 [btrfs]
[  365.636647]  [] ? check_preempt_curr+0x50/0x90
[  365.638095]  [] ?
btrfs_scrubparity_helper+0xd1/0x2d0 [btrfs]
[  365.639526]  [] ? process_one_work+0x160/0x410
[  365.640959]  [] ? worker_thread+0x4d/0x480
[  365.642361]  [] ? process_one_work+0x410/0x410
[  365.643769]  [] ? kthread+0xcd/0xf0
[  365.645072]  [] ? ret_from_fork+0x1f/0x40
[  365.646349]  [] ? kthread_create_on_node+0x190/0x190
[  365.647669] INFO: task btrfs-transacti:4130 blocked for more than
120 seconds.
[  365.648981]   Not tainted 4.8.0-1-amd64 #1
[  365.650270] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[  365.651585] btrfs-transacti D 888d7bc18180 0  4130  2 0x
[  365.652928]  888d6a269000 888d71ec aa6e6f39
888d6d38be08
[  365.654270]  888d6d38c000 888d720eb9f0 888d720eb800
888d720eb9f0
[  365.655612]   888d6a269000 aabeb4e1
888d6b57e3a0
[  365.656951] Call Trace:
[  365.658183]  [] ? try_to_del_timer_sync+0x59/0x80
[  365.659411]  [] ? schedule+0x31/0x80
[  365.660680]  [] ?
wait_current_trans.isra.21+0xcd/0x110 [btrfs]
[  365.662001]  [] ? wake_atomic_t_function+0x60/0x60
[  365.663249]  [] ? 

Re: Help repairing a partition

2016-10-21 Thread Chris Murphy
On Fri, Oct 21, 2016 at 12:36 AM, Suvayu Ali
<fatkasuvayu+li...@gmail.com> wrote:

> I had upgraded to 4.7.3 to test this issue:
>
>   https://bugzilla.redhat.com/show_bug.cgi?id=1372910
>
> It hadn't helped, but I didn't have time to debug it any further.
> Since the Fedora 23 repos have 4.4.1, I guess downgrading is easier
> for me.

Better is to go to http://koji.fedoraproject.org/ and type in
btrfs-progs for the package, and find the most recent x.y-1.z version
- right now that's 4.7.3, although 4.8.1 is probably OK also - it has
no new features, mainly just a pile of bug fixes, which might be
useful. So that'd be either:
btrfs-progs-4.8.1-2.fc26
or
btrfs-progs-4.7.3-1.fc26

And rpmbuild --rebuild them for F23 and then install. I would not
downgrade to 4.4.1 - it's not that it's bad, it's just a waste of time
if it can't help fix the problem which is very likely the older progs
you have.


>
> Thanks for the pointer to the changelog; under 4.7.2 it mentions not
> to repair with 4.7.1, so I'll try `btrfs check --repair` after the
> downgrade.

No. The older the progs the less safe the repair is. And this
particular problem you have probably needs a newer progs to fix it
anyway. So you need to go newer not older. That's pretty much always
the case with Btrfs.


>
>>> followed by this summary:
>>>
>>> checking csums
>>> checking root refs
>>> checking quota groups
>>> Counts for qgroup id: 0/257 are different
>>> our:referenced 7746465792 referenced compressed 7746465792
>>> disk:   referenced 7746461696 referenced compressed 7746461696
>>> diff:   referenced 4096 referenced compressed 4096
>>> our:exclusive 7746465792 exclusive compressed 7746465792
>>> disk:   exclusive 7746461696 exclusive compressed 7746461696
>>> diff:   exclusive 4096 exclusive compressed 4096
>>> Counts for qgroup id: 0/259 are different
>>> our:referenced 135641784320 referenced compressed 135641784320
>>> disk:   referenced 135633862656 referenced compressed 135633862656
>>> diff:   referenced 7921664 referenced compressed 7921664
>>> our:exclusive 135641784320 exclusive compressed 135641784320
>>> disk:   exclusive 135633862656 exclusive compressed 135633862656
>>> diff:   exclusive 7921664 exclusive compressed 7921664
>>> found 167864082432 bytes used err is 0
>>> total csum bytes: 161187492
>>> total tree bytes: 2021015552
>>> total fs tree bytes: 1725759488
>>> total extent tree bytes: 86228992
>>> btree space waste bytes: 386160897
>>> file data blocks allocated: 1269363683328
>>>  referenced 164438126592
>>>
>>> How do I repair this?
>>
>> Yeah good question. I can't tell from the message whether different
>> counts is a bad thing, or if it's just a notification, or what. Yet
>> again btrfs-progs does not help the user make informed decisions, it's
>> really frustrating. I think that part can be ignored though for now,
>> and see if btrfs check --repair can fix the problem now that you have
>> a backup.
>
> Indeed, I have never been this confused about a file system before.
>
> I tried repairing after the downgrade to 4.4.1, it says "Couldn't open
> file system"!  Mounting now works without errors, I can also r/w files
> as normal; go figure!

Oh shit. That's hilarious. I'm not even going to edit what I wrote above.

Anyway, it looks like you have quotas enabled. There are a number of
quota related bug fixes in progs newer than 4.4, so you really ought
to use something newer, and if it breaks then it's a bug and needs a
good bug report write up so it can get fixed.

In the meantime I would be wary with this file system if it's the only
backup copy. (Actually I feel that way no matter the file system.) I'd
make sure btrfs check with progs 4.7.3 or 4.8.1 come up clean (i.e.
err is 0 is generally a good sign), and that a scrub also comes up
clean with no errors: either 'btrfs scrub start ' and then later
check with 'btrfs scrub status' or use -BR flag to not background and
show stats after completion.



-- 
Chris Murphy
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Help repairing a partition

2016-10-21 Thread Suvayu Ali
Hi Chris,

Thanks for your response :).

On 21 October 2016 at 05:18, Chris Murphy <li...@colorremedies.com> wrote:
> On Thu, Oct 20, 2016 at 3:20 PM, Suvayu Ali <fatkasuvayu+li...@gmail.com> 
> wrote:
>> Hi,
>>
>> (please CC me in replies, I'm not subscribed)
>>
>> I'm using kernel 4.7.7-100.fc23 with btrfs-progs v4.7.1.
>>
>> I had my /home, /var, and /opt as subvolumes in a btrfs partition.
>> Last night btrfs failed, and I was unable to mount it normally
>> (leading to boot failures).  The journal had messages like this:
>>
>>   BTRFS: open_ctree failed
>>   BTRFS error: super_total_bytes ... mismatch with fs_devices total_rw_bytes
>>   BTRFS error: failed to read chunk tree: -22
>>
>> Finally I managed to mount it manually like this (after making a dd
>> image of the partition):
>>
>>   # mount -t btrfs -o ro,recovery,nospace_cache /dev/sdb2 /mnt
>>
>> and managed to recover my data. Initially "btrfs check" yielded a few
>>
>>   parent transid verify failed on 101679726592 wanted 822619 found 822617
>>
>> and
>>
>>   checksum verify failed on 101756387328 found 78C8A0BC wanted B7C59D79
>>
>> however after backing up my data, I mounted without the "-o ro" (I got
>> a transid related message, but it did mount).  "btrfs check" now spits
>> out a whole bunch of:
>>
>>   Incorrect local backref count on 202118008832 root 259 owner 178928
>> offset 41181184 found 2 wanted 7 back 0x55713fbbf150
>>   Incorrect global backref count on 202118008832 found 2 wanted 7
>>   backpointer mismatch on [202118008832 376832]
>
>
> This is a known problem with btrfs-progs 4.7.1 it should not be used.
> https://btrfs.wiki.kernel.org/index.php/Changelog#btrfs-progs_4.7.1_.28Aug_2016.29
>
> Upgrade to 4.7.3 or 4.8.1 is advised.

I had upgraded to 4.7.3 to test this issue:

  https://bugzilla.redhat.com/show_bug.cgi?id=1372910

It hadn't helped, but I didn't have time to debug it any further.
Since the Fedora 23 repos have 4.4.1, I guess downgrading is easier
for me.

Thanks for the pointer to the changelog; under 4.7.2 it mentions not
to repair with 4.7.1, so I'll try `btrfs check --repair` after the
downgrade.

>> followed by this summary:
>>
>> checking csums
>> checking root refs
>> checking quota groups
>> Counts for qgroup id: 0/257 are different
>> our:referenced 7746465792 referenced compressed 7746465792
>> disk:   referenced 7746461696 referenced compressed 7746461696
>> diff:   referenced 4096 referenced compressed 4096
>> our:exclusive 7746465792 exclusive compressed 7746465792
>> disk:   exclusive 7746461696 exclusive compressed 7746461696
>> diff:   exclusive 4096 exclusive compressed 4096
>> Counts for qgroup id: 0/259 are different
>> our:referenced 135641784320 referenced compressed 135641784320
>> disk:   referenced 135633862656 referenced compressed 135633862656
>> diff:   referenced 7921664 referenced compressed 7921664
>> our:exclusive 135641784320 exclusive compressed 135641784320
>> disk:   exclusive 135633862656 exclusive compressed 135633862656
>> diff:   exclusive 7921664 exclusive compressed 7921664
>> found 167864082432 bytes used err is 0
>> total csum bytes: 161187492
>> total tree bytes: 2021015552
>> total fs tree bytes: 1725759488
>> total extent tree bytes: 86228992
>> btree space waste bytes: 386160897
>> file data blocks allocated: 1269363683328
>>  referenced 164438126592
>>
>> How do I repair this?
>
> Yeah good question. I can't tell from the message whether different
> counts is a bad thing, or if it's just a notification, or what. Yet
> again btrfs-progs does not help the user make informed decisions, it's
> really frustrating. I think that part can be ignored though for now,
> and see if btrfs check --repair can fix the problem now that you have
> a backup.

Indeed, I have never been this confused about a file system before.

I tried repairing after the downgrade to 4.4.1, it says "Couldn't open
file system"!  Mounting now works without errors, I can also r/w files
as normal; go figure!

Cheers,

-- 
Suvayu

Open source is the future. It sets us free.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Help repairing a partition

2016-10-20 Thread Chris Murphy
On Thu, Oct 20, 2016 at 3:20 PM, Suvayu Ali <fatkasuvayu+li...@gmail.com> wrote:
> Hi,
>
> (please CC me in replies, I'm not subscribed)
>
> I'm using kernel 4.7.7-100.fc23 with btrfs-progs v4.7.1.
>
> I had my /home, /var, and /opt as subvolumes in a btrfs partition.
> Last night btrfs failed, and I was unable to mount it normally
> (leading to boot failures).  The journal had messages like this:
>
>   BTRFS: open_ctree failed
>   BTRFS error: super_total_bytes ... mismatch with fs_devices total_rw_bytes
>   BTRFS error: failed to read chunk tree: -22
>
> Finally I managed to mount it manually like this (after making a dd
> image of the partition):
>
>   # mount -t btrfs -o ro,recovery,nospace_cache /dev/sdb2 /mnt
>
> and managed to recover my data. Initially "btrfs check" yielded a few
>
>   parent transid verify failed on 101679726592 wanted 822619 found 822617
>
> and
>
>   checksum verify failed on 101756387328 found 78C8A0BC wanted B7C59D79
>
> however after backing up my data, I mounted without the "-o ro" (I got
> a transid related message, but it did mount).  "btrfs check" now spits
> out a whole bunch of:
>
>   Incorrect local backref count on 202118008832 root 259 owner 178928
> offset 41181184 found 2 wanted 7 back 0x55713fbbf150
>   Incorrect global backref count on 202118008832 found 2 wanted 7
>   backpointer mismatch on [202118008832 376832]


This is a known problem with btrfs-progs 4.7.1 it should not be used.
https://btrfs.wiki.kernel.org/index.php/Changelog#btrfs-progs_4.7.1_.28Aug_2016.29

Upgrade to 4.7.3 or 4.8.1 is advised.




>
> followed by this summary:
>
> checking csums
> checking root refs
> checking quota groups
> Counts for qgroup id: 0/257 are different
> our:referenced 7746465792 referenced compressed 7746465792
> disk:   referenced 7746461696 referenced compressed 7746461696
> diff:   referenced 4096 referenced compressed 4096
> our:exclusive 7746465792 exclusive compressed 7746465792
> disk:   exclusive 7746461696 exclusive compressed 7746461696
> diff:   exclusive 4096 exclusive compressed 4096
> Counts for qgroup id: 0/259 are different
> our:referenced 135641784320 referenced compressed 135641784320
> disk:   referenced 135633862656 referenced compressed 135633862656
> diff:   referenced 7921664 referenced compressed 7921664
> our:exclusive 135641784320 exclusive compressed 135641784320
> disk:   exclusive 135633862656 exclusive compressed 135633862656
> diff:   exclusive 7921664 exclusive compressed 7921664
> found 167864082432 bytes used err is 0
> total csum bytes: 161187492
> total tree bytes: 2021015552
> total fs tree bytes: 1725759488
> total extent tree bytes: 86228992
> btree space waste bytes: 386160897
> file data blocks allocated: 1269363683328
>  referenced 164438126592
>
> How do I repair this?

Yeah good question. I can't tell from the message whether different
counts is a bad thing, or if it's just a notification, or what. Yet
again btrfs-progs does not help the user make informed decisions, it's
really frustrating. I think that part can be ignored though for now,
and see if btrfs check --repair can fix the problem now that you have
a backup.


-- 
Chris Murphy
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Help repairing a partition

2016-10-20 Thread Suvayu Ali
Hi,

(please CC me in replies, I'm not subscribed)

I'm using kernel 4.7.7-100.fc23 with btrfs-progs v4.7.1.

I had my /home, /var, and /opt as subvolumes in a btrfs partition.
Last night btrfs failed, and I was unable to mount it normally
(leading to boot failures).  The journal had messages like this:

  BTRFS: open_ctree failed
  BTRFS error: super_total_bytes ... mismatch with fs_devices total_rw_bytes
  BTRFS error: failed to read chunk tree: -22

Finally I managed to mount it manually like this (after making a dd
image of the partition):

  # mount -t btrfs -o ro,recovery,nospace_cache /dev/sdb2 /mnt

and managed to recover my data. Initially "btrfs check" yielded a few

  parent transid verify failed on 101679726592 wanted 822619 found 822617

and

  checksum verify failed on 101756387328 found 78C8A0BC wanted B7C59D79

however after backing up my data, I mounted without the "-o ro" (I got
a transid related message, but it did mount).  "btrfs check" now spits
out a whole bunch of:

  Incorrect local backref count on 202118008832 root 259 owner 178928
offset 41181184 found 2 wanted 7 back 0x55713fbbf150
  Incorrect global backref count on 202118008832 found 2 wanted 7
  backpointer mismatch on [202118008832 376832]

followed by this summary:

checking csums
checking root refs
checking quota groups
Counts for qgroup id: 0/257 are different
our:referenced 7746465792 referenced compressed 7746465792
disk:   referenced 7746461696 referenced compressed 7746461696
diff:   referenced 4096 referenced compressed 4096
our:exclusive 7746465792 exclusive compressed 7746465792
disk:   exclusive 7746461696 exclusive compressed 7746461696
diff:   exclusive 4096 exclusive compressed 4096
Counts for qgroup id: 0/259 are different
our:referenced 135641784320 referenced compressed 135641784320
disk:   referenced 135633862656 referenced compressed 135633862656
diff:   referenced 7921664 referenced compressed 7921664
our:exclusive 135641784320 exclusive compressed 135641784320
disk:   exclusive 135633862656 exclusive compressed 135633862656
diff:   exclusive 7921664 exclusive compressed 7921664
found 167864082432 bytes used err is 0
total csum bytes: 161187492
total tree bytes: 2021015552
total fs tree bytes: 1725759488
total extent tree bytes: 86228992
btree space waste bytes: 386160897
file data blocks allocated: 1269363683328
 referenced 164438126592

How do I repair this?  Any thoughts and guidance would be greatly
appreciated.  I am not well versed with all the btrfs commands and
utilities, so I hope I have managed to provide all the right
information.

Thanks,

PS: I see that it now it mounts normally as well!  As in, with default
fstab options, so I guess I can boot.  I would still like to repair
the errors.

-- 
Suvayu

Open source is the future. It sets us free.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Some help with the code.

2016-09-09 Thread David Sterba
On Tue, Sep 06, 2016 at 04:22:25PM +0100, Tomasz Kusmierz wrote:
> This is predominantly for maintainers:
> 
> I've noticed that there is a lot of code for btrfs ... and after few
> glimpses I've noticed that there are occurrences which beg for some
> refactoring to make it less of a pain to maintain.
> 
> I'm speaking of occurrences where:
> - within a function there are multiple checks for null pointer and
> then whenever there is anything hanging on the end of that pointer to
> finally call the function, pass the pointer to it and watch it perform
> same checks to finally deallocate stuff on the end of a pointer.

Can you please point me to an example? If it's a bad pattern it would be
worth cleaning up.

> - single line functions ... called only in two places

That might not be always useless, as the function name tells us what it
does, not how, so it's a form of selfdocumenting code. If the function
body is some common code construct, it would be harder to grep for it.

But I understand what you mean. This could be also a leftover from some
broader changes that removed calls, reduced function size to the
one line.

> and so on.
> 
> I know that you guys are busy, but maintaining code that is only
> growing must be a pain.

Depends. Standalone features bring a lot of new code, but it's
separated. Random sample of patches from recent releases tells me that
net line growth is spread accross many patches that add just a few lines
(eg. enhanced tests, more helpers).

https://btrfs.wiki.kernel.org/index.php/Contributors#Statistics

Doing broader cleanups is good when done from time to time, as it tends
to interfere with other patches, so it's more a matter of scheduling
when to do it. The beginning or end of the particular development cycle
are good candidates.

Reducing size should be done in the way that does not make the code less
readable, which is kind of subjective metric but should be sorted when
patches (or samples) are posted. That said, cleanups and refactoring
patches are welcome.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Some help with the code.

2016-09-06 Thread Tomasz Kusmierz
This is predominantly for maintainers:

I've noticed that there is a lot of code for btrfs ... and after few
glimpses I've noticed that there are occurrences which beg for some
refactoring to make it less of a pain to maintain.

I'm speaking of occurrences where:
- within a function there are multiple checks for null pointer and
then whenever there is anything hanging on the end of that pointer to
finally call the function, pass the pointer to it and watch it perform
same checks to finally deallocate stuff on the end of a pointer.
- single line functions ... called only in two places

and so on.

I know that you guys are busy, but maintaining code that is only
growing must be a pain.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 12/13] btrfs-progs: mkfs: help and usage now to to stdout

2016-08-23 Thread David Sterba
Signed-off-by: David Sterba 
---
 mkfs.c | 38 +++---
 1 file changed, 19 insertions(+), 19 deletions(-)

diff --git a/mkfs.c b/mkfs.c
index f063323903dc..ef0b099a58d7 100644
--- a/mkfs.c
+++ b/mkfs.c
@@ -344,31 +344,31 @@ static int create_data_reloc_tree(struct 
btrfs_trans_handle *trans,
 
 static void print_usage(int ret)
 {
-   fprintf(stderr, "usage: mkfs.btrfs [options] dev [ dev ... ]\n");
-   fprintf(stderr, "options:\n");
-   fprintf(stderr, "\t-A|--alloc-start START  the offset to start the 
FS\n");
-   fprintf(stderr, "\t-b|--byte-count SIZEtotal number of bytes in the 
FS\n");
-   fprintf(stderr, "\t-d|--data PROFILE   data profile, raid0, raid1, 
raid5, raid6, raid10, dup or single\n");
-   fprintf(stderr, "\t-f|--force  force overwrite of existing 
filesystem\n");
-   fprintf(stderr, "\t-l|--leafsize SIZE  deprecated, alias for 
nodesize\n");
-   fprintf(stderr, "\t-L|--label LABELset a label\n");
-   fprintf(stderr, "\t-m|--metadata PROFILE   metadata profile, values 
like data profile\n");
-   fprintf(stderr, "\t-M|--mixed  mix metadata and data 
together\n");
-   fprintf(stderr, "\t-n|--nodesize SIZE  size of btree nodes\n");
-   fprintf(stderr, "\t-s|--sectorsize SIZEmin block allocation (may 
not mountable by current kernel)\n");
-   fprintf(stderr, "\t-r|--rootdir DIRthe source directory\n");
-   fprintf(stderr, "\t-K|--nodiscard  do not perform whole device 
TRIM\n");
-   fprintf(stderr, "\t-O|--features LIST  comma separated list of 
filesystem features, use '-O list-all' to list features\n");
-   fprintf(stderr, "\t-U|--uuid UUID  specify the filesystem 
UUID\n");
-   fprintf(stderr, "\t-q|--quiet  no messages except 
errors\n");
-   fprintf(stderr, "\t-V|--versionprint the mkfs.btrfs version 
and exit\n");
+   printf("usage: mkfs.btrfs [options] dev [ dev ... ]\n");
+   printf("options:\n");
+   printf("\t-A|--alloc-start START  the offset to start the FS\n");
+   printf("\t-b|--byte-count SIZEtotal number of bytes in the FS\n");
+   printf("\t-d|--data PROFILE   data profile, raid0, raid1, raid5, 
raid6, raid10, dup or single\n");
+   printf("\t-f|--force  force overwrite of existing 
filesystem\n");
+   printf("\t-l|--leafsize SIZE  deprecated, alias for nodesize\n");
+   printf("\t-L|--label LABELset a label\n");
+   printf("\t-m|--metadata PROFILE   metadata profile, values like data 
profile\n");
+   printf("\t-M|--mixed  mix metadata and data together\n");
+   printf("\t-n|--nodesize SIZE  size of btree nodes\n");
+   printf("\t-s|--sectorsize SIZEmin block allocation (may not 
mountable by current kernel)\n");
+   printf("\t-r|--rootdir DIRthe source directory\n");
+   printf("\t-K|--nodiscard  do not perform whole device TRIM\n");
+   printf("\t-O|--features LIST  comma separated list of filesystem 
features, use '-O list-all' to list features\n");
+   printf("\t-U|--uuid UUID  specify the filesystem UUID\n");
+   printf("\t-q|--quiet  no messages except errors\n");
+   printf("\t-V|--versionprint the mkfs.btrfs version and 
exit\n");
exit(ret);
 }
 
 static void print_version(void) __attribute__((noreturn));
 static void print_version(void)
 {
-   fprintf(stderr, "mkfs.btrfs, part of %s\n", PACKAGE_STRING);
+   printf("mkfs.btrfs, part of %s\n", PACKAGE_STRING);
exit(0);
 }
 
-- 
2.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Pointers to mirroring partitions (w/ encryption?) help?

2016-06-04 Thread Andrei Borzenkov
04.06.2016 20:31, B. S. пишет:
>>>
>>> Yeah, when it comes to FDE, you either have to make your peace with
>>> trusting the manufacturer, or you can't. If you are going to boot
>>> your system with a traditional boot loader, an unencrypted partition
>>> is mandatory.
>>
>> No, it is not with grub2 that supports LUKS (and geli in *BSD world). Of
>> course initial grub image must be written outside of encrypted area and
>> readable by firmware.
> 
> Good to know. Do you have a link to a how to on such?
> 

As long as you use grub-install and grub-mkconfig this "just works" in
the sense they both detect encrypted container and add necessary drivers
and other steps to access it. The only manual setup is to add

GRUB_ENABLE_CRYPTODISK=y

to /etc/default/grub.

You will need to enter LUKS password twice - once in GRUB, once in
kernel (there is no interface for passing passphrase from bootloader to
Linux kernel). Some suggest including passphrase in initrd (on
assumption that it is encrypted anyway already); there are patches to
support use of external keyfile in grub as well.


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Pointers to mirroring partitions (w/ encryption?) help?

2016-06-04 Thread Andrei Borzenkov
04.06.2016 22:05, Chris Murphy пишет:
...
>>
>> Yeah, when it comes to FDE, you either have to make your peace with
>> trusting the manufacturer, or you can't. If you are going to boot your
>> system with a traditional boot loader, an unencrypted partition is
>> mandatory.
> 
> /boot can be encrypted, GRUB supports this, but I'm unaware of any
> installer that does.

openSUSE supports installation on LUKS encrypted /boot. Installer has
some historical limitations regarding how encrypted container can be
setup, but bootloader part should be OK (including secure boot support).

> The ESP can't be encrypted.
> 

It should be possible if you use hardware encryption (SED).

> http://dustymabe.com/2015/07/06/encrypting-more-boot-joins-the-party/
> 
> It's vaguely possible for the SED variety of drive to support fully
> encrypted everything, including the ESP. The problem is we don't have
> OPAL support on Linux at all anywhere. And for some inexplicable
> reason, the TCG hasn't commissioned a free UEFI application for
> managing the keys and unlocking the drive in the preboot environment.
> For now, it seems, such support has to already be in the firmware.
> 


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Pointers to mirroring partitions (w/ encryption?) help?

2016-06-04 Thread Chris Murphy
On Fri, Jun 3, 2016 at 7:39 PM, Justin Brown  wrote:
> Here's some thoughts:
>
>> Assume a CD sized (680MB) /boot
>
> Some distros carry patches for grub that allow booting from Btrfs

Upstream GRUB has had Btrfs support for a long time. There's been no
need for distros to carry separate patches for years. The exception is
openSUSE, where they have a healthy set of patches for supporting the
discovery of and boot of read only snapshots created by snapper. Those
patches are not merged upstream, I'm not sure if they will be.


>, so
> no separate /boot file system is required. (Fedora does not; Ubuntu --
> and therefore probably all Debians -- does.)

The problem on Fedora is that they depend on grubby to modify the
grub.cfg. And grubby gets confused when the kernel/initramfs are
located on a Btrfs subvolume other than the top level. And Fedora's
installer only installs the system onto a subvolume (specifically,
every mount point defined in the installer becomes a subvolume if you
use Btrfs). So it's stuck being unable to support /boot if it's on
Btrfs.



>
>> perhaps a 200MB (?) sized EFI partition
>
> Way bigger than necessary. It should only be 1-2MiB, and IIRC 2MiB
> might be the max UEFI allows.

You're confusing the ESP with BIOSBoot. The minimum size for 512 byte
sector drives per Microsoft's technotes is 100MiB. Most OEMs use
something between 100MiB and 300MiB. Apple creates a 200MB ESP even
though they don't use it for booting, rather just to stage firmware
updates.

The UEFI spec itself doesn't say how big the ESP should be. 200MiBi is
sane for 512 byte drives. It needs to be 260MiB minimum for 4Kn
drives, because of the minimum number of FAT allocation units at 4096
bytes each requires a 260MiB minimum volume.




>
>> The additional problem is most articles reference FDE (Full Disk Encryption) 
>> - but that doesn't seem to be prudent. e.g. Unencrypted /boot. So having 
>> problems finding concise links on the topics, -FDE -"Full Disk Encryption".
>
> Yeah, when it comes to FDE, you either have to make your peace with
> trusting the manufacturer, or you can't. If you are going to boot your
> system with a traditional boot loader, an unencrypted partition is
> mandatory.

/boot can be encrypted, GRUB supports this, but I'm unaware of any
installer that does. The ESP can't be encrypted.

http://dustymabe.com/2015/07/06/encrypting-more-boot-joins-the-party/

It's vaguely possible for the SED variety of drive to support fully
encrypted everything, including the ESP. The problem is we don't have
OPAL support on Linux at all anywhere. And for some inexplicable
reason, the TCG hasn't commissioned a free UEFI application for
managing the keys and unlocking the drive in the preboot environment.
For now, it seems, such support has to already be in the firmware.




-- 
Chris Murphy
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Pointers to mirroring partitions (w/ encryption?) help?

2016-06-04 Thread B. S.


On 06/04/2016 03:46 AM, Andrei Borzenkov wrote:

04.06.2016 04:39, Justin Brown пишет:

Here's some thoughts:


Assume a CD sized (680MB) /boot


Some distros carry patches for grub that allow booting from Btrfs,
so no separate /boot file system is required. (Fedora does not;
Ubuntu -- and therefore probably all Debians -- does.)



Which grub (or which Fedora) do you mean? btrfs support is upstream
since 2010.

There are restrictions, in particular RAID levels support (RAID5/6 are
not implemented).


Good to know / be reminded of (such specifics) - thanks.


perhaps a 200MB (?) sized EFI partition


Way bigger than necessary. It should only be 1-2MiB, and IIRC 2MiB
might be the max UEFI allows.



You may want to review recent discussion on systemd regarding systemd
boot (a.k.a. gummiboot) which wants to have ESP mounted as /boot.

UEFI mandates support for FAT32 on ESP so max size should be whatever
max size FAT32 has.
...



The additional problem is most articles reference FDE (Full Disk
Encryption) - but that doesn't seem to be prudent. e.g. Unencrypted
/boot. So having problems finding concise links on the topics, -FDE
-"Full Disk Encryption".


Yeah, when it comes to FDE, you either have to make your peace with
trusting the manufacturer, or you can't. If you are going to boot
your system with a traditional boot loader, an unencrypted partition
is mandatory.


No, it is not with grub2 that supports LUKS (and geli in *BSD world). Of
course initial grub image must be written outside of encrypted area and
readable by firmware.


Good to know. Do you have a link to a how to on such?


That being said, we live in a world with UEFI Secure
Boot. While your EFI parition must be unencrypted vfat, you can sign
the kernels (or shims), and the UEFI can be configured to only boot
signed executables, including only those signed by your own key. Some
distros already provide this feature, including using keys probably
already trusted by the default keystore.



UEFI Secure Boot is rather orthogonal to the question of disk encryption.


Perhaps, but not orthogonal to the OP question.

In the end, the OP is about all this 'stuff' landing at once, the 
majority btrfs centric, and a call for help finding the end of the 
string to pull on in a linear way. e.g., as pointed out, most articles 
premising FDE, which is not in play per OP. The OP requesting pointers 
to good concise how to links.

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Pointers to mirroring partitions (w/ encryption?) help?

2016-06-04 Thread Andrei Borzenkov
04.06.2016 04:39, Justin Brown пишет:
> Here's some thoughts:
> 
>> Assume a CD sized (680MB) /boot
> 
> Some distros carry patches for grub that allow booting from Btrfs,
> so no separate /boot file system is required. (Fedora does not;
> Ubuntu -- and therefore probably all Debians -- does.)
> 

Which grub (or which Fedora) do you mean? btrfs support is upstream
since 2010.

There are restrictions, in particular RAID levels support (RAID5/6 are
not implemented).

>> perhaps a 200MB (?) sized EFI partition
> 
> Way bigger than necessary. It should only be 1-2MiB, and IIRC 2MiB 
> might be the max UEFI allows.
> 

You may want to review recent discussion on systemd regarding systemd
boot (a.k.a. gummiboot) which wants to have ESP mounted as /boot.

UEFI mandates support for FAT32 on ESP so max size should be whatever
max size FAT32 has.

...
> 
>> The additional problem is most articles reference FDE (Full Disk
>> Encryption) - but that doesn't seem to be prudent. e.g. Unencrypted
>> /boot. So having problems finding concise links on the topics, -FDE
>> -"Full Disk Encryption".
> 
> Yeah, when it comes to FDE, you either have to make your peace with 
> trusting the manufacturer, or you can't. If you are going to boot
> your system with a traditional boot loader, an unencrypted partition
> is mandatory.

No, it is not with grub2 that supports LUKS (and geli in *BSD world). Of
course initial grub image must be written outside of encrypted area and
readable by firmware.

> That being said, we live in a world with UEFI Secure
> Boot. While your EFI parition must be unencrypted vfat, you can sign
> the kernels (or shims), and the UEFI can be configured to only boot
> signed executables, including only those signed by your own key. Some
> distros already provide this feature, including using keys probably
> already trusted by the default keystore.
> 

UEFI Secure Boot is rather orthogonal to the question of disk encryption.

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Pointers to mirroring partitions (w/ encryption?) help?

2016-06-03 Thread B. S.
r, an unencrypted partition
is mandatory. That being said, we live in a world with UEFI Secure
Boot.


Another learning curve (UEFI) to swallow at the same time as all the 
other here. Current install is the first time it has occurred to me to 
try to incorporate SecureBoot, UEFI, crypt, and all such 'goodness' on a 
fresh (raw) install. Debian is bringing apt-secure along for the ride on 
me, too.



While your EFI parition must be unencrypted vfat, you can sign
the kernels (or shims), and the UEFI can be configured to only boot
signed executables, including only those signed by your own key. Some
distros already provide this feature, including using keys probably
already trusted by the default keystore.


mirror subvolumes (or it inherently comes along for the ride?)


Yes, that is correct. Just to give some more background: the data
and metadata profiles control "mirroring," and they are set at the
file system level. Subvolumes live entirely within one file system,
so whatever profile is set in the FS applies to subvolumes.


Gotcha, thus your dup observation.

However ... the question was aimed at a crypto sda3, thus containing @, 
and probably @home, sda4 created ... how might one kick in to (btrfs) 
mirror sda3 in sda4, including @ and @home.


I would guess, from your comment, once one adds sda4 to the sda3 set, 
all that (sda3) profile applies, gets applied to sda4, and all the btrfs 
magic goodness ... just happens.


Particularly after running balance to force all that goodness to happen 
at once / now, rather than upon next write.



So, I could take an HD, create partitions as above (how? e.g. Set
up encryption / btrfs mirror volumes), then clonezilla (?)
partitions from a current machine in.


Are you currently using Btrfs? If so, use Btrfs' `send` and
`receive` commands.


Yeah. Ick. :-) Have had better luck in the past just cloning or mounting 
and cp -a. Likely, my lack of experience was the issue.


In any case, here, the question was pointed at a new install.

> That should be lot friendlier to your SSD. (I'll

take this opportunity to say that you need to consider the `discard`
mount *and* `/etc/crypttab` options. Discard -- or scheduling
`fstrim` -- is extremely important to maintain optimal performance of
a SSD, but there are some privacy trade-offs on encrypted systems.)
If not, then `cp -a` or similar will work.


SSD not yet in play here, but I do take your point. I had to work 
through all that on the SSD I do have, so I do know to peek at such 
whenever an SSD come into play.


Didn't know about the /etc/crypttab options, thanks for that. Heck, 
hadn't gotten as far as knowing there was an /etc/crypttab.


- thus, I think part of my OP question is what all am I attempting to 
swallow in one go on a fresh install here? I get dmcrypt and uefi is 
involved, so I can start to break down the googling into component 
pieces. Thus, I think, the request for an appropriate link. Something 
that does it all on a fresh install in one go would be good, 
particularly if it identifies the major sub-topics, and has 'links to 
more info'.



Obviously, you'll have to
get your boot mechanism and file system identifiers updated in
addition to `/etc/crypttab` described above.

Lastly, strongly consider `autodefrag` and possibly setting some
highly violatile -- but *unimportant* -- directories to `nodatacow`
via purging and `chattr +C`. (I do this for ~/.cache and
/var/cache.)


Yep, autodefrag is in the mount options. I have a number of home systems 
running btrfs for some years now. Started with Kubuntu 12.04 LTS (since 
running hwe kernels to get later btrfs tools), and a couple of 14.04's. 
GB rsyncs and mondoarchives fly all about the house in cascading 
archives, nightly.


A recent 4TB HD failure is part of the reason for the OP questions. A 
scrub at the time revealed many failures, and dealing with that and 
figuring out which files to fetch from secondary archives was a 
challenge. BUT, FANTASTICALLY, for the first time (pre-btrfs days), at 
least btrfs / something specifically identified WHICH files were 
botched. I wasn't left wondering what botched file will reveal itself 
months from now ... after the botched file had cascaded to all backups!


Having been bitten, and facing a new install, thought I'd better OP.


Yet not looking to put in a 2nd HD


If you change your mind and decide on a backup device, or even if
you just want local backup snapshots, one of the best snapshot
managers is btrfs-sxbackup (no association with the FS project).


Thank you for that!

Thus far, keeping only the OS on / and mondoarchiving it nightly, and 
rsync'ing /everythingelse seems to be doing the job. Perhaps even 
keeping the 'after the failure' complexity level down.



On Fri, Jun 3, 2016 at 3:30 PM, B. S. <bs27...@gmail.com> wrote:

Hallo. I'm continuing on sinking in to btrfs, so pointers to
concise help articles appreciated. I've got a couple new home
systems, so perhaps it's time

Re: Pointers to mirroring partitions (w/ encryption?) help?

2016-06-03 Thread Justin Brown
Here's some thoughts:

> Assume a CD sized (680MB) /boot

Some distros carry patches for grub that allow booting from Btrfs, so
no separate /boot file system is required. (Fedora does not; Ubuntu --
and therefore probably all Debians -- does.)

> perhaps a 200MB (?) sized EFI partition

Way bigger than necessary. It should only be 1-2MiB, and IIRC 2MiB
might be the max UEFI allows.

>  then creates another partition for mirroring, later. IIUC, btrfs add device 
> /dev/sda4 / is appropriate, then. Then running a balance seems recommended.

Don't do this. It's not going to provide any additional protection
that you can't do in a smarter way. If you only have one device and
want data duplication, just use the `dup` data profile (settable via
`balance`). In fact, by default Btrfs uses the `dup` profile for
metadata (and `single` for data). You'll get all the data integrity
benefits with `dup`.

One of the best features and initally confusing things about Btrfs is
how much is done "within" a file system. (There is a certain "the
Btrfs way" to it.)

> Confusing, however, is having those (both) partitions encrypted. Seems some 
> work is needed beforehand. But I've never done encryption.

(This is moot if you go with `dup`.) It's actually quite easy with
every major distro. If we're talking about a fresh install, the distro
installer probably has full support for passphrase-based dm-crypt LUKS
encryption, including multiple volumes sharing a passphrase. An
existing install should be convertable without much trouble. It's
ususally just a matter of setting up the container with `cryptsetup`,
populating `/etc/crypttab`, possibly adding crypto modules to your
initrd and/or updating settings, and rebuilding the initrd. (I have
first-hand experience doing this on a Fedora install recently, and it
took about half an hour and I knew nothing about Fedora's `dracut`
initrd generator tool.)

If you do need multiple encrypted file systems, simply use the same
passphrase for all volumes (but never do this by cloning the LUKS
headers). You'll only need to enter it once at boot.

> The additional problem is most articles reference FDE (Full Disk Encryption) 
> - but that doesn't seem to be prudent. e.g. Unencrypted /boot. So having 
> problems finding concise links on the topics, -FDE -"Full Disk Encryption".

Yeah, when it comes to FDE, you either have to make your peace with
trusting the manufacturer, or you can't. If you are going to boot your
system with a traditional boot loader, an unencrypted partition is
mandatory. That being said, we live in a world with UEFI Secure Boot.
While your EFI parition must be unencrypted vfat, you can sign the
kernels (or shims), and the UEFI can be configured to only boot signed
executables, including only those signed by your own key. Some distros
already provide this feature, including using keys probably already
trusted by the default keystore.

> mirror subvolumes (or it inherently comes along for the ride?)

Yes, that is correct. Just to give some more background: the data and
metadata profiles control "mirroring," and they are set at the file
system level. Subvolumes live entirely within one file system, so
whatever profile is set in the FS applies to subvolumes.

> So, I could take an HD, create partitions as above (how? e.g. Set up 
> encryption / btrfs mirror volumes), then clonezilla (?) partitions from a 
> current machine in.

Are you currently using Btrfs? If so, use Btrfs' `send` and `receive`
commands. That should be lot friendlier to your SSD. (I'll take this
opportunity to say that you need to consider the `discard` mount *and*
`/etc/crypttab` options. Discard -- or scheduling `fstrim` -- is
extremely important to maintain optimal performance of a SSD, but
there are some privacy trade-offs on encrypted systems.) If not, then
`cp -a` or similar will work. Obviously, you'll have to get your boot
mechanism and file system identifiers updated in addition to
`/etc/crypttab` described above.

Lastly, strongly consider `autodefrag` and possibly setting some
highly violatile -- but *unimportant* -- directories to `nodatacow`
via purging and `chattr +C`. (I do this for ~/.cache and /var/cache.)

> Yet not looking to put in a 2nd HD

If you change your mind and decide on a backup device, or even if you
just want local backup snapshots, one of the best snapshot managers is
btrfs-sxbackup (no association with the FS project).

On Fri, Jun 3, 2016 at 3:30 PM, B. S. <bs27...@gmail.com> wrote:
> Hallo. I'm continuing on sinking in to btrfs, so pointers to concise help
> articles appreciated. I've got a couple new home systems, so perhaps it's
> time to investigate encryption, and given the bit rot I've seen here,
> perhaps time to mirror volumes so the wonderful btrfs self-healing
> facilities can be taken advantage of.
>
> Problem with today's hard drives, a quick look at C

Pointers to mirroring partitions (w/ encryption?) help?

2016-06-03 Thread B. S.
Hallo. I'm continuing on sinking in to btrfs, so pointers to concise 
help articles appreciated. I've got a couple new home systems, so 
perhaps it's time to investigate encryption, and given the bit rot I've 
seen here, perhaps time to mirror volumes so the wonderful btrfs 
self-healing facilities can be taken advantage of.


Problem with today's hard drives, a quick look at Canada Computer shows 
the smallest drives 500GB, 120GB SSDs, far more than the 20GB or so an 
OS needs. Yet not looking to put in a 2nd HD, either. It feels like 
mirroring volumes makes sense.


(EFI [partitions] also seem to be sticking their fingers in here.]

Assume a CD sized (680MB) /boot, and perhaps a 200MB (?) sized EFI 
partition, it seems to me one sets up / as usual (less complex install), 
then creates another partition for mirroring, later. IIUC, btrfs add 
device /dev/sda4 / is appropriate, then. Then running a balance seems 
recommended.


Confusing, however, is having those (both) partitions encrypted. Seems 
some work is needed beforehand. But I've never done encryption. I have 
come across https://github.com/gebi/keyctl_keyscript, so I understand 
there will be gotchas to deal with - later. But not there yet, and not 
real sure how to start.


The additional problem is most articles reference FDE (Full Disk 
Encryption) - but that doesn't seem to be prudent. e.g. Unencrypted 
/boot. So having problems finding concise links on the topics, -FDE 
-"Full Disk Encryption".


Any good links to concise instructions on building / establishing 
encrypted btrfs mirror volumes? dm_crypt seems to be the basis, and not 
looking to add LVM, seems an unnecessary extra layer of complexity.


It also feels like I could mkfs.btrfs /dev/sda3 /dev/sda4, then mirror 
subvolumes (or it inherently comes along for the ride?) - so my 
confusion level increases. Especially if encryption is added to the mix.


So, I could take an HD, create partitions as above (how? e.g. Set up 
encryption / btrfs mirror volumes), then clonezilla (?) partitions from 
a current machine in. I assume mounting a live cd then cp -a from old 
disk partition to new disk partition won't 'just work'. (?)


Article suggestions?
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Help ! "btrfs check" looping recursive

2016-04-15 Thread Swâmi Petaramesh
Hi there,

Thanks for your reply Duncan !

Le 15/04/2016 02:24, Duncan wrote :
> Swâmi Petaramesh posted on Thu, 14 Apr 2016 18:56:29 +0200 as excerpted:
> 
>> It seems that i have a "btrfs check" process that’s stuck in an infinite
>> recursive loop…

> Given the prompt above, you're running from parted-magic, but that 
> doesn't tell us the btrfs-progs or kernel versions unless we look it up.

True, I forgot to specify this.

This FS is from a machine that currently runs : 4.5.0-1-ARCH

As I had a couple of "dead" files (KDE session files in
~/.config/session) that showed "" for all their attributes and
couldn’t be accessed nor deleted, I ran "btrfs check" from a reasonably
recent live Parted Magic, which has :

- Kernel : 4.3.2
- BTRFS tools : 4.1.2

> So kernel and btrfs-progs version?  Also, btrfs filesystem show output 
> might be useful.

Taken from the currently running machine (as I in the end choosed to
abort the "btrfs check" using ^C) :

# btrfs fi sh
Label: 'LINUX'  uuid: 13c87f57-3a85-4daf-a4bf-ba777407c169
Total devices 1 FS bytes used 268.07GiB
devid1 size 334.50GiB used 294.54GiB path /dev/mapper/VGZ-LINUX


# btrfs fi df /
Data, single: total=289.46GiB, used=264.17GiB
System, DUP: total=32.00MiB, used=56.00KiB
Metadata, single: total=5.01GiB, used=3.90GiB
GlobalReserve, single: total=512.00MiB, used=0.00B


# btrfs fi us /
Overall:
Device size: 334.50GiB
Device allocated:294.54GiB
Device unallocated:   39.96GiB
Device missing:  0.00B
Used:268.07GiB
Free (estimated): 65.26GiB  (min: 45.28GiB)
Data ratio:   1.00
Metadata ratio:   1.00
Global reserve:  512.00MiB  (used: 0.00B)





Data,single: Size:289.46GiB, Used:264.17GiB


   /dev/mapper/VGZ-LINUX 289.46GiB





Metadata,single: Size:5.01GiB, Used:3.90GiB


   /dev/mapper/VGZ-LINUX   5.01GiB

System,DUP: Size:32.00MiB, Used:56.00KiB
   /dev/mapper/VGZ-LINUX  64.00MiB

Unallocated:
   /dev/mapper/VGZ-LINUX  39.96GiB


# df -h /
Sys. de fichiers  Taille Utilisé Dispo Uti% Monté sur
/dev/mapper/VGZ-LINUX   335G269G   66G  81% /

> Btrfs-progs version in particular, since the recursive nature of this 
> loop is very obviously a bug.


I hope I gave all the necessary information now.

TIA and best regards.

ॐ

-- 
Swâmi Petaramesh  PGP 9076E32E
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Help ! "btrfs check" looping recursive

2016-04-14 Thread Duncan
Swâmi Petaramesh posted on Thu, 14 Apr 2016 18:56:29 +0200 as excerpted:

> It seems that i have a "btrfs check" process that’s stuck in an infinite
> recursive loop…
> 
> How could I end this without breaking my filesystem ?

...

> root@PartedMagic:~# btrfs check --repair /dev/VGZ/LINUX
> enabling repair mode [...]


[Just a btrfs user and list regular myself, not a btrfs dev and not at a 
level to specifically answer the question.]

Given the prompt above, you're running from parted-magic, but that 
doesn't tell us the btrfs-progs or kernel versions unless we look it up.

So kernel and btrfs-progs version?  Also, btrfs filesystem show output 
might be useful.

(Tho in this specific context, kernel version isn't as useful as normal, 
since unlike many btrfs commands that simply call kernel code to do the 
real work, check code is all userspace.  But it's can't hurt to post it.  
Similarly, btrfs fi df to compliment btrfs fi show, or btrfs fi usage to 
output the same information as both, would in other contexts be useful, 
but they require a mounted filesystem, not something you can really even 
try with check running.)

Btrfs-progs version in particular, since the recursive nature of this 
loop is very obviously a bug.  If it's a current progs version, the bug 
may have been recently introduced.  If it's a dated version, the bug may 
have already been fixed.  (Either way, it may be that someone else will 
recognize the bug and tell you to try a later/earlier version, or if not, 
you very well may prompt a new patch, possibly after some further 
debugging.)

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Help ! "btrfs check" looping recursive

2016-04-14 Thread Swâmi Petaramesh
Hi folks,

It seems that i have a "btrfs check" process that’s stuck in an infinite
recursive loop…

How could I end this without breaking my filesystem ?

Help much needed & appreciated…

TIA.

Kind regards.


root@PartedMagic:~# btrfs check --repair /dev/VGZ/LINUX
enabling repair mode
Checking filesystem on /dev/VGZ/LINUX
UUID: 13c87f57-3a85-4daf-a4bf-ba777407c169
checking extents
Fixed 0 roots.
checking free space cache
cache and super generation don't match, space cache will be invalidated
checking fs roots
Deleting bad dir index [2188127,96,2152] root 267
Deleting bad dir index [2188127,96,2155] root 267
Deleting bad dir index [2188127,96,2152] root 40298
Deleting bad dir index [2188127,96,2155] root 40298
Deleting bad dir index [2188127,96,2152] root 40761
Deleting bad dir index [2188127,96,2155] root 40761
reset isize for dir 2188127 root 40815
Trying to rebuild inode:8089093
Can't determint the filetype for inode 8089093, assume it is a normal file
Can't get file name for inode 8089093, using '8089093' as fallback
Can't get file type for inode 8089093, using FILE as fallback
Moving file '8089093' to 'lost+found' dir since it has no valid backref
Fixed the nlink of inode 8089093
Trying to rebuild inode:8089098
Can't determint the filetype for inode 8089098, assume it is a normal file
Can't get file name for inode 8089098, using '8089098' as fallback
Can't get file type for inode 8089098, using FILE as fallback
Moving file '8089098' to 'lost+found' dir since it has no valid backref
Fixed the nlink of inode 8089098
Can't get file name for inode 8089093, using '8089093' as fallback
Moving file '8089093.8089093' to 'lost+found' dir since it has no valid
backref
Fixed the nlink of inode 8089093
root 40815 inode 8089093 errors 10, odd dir item
Can't get file name for inode 8089098, using '8089098' as fallback
Moving file '8089098.8089098' to 'lost+found' dir since it has no valid
backref
Fixed the nlink of inode 8089098
root 40815 inode 8089098 errors 10, odd dir item
Can't get file name for inode 8089093, using '8089093' as fallback
Moving file '8089093.8089093.8089093' to 'lost+found' dir since it has
no valid backref
Fixed the nlink of inode 8089093
root 40815 inode 8089093 errors 10, odd dir item
Can't get file name for inode 8089098, using '8089098' as fallback
Moving file '8089098.8089098.8089098' to 'lost+found' dir since it has
no valid backref
Fixed the nlink of inode 8089098
root 40815 inode 8089098 errors 10, odd dir item
Deleting bad dir index [2188127,96,2152] root 40815
Deleting bad dir index [2188127,96,2155] root 40815
reset isize for dir 2188127 root 40815
Can't get file name for inode 8089093, using '8089093' as fallback
Moving file '8089093.8089093.8089093.8089093' to 'lost+found' dir since
it has no valid backref
Fixed the nlink of inode 8089093
Can't get file name for inode 8089098, using '8089098' as fallback
Moving file '8089098.8089098.8089098.8089098' to 'lost+found' dir since
it has no valid backref
Fixed the nlink of inode 8089098
Can't get file name for inode 8089093, using '8089093' as fallback
Moving file '8089093.8089093.8089093.8089093.8089093' to 'lost+found'
dir since it has no valid backref
Fixed the nlink of inode 8089093
Can't get file name for inode 8089098, using '8089098' as fallback
Moving file '8089098.8089098.8089098.8089098.8089098' to 'lost+found'
dir since it has no valid backref
Fixed the nlink of inode 8089098
Deleting bad dir index [2188127,96,2152] root 40869
Deleting bad dir index [2188127,96,2155] root 40869
Can't get file name for inode 8089093, using '8089093' as fallback
Moving file '8089093.8089093.8089093.8089093.8089093.8089093' to
'lost+found' dir since it has no valid backref
Fixed the nlink of inode 8089093
Can't get file name for inode 8089098, using '8089098' as fallback
Moving file '8089098.8089098.8089098.8089098.8089098.8089098' to
'lost+found' dir since it has no valid backref
Fixed the nlink of inode 8089098
Can't get file name for inode 8089093, using '8089093' as fallback
Moving file '8089093.8089093.8089093.8089093.8089093.8089093.8089093' to
'lost+found' dir since it has no valid backref
Fixed the nlink of inode 8089093
Can't get file name for inode 8089098, using '8089098' as fallback
Moving file '8089098.8089098.8089098.8089098.8089098.8089098.8089098' to
'lost+found' dir since it has no valid backref
Fixed the nlink of inode 8089098
Can't get file name for inode 8089093, using '8089093' as fallback
Moving file
'8089093.8089093.8089093.8089093.8089093.8089093.8089093.8089093' to
'lost+found' dir since it has no valid backref
Fixed the nlink of inode 8089093
Can't get file name for inode 8089098, using '8089098' as fallback
Moving file
'8089098.8089098.8089098.8089098.8089098.8089098.8089098.8089098' to
'lost+found' dir since it has no valid backref
Fixed the nlink of inode 8089098
Deleting bad dir index [2188127,96,2152] root 40905
Deleting bad dir index [2188127,96,2155] root 40905
Can't get file name for inode 

Re: Re: unable to mount btrfs partition, please help :(

2016-03-20 Thread Chris Murphy
On Sun, Mar 20, 2016 at 1:31 PM, Patrick Tschackert <killing-t...@gmx.de> wrote:
> My raid is done with the scrub now, this is what i get:
>
> $ cat /sys/block/md0/md/mismatch_cnt
> 311936608

I think this is an assembly problem. Read errors don't result in
mismatch counts. An md mismatch count happens when there's a mismatch
between data strip and parity strip(s). So this is a lot of
mismatches.

I think you need to take this problem to the linux-raid@ list, I don't
think anyone on this list is going to be able to help with this
portion of the problem. I'm only semi-literate with this, and you need
to find out why there are so many mismatches and confirm whether the
array is being assembled correctly.

In your writeup for the list you can include the URL for the first
post to this list. I wouldn't repeat any of the VM crashing stuff
because it's not really relevant. You'll need to include the kernel
you were using at the time of the problem, the kernel you're using for
the scrub, the version of mdadm, and all the device metadata (-E for
each device) and the array (-D), and smartctl -A for each device (you
could put smartctl -x for each drive into a file and the put the file
up somewhere like dropbox or google drive, or individually pastebin
them if you can keep it all separate, -x is really verbose but
sometimes contains read error information) to show bad sectors.


The summary line is basically: this was working, after a VM crash
followed by shutdown -r now, the Btrfs filesystem won't mount. A drive
was faulty and rebuilt with a spare. You just did a check scrub and
have all these errors in mismatch_cnt. The question is: how to confirm
the array is properly assembled? Because that's too many errors, and
the file system on that array will not mount. Further complicating
matters is even after rebuild you have another drive that has some
read errors. Those weren't being fixed this whole time (during rebuild
for example) likely because of the timeout vs SCT ERC
misconfiguration, other wise they would have been fixed.


>
> I also attached my dmesg output to this mail. Here's an excerpt:
> [12235.372901] sd 7:0:0:0: [sdh] tag#15 FAILED Result: hostbyte=DID_OK 
> driverbyte=DRIVER_SENSE
> [12235.372906] sd 7:0:0:0: [sdh] tag#15 Sense Key : Medium Error [current] 
> [descriptor]
> [12235.372909] sd 7:0:0:0: [sdh] tag#15 Add. Sense: Unrecovered read error - 
> auto reallocate failed
> [12235.372913] sd 7:0:0:0: [sdh] tag#15 CDB: Read(16) 88 00 00 00 00 00 af b2 
> bb 48 00 00 05 40 00 00
> [12235.372916] blk_update_request: I/O error, dev sdh, sector 2947727304
> [12235.372941] ata8: EH complete
> [12266.856747] ata8.00: exception Emask 0x0 SAct 0x7fff SErr 0x0 action 
> 0x0
> [12266.856753] ata8.00: irq_stat 0x4008
> [12266.856756] ata8.00: failed command: READ FPDMA QUEUED
> [12266.856762] ata8.00: cmd 60/40:d8:08:17:b5/05:00:af:00:00/40 tag 27 ncq 
> 688128 in
>  res 41/40:00:18:1b:b5/00:00:af:00:00/40 Emask 0x409 (media error) 
> [12266.856765] ata8.00: status: { DRDY ERR }
> [12266.856767] ata8.00: error: { UNC }
> [12266.858112] ata8.00: configured for UDMA/133

What do you get for
smartctl -x /dev/sdh


I see this too:
[11440.088441] ata8.00: status: { DRDY }
[11440.088443] ata8.00: failed command: READ FPDMA QUEUED
[11440.088447] ata8.00: cmd 60/40:c8:e8:bc:15/05:00:ab:00:00/40 tag 25
ncq 688128 in
 res 50/00:00:00:00:00/00:00:00:00:00/a0 Emask 0x1 (device error)

That's weird. You have several other identical model drives, so I
doubt this is some sort of NCQ incompatibility with this model drive,
no other drive is complaining like this. So I wonder if there's just
something wrong with this drive aside from the bad sectors (?) I can't
really tell but it's suspicious.



> If I understand correctly, my /dev/sdh drive is having trouble.
> Could this be the problem? Should I set the drive to failed and rebuild on a 
> spare disk?

You need to really slow down and understand the problem first. Every
data loss case I've ever come across with md/mdadm raid6 was user
induced because they changed too much stuff too fast without
consulting people who know better. They got impatient. So I suggest
going to the linux-raid@ list and asking there what's going on. The
less you change the better because most of the changes md/mdadm does
are irreversible.


-- 
Chris Murphy
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: unable to mount btrfs partition, please help :(

2016-03-20 Thread Chris Murphy
On Sun, Mar 20, 2016 at 6:19 AM, Martin Steigerwald  wrote:
> On Sonntag, 20. März 2016 10:18:26 CET Patrick Tschackert wrote:
>> > I think in retrospect the safe way to do these kinds of Virtual Box
>> > updates, which require kernel module updates, would have been to
>> > shutdown the VM and stop the array. *shrug*
>>
>>
>> After this, I think I'll just do away with the virtual machine on this host,
>> as the app contained in that vm can also run on the host. I tried to be
>> fancy, and it seems to needlessly complicate things.
>
> I am not completely sure and I have no exact reference anymore, but I think I
> read more than once about fs benchmarks running faster in Virtualbox than on
> the physical system, which may point at an at least incomplete fsync()
> implementation for writing into Virtualbox image files.
>
> I never found any proof of this nor did I specificially seeked to research it.
> So it may be true or not.

Sure but that would only affect the guest's file system, the one
inside the VDI. It's the host managed filesystem that's busted.

-- 
Chris Murphy
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: unable to mount btrfs partition, please help :(

2016-03-20 Thread Chris Murphy
On Sun, Mar 20, 2016 at 3:18 AM, Patrick Tschackert  wrote:
> Thanks for answering again!
> So, first of all I installed a newer kernel from the backports as per 
> Nicholas D Steeves suggestion:
>
> $ apt-get install -t jessie-backports linux-image-4.3.0-0.bpo.1-amd64
>
> After rebooting:
> $ uname -a
> Linux vmhost 4.3.0-0.bpo.1-amd64 #1 SMP Debian 4.3.5-1~bpo8+1 (2016-02-23) 
> x86_64 GNU/Linux
>
> But the problem with mounting the filesystem persists :(
>
>> OK I went back and read this again: host is managing the md raid5, the
>> guest is writing Btrfs to an "encrypted container" but what is that? A
>> LUKS encrypted LVM LV that's directly used by Virtual Box as a raw
>> device? It's hard to say what layer broke this. But the VM crashing is
>> in effect like a power failure, and it's an open question (for me) how
>> this setup deals with barriers. A shutdown -r now should still cleanly
>> stop the array so I wouldn't expect there to be an array problem but
>> then you also report a device failure. Bad luck.
>
> The host is managing an md raid 6 (/dev/md0), and I had an encrypted volume 
> (via cryptsetup) on top of that device.
> The host mounted the btrfs filesystem contained in that volume, and the VM 
> wrote to the filesystem as well using a virtualbox shared folder.

OK well to me the VM doesn't seem related off hand. Ultimately its
only the host writing to the filesystem, even for the shared folder.
The guest VM has no direct access to do Btrfs writes, it's something
like a network-like shared folder.


> After this, I think I'll just do away with the virtual machine on this host, 
> as the app contained in that vm can also run on the host.
> I tried to be fancy, and it seems to needlessly complicate things.

virt-manager or gnome-boxes work better, although you lose shared
folder, you'll have to come up with a work around, like using NFS.


> $ for i in /sys/class/scsi_generic/*/device/timeout; do echo 120 > "$i"; done
> (I know this isn't persistent across reboots...)

Correct.


-- 
Chris Murphy
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: unable to mount btrfs partition, please help :(

2016-03-20 Thread Martin Steigerwald
On Sonntag, 20. März 2016 10:18:26 CET Patrick Tschackert wrote:
> > I think in retrospect the safe way to do these kinds of Virtual Box
> > updates, which require kernel module updates, would have been to
> > shutdown the VM and stop the array. *shrug*
> 
>  
> After this, I think I'll just do away with the virtual machine on this host,
> as the app contained in that vm can also run on the host. I tried to be
> fancy, and it seems to needlessly complicate things.

I am not completely sure and I have no exact reference anymore, but I think I 
read more than once about fs benchmarks running faster in Virtualbox than on 
the physical system, which may point at an at least incomplete fsync() 
implementation for writing into Virtualbox image files.

I never found any proof of this nor did I specificially seeked to research it. 
So it may be true or not.

Thanks,
-- 
Martin
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: unable to mount btrfs partition, please help :(

2016-03-20 Thread Patrick Tschackert
Thanks for answering, I already upgraded to a backports kernel as mentioned 
here:
https://mail-archive.com/linux-btrfs@vger.kernel.org/msg51748.html

I now have

$ uname -a
Linux vmhost 4.3.0-0.bpo.1-amd64 #1 SMP Debian 4.3.5-1~bpo8+1 (2016-02-23) 
x86_64 GNU/Linux

As I wrote here 
https://mail-archive.com/linux-btrfs@vger.kernel.org/msg51748.html
the problem still persists :(
 
Cheers,
Patrick

Gesendet: Sonntag, 20. März 2016 um 13:11 Uhr
Von: "Martin Steigerwald" <mar...@lichtvoll.de>
An: "Chris Murphy" <li...@colorremedies.com>
Cc: "Patrick Tschackert" <killing-t...@gmx.de>, "Btrfs BTRFS" 
<linux-btrfs@vger.kernel.org>
Betreff: Re: unable to mount btrfs partition, please help :(
On Samstag, 19. März 2016 19:34:55 CET Chris Murphy wrote:
> >>> $ uname -a
> >>> Linux vmhost 3.16.0-4-amd64 #1 SMP Debian 3.16.7-ckt20-1+deb8u4
> >>> (2016-02-29) x86_64 GNU/Linux
> >>
> >>This is old. You should upgrade to something newer, ideally 4.5 but
> >>4.4.6 is good also, and then oldest I'd suggest is 4.1.20.
> >>
> > Shouldn't I be able to get the newest kernel by executing "apt-get update
> > && apt-get dist-upgrade"? That's what I ran just now, and it doesn't
> > install a newer kernel. Do I really have to manually upgrade to a newer
> > one?
> I'm not sure. You might do a list search for debian, as I know debian
> users are using newer kernels that they didn't build themselves.

Try a backport¹ kernel. Add backports and do

apt-cache search linux-image

I use 4.3 backport kernel successfully on two server VMs which use BTRFS.

[1] http://backports.debian.org/

Thx,
--
Martin
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: unable to mount btrfs partition, please help :(

2016-03-20 Thread Martin Steigerwald
On Samstag, 19. März 2016 19:34:55 CET Chris Murphy wrote:
> >>> $ uname -a
> >>> Linux vmhost 3.16.0-4-amd64 #1 SMP Debian 3.16.7-ckt20-1+deb8u4
> >>> (2016-02-29) x86_64 GNU/Linux
> >>
> >>This is old. You should upgrade to something newer, ideally 4.5 but
> >>4.4.6 is good also, and then oldest I'd suggest is 4.1.20.
> >>
> > Shouldn't I be able to get the newest kernel by executing "apt-get update
> > && apt-get dist-upgrade"? That's what I ran just now, and it doesn't
> > install a newer kernel. Do I really have to manually upgrade to a newer
> > one?
> I'm not sure. You might do a list search for debian, as I know debian
> users are using newer kernels that they didn't build themselves.

Try a backport¹ kernel. Add backports and do 

apt-cache search linux-image 

I use 4.3 backport kernel successfully on two server VMs which use BTRFS.

[1] http://backports.debian.org/

Thx,
-- 
Martin
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: unable to mount btrfs partition, please help :(

2016-03-20 Thread Patrick Tschackert
Thanks for answering again!
So, first of all I installed a newer kernel from the backports as per Nicholas 
D Steeves suggestion:

$ apt-get install -t jessie-backports linux-image-4.3.0-0.bpo.1-amd64

After rebooting:
$ uname -a
Linux vmhost 4.3.0-0.bpo.1-amd64 #1 SMP Debian 4.3.5-1~bpo8+1 (2016-02-23) 
x86_64 GNU/Linux

But the problem with mounting the filesystem persists :(

> OK I went back and read this again: host is managing the md raid5, the
> guest is writing Btrfs to an "encrypted container" but what is that? A
> LUKS encrypted LVM LV that's directly used by Virtual Box as a raw
> device? It's hard to say what layer broke this. But the VM crashing is
> in effect like a power failure, and it's an open question (for me) how
> this setup deals with barriers. A shutdown -r now should still cleanly
> stop the array so I wouldn't expect there to be an array problem but
> then you also report a device failure. Bad luck.
 
The host is managing an md raid 6 (/dev/md0), and I had an encrypted volume 
(via cryptsetup) on top of that device.
The host mounted the btrfs filesystem contained in that volume, and the VM 
wrote to the filesystem as well using a virtualbox shared folder.
The vm then crashed, but I shut down the host with "shutdown -r now".
After the reboot, one disk of the array was no longer present, but I managed to 
rebuild/restore using a spare disk. The RAID now seems to be healthy.

> I think in retrospect the safe way to do these kinds of Virtual Box
> updates, which require kernel module updates, would have been to
> shutdown the VM and stop the array. *shrug*
 
After this, I think I'll just do away with the virtual machine on this host, as 
the app contained in that vm can also run on the host.
I tried to be fancy, and it seems to needlessly complicate things.
 
> These drives are technically not suitable for use in any kind of raid
> except linear and raid 0 (which have no redundancy so they aren't
> really raid). You'd have to dig up drive specs, assuming they're
> published, to see what the recovery times are for the drive models
> when a bad sector is encountered. But it's typical for such drives to
> exceed 30 seconds for recovery, with some drives reported to have 2+
> minute recoveries. To properly configure them, you'll have to increase
> the kernel's SCSI comment timer to at least 120 to make sure there's
> sufficient time to wait for the drive to explicitly spit back a read
> error to the kernel. Otherwise, the kernel gives up after 30 seconds,
> and resets the link to the drive, and any possibility of fixing up the
> bad sector via the raid read error fixup mechanism is thwarted. It's
> really common, the linux-raid@ list has many of these kinds of threads
> with this misconfiguration as the source problem.
 
> For the first listing of drives yes. And 120 second delays might be
> too long for your use case, but that's the reality.

> You should change the command timer for the drives that do not support
> configurable SCT ERC. And then do a scrub check. And then check both
> cat /sys/block/mdX/md/mismatch_cnt, which ideally should be 0, and
> also check kernel messages for libata read errors.

So I did this:
 
$ cat /sys/block/md0/md/mismatch_cnt
0

$ for i in /sys/class/scsi_generic/*/device/timeout; do echo 120 > "$i"; done
(I know this isn't persistent across reboots...)

$ echo check > /sys/block/md0/md/sync_action

$ cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4] 
md0 : active raid6 sda[0] sdf[12](S) sdg[11](S) sdj[9] sdh[7] sdi[6] sdk[10] 
sde[4] sdd[3] sdc[2] sdb[1]
  20510948416 blocks super 1.2 level 6, 64k chunk, algorithm 2 [9/9] 
[U]
  [>]  check =  1.0% (30812476/2930135488) 
finish=340.6min speed=141864K/sec
  
unused devices: 

So the raid is currently doing a scrub, which will take a few hours.

> Hmm not good. See this similar thread.
> http://www.spinics.net/lists/linux-btrfs/msg51711.html

> backups in all superblocks have the same chunk_root, no alternative
> chunk root to try.

> So at the moment I think it's worth trying a newer kernel version and
> mounting normally; then mounting with -o recovery; then - recovery,ro.

> If that doesn't work, you're best off waiting for a developer to give
> advice on the next step;  'btrfs rescue chunk-recover' seems most
> appropriate but again someone else a while back had success with
> zero-log, but it's hard to say if the two cases are really similar and
> maybe that person just got lucky. Both of those change the file system
> in irreversible ways, that's why I suggest waiting or asking on IRC.

Thanks again for taking the time to answer. I'll wait while my RAID is doing 
the scrub, maybe a dev will answer (like you said).
The friendly people on IRC couldn't help and sent me here.
--
To unsubscrib

Re: unable to mount btrfs partition, please help :(

2016-03-19 Thread Duncan
Patrick Tschackert posted on Sat, 19 Mar 2016 23:15:33 +0100 as excerpted:

> I'm growing increasingly desperate, can anyone help me?

No need to be desperate.  As the sysadmin's rule of backups states, 
simple form, you either have at least one level of backup, or you are by 
your (in)action defining the data not backed up as worth less than the 
time, hassle and resources necessary to do that backup.

Therefore, there are only two possibilities:

1) You have a backup.  No sweat.  You can use it if you need to, so no 
desperation needed.

2) You don't have a backup.  No sweat.  By not having a backup, your 
actions defined the data at risk as worth less than the time, hassle and 
resources necessary for that backup, so if you lose the data, you can 
still be happy, because you saved what you defined as of most importance, 
the time, resources and hassle of doing that backup.

Since you saved what you yourself defined by your own actions as of most 
value to you, either way, you have what was most valuable to you and can 
thus be happy to have the valuable stuff, even if you lost what was 
therefore much more trivial.

There are no other possibilities.  Your words might lie.  Your actions 
don't.  Either way, you saved the valuable stuff and thus have no reason 
to be desperate.


And of course, btrfs, while stabilizing, is not yet fully stable and 
mature, and while stable enough to be potentially suitable for those who 
have tested backups or are only using it with trivial data they can 
afford to lose anyway, if they don't have backups, it's certainly not to 
the level of stability of the more mature filesystems the above sysadmin's 
rule of backups was designed for.  So that rule applies even MORE 
strongly to btrfs than it does to more mature and stable filesystems.  
(FWIW, there's a more complex version of the rule that takes relative 
risk into account and covers multiple levels of backup where either the 
risk is high enough or the data valuable enough to warrant it, but the 
simple form just says if you don't have at least one backup, you are by 
that lack of backup defining the data at risk as not worth the time and 
trouble to do it.)

And there's no way that not knowing the btrfs status changes that either, 
because if you didn't know the status, it can only be because you didn't 
care enough about the reliability of the filesystem you were entrusting 
your data to, to care about researching it.  After all, both the btrfs 
wiki and the kernel btrfs option stress the need for backups if you're 
choosing btrfs, as does this list, repeatedly.  So the only way someone 
couldn't know is if they didn't care enough to /bother/ to know, which 
again defines the data stored on the filesystem as of only trivial value, 
worth so little it's not worth researching a new filesystem you plan on 
storing it on.

So there's no reason to be desperate.  It'll only stress you out and 
increase your blood pressure.  Either you considered the data valuable 
enough to have a backup, or you didn't.  There is no third option.  And 
either way, it's not worth stressing out over, because you either have 
that backup and thus don't need to stress, or you yourself defined the 
data as trivial by not having it.

> $ uname -a Linux vmhost 3.16.0-4-amd64 #1 SMP Debian
> 3.16.7-ckt20-1+deb8u4 (2016-02-29) x86_64 GNU/Linux
> 
> $ btrfs --version btrfs-progs v4.4

As CMurphy says, that's an old kernel, not really supported by the 
list.   With btrfs still stabilizing, the code is still changing pretty 
fast, and old kernels are known buggy kernels.  The list focuses on the 
mainline kernel and its two primary tracks, LTS kernel series and current 
kernel series.  On the current kernel track, the last two kernels are 
best supported.  With 4.5 just out, that's 4.5 and 4.4.

On the LTS track, the two latest LTS kernel series are recommended, with 
4.4 being the latest LTS kernel, and 4.1 being the one previous to that.  
However, 3.18 was the one previous to that and has been reasonably 
stable, so while the two latest LTS series remain recommended, we're 
still trying to support 3.18 too, for those who need that far back.

But 3.16 is previous to that and is really too far back to be practically 
supported well by the list, as btrfs really is still stabilizing and our 
focus is forward, not backward.  That doesn't mean we won't try to 
support it, it simply means that when there's a problem, the first 
recommendation, as you've seen, is likely to be try a newer kernel.

Of course various distros do offer support for btrfs on older kernels and 
we recognize that.  However, our focus is on mainline, and we don't track 
what patches the various distros have backported and what patches they 
haven't, so we're not in a particularly good position to provide support 
for them, at least back further than the mainline kernels we support.  If 
you wish to use btrfs on such old kernels, then, our recommendation is to 
g

Re: unable to mount btrfs partition, please help :(

2016-03-19 Thread Nicholas D Steeves
On 19 March 2016 at 21:34, Chris Murphy  wrote:
> On Sat, Mar 19, 2016 at 5:35 PM, Patrick Tschackert  
> wrote:
 $ uname -a
 Linux vmhost 3.16.0-4-amd64 #1 SMP Debian 3.16.7-ckt20-1+deb8u4
 (2016-02-29) x86_64 GNU/Linux
>>>This is old. You should upgrade to something newer, ideally 4.5 but
>>>4.4.6 is good also, and then oldest I'd suggest is 4.1.20.
>>
>> Shouldn't I be able to get the newest kernel by executing "apt-get update && 
>> apt-get dist-upgrade"?
>> That's what I ran just now, and it doesn't install a newer kernel. Do I 
>> really have to manually upgrade to a newer one?
>
> I'm not sure. You might do a list search for debian, as I know debian
> users are using newer kernels that they didn't build themselves.
>
>
>> On top of the sticky situation i'm already in, i'm not sure if I trust 
>> myself manually building a new kernel. Should I?

If you enable Debian backports, which I assume you have since you're
running the version of btrfs-progs that was backported without a
warning not to use it with old kernels...well, if backports are
enabled then you can try:

apt-get install -t jessie-backports linux-image-4.3.0-0.bpo.1-amd64

linux-4.3.x was a complete mess for both my laptop (Thinkpad X220,
quite well supported), and I'm not sure if it was driver-related or
btrfs-related.  I actually started tracking linux-4.4 at rc1, it was
so bad.

If you don't want to try building your own kernel, I'd file a bug
report against linux-image-amd64 asking for a backport of linux-4.4,
which is in Stretch/testing; I'm surprised it hasn't been backported
yet...  The only issue I remember is an error message when booting, I
think because the microcode interface changed between 4.3.x and 4.4.x.
Installing microcode-related packages from backports is how think I
worked around this.

Alternatively, if you want to build your own kernel you might be able
to install linux-image from backports, download and untar linux-4.1.x
somewhere, and then copy the config from /boot/config-4.3* to
somedir/linux-4.1.x/.config.

I uploaded two scripts to github that I've been using for ages to
track the upstream LTS kernel branch that Debian didn't choose.  You
can find them here:

https://github.com/sten0/lts-convenience

All those syncs and btrfs sub sync lines are there because I always
seem to run strange issues with adding and removing snapshots.

Cheers,
Nicholas
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: unable to mount btrfs partition, please help :(

2016-03-19 Thread Chris Murphy
On Sat, Mar 19, 2016 at 5:35 PM, Patrick Tschackert  wrote:
> Hi Chris,
>
> thank you for answering so quickly!
>
>> Try 'btrfs check' without any options first.
> $ btrfs check /dev/mapper/storage
> checksum verify failed on 36340960788480 found 8F8E1006 wanted 4AA1BC89
> checksum verify failed on 36340960788480 found 8F8E1006 wanted 4AA1BC89
> bytenr mismatch, want=36340960788480, have=4530277753793296986
> Couldn't read chunk tree
> Couldn't open file system
>
>> To me it seems the problem is instigated by lower layers either not
>> completing critical writes at the time of the power failure, or didn't
>> rebuild correctly.
>
> There wasn't a power failure, a VM crashed whilst writing to the btrfs 
> filesys.

OK I went back and read this again: host is managing the md raid5, the
guest is writing Btrfs to an "encrypted container" but what is that? A
LUKS encrypted LVM LV that's directly used by Virtual Box as a raw
device? It's hard to say what layer broke this. But the VM crashing is
in effect like a power failure, and it's an open question (for me) how
this setup deals with barriers. A shutdown -r now should still cleanly
stop the array so I wouldn't expect there to be an array problem but
then you also report a device failure. Bad luck.

I think in retrospect the safe way to do these kinds of Virtual Box
updates, which require kernel module updates, would have been to
shutdown the VM and stop the array. *shrug*


>
>> You should check the SCT ERC setting on each drive with 'smartctl -l
>> scterc /dev/sdX' and also the kernel command timer setting with 'cat
>> /sys/block/sdX/device/timeout' also for each device. The SCT ERC value
>> must be less than the command timer. It's a common misconfiguration
>> with raid setups.
>
> $ smartctl -l scterc /dev/sda (sdb, sdc, sde, sdg)
> gives me
>
> smartctl 6.4 2014-10-07 r4002 [x86_64-linux-3.16.0-4-amd64] (local build)
> Copyright (C) 2002-14, Bruce Allen, Christian Franke, www.smartmontools.org
>
> SCT Error Recovery Control command not supported

These drives are technically not suitable for use in any kind of raid
except linear and raid 0 (which have no redundancy so they aren't
really raid). You'd have to dig up drive specs, assuming they're
published, to see what the recovery times are for the drive models
when a bad sector is encountered. But it's typical for such drives to
exceed 30 seconds for recovery, with some drives reported to have 2+
minute recoveries. To properly configure them, you'll have to increase
the kernel's SCSI comment timer to at least 120 to make sure there's
sufficient time to wait for the drive to explicitly spit back a read
error to the kernel. Otherwise, the kernel gives up after 30 seconds,
and resets the link to the drive, and any possibility of fixing up the
bad sector via the raid read error fixup mechanism is thwarted. It's
really common, the linux-raid@ list has many of these kinds of threads
with this misconfiguration as the source problem.




>
> while
> $ smartctl -l scterc /dev/sdf (sdh, sdi, sdj, sdk)
> gives me
>
> smartctl 6.4 2014-10-07 r4002 [x86_64-linux-3.16.0-4-amd64] (local build)
> Copyright (C) 2002-14, Bruce Allen, Christian Franke, www.smartmontools.org
>
> SCT Error Recovery Control:
>Read: 70 (7.0 seconds)
>   Write: 70 (7.0 seconds)

These drives are suitable for raid out of the box.


>
> $ cat /sys/block/sdX/device/timeout
> gives me "30" for every device
>
> Does that mean my settings for the device timeouts are wrong?

For the first listing of drives yes. And 120 second delays might be
too long for your use case, but that's the reality.

You should change the command timer for the drives that do not support
configurable SCT ERC. And then do a scrub check. And then check both
cat /sys/block/mdX/md/mismatch_cnt, which ideally should be 0, and
also check kernel messages for libata read errors.


>
>> After that's fixed you should do a scrub, and I'm thinking it's best
>> to do only a check, which means 'echo check >
>> /sys/block/mdX/md/sync_action' rather than issuing repair which
>> assumes data strips are correct and parity strips are wrong and
>> rebuilds all parity strips.
>
> I don't quite understand, I thought a scrub could only be done on a mounted 
> filesys?

You have two scrubs. There's a Btrfs scrub. And an md scrub. I'm
referring to the latter.


> Do you reall mean executing the command "echo check > 
> /sys/block/md0/md/sync_action"? At the moment it says "idle" in that file.
> Also, the btrfs filesys sits in an encrypted container, so the setup looks 
> like this:
>
> /dev/md0 (this is the Raid device)
> /dev/mapper/storage (after cryptsetup luksOpen, this is where filesys should 
> be mounted from)
> /media/storage (i always mounted the filesystem into this folder by executing 
> "mount /dev/mapper/storage /media/storage")
>
> Apologies if I didn't make that clear enough in my initial email

Ok so the host is writing Btrfs to 

Re: unable to mount btrfs partition, please help :(

2016-03-19 Thread Patrick Tschackert
8    level: 3
                backup_fs_root:         24022070902784  gen: 1322968    level: 3
                backup_dev_root:        24014655901696  gen: 1275381    level: 2
                backup_csum_root:       24022070956032  gen: 1322968    level: 4
                backup_total_bytes:     21003208163328
                backup_bytes_used:      17670808895488
                backup_num_devices:     1

        backup 1:
                backup_tree_root:       24022114037760  gen: 1322968    level: 2
                backup_chunk_root:      36340959809536  gen: 1275381    level: 2
                backup_extent_root:     24022186385408  gen: 1322969    level: 3
                backup_fs_root:         24022186381312  gen: 1322969    level: 3
                backup_dev_root:        24014655901696  gen: 1275381    level: 2
                backup_csum_root:       24022186536960  gen: 1322969    level: 4
                backup_total_bytes:     21003208163328
                backup_bytes_used:      17670826078208
                backup_num_devices:     1

        backup 2:
                backup_tree_root:       24022309593088  gen: 1322969    level: 2
                backup_chunk_root:      36340959809536  gen: 1275381    level: 2
                backup_extent_root:     24022337949696  gen: 1322970    level: 3
                backup_fs_root:         24022337937408  gen: 1322970    level: 3
                backup_dev_root:        24014655901696  gen: 1275381    level: 2
                backup_csum_root:       24022337990656  gen: 1322970    level: 4
                backup_total_bytes:     21003208163328
                backup_bytes_used:      17670866358272
                backup_num_devices:     1

        backup 3:
                backup_tree_root:       24021840482304  gen: 1322966    level: 2
                backup_chunk_root:      36340959809536  gen: 1275381    level: 2
                backup_extent_root:     24021883957248  gen: 1322967    level: 3
                backup_fs_root:         24021883949056  gen: 1322967    level: 3
                backup_dev_root:        24014655901696  gen: 1275381    level: 2
                backup_csum_root:       24021884100608  gen: 1322967    level: 4
                backup_total_bytes:     21003208163328
                backup_bytes_used:      17670630260736
                backup_num_devices:     1


On Sun, Mar 20, 2016 at 12:02 AM, Chris Murphy <li...@colorremedies.com> wrote:
> On Sat, Mar 19, 2016 at 4:15 PM, Patrick Tschackert <killing-t...@gmx.de> 
> wrote:
>
>> I'm growing increasingly desperate, can anyone help me? I'm thinking
>> of trying one or more of the following, but would like an informed
>> opinion:
>> 1) btrfs check --fix-crc
>> 2) btrfs-check --init-csum-tree
>> 3) btrfs rescue chunk-recover
>> 4) btrfs-check --repair
>> 5) btrfs rescue zero-log
>
> None of the above. Try 'btrfs check' without any options first.
>
> To me it seems the problem is instigated by lower layers either not
> completing critical writes at the time of the power failure, or didn't
> rebuild correctly.
>
> You should check the SCT ERC setting on each drive with 'smartctl -l
> scterc /dev/sdX' and also the kernel command timer setting with 'cat
> /sys/block/sdX/device/timeout' also for each device. The SCT ERC value
> must be less than the command timer. It's a common misconfiguration
> with raid setups.
>
> After that's fixed you should do a scrub, and I'm thinking it's best
> to do only a check, which means 'echo check >
> /sys/block/mdX/md/sync_action' rather than issuing repair which
> assumes data strips are correct and parity strips are wrong and
> rebuilds all parity strips.
>
>
>>
>> $ uname -a
>> Linux vmhost 3.16.0-4-amd64 #1 SMP Debian 3.16.7-ckt20-1+deb8u4
>> (2016-02-29) x86_64 GNU/Linux
>
> This is old. You should upgrade to something newer, ideally 4.5 but
> 4.4.6 is good also, and then oldest I'd suggest is 4.1.20.
>
>>
>> $ btrfs --version
>> btrfs-progs v4.4
>
> Good.
>
>> $ btrfs fi show
>> Label: none uuid: 9868d803-78d1-40c3-b1ee-a4ce3363df87
>> Total devices 1 FS bytes used 16.07TiB
>> devid 1 size 19.10TiB used 16.27TiB path /dev/mapper/storage
>>
>> excerpt from DMESG:
>> [ 151.970916] BTRFS: device fsid 9868d803-78d1-40c3-b1ee-a4ce3363df87
>> devid 1 transid 1322969 /dev/dm-0
>> [ 163.105784] BTRFS info (device dm-0): disk space caching is enabled
>> [ 165.304968] BTRFS: bad tree block start 4530277753793296986 36340960788480
>> [ 165.305233] BTRFS: bad tree block start 4530277753793296986 36340960788480
>> [ 165.305281] BTRFS: failed to read chunk tree on dm-0
>> [ 165.331407] BTRFS: open_ctree failed
>
> Yeah this isn't a good message typically. There's one surpri

Re: unable to mount btrfs partition, please help :(

2016-03-19 Thread Chris Murphy
On Sat, Mar 19, 2016 at 4:15 PM, Patrick Tschackert <killing-t...@gmx.de> wrote:

> I'm growing increasingly desperate, can anyone help me? I'm thinking
> of trying one or more of the following, but would like an informed
> opinion:
> 1) btrfs check --fix-crc
> 2) btrfs-check --init-csum-tree
> 3) btrfs rescue chunk-recover
> 4) btrfs-check --repair
> 5) btrfs rescue zero-log

None of the above. Try 'btrfs check' without any options first.

To me it seems the problem is instigated by lower layers either not
completing critical writes at the time of the power failure, or didn't
rebuild correctly.

You should check the SCT ERC setting on each drive with 'smartctl -l
scterc /dev/sdX' and also the kernel command timer setting with 'cat
/sys/block/sdX/device/timeout' also for each device. The SCT ERC value
must be less than the command timer. It's a common misconfiguration
with raid setups.

After that's fixed you should do a scrub, and I'm thinking it's best
to do only a check, which means 'echo check >
/sys/block/mdX/md/sync_action' rather than issuing repair which
assumes data strips are correct and parity strips are wrong and
rebuilds all parity strips.


>
> $ uname -a
> Linux vmhost 3.16.0-4-amd64 #1 SMP Debian 3.16.7-ckt20-1+deb8u4
> (2016-02-29) x86_64 GNU/Linux

This is old. You should upgrade to something newer, ideally 4.5 but
4.4.6 is good also, and then oldest I'd suggest is 4.1.20.

>
> $ btrfs --version
> btrfs-progs v4.4

Good.

> $ btrfs fi show
> Label: none uuid: 9868d803-78d1-40c3-b1ee-a4ce3363df87
> Total devices 1 FS bytes used 16.07TiB
> devid 1 size 19.10TiB used 16.27TiB path /dev/mapper/storage
>
> excerpt from DMESG:
> [ 151.970916] BTRFS: device fsid 9868d803-78d1-40c3-b1ee-a4ce3363df87
> devid 1 transid 1322969 /dev/dm-0
> [ 163.105784] BTRFS info (device dm-0): disk space caching is enabled
> [ 165.304968] BTRFS: bad tree block start 4530277753793296986 36340960788480
> [ 165.305233] BTRFS: bad tree block start 4530277753793296986 36340960788480
> [ 165.305281] BTRFS: failed to read chunk tree on dm-0
> [ 165.331407] BTRFS: open_ctree failed

Yeah this isn't a good message typically. There's one surprising (to
me) case where someone had luck getting this fixed with btrfs-zero-log
which is unexpected. But I think it's very premature to make changes
to the file system until you have more information.

What do you get for
btrfs-find-root /dev/mdX
btrfs-show-super -fa /dev/mdX


-- 
Chris Murphy
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


unable to mount btrfs partition, please help :(

2016-03-19 Thread Patrick Tschackert
Hi,

Apologies if this eMail reaches the mailing list multiple times, i can't seem 
to get throught to the mailing list, so I'm sending it through a different 
account now...

I'm having problems mounting my BTRFS filesystem. Here's what happened:

My BTRFS filesystem sits in an encrypted container on a linux software
RAID 6. A VirtualBox crash (while writing to the filesystem I
presume), and I rebooted the machine with "shutdown -r now", becaue a
reboot was necessary due to upgraded VirtualBox drivers.
When the system was running again, I couldn't mount the filesystem.
This is what I did:

$ cryptsetup luksOpen /dev/md0 storage (this worked fine)
$ mount /dev/mapper/storage /media/storage
mount: wrong fs type, bad option, bad superblock on /dev/mapper/storage,
missing codepage or helper program, or other error

In some cases useful info is found in syslog - try
dmesg | tail or so.

I then saw that one of the RAID's disks was no longer present in the
array, so I started a rebuild/recover by executing mdadm --run. The
RAID rebuilt itself using one of the spare disks. After the rebuild,
the problem persists, I cannot mount my file system. Mounting with
options "ro" and/or "recovery" makes no difference. I am unable to do
a backup of the metadata:

$ btrfs-image -c9 /dev/mapper/storage ~/btrfs_img
checksum verify failed on 36340960788480 found 8F8E1006 wanted 4AA1BC89
checksum verify failed on 36340960788480 found 8F8E1006 wanted 4AA1BC89
bytenr mismatch, want=36340960788480, have=4530277753793296986
Couldn't read chunk tree
Open ctree failed
create failed (Success)

I'm growing increasingly desperate, can anyone help me? I'm thinking
of trying one or more of the following, but would like an informed
opinion:
1) btrfs check --fix-crc
2) btrfs-check --init-csum-tree
3) btrfs rescue chunk-recover
4) btrfs-check --repair
5) btrfs rescue zero-log

Here is various info about my system as it is now, including the info
requested on https://btrfs.wiki.kernel.org/index.php/Btrfs_mailing_list.
The full DMESG is attached to this eMail.

$ btrfs restore -D /dev/mapper/storage /media/rest
checksum verify failed on 36340960788480 found 8F8E1006 wanted 4AA1BC89
checksum verify failed on 36340960788480 found 8F8E1006 wanted 4AA1BC89
bytenr mismatch, want=36340960788480, have=4530277753793296986
Couldn't read chunk tree
Could not open root, trying backup super
checksum verify failed on 36340960788480 found 8F8E1006 wanted 4AA1BC89
checksum verify failed on 36340960788480 found 8F8E1006 wanted 4AA1BC89
bytenr mismatch, want=36340960788480, have=4530277753793296986
Couldn't read chunk tree
Could not open root, trying backup super
checksum verify failed on 36340960788480 found 8F8E1006 wanted 4AA1BC89
checksum verify failed on 36340960788480 found 8F8E1006 wanted 4AA1BC89
bytenr mismatch, want=36340960788480, have=4530277753793296986
Couldn't read chunk tree
Could not open root, trying backup super

$ btrfs check --readonly /dev/mapper/storage
checksum verify failed on 36340960788480 found 8F8E1006 wanted 4AA1BC89
checksum verify failed on 36340960788480 found 8F8E1006 wanted 4AA1BC89
bytenr mismatch, want=36340960788480, have=4530277753793296986
Couldn't read chunk tree
Couldn't open file system

$ btrfs-show-super /dev/mapper/storage
superblock: bytenr=65536, device=/dev/mapper/storage
-
csum 0xf3887f83 [match]
bytenr 65536
flags 0x1
( WRITTEN )
magic _BHRfS_M [match]
fsid 9868d803-78d1-40c3-b1ee-a4ce3363df87
label
generation 1322969
root 24022309593088
sys_array_size 97
chunk_root_generation 1275381
root_level 2
chunk_root 36340959809536
chunk_root_level 2
log_root 0
log_root_transid 0
log_root_level 0
total_bytes 21003208163328
bytes_used 17670843191296
sectorsize 4096
nodesize 4096
leafsize 4096
stripesize 4096
root_dir 6
num_devices 1
compat_flags 0x0
compat_ro_flags 0x0
incompat_flags 0x1
( MIXED_BACKREF )
csum_type 0
csum_size 4
cache_generation 1322969
uuid_tree_generation 1322969
dev_item.uuid c1123f55-46ce-4931-8722-7387fee07608
dev_item.fsid 9868d803-78d1-40c3-b1ee-a4ce3363df87 [match]
dev_item.type 0
dev_item.total_bytes 21003208163328
dev_item.bytes_used 17886424858624
dev_item.io_align 4096
dev_item.io_width 4096
dev_item.sector_size 4096
dev_item.devid 1
dev_item.dev_group 0
dev_item.seek_speed 0
dev_item.bandwidth 0
dev_item.generation 0

$ cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4]
md0 : active (auto-read-only) raid6 sda[0] sdg[11](S) sdf[12](S)
sdj[9] sdh[7] sdi[6] sdk[10] sde[4] sdd[3] sdc[2] sdb[1]
20510948416 blocks super 1.2 level 6, 64k chunk, algorithm 2
[9/9] [U]

unused devices: 

$ mdadm -D /dev/md0
/dev/md0:
Version : 1.2
Creation Time : Sat Jun 14 18:47:44 2014
Raid Level : raid6
Array Size : 20510948416 (19560.77 GiB 21003.21 GB)
Used Dev Size : 2930135488 (2794.40 GiB 3000.46 GB)
Raid Devices : 9
Total Devices : 11
Persistence : Superblock is persistent

Update Time : Sat Mar 19 13:5

Re: [PATCH v2 RESEND] btrfs: maintain consistency in logging to help debugging

2016-03-19 Thread David Sterba
On Thu, Mar 10, 2016 at 12:22:58PM +0800, Anand Jain wrote:
> Optional Label may or may not be set, or it might be set at some time
> later. However while debugging to search through the kernel logs the
> scripts would need the logs to be consistent, so logs search key words
> shouldn't depend on the optional variables, instead fsid is better.

I think the label is a useful information, as it's set by the user. So
if I'm looking to the log, I'll recognize the labels, not the device or
fsid.

It would be better to show all of them, ie. label, fsid, device and
transid. The line will get longer, but I hope it's ok.

Proposed order of the fields:
- device PATH
- devid ID
- fsid UUID
- transid TID

> - if (disk_super->label[0]) {
> - printk(KERN_INFO "BTRFS: device label %s ", 
> disk_super->label);
> - } else {
> - printk(KERN_INFO "BTRFS: device fsid %pU ", 
> disk_super->fsid);
> - }
> -
> - printk(KERN_CONT "devid %llu transid %llu %s\n", devid, 
> transid, path);
> + printk(KERN_INFO "BTRFS: device fsid %pU devid %llu transid 
> %llu %s\n",
> + disk_super->fsid, devid, 
> transid, path);
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 RESEND] btrfs: maintain consistency in logging to help debugging

2016-03-19 Thread Anand Jain




On 03/17/2016 12:18 AM, David Sterba wrote:

On Thu, Mar 10, 2016 at 12:22:58PM +0800, Anand Jain wrote:

Optional Label may or may not be set, or it might be set at some time
later. However while debugging to search through the kernel logs the
scripts would need the logs to be consistent, so logs search key words
shouldn't depend on the optional variables, instead fsid is better.


I think the label is a useful information, as it's set by the user. So
if I'm looking to the log, I'll recognize the labels, not the device or
fsid.

It would be better to show all of them, ie. label, fsid, device and
transid. The line will get longer, but I hope it's ok.

Proposed order of the fields:
- device PATH
- devid ID
- fsid UUID
- transid TID


(I am not too particular about the below but just my opinion.)

The patch titled in the ML:
  Btrfs: fix fs logging for multi device

Would prefix
  BTRFS: :
to most of the logs in dmesg.

So I guess if we have following then
  BTRFS: :  (*)  
is better.

For end users, I hope we provide all those requisites through
btrfs-progs cli, and they wouldn't have to review dmesg. Further,
'btrfs fi show' provides the FSID to label mapping. So I hope the
next set of targeted community, the troubleshooters will be familiar
with the FSID, and they could do

 dmesg | grep "BTRFS: :"

To filter to get logs of one btrfs which they want to troubleshoot.
(as there may be more than one btrfs in the system).

[*] May be in future.
 (There is a bug that we might fails to know / assemble right set
 devices as per last assembled-volume and to fix this, its better
 create a new device UUID for the replace target device instead of
 copying the device UUID of source device (bit vague of now). If
 this is successful, then device UUID will be useful to printk here).

Thanks, Anand


-   if (disk_super->label[0]) {
-   printk(KERN_INFO "BTRFS: device label %s ", 
disk_super->label);
-   } else {
-   printk(KERN_INFO "BTRFS: device fsid %pU ", 
disk_super->fsid);
-   }
-
-   printk(KERN_CONT "devid %llu transid %llu %s\n", devid, 
transid, path);
+   printk(KERN_INFO "BTRFS: device fsid %pU devid %llu transid %llu 
%s\n",
+   disk_super->fsid, devid, 
transid, path);

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 RESEND] btrfs: maintain consistency in logging to help debugging

2016-03-09 Thread Anand Jain
Optional Label may or may not be set, or it might be set at some time
later. However while debugging to search through the kernel logs the
scripts would need the logs to be consistent, so logs search key words
shouldn't depend on the optional variables, instead fsid is better.

Signed-off-by: Anand Jain 
---
v2: fix commit log

 fs/btrfs/volumes.c | 9 ++---
 1 file changed, 2 insertions(+), 7 deletions(-)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index dc2db98..af176d6 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -998,13 +998,8 @@ int btrfs_scan_one_device(const char *path, fmode_t flags, 
void *holder,
 
ret = device_list_add(path, disk_super, devid, fs_devices_ret);
if (ret > 0) {
-   if (disk_super->label[0]) {
-   printk(KERN_INFO "BTRFS: device label %s ", 
disk_super->label);
-   } else {
-   printk(KERN_INFO "BTRFS: device fsid %pU ", 
disk_super->fsid);
-   }
-
-   printk(KERN_CONT "devid %llu transid %llu %s\n", devid, 
transid, path);
+   printk(KERN_INFO "BTRFS: device fsid %pU devid %llu transid 
%llu %s\n",
+   disk_super->fsid, devid, 
transid, path);
ret = 0;
}
if (!ret && fs_devices_ret)
-- 
2.4.1

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: BTRFS Raid 6 corruption - please help with restore

2016-03-02 Thread Chris Murphy
On Wed, Mar 2, 2016 at 11:42 AM, Stuart Gittings <gitting...@gmail.com> wrote:
> All devices are present.  Btrfs if show is listed below and shows they are 
> all there.  I'm afraid btrfs dev scan does not help


What do you get for 'btrfs check'  (do not use --repair yet)




-- 
Chris Murphy
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: BTRFS Raid 6 corruption - please help with restore

2016-03-02 Thread Chris Murphy
On Wed, Mar 2, 2016 at 3:47 AM, Stuart Gittings <gitting...@gmail.com> wrote:
> Hi - I have some corruption on a 12 drive Raid 6 volume.  Here's the
> basics - if someone could help with restore it would save me a ton of
> time (and some data loss - I have critical data backed up, but not
> all).
>
> stuart@debian:~$ uname -a
> Linux debian 4.3.0-0.bpo.1-amd64 #1 SMP Debian 4.3.3-7~bpo8+1
> (2016-01-19) x86_64 GNU/Linux
>
> stuart@debian:~$ sudo btrfs --version
> btrfs-progs v4.4
>
>  sudo btrfs fi sh
> Label: none  uuid: 7f994e11-e146-4dee-80f0-c16ac3073e91
> Total devices 12 FS bytes used 14.25TiB
> devid1 size 2.73TiB used 167.14GiB path /dev/sdc
> devid2 size 5.46TiB used 1.75TiB path /dev/sdd
> devid3 size 5.46TiB used 1.75TiB path /dev/sde
> devid4 size 2.73TiB used 167.14GiB path /dev/sdn
> devid5 size 5.46TiB used 1.75TiB path /dev/sdf
> devid6 size 2.73TiB used 1.75TiB path /dev/sdm
> devid9 size 2.73TiB used 1.75TiB path /dev/sdj
> devid   10 size 2.73TiB used 1.75TiB path /dev/sdi
> devid   11 size 2.73TiB used 1.75TiB path /dev/sdg
> devid   13 size 2.73TiB used 1.75TiB path /dev/sdl
> devid   14 size 2.73TiB used 1.75TiB path /dev/sdk
> devid   15 size 2.73TiB used 1.75TiB path /dev/sdh
>
> sudo mount -t btrfs -oro,recover /dev/sdc /data
> mount: wrong fs type, bad option, bad superblock on /dev/sdc,
>missing codepage or helper program, or other error
>
>In some cases useful info is found in syslog - try
>dmesg | tail or so.
>
> dmesg:
>
> [ 5642.118303] BTRFS info (device sdc): enabling auto recovery
> [ 5642.118313] BTRFS info (device sdc): disk space caching is enabled
> [ 5642.118316] BTRFS: has skinny extents
> [ 5642.130145] btree_readpage_end_io_hook: 39 callbacks suppressed
> [ 5642.130148] BTRFS (device sdc): bad tree block start
> 13629298965300190098 47255853072384
> [ 5642.130759] BTRFS (device sdc): bad tree block start
> 10584834564968318131 47255853105152
> [ 5642.131289] BTRFS (device sdc): bad tree block start
> 2775635947161390306 47255853121536
> [ 5644.730012] BTRFS: bdev /dev/sdc errs: wr 1664846, rd 210656, flush
> 18054, corrupt 0, gen 0
> [ 5644.801291] BTRFS (device sdc): bad tree block start
> 8578409561856120450 47254279438336
> [ 5644.801304] BTRFS (device sdc): bad tree block start
> 18087369170870825197 47254279454720
> [ 5644.831199] BTRFS (device sdc): bad tree block start
> 9721403008164124267 47254277718016
> [ 5644.842763] BTRFS (device sdc): bad tree block start
> 18087369170870825197 47254279454720
> [ 5644.891992] BTRFS (device sdc): bad tree block start
> 17582844917171188859 47254194176000
> [ 5644.951366] BTRFS (device sdc): bad tree block start
> 3962496226683925584 47254278586368
> [ 5645.097168] BTRFS (device sdc): bad tree block start
> 17049293152820168762 47255619846144
> [ 5646.159819] BTRFS: Failed to read block groups: -5
> [ 5646.215905] BTRFS: open_ctree failed
> stuart@debian:~$
>
> Finally:
>  sudo btrfs restore /dev/sdc /backup
> checksum verify failed on 47255853072384 found 70F58CCA wanted AE18D5BC
> checksum verify failed on 47255853072384 found 70F58CCA wanted AE18D5BC
> checksum verify failed on 47255853072384 found 805B1FF7 wanted B76A652F
> checksum verify failed on 47255853072384 found 70F58CCA wanted AE18D5BC
> bytenr mismatch, want=47255853072384, have=13629298965300190098
> Couldn't read chunk tree
> Could not open root, trying backup super
> warning, device 3 is missing
> warning, device 2 is missing
> warning, device 5 is missing
> warning, device 4 is missing
> bytenr mismatch, want=47255851761664, have=47255851958272
> Couldn't read chunk root
> Could not open root, trying backup super
> warning, device 3 is missing
> warning, device 2 is missing
> warning, device 5 is missing
> warning, device 4 is missing
> bytenr mismatch, want=47255851761664, have=47255851958272
> Couldn't read chunk root
> Could not open root, trying backup super
>


Well there appear to be too many devices missing, I count four. What
does 'btrfs fi show' look like? If there are missing devices, try
'btrfs dev scan' and then 'btrfs fi show' again and see if it changes.
I don't think much can be done if there really are four missing
devices.

-- 
Chris Murphy
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


BTRFS Raid 6 corruption - please help with restore

2016-03-02 Thread Stuart Gittings
Hi - I have some corruption on a 12 drive Raid 6 volume.  Here's the
basics - if someone could help with restore it would save me a ton of
time (and some data loss - I have critical data backed up, but not
all).

stuart@debian:~$ uname -a
Linux debian 4.3.0-0.bpo.1-amd64 #1 SMP Debian 4.3.3-7~bpo8+1
(2016-01-19) x86_64 GNU/Linux

stuart@debian:~$ sudo btrfs --version
btrfs-progs v4.4

 sudo btrfs fi sh
Label: none  uuid: 7f994e11-e146-4dee-80f0-c16ac3073e91
Total devices 12 FS bytes used 14.25TiB
devid1 size 2.73TiB used 167.14GiB path /dev/sdc
devid2 size 5.46TiB used 1.75TiB path /dev/sdd
devid3 size 5.46TiB used 1.75TiB path /dev/sde
devid4 size 2.73TiB used 167.14GiB path /dev/sdn
devid5 size 5.46TiB used 1.75TiB path /dev/sdf
devid6 size 2.73TiB used 1.75TiB path /dev/sdm
devid9 size 2.73TiB used 1.75TiB path /dev/sdj
devid   10 size 2.73TiB used 1.75TiB path /dev/sdi
devid   11 size 2.73TiB used 1.75TiB path /dev/sdg
devid   13 size 2.73TiB used 1.75TiB path /dev/sdl
devid   14 size 2.73TiB used 1.75TiB path /dev/sdk
devid   15 size 2.73TiB used 1.75TiB path /dev/sdh

sudo mount -t btrfs -oro,recover /dev/sdc /data
mount: wrong fs type, bad option, bad superblock on /dev/sdc,
   missing codepage or helper program, or other error

   In some cases useful info is found in syslog - try
   dmesg | tail or so.

dmesg:

[ 5642.118303] BTRFS info (device sdc): enabling auto recovery
[ 5642.118313] BTRFS info (device sdc): disk space caching is enabled
[ 5642.118316] BTRFS: has skinny extents
[ 5642.130145] btree_readpage_end_io_hook: 39 callbacks suppressed
[ 5642.130148] BTRFS (device sdc): bad tree block start
13629298965300190098 47255853072384
[ 5642.130759] BTRFS (device sdc): bad tree block start
10584834564968318131 47255853105152
[ 5642.131289] BTRFS (device sdc): bad tree block start
2775635947161390306 47255853121536
[ 5644.730012] BTRFS: bdev /dev/sdc errs: wr 1664846, rd 210656, flush
18054, corrupt 0, gen 0
[ 5644.801291] BTRFS (device sdc): bad tree block start
8578409561856120450 47254279438336
[ 5644.801304] BTRFS (device sdc): bad tree block start
18087369170870825197 47254279454720
[ 5644.831199] BTRFS (device sdc): bad tree block start
9721403008164124267 47254277718016
[ 5644.842763] BTRFS (device sdc): bad tree block start
18087369170870825197 47254279454720
[ 5644.891992] BTRFS (device sdc): bad tree block start
17582844917171188859 47254194176000
[ 5644.951366] BTRFS (device sdc): bad tree block start
3962496226683925584 47254278586368
[ 5645.097168] BTRFS (device sdc): bad tree block start
17049293152820168762 47255619846144
[ 5646.159819] BTRFS: Failed to read block groups: -5
[ 5646.215905] BTRFS: open_ctree failed
stuart@debian:~$

Finally:
 sudo btrfs restore /dev/sdc /backup
checksum verify failed on 47255853072384 found 70F58CCA wanted AE18D5BC
checksum verify failed on 47255853072384 found 70F58CCA wanted AE18D5BC
checksum verify failed on 47255853072384 found 805B1FF7 wanted B76A652F
checksum verify failed on 47255853072384 found 70F58CCA wanted AE18D5BC
bytenr mismatch, want=47255853072384, have=13629298965300190098
Couldn't read chunk tree
Could not open root, trying backup super
warning, device 3 is missing
warning, device 2 is missing
warning, device 5 is missing
warning, device 4 is missing
bytenr mismatch, want=47255851761664, have=47255851958272
Couldn't read chunk root
Could not open root, trying backup super
warning, device 3 is missing
warning, device 2 is missing
warning, device 5 is missing
warning, device 4 is missing
bytenr mismatch, want=47255851761664, have=47255851958272
Couldn't read chunk root
Could not open root, trying backup super

Thanks in advance to anyone who might be able to suggest ideas.

Stuart
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Can you help explain these OOM crashes?

2016-02-25 Thread Al Viro
On Thu, Feb 25, 2016 at 02:42:51PM -0500, Chris Mason wrote:
> Al, any ideas why get_anon_bdev is doing an atomic allocation here?
> 
>   if (ida_pre_get(_dev_ida, GFP_ATOMIC) == 0)

Because set() callback of sget() runs under sb_lock - it must be atomic
wrt scanning the list of superblock in search of match.  And get_anon_bdev()
is called from such callbacks...

In principle, we could change locking rules for case when test callback
is NULL, except that it's also called from ns_set_super(), which *does*
come along with non-NULL test() (see mount_ns()), so that really doesn't
help...
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Can you help explain these OOM crashes?

2016-02-25 Thread Chris Mason
On Thu, Feb 25, 2016 at 11:20:29AM -0800, Marc MERLIN wrote:
> Which kind of RAM am I missing? :)
> 
> Thanks,
> Marc
> 
> [46320.200703] btrfs: page allocation failure: order:1, mode:0x2204020
> [46320.221174] CPU: 7 PID: 12576 Comm: btrfs Not tainted 
> 4.4.2-amd64-i915-volpreempt-20160213bc1 #3
> [46320.249161] Hardware name: System manufacturer System Product Name/P8H67-M 
> PRO, BIOS 3904 04/27/2013
> [46320.277878]   8801cdb636f0 8134ae0a 
> 
> [46320.301563]  8801cdb63788 81124ab6 0086 
> 0086
> [46320.325248]  88021f5f5e00 fffe 8801cdb63750 
> 8108c770
> [46320.348911] Call Trace:
> [46320.357508]  [] dump_stack+0x44/0x55
> [46320.374222]  [] warn_alloc_failed+0x114/0x12c
> [46320.393259]  [] ? __wake_up+0x44/0x4b
> [46320.410229]  [] __alloc_pages_nodemask+0x7cb/0x84c
> [46320.430671]  [] kmem_getpages+0x5c/0x137
> [46320.448328]  [] fallback_alloc+0x109/0x1b1
> [46320.466472]  [] cache_alloc_node+0x123/0x130
> [46320.486219]  [] kmem_cache_alloc+0xa4/0x14f
> [46320.504600]  [] ida_pre_get+0x32/0xb6
> [46320.521395]  [] get_anon_bdev+0x1f/0xc8

Al, any ideas why get_anon_bdev is doing an atomic allocation here?

if (ida_pre_get(_dev_ida, GFP_ATOMIC) == 0)

[ rest of the oom below for reference ]

-chris

> [46320.538780]  [] btrfs_init_fs_root+0x104/0x14e
> [46320.557889]  [] btrfs_get_fs_root+0xb7/0x1bf
> [46320.576480]  [] create_pending_snapshot+0x65e/0xb09
> [46320.596850]  [] create_pending_snapshots+0x72/0x8e
> [46320.616946]  [] ? create_pending_snapshots+0x72/0x8e
> [46320.637533]  [] btrfs_commit_transaction+0x3a5/0x921
> [46320.658117]  [] btrfs_mksubvol+0x2f4/0x408
> [46320.676044]  [] ? wake_up_atomic_t+0x2c/0x2c
> [46320.694626]  [] 
> btrfs_ioctl_snap_create_transid+0x148/0x17a
> [46320.716984]  [] btrfs_ioctl_snap_create_v2+0xc7/0x110
> [46320.737714]  [] btrfs_ioctl+0x545/0x2630
> [46320.755071]  [] ? 
> mem_cgroup_charge_statistics.isra.23+0x33/0x69
> [46320.778689]  [] ? __lru_cache_add+0x23/0x44
> [46320.796851]  [] ? 
> lru_cache_add_active_or_unevictable+0x2d/0x6b
> [46320.820156]  [] ? set_pte_at+0x9/0xd
> [46320.836407]  [] ? handle_mm_fault+0x4f0/0xf06
> [46320.854969]  [] ? do_mmap+0x2de/0x327
> [46320.871422]  [] do_vfs_ioctl+0x3a1/0x414
> [46320.888953]  [] ? __audit_syscall_entry+0xc0/0xe4
> [46320.908531]  [] ? do_audit_syscall_entry+0x60/0x62
> [46320.928487]  [] SyS_ioctl+0x57/0x79
> [46320.944553]  [] entry_SYSCALL_64_fastpath+0x16/0x75
> [46320.964603] Mem-Info:
> [46320.972173] active_anon:40431 inactive_anon:129104 isolated_anon:0
> [46320.972173]  active_file:414564 inactive_file:956231 isolated_file:0
> [46320.972173]  unevictable:1220 dirty:227385 writeback:4016 unstable:0
> [46320.972173]  slab_reclaimable:46015 slab_unreclaimable:67059
> [46320.972173]  mapped:11769 shmem:2295 pagetables:2380 bounce:0
> [46320.972173]  free:13404 free_pcp:1790 free_cma:0
> [46321.081552] Node 0 DMA free:15888kB min:20kB low:24kB high:28kB 
> active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB 
> unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15976kB 
> managed:15892kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB 
> slab_reclaimable:0kB slab_unreclaimable:4kB kernel_stack:0kB pagetables:0kB 
> unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB 
> writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
> [46321.209469] lowmem_reserve[]: 0 3201 7672 7672
> [46321.223638] Node 0 DMA32 free:27816kB min:4640kB low:5800kB high:6960kB 
> active_anon:72844kB inactive_anon:205672kB active_file:725732kB 
> inactive_file:1484744kB unevictable:1524kB isolated(anon):0kB 
> isolated(file):0kB present:3362068kB managed:3283032kB mlocked:1524kB 
> dirty:206660kB writeback:189832kB mapped:18112kB shmem:3456kB 
> slab_reclaimable:75980kB slab_unreclaimable:103932kB kernel_stack:4800kB 
> pagetables:3932kB unstable:0kB bounce:0kB free_pcp:2160kB local_pcp:380kB 
> free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
> [46321.370242] lowmem_reserve[]: 0 0 4471 4471
> [46321.383748] Node 0 Normal free:14396kB min:6480kB low:8100kB high:9720kB 
> active_anon:88268kB inactive_anon:310756kB active_file:1001176kB 
> inactive_file:1915152kB unevictable:3356kB isolated(anon):0kB 
> isolated(file):0kB present:4708352kB managed:4578512kB mlocked:120259087644kB 
> dirty:274088kB writeback:255988kB mapped:28380kB shmem:5736kB 
> slab_reclaimable:108288kB slab_unreclaimable:163908kB kernel_stack:7312kB 
> pagetables:5540kB unstable:0kB bounce:0kB free_pcp:2468kB local_pcp:688kB 
> free_cma:2628kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
> [46321.534455] lowmem_reserve[]: 0 0 0 0
> [46321.546480] Node 0 DMA: 0*4kB 0*8kB 1*16kB (U) 0*32kB 2*64kB (U) 1*128kB 
> (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15888kB
> [46321.588226] Node 0 DMA32: 6462*4kB (UME) 1397*8kB (UM) 9*16kB (U) 

Can you help explain these OOM crashes?

2016-02-25 Thread Marc MERLIN
Which kind of RAM am I missing? :)

Thanks,
Marc

[46320.200703] btrfs: page allocation failure: order:1, mode:0x2204020
[46320.221174] CPU: 7 PID: 12576 Comm: btrfs Not tainted 
4.4.2-amd64-i915-volpreempt-20160213bc1 #3
[46320.249161] Hardware name: System manufacturer System Product Name/P8H67-M 
PRO, BIOS 3904 04/27/2013
[46320.277878]   8801cdb636f0 8134ae0a 

[46320.301563]  8801cdb63788 81124ab6 0086 
0086
[46320.325248]  88021f5f5e00 fffe 8801cdb63750 
8108c770
[46320.348911] Call Trace:
[46320.357508]  [] dump_stack+0x44/0x55
[46320.374222]  [] warn_alloc_failed+0x114/0x12c
[46320.393259]  [] ? __wake_up+0x44/0x4b
[46320.410229]  [] __alloc_pages_nodemask+0x7cb/0x84c
[46320.430671]  [] kmem_getpages+0x5c/0x137
[46320.448328]  [] fallback_alloc+0x109/0x1b1
[46320.466472]  [] cache_alloc_node+0x123/0x130
[46320.486219]  [] kmem_cache_alloc+0xa4/0x14f
[46320.504600]  [] ida_pre_get+0x32/0xb6
[46320.521395]  [] get_anon_bdev+0x1f/0xc8
[46320.538780]  [] btrfs_init_fs_root+0x104/0x14e
[46320.557889]  [] btrfs_get_fs_root+0xb7/0x1bf
[46320.576480]  [] create_pending_snapshot+0x65e/0xb09
[46320.596850]  [] create_pending_snapshots+0x72/0x8e
[46320.616946]  [] ? create_pending_snapshots+0x72/0x8e
[46320.637533]  [] btrfs_commit_transaction+0x3a5/0x921
[46320.658117]  [] btrfs_mksubvol+0x2f4/0x408
[46320.676044]  [] ? wake_up_atomic_t+0x2c/0x2c
[46320.694626]  [] btrfs_ioctl_snap_create_transid+0x148/0x17a
[46320.716984]  [] btrfs_ioctl_snap_create_v2+0xc7/0x110
[46320.737714]  [] btrfs_ioctl+0x545/0x2630
[46320.755071]  [] ? 
mem_cgroup_charge_statistics.isra.23+0x33/0x69
[46320.778689]  [] ? __lru_cache_add+0x23/0x44
[46320.796851]  [] ? 
lru_cache_add_active_or_unevictable+0x2d/0x6b
[46320.820156]  [] ? set_pte_at+0x9/0xd
[46320.836407]  [] ? handle_mm_fault+0x4f0/0xf06
[46320.854969]  [] ? do_mmap+0x2de/0x327
[46320.871422]  [] do_vfs_ioctl+0x3a1/0x414
[46320.888953]  [] ? __audit_syscall_entry+0xc0/0xe4
[46320.908531]  [] ? do_audit_syscall_entry+0x60/0x62
[46320.928487]  [] SyS_ioctl+0x57/0x79
[46320.944553]  [] entry_SYSCALL_64_fastpath+0x16/0x75
[46320.964603] Mem-Info:
[46320.972173] active_anon:40431 inactive_anon:129104 isolated_anon:0
[46320.972173]  active_file:414564 inactive_file:956231 isolated_file:0
[46320.972173]  unevictable:1220 dirty:227385 writeback:4016 unstable:0
[46320.972173]  slab_reclaimable:46015 slab_unreclaimable:67059
[46320.972173]  mapped:11769 shmem:2295 pagetables:2380 bounce:0
[46320.972173]  free:13404 free_pcp:1790 free_cma:0
[46321.081552] Node 0 DMA free:15888kB min:20kB low:24kB high:28kB 
active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB 
unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15976kB 
managed:15892kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB 
slab_reclaimable:0kB slab_unreclaimable:4kB kernel_stack:0kB pagetables:0kB 
unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB 
writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
[46321.209469] lowmem_reserve[]: 0 3201 7672 7672
[46321.223638] Node 0 DMA32 free:27816kB min:4640kB low:5800kB high:6960kB 
active_anon:72844kB inactive_anon:205672kB active_file:725732kB 
inactive_file:1484744kB unevictable:1524kB isolated(anon):0kB 
isolated(file):0kB present:3362068kB managed:3283032kB mlocked:1524kB 
dirty:206660kB writeback:189832kB mapped:18112kB shmem:3456kB 
slab_reclaimable:75980kB slab_unreclaimable:103932kB kernel_stack:4800kB 
pagetables:3932kB unstable:0kB bounce:0kB free_pcp:2160kB local_pcp:380kB 
free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
[46321.370242] lowmem_reserve[]: 0 0 4471 4471
[46321.383748] Node 0 Normal free:14396kB min:6480kB low:8100kB high:9720kB 
active_anon:88268kB inactive_anon:310756kB active_file:1001176kB 
inactive_file:1915152kB unevictable:3356kB isolated(anon):0kB 
isolated(file):0kB present:4708352kB managed:4578512kB mlocked:120259087644kB 
dirty:274088kB writeback:255988kB mapped:28380kB shmem:5736kB 
slab_reclaimable:108288kB slab_unreclaimable:163908kB kernel_stack:7312kB 
pagetables:5540kB unstable:0kB bounce:0kB free_pcp:2468kB local_pcp:688kB 
free_cma:2628kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
[46321.534455] lowmem_reserve[]: 0 0 0 0
[46321.546480] Node 0 DMA: 0*4kB 0*8kB 1*16kB (U) 0*32kB 2*64kB (U) 1*128kB (U) 
1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15888kB
[46321.588226] Node 0 DMA32: 6462*4kB (UME) 1397*8kB (UM) 9*16kB (U) 0*32kB 
0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 37168kB
[46321.628535] Node 0 Normal: 4296*4kB (UMEC) 880*8kB (UMEC) 130*16kB (UMC) 
9*32kB (C) 1*64kB (C) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 
26656kB
[46321.672956] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 
hugepages_size=2048kB
[46321.699161] 1267593 total pagecache pages
[46321.712115] 933 pages in swap 

[PATCH v2 03/13] btrfs: maintain consistency in logging to help debugging

2016-02-12 Thread Anand Jain
Optional Label may or may not be set, or it might be set at some time
later. However while debugging to search through the kernel logs the
scripts would need the logs to be consistent, so logs search key words
shouldn't depend on the optional variables, instead fsid is better.

Signed-off-by: Anand Jain 
---
v2: fix commit log

 fs/btrfs/volumes.c | 9 ++---
 1 file changed, 2 insertions(+), 7 deletions(-)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 9860b10..36108e9 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -1080,13 +1080,8 @@ int btrfs_scan_one_device(const char *path, fmode_t 
flags, void *holder,
 
ret = device_list_add(path, disk_super, devid, fs_devices_ret);
if (ret > 0) {
-   if (disk_super->label[0]) {
-   printk(KERN_INFO "BTRFS: device label %s ", 
disk_super->label);
-   } else {
-   printk(KERN_INFO "BTRFS: device fsid %pU ", 
disk_super->fsid);
-   }
-
-   printk(KERN_CONT "devid %llu transid %llu %s\n", devid, 
transid, path);
+   printk(KERN_INFO "BTRFS: device fsid %pU devid %llu transid 
%llu %s\n",
+   disk_super->fsid, devid, 
transid, path);
ret = 0;
}
if (!ret && fs_devices_ret)
-- 
2.7.0

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs check help

2015-11-29 Thread Qu Wenruo



Vincent Olivier wrote on 2015/11/27 06:25 -0500:



On Nov 26, 2015, at 10:03 PM, Vincent Olivier <vinc...@up4.com> wrote:



On Nov 25, 2015, at 8:44 PM, Qu Wenruo <quwen...@cn.fujitsu.com> wrote:



Vincent Olivier wrote on 2015/11/25 11:51 -0500:

I should probably point out that there is 64GB of RAM on this machine and it’s 
a dual Xeon processor (LGA2011-3) system. Also, there is only Btrfs served via 
Samba and the kernel panic was caused Btrfs (as per what I remember from the 
log on the screen just before I rebooted) and happened in the middle of the 
night when zero (0) client was connected.

You will find below the full “btrfs check” log for each device in the order it 
is listed by “btrfs fi show”.


There is really no need to do such thing, as btrfs is able to manage multiple 
device, calling btrfsck on any of them is enough as long as it's not hugely 
damaged.



Ca I get a strong confirmation that I should run with the “—repair” option on 
each device? Thanks.


YES.

Inode nbytes fix is *VERY* safe as long as it's the only error.

Although it's not that convincing since the inode nbytes fix code is written by 
myself and authors always tend to believe their codes are good
But at least, some other users with more complicated problem(with inode nbytes 
error) fixed it.

The last decision is still on you anyway.


I will do it on the first device from the “fi show” output and report.



ok this doesn’t look good. i ran —repair and check again and it looks even 
worse. please help.


[root@3dcpc5 ~]# btrfs check --repair /dev/sdk
enabling repair mode
Checking filesystem on /dev/sdk
UUID: 6a742786-070d-4557-9e67-c73b84967bf5
checking extents
Fixed 0 roots.
checking free space cache
cache and super generation don't match, space cache will be invalidated
checking fs roots
reset nbytes for ino 1341670 root 5
reset nbytes for ino 1341670 root 11406


As mentioned by other guys, inode nbytes seems to be fixed.

But to make it sure, if the inode is a directory or a normal file?


warning line 3653


Seems to be a unexpected warning.
The subvolume root seems to be shared by other subvolume.

It may be one corner case for inode nbytes repair code.
But it seems no harm yet.


checking csums
checking root refs
found 19343374874998 bytes used err is 0
total csum bytes: 18863243900
total tree bytes: 27413118976
total fs tree bytes: 4455694336
total extent tree bytes: 3077373952
btree space waste bytes: 2882193883
file data blocks allocated: 19461564538880
  referenced 20155355832320





root@3dcpc5 ~]# btrfs check /dev/sdk
Checking filesystem on /dev/sdk
UUID: 6a742786-070d-4557-9e67-c73b84967bf5
checking extents
checking free space cache
block group 53328591454208 has wrong amount of free space
failed to load free space cache for block group 53328591454208
block group 53329665196032 has wrong amount of free space
failed to load free space cache for block group 53329665196032
Wanted offset 58836887044096, found 58836887011328
Wanted offset 58836887044096, found 58836887011328
cache appears valid but isnt 58836887011328
Wanted offset 60505481887744, found 60505481805824
Wanted offset 60505481887744, found 60505481805824
cache appears valid but isnt 60505481805824
Wanted bytes 16384, found 81920 for off 60979001966592
Wanted bytes 1073725440, found 81920 for off 60979001966592
cache appears valid but isnt 60979001950208
Wanted offset 61297908056064, found 61297908006912
Wanted offset 61297908056064, found 61297908006912
cache appears valid but isnt 61297903271936
Wanted bytes 32768, found 16384 for off 61711301296128
Wanted bytes 1066319872, found 16384 for off 61711301296128
cache appears valid but isnt 61711293874176
There is no free space entry for 62691824041984-62691824058368
There is no free space entry for 62691824041984-62692693901312
cache appears valid but isnt 62691620159488
There is no free space entry for 63723505205248-63723505221632
There is no free space entry for 63723505205248-63724559794176
cache appears valid but isnt 63723486052352
Wanted bytes 32768, found 16384 for off 64746920902656
Wanted bytes 914849792, found 16384 for off 64746920902656
cache appears valid but isnt 64746762010624
There is no free space entry for 65770368401408-65770368434176
There is no free space entry for 65770368401408-6577710720
cache appears valid but isnt 65770037968896
Wanted offset 66758954270720, found 66758954221568
Wanted offset 66758954270720, found 66758954221568
cache appears valid but isnt 66758954188800
block group 70204591702016 has wrong amount of free space
failed to load free space cache for block group 70204591702016
block group 70205665443840 has wrong amount of free space
failed to load free space cache for block group 70205665443840
block group 70206739185664 has wrong amount of free space
failed to load free space cache for block group 70206739185664
Wanted offset 70216543715328, found 70216543698944
Wanted offset 70216543715328, found 70216543698944
cache appears va

Re: btrfs check help

2015-11-27 Thread Henk Slager
My experience/interpretation of the 2 checks is that it is OK, see
some more comments inserted below. Hopefully a developer of
btrfs-progs can comment in more detail.

> [root@3dcpc5 ~]# btrfs check --repair /dev/sdk
> enabling repair mode
> Checking filesystem on /dev/sdk
> UUID: 6a742786-070d-4557-9e67-c73b84967bf5
> checking extents
> Fixed 0 roots.
> checking free space cache
> cache and super generation don't match, space cache will be invalidated
This might be there because of the crash earlier, but a cache
invalidation should not be a problem.
> checking fs roots
> reset nbytes for ino 1341670 root 5
> reset nbytes for ino 1341670 root 11406
At least the nbytes error seems to be fixed.
> warning line 3653
> checking csums
> checking root refs
> found 19343374874998 bytes used err is 0
> total csum bytes: 18863243900
> total tree bytes: 27413118976
> total fs tree bytes: 4455694336
> total extent tree bytes: 3077373952
> btree space waste bytes: 2882193883
> file data blocks allocated: 19461564538880
>  referenced 20155355832320

The second readonly check partly can't deal with the just invalidated
space cache I think (I assume you haven't mounted and/or/ used
read-write the filesystem in between), but even if the space cache
wouldn't be touched in the --repair check, my experience is that those
errors, like in dmesg on my system:
[38018.645187] BTRFS info (device sdi): The free space cache file
(6258971115520) is invalid. skip it
   will disappear over time when the filesystem is filled/used.
This particular error is from a backup fs where one disk had gone bad.
A btrfs replace still worked and just after that, I saw many of those
errors, but now after a few weeks they are mostly gone. I did not
explicitly unmount or check--repair the fs, I just had to reboot the
system for another reason.
Your kernel+tools is new enough, you probably want to have a look at
the 'Space cache control' options on the wiki:
https://btrfs.wiki.kernel.org/index.php/Mount_options
  before you decide what to do.

> root@3dcpc5 ~]# btrfs check /dev/sdk
> Checking filesystem on /dev/sdk
> UUID: 6a742786-070d-4557-9e67-c73b84967bf5
> checking extents
> checking free space cache
> block group 53328591454208 has wrong amount of free space
> failed to load free space cache for block group 53328591454208
> block group 53329665196032 has wrong amount of free space
> failed to load free space cache for block group 53329665196032
> Wanted offset 58836887044096, found 58836887011328
> Wanted offset 58836887044096, found 58836887011328
> cache appears valid but isnt 58836887011328
> Wanted offset 60505481887744, found 60505481805824
> Wanted offset 60505481887744, found 60505481805824
> cache appears valid but isnt 60505481805824
> Wanted bytes 16384, found 81920 for off 60979001966592
> Wanted bytes 1073725440, found 81920 for off 60979001966592
> cache appears valid but isnt 60979001950208
> Wanted offset 61297908056064, found 61297908006912
> Wanted offset 61297908056064, found 61297908006912
> cache appears valid but isnt 61297903271936
> Wanted bytes 32768, found 16384 for off 61711301296128
> Wanted bytes 1066319872, found 16384 for off 61711301296128
> cache appears valid but isnt 61711293874176
> There is no free space entry for 62691824041984-62691824058368
> There is no free space entry for 62691824041984-62692693901312
> cache appears valid but isnt 62691620159488
> There is no free space entry for 63723505205248-63723505221632
> There is no free space entry for 63723505205248-63724559794176
> cache appears valid but isnt 63723486052352
> Wanted bytes 32768, found 16384 for off 64746920902656
> Wanted bytes 914849792, found 16384 for off 64746920902656
> cache appears valid but isnt 64746762010624
> There is no free space entry for 65770368401408-65770368434176
> There is no free space entry for 65770368401408-6577710720
> cache appears valid but isnt 65770037968896
> Wanted offset 66758954270720, found 66758954221568
> Wanted offset 66758954270720, found 66758954221568
> cache appears valid but isnt 66758954188800
> block group 70204591702016 has wrong amount of free space
> failed to load free space cache for block group 70204591702016
> block group 70205665443840 has wrong amount of free space
> failed to load free space cache for block group 70205665443840
> block group 70206739185664 has wrong amount of free space
> failed to load free space cache for block group 70206739185664
> Wanted offset 70216543715328, found 70216543698944
> Wanted offset 70216543715328, found 70216543698944
> cache appears valid but isnt 70216537079808
> Wanted offset 71025067474944, found 71025067409408
> Wanted offset 71025067474944, found 71025067409408
> cache appears valid but isnt 71025064673280
> Wanted offset 71455641354240, found 71455641337856
> Wanted offset 71455641354240, found 71455641337856
> cache appears valid but isnt 71455635144704
> block group 71662867316736 has wrong amount of free space
> failed to load free space 

Re: btrfs check help

2015-11-27 Thread Vincent Olivier

> On Nov 26, 2015, at 10:03 PM, Vincent Olivier <vinc...@up4.com> wrote:
> 
>> 
>> On Nov 25, 2015, at 8:44 PM, Qu Wenruo <quwen...@cn.fujitsu.com> wrote:
>> 
>> 
>> 
>> Vincent Olivier wrote on 2015/11/25 11:51 -0500:
>>> I should probably point out that there is 64GB of RAM on this machine and 
>>> it’s a dual Xeon processor (LGA2011-3) system. Also, there is only Btrfs 
>>> served via Samba and the kernel panic was caused Btrfs (as per what I 
>>> remember from the log on the screen just before I rebooted) and happened in 
>>> the middle of the night when zero (0) client was connected.
>>> 
>>> You will find below the full “btrfs check” log for each device in the order 
>>> it is listed by “btrfs fi show”.
>> 
>> There is really no need to do such thing, as btrfs is able to manage 
>> multiple device, calling btrfsck on any of them is enough as long as it's 
>> not hugely damaged.
>> 
>>> 
>>> Ca I get a strong confirmation that I should run with the “—repair” option 
>>> on each device? Thanks.
>> 
>> YES.
>> 
>> Inode nbytes fix is *VERY* safe as long as it's the only error.
>> 
>> Although it's not that convincing since the inode nbytes fix code is written 
>> by myself and authors always tend to believe their codes are good
>> But at least, some other users with more complicated problem(with inode 
>> nbytes error) fixed it.
>> 
>> The last decision is still on you anyway.
> 
> I will do it on the first device from the “fi show” output and report.


ok this doesn’t look good. i ran —repair and check again and it looks even 
worse. please help.


[root@3dcpc5 ~]# btrfs check --repair /dev/sdk
enabling repair mode
Checking filesystem on /dev/sdk
UUID: 6a742786-070d-4557-9e67-c73b84967bf5
checking extents
Fixed 0 roots.
checking free space cache
cache and super generation don't match, space cache will be invalidated
checking fs roots
reset nbytes for ino 1341670 root 5
reset nbytes for ino 1341670 root 11406
warning line 3653
checking csums
checking root refs
found 19343374874998 bytes used err is 0
total csum bytes: 18863243900
total tree bytes: 27413118976
total fs tree bytes: 4455694336
total extent tree bytes: 3077373952
btree space waste bytes: 2882193883
file data blocks allocated: 19461564538880
 referenced 20155355832320





root@3dcpc5 ~]# btrfs check /dev/sdk
Checking filesystem on /dev/sdk
UUID: 6a742786-070d-4557-9e67-c73b84967bf5
checking extents
checking free space cache
block group 53328591454208 has wrong amount of free space
failed to load free space cache for block group 53328591454208
block group 53329665196032 has wrong amount of free space
failed to load free space cache for block group 53329665196032
Wanted offset 58836887044096, found 58836887011328
Wanted offset 58836887044096, found 58836887011328
cache appears valid but isnt 58836887011328
Wanted offset 60505481887744, found 60505481805824
Wanted offset 60505481887744, found 60505481805824
cache appears valid but isnt 60505481805824
Wanted bytes 16384, found 81920 for off 60979001966592
Wanted bytes 1073725440, found 81920 for off 60979001966592
cache appears valid but isnt 60979001950208
Wanted offset 61297908056064, found 61297908006912
Wanted offset 61297908056064, found 61297908006912
cache appears valid but isnt 61297903271936
Wanted bytes 32768, found 16384 for off 61711301296128
Wanted bytes 1066319872, found 16384 for off 61711301296128
cache appears valid but isnt 61711293874176
There is no free space entry for 62691824041984-62691824058368
There is no free space entry for 62691824041984-62692693901312
cache appears valid but isnt 62691620159488
There is no free space entry for 63723505205248-63723505221632
There is no free space entry for 63723505205248-63724559794176
cache appears valid but isnt 63723486052352
Wanted bytes 32768, found 16384 for off 64746920902656
Wanted bytes 914849792, found 16384 for off 64746920902656
cache appears valid but isnt 64746762010624
There is no free space entry for 65770368401408-65770368434176
There is no free space entry for 65770368401408-6577710720
cache appears valid but isnt 65770037968896
Wanted offset 66758954270720, found 66758954221568
Wanted offset 66758954270720, found 66758954221568
cache appears valid but isnt 66758954188800
block group 70204591702016 has wrong amount of free space
failed to load free space cache for block group 70204591702016
block group 70205665443840 has wrong amount of free space
failed to load free space cache for block group 70205665443840
block group 70206739185664 has wrong amount of free space
failed to load free space cache for block group 70206739185664
Wanted offset 70216543715328, found 70216543698944
Wanted offset 70216543715328, found 70216543698944
cache appears valid b

Re: btrfs check help

2015-11-27 Thread Chris Murphy
On Fri, Nov 27, 2015 at 4:25 AM, Vincent Olivier  wrote:
>
> [root@3dcpc5 ~]# btrfs check --repair /dev/sdk
> enabling repair mode
> Checking filesystem on /dev/sdk
> UUID: 6a742786-070d-4557-9e67-c73b84967bf5
> checking extents
> Fixed 0 roots.
> checking free space cache
> cache and super generation don't match, space cache will be invalidated
> checking fs roots
> reset nbytes for ino 1341670 root 5
> reset nbytes for ino 1341670 root 11406
> warning line 3653


I'm not sure what that last line means.



> root@3dcpc5 ~]# btrfs check /dev/sdk
> Checking filesystem on /dev/sdk
> UUID: 6a742786-070d-4557-9e67-c73b84967bf5
> checking extents
> checking free space cache
> block group 53328591454208 has wrong amount of free space
> failed to load free space cache for block group 53328591454208
> block group 53329665196032 has wrong amount of free space
> failed to load free space cache for block group 53329665196032
> Wanted offset 58836887044096, found 58836887011328
> Wanted offset 58836887044096, found 58836887011328
> cache appears valid but isnt 58836887011328
> Wanted offset 60505481887744, found 60505481805824
> Wanted offset 60505481887744, found 60505481805824
> cache appears valid but isnt 60505481805824
> Wanted bytes 16384, found 81920 for off 60979001966592
> Wanted bytes 1073725440, found 81920 for off 60979001966592
> cache appears valid but isnt 60979001950208
> Wanted offset 61297908056064, found 61297908006912
> Wanted offset 61297908056064, found 61297908006912
> cache appears valid but isnt 61297903271936
> Wanted bytes 32768, found 16384 for off 61711301296128
> Wanted bytes 1066319872, found 16384 for off 61711301296128
> cache appears valid but isnt 61711293874176
> There is no free space entry for 62691824041984-62691824058368
> There is no free space entry for 62691824041984-62692693901312
> cache appears valid but isnt 62691620159488
> There is no free space entry for 63723505205248-63723505221632
> There is no free space entry for 63723505205248-63724559794176
> cache appears valid but isnt 63723486052352
> Wanted bytes 32768, found 16384 for off 64746920902656
> Wanted bytes 914849792, found 16384 for off 64746920902656
> cache appears valid but isnt 64746762010624
> There is no free space entry for 65770368401408-65770368434176
> There is no free space entry for 65770368401408-6577710720
> cache appears valid but isnt 65770037968896
> Wanted offset 66758954270720, found 66758954221568
> Wanted offset 66758954270720, found 66758954221568
> cache appears valid but isnt 66758954188800
> block group 70204591702016 has wrong amount of free space
> failed to load free space cache for block group 70204591702016
> block group 70205665443840 has wrong amount of free space
> failed to load free space cache for block group 70205665443840
> block group 70206739185664 has wrong amount of free space
> failed to load free space cache for block group 70206739185664
> Wanted offset 70216543715328, found 70216543698944
> Wanted offset 70216543715328, found 70216543698944
> cache appears valid but isnt 70216537079808
> Wanted offset 71025067474944, found 71025067409408
> Wanted offset 71025067474944, found 71025067409408
> cache appears valid but isnt 71025064673280
> Wanted offset 71455641354240, found 71455641337856
> Wanted offset 71455641354240, found 71455641337856
> cache appears valid but isnt 71455635144704
> block group 71662867316736 has wrong amount of free space
> failed to load free space cache for block group 71662867316736
> block group 71663941058560 has wrong amount of free space
> failed to load free space cache for block group 71663941058560
> There is no free space entry for 72725872967680-72725872984064
> There is no free space entry for 72725872967680-72726945464320
> cache appears valid but isnt 72725871722496
> block group 73207981801472 has wrong amount of free space
> failed to load free space cache for block group 73207981801472
> found 19343374940534 bytes used err is -22
> total csum bytes: 18863243900
> total tree bytes: 27413184512
> total fs tree bytes: 4455727104
> total extent tree bytes: 3077406720
> btree space waste bytes: 2882234096
> file data blocks allocated: 19461573357568
>  referenced 20155367563264


Except for the bytes used err is -22, I think this is just
acknowledging that the space caches are invalid, i.e. not a surprise.
It should get rebuilt at mount time, depending on the size of the file
system, it might take a while (?).


-- 
Chris Murphy
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs check help

2015-11-26 Thread Vincent Olivier

> On Nov 25, 2015, at 8:44 PM, Qu Wenruo  wrote:
> 
> 
> 
> Vincent Olivier wrote on 2015/11/25 11:51 -0500:
>> I should probably point out that there is 64GB of RAM on this machine and 
>> it’s a dual Xeon processor (LGA2011-3) system. Also, there is only Btrfs 
>> served via Samba and the kernel panic was caused Btrfs (as per what I 
>> remember from the log on the screen just before I rebooted) and happened in 
>> the middle of the night when zero (0) client was connected.
>> 
>> You will find below the full “btrfs check” log for each device in the order 
>> it is listed by “btrfs fi show”.
> 
> There is really no need to do such thing, as btrfs is able to manage multiple 
> device, calling btrfsck on any of them is enough as long as it's not hugely 
> damaged.
> 
>> 
>> Ca I get a strong confirmation that I should run with the “—repair” option 
>> on each device? Thanks.
> 
> YES.
> 
> Inode nbytes fix is *VERY* safe as long as it's the only error.
> 
> Although it's not that convincing since the inode nbytes fix code is written 
> by myself and authors always tend to believe their codes are good
> But at least, some other users with more complicated problem(with inode 
> nbytes error) fixed it.
> 
> The last decision is still on you anyway.

I will do it on the first device from the “fi show” output and report.

Thanks,

Vincent

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs check help

2015-11-25 Thread Vincent Olivier
I should probably point out that there is 64GB of RAM on this machine and it’s 
a dual Xeon processor (LGA2011-3) system. Also, there is only Btrfs served via 
Samba and the kernel panic was caused Btrfs (as per what I remember from the 
log on the screen just before I rebooted) and happened in the middle of the 
night when zero (0) client was connected.

You will find below the full “btrfs check” log for each device in the order it 
is listed by “btrfs fi show”.

Ca I get a strong confirmation that I should run with the “—repair” option on 
each device? Thanks.

Vincent


Checking filesystem on /dev/sdk
UUID: 6a742786-070d-4557-9e67-c73b84967bf5
checking extents [o]
checking free space cache [.]
root 5 inode 1341670 errors 400, nbytes wrong
root 11406 inode 1341670 errors 400, nbytes wrong

found 19328980191604 bytes used err is 1
total csum bytes: 18849205856
total tree bytes: 27393392640
total fs tree bytes: 4452958208
total extent tree bytes: 3075571712
btree space waste bytes: 2881050910
file data blocks allocated: 19445786390528
referenced 20138885959680
Checking filesystem on /dev/sdp
UUID: 6a742786-070d-4557-9e67-c73b84967bf5
checking extents [O]
checking free space cache [o]
root 5 inode 1341670 errors 400, nbytes wrong
root 11406 inode 1341670 errors 400, nbytes wrong

found 19328980191604 bytes used err is 1
total csum bytes: 18849205856
total tree bytes: 27393392640
total fs tree bytes: 4452958208
total extent tree bytes: 3075571712
btree space waste bytes: 2881050910
file data blocks allocated: 19445786390528
referenced 20138885959680
Checking filesystem on /dev/sdi
UUID: 6a742786-070d-4557-9e67-c73b84967bf5
checking extents [.]
checking free space cache [.]
root 5 inode 1341670 errors 400, nbytes wrong
root 11406 inode 1341670 errors 400, nbytes wrong

found 19328980191604 bytes used err is 1
total csum bytes: 18849205856
total tree bytes: 27393392640
total fs tree bytes: 4452958208
total extent tree bytes: 3075571712
btree space waste bytes: 2881050910
file data blocks allocated: 19445786390528
referenced 20138885959680
Checking filesystem on /dev/sdq
UUID: 6a742786-070d-4557-9e67-c73b84967bf5
checking extents [o]
checking free space cache [o]
root 5 inode 1341670 errors 400, nbytes wrong
root 11406 inode 1341670 errors 400, nbytes wrong

found 19328980191604 bytes used err is 1
total csum bytes: 18849205856
total tree bytes: 27393392640
total fs tree bytes: 4452958208
total extent tree bytes: 3075571712
btree space waste bytes: 2881050910
file data blocks allocated: 19445786390528
referenced 20138885959680
Checking filesystem on /dev/sdh
UUID: 6a742786-070d-4557-9e67-c73b84967bf5
checking extents [o]
checking free space cache [.]
root 5 inode 1341670 errors 400, nbytes wrong
root 11406 inode 1341670 errors 400, nbytes wrong

found 19328980191604 bytes used err is 1
total csum bytes: 18849205856
total tree bytes: 27393392640
total fs tree bytes: 4452958208
total extent tree bytes: 3075571712
btree space waste bytes: 2881050910
file data blocks allocated: 19445786390528
referenced 20138885959680
Checking filesystem on /dev/sdm
UUID: 6a742786-070d-4557-9e67-c73b84967bf5
checking extents [O]
checking free space cache [.]
root 5 inode 1341670 errors 400, nbytes wrong
root 11406 inode 1341670 errors 400, nbytes wrong

found 19328980191604 bytes used err is 1
total csum bytes: 18849205856
total tree bytes: 27393392640
total fs tree bytes: 4452958208
total extent tree bytes: 3075571712
btree space waste bytes: 2881050910
file data blocks allocated: 19445786390528
referenced 20138885959680
Checking filesystem on /dev/sdj
UUID: 6a742786-070d-4557-9e67-c73b84967bf5
checking extents [.]
checking free space cache [.]
root 5 inode 1341670 errors 400, nbytes wrong
root 11406 inode 1341670 errors 400, nbytes wrong

found 19328980191604 bytes used err is 1
total csum bytes: 18849205856
total tree bytes: 27393392640
total fs tree bytes: 4452958208
total extent tree bytes: 3075571712
btree space waste bytes: 2881050910
file data blocks allocated: 19445786390528
referenced 20138885959680
Checking filesystem on /dev/sdo
UUID: 6a742786-070d-4557-9e67-c73b84967bf5
checking extents [O]
checking free space cache [.]
checking fs roots [o]
root 5 inode 1341670 errors 400, nbytes wrong
root 11406 inode 1341670 errors 400, nbytes wrong

found 19328980191604 bytes used err is 1
total csum bytes: 18849205856
total tree bytes: 27393392640
total fs tree bytes: 4452958208
total extent tree bytes: 3075571712
btree space waste bytes: 2881050910
file data blocks allocated: 19445786390528
referenced 20138885959680
Checking filesystem on /dev/sdg
UUID: 6a742786-070d-4557-9e67-c73b84967bf5
checking extents [o]
checking free space cache [o]
root 5 inode 1341670 errors 400, nbytes wrong
root 11406 inode 1341670 errors 400, nbytes wrong

found 19328980191604 bytes used err is 1
total csum bytes: 18849205856
total tree bytes: 27393392640
total fs tree bytes: 4452958208
total extent tree bytes: 3075571712
btree space waste bytes: 

Re: btrfs check help

2015-11-25 Thread Henk Slager
[...]
> Ca I get a strong confirmation that I should run with the “—repair” option on 
> each device? Thanks.
>
> Vincent
>
>
> Checking filesystem on /dev/sdk
> UUID: 6a742786-070d-4557-9e67-c73b84967bf5
> checking extents [o]
> checking free space cache [.]
> root 5 inode 1341670 errors 400, nbytes wrong
> root 11406 inode 1341670 errors 400, nbytes wrong
[...]

I just remember that I have seen this kind of error before; luckily, I
found the btrfs check output (august 2015) on some backup of an old
snapshot; In my case it was on a raid5 fs from november 2013, 7 small
txt files (all several 100 bytes) and the 7 errors are repeated for
about 10 snapshots. I did   # find . -inum to find
the files. 2 of the 7 were still in the latest/actual subvol and I
just recreated them.

The errors from the older snapshots are still there as far as I
remember from the last btrfs check I did (with kernel 4.3.0 tools
4.3.x). The fs is converted to raid10 since 3 months. As I also got
other fake errors (as in this
https://www.mail-archive.com/linux-btrfs%40vger.kernel.org/msg48325.html
), I won't run a repair until I see proof that this 'errors 400,
nbytes wrong' is a risk for file-server stability.
I just see that on an archive clone fs with this 10 old snapshots
(created via send|receive), there is no error.

In your case, it is likely just 1 small file in rootvol (5) and the
same allocation in other subvol (11406), so maybe you can fix this
like I did and don't run a '--repair'
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs check help

2015-11-24 Thread Austin S Hemmelgarn

On 2015-11-24 12:06, Vincent Olivier wrote:

Hi,

Woke up this morning with a kernel panic (for which I do not have details). 
Please find below the output for btrfs check. Is this normal ? What should I do 
? Arch Linux 4.2.5. Btrfs-utils 4.3.1. 17x4TB RAID10.
You get bonus points for being on a reasonably up-to-date kernel and 
userspace :)


This is actually a pretty tame check result for a filesystem that's been 
through kernel panic. I think everything listed here is safe for check 
to fix, but I would suggest waiting until the devs provide opinions 
before actually running with --repair.  I would also suggest comparing 
results between the different devices in the FS, if things are 
drastically different, you may have issues that check can't fix on it's own.

[root@3dcpc5 ~]# btrfs check /dev/sdk
Checking filesystem on /dev/sdk
UUID: 6a742786-070d-4557-9e67-c73b84967bf5
checking extents
checking free space cache
checking fs roots
These next two lines are errors, but I'm not 100% certain if it's safe 
to have check fix them:

root 5 inode 1341670 errors 400, nbytes wrong
root 11406 inode 1341670 errors 400, nbytes wrong
This next one is also an error, and I am fairly certain that it's safe 
to have check fix as long as the number at the end is not too big.

found 19328809638262 bytes used err is 1

The rest is just reference info

total csum bytes: 18849042724
total tree bytes: 27389886464
total fs tree bytes: 4449746944
total extent tree bytes: 3075457024
btree space waste bytes: 2880474254
The only other thing I know that's worth mentioning is that if the 
numbers on these next two lines don't match, you may be missing some 
writes from right before the crash.

file data blocks allocated: 19430708535296
referenced 20123773407232






smime.p7s
Description: S/MIME Cryptographic Signature


Re: btrfs check help

2015-11-24 Thread Hugo Mills
On Tue, Nov 24, 2015 at 03:28:28PM -0500, Austin S Hemmelgarn wrote:
> On 2015-11-24 12:06, Vincent Olivier wrote:
> >Hi,
> >
> >Woke up this morning with a kernel panic (for which I do not have details). 
> >Please find below the output for btrfs check. Is this normal ? What should I 
> >do ? Arch Linux 4.2.5. Btrfs-utils 4.3.1. 17x4TB RAID10.
> You get bonus points for being on a reasonably up-to-date kernel and
> userspace :)
> 
> This is actually a pretty tame check result for a filesystem that's
> been through kernel panic. I think everything listed here is safe
> for check to fix, but I would suggest waiting until the devs provide
> opinions before actually running with --repair.  I would also
> suggest comparing results between the different devices in the FS,
> if things are drastically different, you may have issues that check
> can't fix on it's own.
> >[root@3dcpc5 ~]# btrfs check /dev/sdk
> >Checking filesystem on /dev/sdk
> >UUID: 6a742786-070d-4557-9e67-c73b84967bf5
> >checking extents
> >checking free space cache
> >checking fs roots
> These next two lines are errors, but I'm not 100% certain if it's
> safe to have check fix them:
> >root 5 inode 1341670 errors 400, nbytes wrong
> >root 11406 inode 1341670 errors 400, nbytes wrong

   I think so yes.

> This next one is also an error, and I am fairly certain that it's
> safe to have check fix as long as the number at the end is not too
> big.
> >found 19328809638262 bytes used err is 1

   Agreed.

   Hugo.

> The rest is just reference info
> >total csum bytes: 18849042724
> >total tree bytes: 27389886464
> >total fs tree bytes: 4449746944
> >total extent tree bytes: 3075457024
> >btree space waste bytes: 2880474254
> The only other thing I know that's worth mentioning is that if the
> numbers on these next two lines don't match, you may be missing some
> writes from right before the crash.
> >file data blocks allocated: 19430708535296
> >referenced 20123773407232

-- 
Hugo Mills | Great films about cricket: Umpire of the Rising Sun
hugo@... carfax.org.uk |
http://carfax.org.uk/  |
PGP: E2AB1DE4  |


signature.asc
Description: Digital signature


btrfs check help

2015-11-24 Thread Vincent Olivier
Hi,

Woke up this morning with a kernel panic (for which I do not have details). 
Please find below the output for btrfs check. Is this normal ? What should I do 
? Arch Linux 4.2.5. Btrfs-utils 4.3.1. 17x4TB RAID10.

Regards,

Vincent

[root@3dcpc5 ~]# btrfs check /dev/sdk
Checking filesystem on /dev/sdk
UUID: 6a742786-070d-4557-9e67-c73b84967bf5
checking extents
checking free space cache
checking fs roots
root 5 inode 1341670 errors 400, nbytes wrong
root 11406 inode 1341670 errors 400, nbytes wrong
found 19328809638262 bytes used err is 1
total csum bytes: 18849042724
total tree bytes: 27389886464
total fs tree bytes: 4449746944
total extent tree bytes: 3075457024
btree space waste bytes: 2880474254
file data blocks allocated: 19430708535296
referenced 20123773407232--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: corrupted RAID1: unsuccessful recovery / help needed

2015-10-30 Thread Duncan
Lukas Pirl posted on Fri, 30 Oct 2015 10:43:41 +1300 as excerpted:

> If there is one subvolume that contains all other (read only) snapshots
> and there is insufficient storage to copy them all separately:
> Is there an elegant way to preserve those when moving the data across
> disks?

AFAIK, no elegant way without a writable mount.

Tho I'm not sure, btrfs send, to a btrfs elsewhere using receive, may 
work, since you did specify read-only snapshots, which is what send 
normally works with in ordered to avoid changes to the snapshot while 
it's sending it.  My own use-case doesn't involve either snapshots or 
send/receive, however, so I'm not sure if send can work with a read-only 
filesystem or not, but I think its normal method of operation is to 
create those read-only snapshots itself, which of course would require a 
writable filesystem, so I'm guessing it won't work unless you can 
convince it to use the read-only mounts as-is.

The less elegant way would involve manual deduplication.  Copy one 
snapshot, then another, and dedup what hasn't changed between the two, 
then add a third and dedup again. ...  Depending on the level of dedup 
(file vs block level) and the level of change in your filesystem, this 
should ultimately take about the same level of space as a full backup 
plus a series of incrementals.


Meanwhile, this does reinforce the point that snapshots don't replace 
full backups, that being the reason I don't use them here, since if the 
filesystem goes bad, it'll very likely take all the snapshots with it.

Snapshots do tend to be pretty convenient, arguably /too/ convenient and 
near-zero-cost to make, as people then tend to just do scheduled 
snapshots, without thinking about their overhead and maintenance costs on 
the filesystem, until they already have problems.  I'm not sure if you 
are a regular list reader and have thus seen my normal spiel on btrfs 
snapshot scaling and recommended limits to avoid problems or not, so if 
not, here's a slightly condensed version...

Btrfs has scaling issues that appear when trying to manage too many 
snapshots.  These tend to appear first in tools like balance and check, 
where time to process a filesystem goes up dramatically as the number of 
snapshots increases, to the point where it can become entirely 
impractical to manage at all somewhere near the 100k snapshots range, and 
is already dramatically affecting runtime at 10k snapshots.

As a result, I recommend keeping per-subvol snapshots to 250-ish, which 
will allow snapshotting four subvolumes while still keeping total 
filesystem snapshots to 1000, or eight subvolumes at a filesystem total 
of 2000 snapshots, levels where the scaling issues should remain well 
within control.  And 250-ish snapshots per subvolume is actually very 
reasonable even with half-hour scheduled snapshotting, provided a 
reasonable scheduled snapshot thinning program is also implemented, 
cutting say to hourly after six hours, six-hourly after a day, 12 hourly 
after 2 days, daily after a week, and weekly after four weeks to a 
quarter (13 weeks).  Out beyond a quarter or two, certainly within a 
year, longer term backups to other media should be done, and snapshots 
beyond that can be removed entirely, freeing up the space the old 
snapshots kept locked down and helping to keep the btrfs healthy and 
functioning well within its practical scalability limits.

Because a balance that takes a month to complete because it's dealing 
with a few hundred k snapshots is in practice (for most people) not 
worthwhile to do at all, and also in practice, a year or even six months 
out, are you really going to care about the precise half-hour snapshot, 
or is the next daily or weekly snapshot going to be just as good, and a 
whole lot easier to find among a couple hundred snapshots than hundreds 
of thousands?

If you have far too many snapshots, perhaps this sort of thinning 
strategy will as well allow you to copy and dedup only key snapshots, say 
weekly plus daily for the last week, doing the backup thing manually, as 
well, modifying the thinning strategy accordingly if necessary to get it 
to fit.  Tho using the copy and dedup strategy above will still require 
at least double the full space of a single copy, plus the space necessary 
for each deduped snapshot copy you keep, since the dedup occurs after the 
copy.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: corrupted RAID1: unsuccessful recovery / help needed

2015-10-30 Thread Duncan
Lukas Pirl posted on Fri, 30 Oct 2015 10:43:41 +1300 as excerpted:

> Is e.g. "balance" also influenced by the userspace tools or does
> the kernel the actual work?

btrfs balance is done "online", that is, on the (writable-)mounted 
filesystem, and the kernel does the real work.  It's the tools that work 
on the unmounted filesystem, btrfs check, btrfs restore, btrfs rescue, 
etc, where the userspace code does the real work, and thus where being 
current and having all the latests userspace fixes is vital.

If you can't mount writable, you can't balance.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: corrupted RAID1: unsuccessful recovery / help needed

2015-10-30 Thread Hugo Mills
On Fri, Oct 30, 2015 at 10:58:47AM +, Duncan wrote:
> Lukas Pirl posted on Fri, 30 Oct 2015 10:43:41 +1300 as excerpted:
> 
> > If there is one subvolume that contains all other (read only) snapshots
> > and there is insufficient storage to copy them all separately:
> > Is there an elegant way to preserve those when moving the data across
> > disks?

   If they're read-only snapshots already, then yes:

sent=
for sub in *; do
   btrfs send $sent $sub | btrfs receive /where/ever
   sent="$sent -c$sub"
done

   That will preserve the shared extents between the subvols on the
receiving FS.

   If they're not read-only, then snapshotting each one again as RO
before sending would be the approach, but if your FS is itself RO,
that's not going to be possible, and you need to look at Duncan's
email.

   Hugo.

> AFAIK, no elegant way without a writable mount.
> 
> Tho I'm not sure, btrfs send, to a btrfs elsewhere using receive, may 
> work, since you did specify read-only snapshots, which is what send 
> normally works with in ordered to avoid changes to the snapshot while 
> it's sending it.  My own use-case doesn't involve either snapshots or 
> send/receive, however, so I'm not sure if send can work with a read-only 
> filesystem or not, but I think its normal method of operation is to 
> create those read-only snapshots itself, which of course would require a 
> writable filesystem, so I'm guessing it won't work unless you can 
> convince it to use the read-only mounts as-is.
> 
> The less elegant way would involve manual deduplication.  Copy one 
> snapshot, then another, and dedup what hasn't changed between the two, 
> then add a third and dedup again. ...  Depending on the level of dedup 
> (file vs block level) and the level of change in your filesystem, this 
> should ultimately take about the same level of space as a full backup 
> plus a series of incrementals.
> 
> 
> Meanwhile, this does reinforce the point that snapshots don't replace 
> full backups, that being the reason I don't use them here, since if the 
> filesystem goes bad, it'll very likely take all the snapshots with it.
> 
> Snapshots do tend to be pretty convenient, arguably /too/ convenient and 
> near-zero-cost to make, as people then tend to just do scheduled 
> snapshots, without thinking about their overhead and maintenance costs on 
> the filesystem, until they already have problems.  I'm not sure if you 
> are a regular list reader and have thus seen my normal spiel on btrfs 
> snapshot scaling and recommended limits to avoid problems or not, so if 
> not, here's a slightly condensed version...
> 
> Btrfs has scaling issues that appear when trying to manage too many 
> snapshots.  These tend to appear first in tools like balance and check, 
> where time to process a filesystem goes up dramatically as the number of 
> snapshots increases, to the point where it can become entirely 
> impractical to manage at all somewhere near the 100k snapshots range, and 
> is already dramatically affecting runtime at 10k snapshots.
> 
> As a result, I recommend keeping per-subvol snapshots to 250-ish, which 
> will allow snapshotting four subvolumes while still keeping total 
> filesystem snapshots to 1000, or eight subvolumes at a filesystem total 
> of 2000 snapshots, levels where the scaling issues should remain well 
> within control.  And 250-ish snapshots per subvolume is actually very 
> reasonable even with half-hour scheduled snapshotting, provided a 
> reasonable scheduled snapshot thinning program is also implemented, 
> cutting say to hourly after six hours, six-hourly after a day, 12 hourly 
> after 2 days, daily after a week, and weekly after four weeks to a 
> quarter (13 weeks).  Out beyond a quarter or two, certainly within a 
> year, longer term backups to other media should be done, and snapshots 
> beyond that can be removed entirely, freeing up the space the old 
> snapshots kept locked down and helping to keep the btrfs healthy and 
> functioning well within its practical scalability limits.
> 
> Because a balance that takes a month to complete because it's dealing 
> with a few hundred k snapshots is in practice (for most people) not 
> worthwhile to do at all, and also in practice, a year or even six months 
> out, are you really going to care about the precise half-hour snapshot, 
> or is the next daily or weekly snapshot going to be just as good, and a 
> whole lot easier to find among a couple hundred snapshots than hundreds 
> of thousands?
> 
> If you have far too many snapshots, perhaps this sort of thinning 
> strategy will as well allow you to copy and dedup only key snapshots, say 
> weekly plus daily for the last week, doing the backup thing manually, as 
> well, modifying the thinning strategy accordingly if necessary to get it 
> to fit.  Tho using the copy and dedup strategy above will still require 
> at least double the full space of a single copy, plus the space necessary 
> for each deduped snapshot 

<    1   2   3   4   5   6   >