Re: [markfasheh/duperemove] Why blocksize is limit to 1MB?

2017-01-02 Thread Xin Zhou
Hi,

Before doing that, probably one way to think about it could be,
what would be the probablitity of two 100M blocks generate the same hash and be 
treated as identical.
Thanks,
Xin
 
 

Sent: Monday, January 02, 2017 at 4:32 AM
From: "Peter Becker" <floyd@gmail.com>
To: "Xin Zhou" <xin.z...@gmx.com>
Cc: linux-btrfs <linux-btrfs@vger.kernel.org>
Subject: Re: [markfasheh/duperemove] Why blocksize is limit to 1MB?
> 1M is already a little bit too big in size.

Not in my usecase :)

Is it right the this isn't an limit in btrfs? So i can patch this and try 100M.
The reason is, that i must dedupe the whole 8 TB in less then a day
but with 128K and 1M blocksize it will take a week.

I don't know why adding extends take so long.
I/O during adding extends is less then 4MB/s, and CPU (dual core) and
memory (8 GB) usage are less then 20%, on bare metal.

2017-01-01 5:38 GMT+01:00 Xin Zhou <xin.z...@gmx.com>:
> Hi,
>
> In general, the larger the block / chunk size is, the less dedup can be 
> achieved.
> 1M is already a little bit too big in size.
>
> Thanks,
> Xin
>
>
>
>
> Sent: Friday, December 30, 2016 at 12:28 PM
> From: "Peter Becker" <floyd@gmail.com>
> To: linux-btrfs <linux-btrfs@vger.kernel.org>
> Subject: [markfasheh/duperemove] Why blocksize is limit to 1MB?
> Hello, i have a 8 TB volume with multiple files with hundreds of GB each.
> I try to dedupe this because the first hundred GB of many files are identical.
> With 128KB blocksize with nofiemap and lookup-extends=no option, will
> take more then a week (only dedupe, previously hashed). So i tryed -b
> 100M but this returned me an error: "Blocksize is bounded ...".
>
> The reason is that the blocksize is limit to
>
> #define MAX_BLOCKSIZE (1024U*1024)
>
> But i can't found any description why.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [markfasheh/duperemove] Why blocksize is limit to 1MB?

2016-12-31 Thread Xin Zhou
Hi,

In general, the larger the block / chunk size is, the less dedup can be 
achieved.
1M is already a little bit too big in size.

Thanks,
Xin

 
 

Sent: Friday, December 30, 2016 at 12:28 PM
From: "Peter Becker" 
To: linux-btrfs 
Subject: [markfasheh/duperemove] Why blocksize is limit to 1MB?
Hello, i have a 8 TB volume with multiple files with hundreds of GB each.
I try to dedupe this because the first hundred GB of many files are identical.
With 128KB blocksize with nofiemap and lookup-extends=no option, will
take more then a week (only dedupe, previously hashed). So i tryed -b
100M but this returned me an error: "Blocksize is bounded ...".

The reason is that the blocksize is limit to

#define MAX_BLOCKSIZE (1024U*1024)

But i can't found any description why.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [CORRUPTION FILESYSTEM] Corrupted and unrecoverable file system during the snapshot receive

2016-12-26 Thread Xin Zhou
>Unless there are bugs that
>would show up in other situations as well (or an out-of-space condition
>is triggered that would likewise show up in other situations with a
>similar amount of data/metadata written),
 
That is exactly some bugs come from.
For simple cases, it is ok to assume the send/receive always succeed.
And if it errors out, assumes the delete always succeed, and the file system is 
in consistent status,
and good luck with the data.



Xin




Sent: Sunday, December 25, 2016 at 7:52 PM
From: Duncan <1i5t5.dun...@cox.net>
To: linux-btrfs@vger.kernel.org
Subject: Re: [CORRUPTION FILESYSTEM] Corrupted and unrecoverable file system 
during the snapshot receive
Xin Zhou posted on Mon, 26 Dec 2016 03:36:09 +0100 as excerpted:

> One interesting thing to investigate might be the btrfs send / receive
> result, under a disruptive network environment. If the connection breaks
> in the middle of transfer (at different phase, maybe), see what could be
> the file system status.

Btrfs send, sends from a read-only snapshot, so the sending filesystem
shouldn't be harmed no matter what happens to send.

Btrfs receive does all its work in a new subvolume (basically a snapshot
in an incremental send, tho I'm not sure it's a full snapshot in the
technical sense), modifying the files therein using standard calls used
in other contexts as well, so absent bugs that should appear in those
other contexts too if they exist, the worst damage that a receive should
be able to do is an unfaithful replay of the send stream, such that an
appropriate copy of the sent snapshot doesn't appear on the receiver.

Which means even in the case of error, cleanup is as simple as deleting
the aborted/incompletely-received subvolume. Unless there are bugs that
would show up in other situations as well (or an out-of-space condition
is triggered that would likewise show up in other situations with a
similar amount of data/metadata written), there should be no effects
outside that received subvolume.

--
Duncan - List replies preferred. No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master." Richard Stallman

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [CORRUPTION FILESYSTEM] Corrupted and unrecoverable file system during the snapshot receive

2016-12-26 Thread Xin Zhou

That is one way to diagnose the issue in data path.
If ssh can guarantee data transfer and retry, then those data protection 
company does not need to have a whole team handle the send / receive for remote 
data backup.

In your case, if the conection is very light, then the issue could be in other 
place.

Xin 
 

Sent: Monday, December 26, 2016 at 3:04 AM
From: "Giuseppe Della Bianca" <b...@adria.it>
To: "Xin Zhou" <xin.z...@gmx.com>, "Btrfs BTRFS" <linux-btrfs@vger.kernel.org>
Subject: Re: [CORRUPTION FILESYSTEM] Corrupted and unrecoverable file system 
during the snapshot receive
Hi.

I agree with Duncan, and I add:

- For remote transfer is used ssh.
ssh is designed to ensure integrity of data.
- Remote transfer uses a Gigabit Ethernet, it is never congested.
- I had the same problems with a local btrfs receive.
- The script currently has 907 lines of code, many of which are to ensure the
detection and display of btrfs tools errors.
- The script stops executing when btrs tools return an error code.
- Is not possible that the script does not display error messages or ignore
error code of btrfs tools.

An example of today:

(2016-12-26 10:53:51) Start btrfsManage
. . . Start managing SEND ' / ' filesystem ' root ' snapshot in ' /dev/sda2 '

Sending ' root-2016-12-04_18:13:57.35 ' source snapshot to ' btrfsreceive ' 
subvolume
. . . btrfs send -p 
/tmp/tmp.xJWkEN1U23/btrfssnapshot/root/root-2016-12-03_18:07:09.34 
/tmp/tmp.xJWkEN1U23/btrfssnapshot/root/root-2016-12-04_18:13:57.35 | btrfs 
receive /tmp/tmp.pWWKP4vfAy/btrfsreceive/root/.part/
. . . At subvol 
/tmp/tmp.xJWkEN1U23/btrfssnapshot/root/root-2016-12-04_18:13:57.35
. . . ERROR: truncate usr/share/locale/it/LC_MESSAGES/kio4.mo failed: Read-only 
file system
. . . At snapshot root-2016-12-04_18:13:57.35
. . . _EC_ERR_ 1
. . . _EC_ERR_ 141

(2016-12-26 10:54:28) End btrfsManage
. . . End managing SEND ' / ' filesystem ' root ' snapshot in ' /dev/sda2 '
WITH ERRORS


Checking filesystem on /dev/sda2
UUID: 44f1de7e-a65b-41ce-8ff4-20f7ed83e106
checking extents
ref mismatch on [62408097792 16384] extent item 0, found 1
Backref 62408097792 parent 1060 root 1060 not found in extent tree
backpointer mismatch on [62408097792 16384]
owner ref check failed [62408097792 16384]
ref mismatch on [77565509632 16384] extent item 0, found 1
Backref 77565509632 parent 1060 root 1060 not found in extent tree
backpointer mismatch on [77565509632 16384]
]zac[
Backref 77826916352 parent 1060 root 1060 not found in extent tree
backpointer mismatch on [77826916352 16384]
owner ref check failed [77826916352 16384]
ref mismatch on [77853933568 16384] extent item 0, found 1
Backref 77853933568 parent 1060 root 1060 not found in extent tree
backpointer mismatch on [77853933568 16384]
owner ref check failed [77853933568 16384]
checking free space cache
checking fs roots
warning line 3822
checking csums
checking root refs
found 135128678400 bytes used err is 0
total csum bytes: 126946572
total tree bytes: 5132206080
total fs tree bytes: 4744757248
total extent tree bytes: 240795648
btree space waste bytes: 914832832
file data blocks allocated: 3311786532864
referenced 703616266240



Is likely that mine is a special case.

But a special case, with a code change in other points, can become a problem 
for many.

It's not nice to say, but it seems I have to hope that my problem becomes a 
problem of many.

Meanwhile, I'll find my own workaround of a probable serious btrfs bug.


Thank you.

Gdb


> Hi,
>
> Probably can try to use "-v" to enable more output print.
> A quick look at the send / receive code, it seems a little bit risky.
> It seems lack of specific error handlings, and in most cases, return the
> same error code. I think it might be helpful, when a transfer succeed, the
> command prints the transfer id, source / dest, and a specific "success"
> string.
> Such output could help the script to figure out if a transfer really
> succeed.
>
> The code is relatively new to me, I did not see retry logic in stream
> handling, please correct me if I am wrong about this. So, I am not quite
> sure about the transfer behavior, if the system subject to network issues
> in heavy workload, in which packets missing or connect issues are not rare.
>
> Since the test mentioned at the begining deletes the snapshots after a
> transfer, while most users keep the middle snapshot even in cascading
> transfer, probably the current btrfs and cmds still works for regular
> users.
>
> Thanks,
> Xin
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [CORRUPTION FILESYSTEM] Corrupted and unrecoverable file system during the snapshot receive

2016-12-25 Thread Xin Zhou
Hi,
For free software with open source code, that is quite good.
Most commercial product has a very robust error handling in transport, to 
guarantee no corruption due to transfer issues.
One interesting thing to investigate might be the btrfs send / receive result, 
under a disruptive network environment.
If the connection breaks in the middle of transfer (at different phase, maybe), 
see what could be the file system status.

Thanks,
Xin

 
 

Sent: Sunday, December 25, 2016 at 2:57 PM
From: Duncan <1i5t5.dun...@cox.net>
To: linux-btrfs@vger.kernel.org
Subject: Re: [CORRUPTION FILESYSTEM] Corrupted and unrecoverable file system 
during the snapshot receive
Xin Zhou posted on Sat, 24 Dec 2016 21:15:40 +0100 as excerpted:

> The code is relatively new to me, I did not see retry logic in stream
> handling, please correct me if I am wrong about this.
> So, I am not quite sure about the transfer behavior, if the system
> subject to network issues in heavy workload,
> in which packets missing or connect issues are not rare.

As you likely know I'm neither a dev, just a list regular and btrfs user
myself, and I'm not particularly familiar with send/receive as I don't
use it myself, but...

AFAIK, the send and receive sides are specifically designed to be
separate and to work with STDOUT/STDIN, so it's possible with STDOUT
redirection to "send" to a local file instead of directly to receive, and
then to replay that file on the other end by cat-ing it to receive.

As such, transfer behavior isn't really a factor at the btrfs layer,
since handling problems in the transfer layer is the responsibility of
whatever method the user is using to do that transfer, and the user is
presumed to use a transfer method with whatever reliability guarantees
they deem necessary. So network behavior isn't really a factor at the
btrfs level as that's the transfer layer and btrfs isn't worrying about
that, simply assuming it to have the necessary reliability.

--
Duncan - List replies preferred. No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master." Richard Stallman

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [CORRUPTION FILESYSTEM] Corrupted and unrecoverable file system during the snapshot receive

2016-12-23 Thread Xin Zhou

Hi,

Would you like to show the "btrfs send/receive" command the script are using, 
including all the parameters,
and how the script waits for a completion of a transfer.

>From the beginning of the thread, it seems the transfer tests are going 
>through different network environment.
 
Thanks,
Xin

Sent: Friday, December 23, 2016 at 9:48 AM
From: b...@adria.it
To: "Xin Zhou" <xin.z...@gmx.com>
Cc: "Btrfs BTRFS" <linux-btrfs@vger.kernel.org>
Subject: Re: [CORRUPTION FILESYSTEM] Corrupted and unrecoverable file system 
during the snapshot receive
Yes.

Is through to the btrfs-tools error message that the script has printed, that I
realized the filesystem corruption.


P.S. Various messages that you see in the working examples of the script, are
emitted directly by the btrfs-tools.


Gdb

Xin Zhou <xin.z...@gmx.com>:

> Hi,
>
> Does the script check the transfer status, and is there a transfer returns an
> error code?
> Thanks,
> Xin
>  
>  
>
> Sent: Thursday, December 22, 2016 at 11:28 PM
> From: "Giuseppe Della Bianca" <b...@adria.it>
> To: "Btrfs BTRFS" <linux-btrfs@vger.kernel.org>
> Subject: Re: [CORRUPTION FILESYSTEM] Corrupted and unrecoverable file system
> during the snapshot receive
> (synthetic resend)
>
> Hi.
>
> Is possible that there are transfers, cancellations and other, at the same
> time, but not in the same subvolume.
>
> My script checks that there are no transfers in progress on the same
> subvolume.
>
> Is possible that the same subvolume is mounted several times (temporary
> mount
> at the beginning, and unmount at the end, in my script).
>
>
> Thanks for all.
>
>
> P.S. Sorry for my bad English.
>
>
> Gdb
>
>
> In data mercoledì 21 dicembre 2016 23:14:44, Xin Zhou ha scritto:
> > Hi,
> > Racing condition can happen, if running multiple transfers to the same
> > destination. Would you like to tell how many transfers are the scripts
> > running at a time to a specific hdd?
> >
> > Thanks,
> > Xin
> >
> >
> > Sent: Wednesday, December 21, 2016 at 1:11 PM
> > From: "Chris Murphy" <li...@colorremedies.com>
> > To: No recipient address
> > Cc: "Giuseppe Della Bianca" <b...@adria.it>, "Xin Zhou"
> <xin.z...@gmx.com>,
> > "Btrfs BTRFS" <linux-btrfs@vger.kernel.org> Subject: Re: [CORRUPTION
> > FILESYSTEM] Corrupted and unrecoverable file system during the snapshot
> > receive
> > On Wed, Dec 21, 2016 at 2:09 PM, Chris Murphy <li...@colorremedies.com>
> wrote:
> > > What about CONFIG_BTRFS_FS_CHECK_INTEGRITY? And then using check_int
> > > mount option?
> >
> > This slows things down, and in that case it might avoid the problem if
> > it's the result of a race condition.
> >
> > --
> > Chris Murphy
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>





This mail has been sent using Alpikom webmail system
http://www.alpikom.it[http://www.alpikom.it]
 
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [CORRUPTION FILESYSTEM] Corrupted and unrecoverable file system during the snapshot receive

2016-12-23 Thread Xin Zhou
Hi,

Does the script check the transfer status, and is there a transfer returns an 
error code?
Thanks,
Xin
 
 

Sent: Thursday, December 22, 2016 at 11:28 PM
From: "Giuseppe Della Bianca" <b...@adria.it>
To: "Btrfs BTRFS" <linux-btrfs@vger.kernel.org>
Subject: Re: [CORRUPTION FILESYSTEM] Corrupted and unrecoverable file system 
during the snapshot receive
(synthetic resend)

Hi.

Is possible that there are transfers, cancellations and other, at the same
time, but not in the same subvolume.

My script checks that there are no transfers in progress on the same
subvolume.

Is possible that the same subvolume is mounted several times (temporary mount
at the beginning, and unmount at the end, in my script).


Thanks for all.


P.S. Sorry for my bad English.


Gdb


In data mercoledì 21 dicembre 2016 23:14:44, Xin Zhou ha scritto:
> Hi,
> Racing condition can happen, if running multiple transfers to the same
> destination. Would you like to tell how many transfers are the scripts
> running at a time to a specific hdd?
>
> Thanks,
> Xin
>
>
> Sent: Wednesday, December 21, 2016 at 1:11 PM
> From: "Chris Murphy" <li...@colorremedies.com>
> To: No recipient address
> Cc: "Giuseppe Della Bianca" <b...@adria.it>, "Xin Zhou" <xin.z...@gmx.com>,
> "Btrfs BTRFS" <linux-btrfs@vger.kernel.org> Subject: Re: [CORRUPTION
> FILESYSTEM] Corrupted and unrecoverable file system during the snapshot
> receive
> On Wed, Dec 21, 2016 at 2:09 PM, Chris Murphy <li...@colorremedies.com>
wrote:
> > What about CONFIG_BTRFS_FS_CHECK_INTEGRITY? And then using check_int
> > mount option?
>
> This slows things down, and in that case it might avoid the problem if
> it's the result of a race condition.
>
> --
> Chris Murphy

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs_log2phys: cannot lookup extent mapping

2016-12-22 Thread Xin Zhou
Hi,
If the change of disk format between versions is precisely documented,
it is plausible to create a utility to convert the old volume to new ones,
trigger the workflow, upgrade the kernel and boots up for mounting the new 
volume.
Currently, the btrfs wiki shows partial content of the on-disk format.
Thanks,
Xin
 
 

Sent: Wednesday, December 21, 2016 at 6:50 AM
From: "David Hanke" 
To: linux-btrfs@vger.kernel.org
Subject: Re: btrfs_log2phys: cannot lookup extent mapping
Hi Duncan,

Thank you for your reply. If I've emailed the wrong list, please let me
know. What I hear you saying, in short, is that btrfs is not yet fully
stable but current 4.x versions may work better. I'm willing to upgrade,
but I'm told that the upgrade process may result in total failure, and
I'm not sure I can trust the contents of the volume either way. Given
that, it seems I must backup the backup, erase and start over. What
would you do?

Thank you,

David


On 12/20/16 17:24, Duncan wrote:
> David Hanke posted on Tue, 20 Dec 2016 09:52:25 -0600 as excerpted:
>
>> I've been using a btrfs-based volume for backups, but lately the
>> system's been filling the syslog with errors like "btrfs_log2phys:
>> cannot lookup extent mapping for 7129125486592" at the rate of hundreds
>> per second. (Please see output below for more details.) Despite the
>> errors, the files I've looked at appear to be written and read
>> successfully.
>>
>> I'm wondering if the contents of the volume are trustworthy and whether
>> this problem is resolvable without backing up, erasing and starting
>> over?
>>
>> Thank you!
>>
>> David
>>
>>
>> # uname -a
>> Linux backup2 3.0.101.RNx86_64.3 #1 SMP Wed Apr 1 16:02:14 PDT 2015
>> x86_64 GNU/Linux
>>
>> # btrfs --version
>> Btrfs v3.17.3
> FWIW...
>
> [TL;DR: see the four bottom line choices, at the bottom.]
>
> This is the upstream btrfs development and discussion list for a
> filesystem that's still stabilizing (that is, not fully stable and
> mature) and that remains under heavy development and bug fixing. As
> such, list focus is heavily forward looking, with an extremely strong
> recommendation to use current kernels (and to a lessor extent btrfs
> userspace) if you're going to be running btrfs, as these have all the
> latest bugfixes.
>
> Put a different way, the general view and strong recommendation of the
> list is that because btrfs is still under heavy development, with bug
> fixes, some more major than others, every kernel cycle, while we
> recognize that choosing to run old and stale^H^Hble kernels and userspace
> is a legitimate choice on its own, that choice of stability over support
> for the latest and greatest, is viewed as incompatible with choosing to
> run a still under heavy development filesystem. Choosing one OR the
> other is strongly recommended.
>
> For list purposes, we recommend and best support the last two kernel
> release series in two tracks, LTS/long-term-stable, or current release
> track. On the LTS track, that's the LTS 4.4 and 4.1 series. On the
> current track, 4.9 is the latest release, so 4.9 and 4.8 are best
> supported.
>
> Meanwhile, it's worth keeping in mind that the experimental label and
> accompanying extremely strong "eat your babies" level warnings weren't
> peeled off until IIRC 3.12 or so, meaning anything before that is not
> only ancient history in list terms, but also still labeled as "eat your
> babies" level experimental. Why anyone choosing to run an ancient eat-
> your-babies level experimental version of a filesystem that's now rather
> more stable and mature, tho not yet fully stabilized, is beyond me. If
> they're interested in newer filesystems, running newer and less buggy
> versions is reasonable; if they're interested in years-stale level of
> stability, then running such filesystems, especially when still labeled
> eat-your-babies level experimental back then, seems an extremely odd
> choice indeed.
>
> Of course, on-list we do recognize that various distros did and do offer
> support at some level for older than list-recommended version btrfs, in
> part because they backport fixes from newer versions. However, because
> we're forward development focused we don't track what patches these
> distros may or may not have backported and thus aren't in a good position
> to provide good support for them. Instead, users choosing to use such
> kernels are generally asked to choose between upgrading to something
> reasonably supportable on-list if they wish to go that route, or referred
> back to their distros for the support they're in a far better position to
> offer, since they know what they've backported and what they haven't,
> while we don't.
>
> As for btrfs userspace, the way btrfs works, during normal runtime,
> userspace primarily calls the kernel to do the real work, so userspace
> version isn't as big a deal unless you're trying to use a feature only
> supported by newer versions, except that if it's /too/ old, the 

Re: [CORRUPTION FILESYSTEM] Corrupted and unrecoverable file system during the snapshot receive

2016-12-21 Thread Xin Zhou
Hi,
Racing condition can happen, if running multiple transfers to the same 
destination.
Would you like to tell how many transfers are the scripts running at a time to 
a specific hdd?

Thanks,
Xin
 

Sent: Wednesday, December 21, 2016 at 1:11 PM
From: "Chris Murphy" <li...@colorremedies.com>
To: No recipient address
Cc: "Giuseppe Della Bianca" <b...@adria.it>, "Xin Zhou" <xin.z...@gmx.com>, 
"Btrfs BTRFS" <linux-btrfs@vger.kernel.org>
Subject: Re: [CORRUPTION FILESYSTEM] Corrupted and unrecoverable file system 
during the snapshot receive
On Wed, Dec 21, 2016 at 2:09 PM, Chris Murphy <li...@colorremedies.com> wrote:
> What about CONFIG_BTRFS_FS_CHECK_INTEGRITY? And then using check_int
> mount option?

This slows things down, and in that case it might avoid the problem if
it's the result of a race condition.

--
Chris Murphy
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [CORRUPTION FILESYSTEM] Corrupted and unrecoverable file system during the snapshot receive

2016-12-20 Thread Xin Zhou
Hi,

The system seems running some customized scripts continuously backup data from 
a NVME drive to HDDs.
If the 3 HDDs backup storage are same in btrfs config, and the there is a bug 
in btrfs code,
they all suppose to fail after the same operation sequence.

Otherwise, probably one of the HDDs might have issue, or there is a bug in 
layer below btrfs.

For the customize script, it might be helpful to check the file system 
consistency after each transfer.
That might be useful to figure out which step generates a corruption, and if 
there is error propagations.

Regards,
Xin
 
 

Sent: Monday, December 19, 2016 at 10:55 AM
From: "Giuseppe Della Bianca" <b...@adria.it>
To: "Xin Zhou" <xin.z...@gmx.com>
Cc: linux-btrfs@vger.kernel.org
Subject: Re: [CORRUPTION FILESYSTEM] Corrupted and unrecoverable file system 
during the snapshot receive
a concrete example


SNAPSHOT

/dev/nvme0n1p2 on /tmp/tmp.X3vU6dLLVI type btrfs 
(rw,relatime,ssd,space_cache,subvolid=5,subvol=/)

btrfsManage SNAPSHOT /

(2016-12-19 19:44:00) Start btrfsManage
. . . Start managing SNAPSHOT ' / ' filesystem ' root ' snapshot

In ' btrfssnapshot ' latest source snapshot ' root-2016-12-18_15:10:01.40 '
. . . date ' 2016-12-18_15:10:01 ' number ' 40 '

Creation ' root-2016-12-19_19:44:00.part ' snapshot from ' root ' subvolume
. . . Create a readonly snapshot of '/tmp/tmp.X3vU6dLLVI/root' in 
'/tmp/tmp.X3vU6dLLVI/btrfssnapshot/root/root-2016-12-19_19:44:00.part'

Renaming ' root-2016-12-19_19:44:00.part ' into ' root-2016-12-19_19:44:00.41 ' 
snapshot

Source snapshot list of ' root ' subvolume
. . . btrfssnapshot/root/root-2016-08-28-12-35-01.1
]zac[
. . . btrfssnapshot/root/root-2016-12-19_19:44:00.41

(2016-12-19 19:44:05) End btrfsManage
. . . End managing SNAPSHOT ' / ' filesystem ' root ' snapshot
CORRECTLY



SEND e RECEIVE

/dev/nvme0n1p2 on /tmp/tmp.o78czE0Bo6 type btrfs 
(rw,relatime,ssd,space_cache,subvolid=5,subvol=/)
/dev/sda2 on /tmp/tmp.XcwqQCKq09 type btrfs 
(rw,relatime,space_cache,subvolid=5,subvol=/)

btrfsManage SEND / /dev/sda2

(2016-12-19 19:47:24) Start btrfsManage
. . . Start managing SEND ' / ' filesystem ' root ' snapshot in ' /dev/sda2 '

Sending ' root-2016-12-19_19:44:00.41 ' source snapshot to ' btrfsreceive ' 
subvolume
. . . btrfs send -p 
/tmp/tmp.o78czE0Bo6/btrfssnapshot/root/root-2016-12-18_15:10:01.40 
/tmp/tmp.o78czE0Bo6/btrfssnapshot/root/root-2016-12-19_19:44:00.41 | btrfs 
receive /tmp/tmp.XcwqQCKq09/btrfsreceive/root/.part/
. . . At subvol 
/tmp/tmp.o78czE0Bo6/btrfssnapshot/root/root-2016-12-19_19:44:00.41
. . . At snapshot root-2016-12-19_19:44:00.41

Creation ' root-2016-12-19_19:44:00.41 ' snapshot from ' 
.part/root-2016-12-19_19:44:00.41 ' subvolume
. . . Create a readonly snapshot of 
'/tmp/tmp.XcwqQCKq09/btrfsreceive/root/.part/root-2016-12-19_19:44:00.41' in 
'/tmp/tmp.XcwqQCKq09/btrfsreceive/root/root-2016-12-19_19:44:00.41'
. . . Delete subvolume (commit): 
'/tmp/tmp.XcwqQCKq09/btrfsreceive/root/.part/root-2016-12-19_19:44:00.41'

Snapshot list in ' /dev/sda2 ' device
. . . btrfsreceive/data_backup/data_backup-2016-12-17_12:07:00.1
. . . btrfsreceive/data_storage/data_storage-2016-12-10_17:05:51.1
. . . btrfsreceive/root/root-2016-08-28-12-35-01.1
]zac[
. . . btrfsreceive/root/root-2016-12-19_19:44:00.41

(2016-12-19 19:48:37) End btrfsManage
. . . End managing SEND ' / ' filesystem ' root ' snapshot in ' /dev/sda2 '
CORRECTLY



> Hi Giuseppe,
>
> Would you like to tell some details about:
> 1. the XYZ snapshot was taken from which subvolume
> 2. where the base (initial) snapshot is stored
> 3. The 3 partitions receives the same snapshot, are they in the same btrfs
> configuration and subvol structure?
>
> Also, would you send the link reports "two files unreadable error" post
> mentioned in step 2? Hope can see the message and figure out if the issue
> first comes from sender or receiver side.
>
> Thanks,
> Xin
>
>
 
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Help please: BTRFS fs crashed due to bad removal of USB drive, no help from recovery procedures

2016-12-19 Thread Xin Zhou
Hi Jari,

The message shows:
> [ 135.446260] BTRFS error (device sdb1): superblock contains fatal errors
 
So according this info, before trying to run repair / rescue procedure, would 
you like to show the 0,1,2 superblock status?

Regards,
Xin
 
 

Sent: Monday, December 19, 2016 at 2:32 AM
From: "Jari Seppälä" <lihamakaroonilaati...@gmail.com>
To: linux-btrfs@vger.kernel.org
Cc: "Xin Zhou" <xin.z...@gmx.com>
Subject: Re: Help please: BTRFS fs crashed due to bad removal of USB drive, no 
help from recovery procedures
Xin Zhou <xin.z...@gmx.com> kirjoitti 17.12.2016 kello 22.27:
>
> Hi Jari,
>
> Similar with other file system, btrfs has copies of super blocks.
> Try to run "man btrfs check", "man btrfs rescue" and related commands for 
> more details.
> Regards,
> Xin

Hi Xin,

I did follow all recovery procedures from man and wiki pages. Tools do not help 
as they thing there is no BTRFS fs anymore. However if I try to reformat the 
device I get:

btrfs-progs v4.4
See http://btrfs.wiki.kernel.org for more information.
/dev/sdb1 appears to contain an existing filesystem (btrfs).

So, recovery tools seem to thing there is no btrfs filesystem. Mkfs seems to 
thing there is.

What I have tried:
btrfsck /dev/sdb1
mount -t btrfs -o ro /dev/sdb1 /mnt/share/
mount -t btrfs -o ro,recovery /dev/sdb1 /mnt/share/
mount -t btrfs -o roootflags=recovery,nospace_cache /dev/sdb1 /mnt/share/
mount -t btrfs -o rootflags=recovery,nospace_cache /dev/sdb1 /mnt/share/
mount -t btrfs -o rootflags=recovery,nospace_cache,clear_cache /dev/sdb1 
/mnt/share/
mount -t btrfs -o ro,rootflags=recovery,nospace_cache,clear_cache /dev/sdb1 
/mnt/share/
btrfs restore /dev/sdb1 /target/device
btrfs rescue zero-log /dev/sdb1
btrfsck --init-csum-tree /dev/sdb1
btrfsck --fix-crc /dev/sdb1
btrfsck --check-data-csum /dev/sdb1
btrfs rescue chunk-recover /dev/sdb1
btrfs rescue super-recover /dev/sdb1
btrfs rescue zero-log /dev/sdb1

No help whatsoever.

Jari

>
>
>
> Sent: Saturday, December 17, 2016 at 2:06 AM
> From: "Jari Seppälä" <lihamakaroonilaati...@gmail.com>
> To: linux-btrfs@vger.kernel.org
> Subject: Help please: BTRFS fs crashed due to bad removal of USB drive, no 
> help from recovery procedures
> Syslog tells:
> [ 135.446222] BTRFS error (device sdb1): system chunk array too small 0 < 97
> [ 135.446260] BTRFS error (device sdb1): superblock contains fatal errors
> [ 135.462544] BTRFS error (device sdb1): open_ctree failed
>
> What have been done:
> * All "btrfs rescue" options
>
> Info on system
> * fs on external SSD via USB
> * kernel 4.9.0 (tried with 4.8.13)
> * btrfs-tools 4.4
> * Mythbuntu (Ubuntu) 16.04.1 LTS with latest fixes 2012-12-16
>
> Any help appreciated. Around 300G of TV recordings on the drive, which of 
> course will eventually come as replays.
>
> Jari
> --
> *** Jari Seppälä
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at 
> http://vger.kernel.org/majordomo-info.html[http://vger.kernel.org/majordomo-info.html]

--
*** Jari Seppälä
 
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [CORRUPTION FILESYSTEM] Corrupted and unrecoverable file system during the snapshot receive

2016-12-18 Thread Xin Zhou

Hi Giuseppe,

Would you like to tell some details about:
1. the XYZ snapshot was taken from which subvolume
2. where the base (initial) snapshot is stored
3. The 3 partitions receives the same snapshot, are they in the same btrfs 
configuration and subvol structure?

Also, would you send the link reports "two files unreadable error" post 
mentioned in step 2? 
Hope can see the message and figure out if the issue first comes from sender or 
receiver side. 

Thanks,
Xin
 

Sent: Sunday, December 18, 2016 at 11:59 AM
From: "Giuseppe Della Bianca" 
To: linux-btrfs@vger.kernel.org
Subject: Re: [CORRUPTION FILESYSTEM] Corrupted and unrecoverable file system 
during the snapshot receive
> Same problem, this time on a local subvolume.
>
> kernel-4.8.8-100.fc23.x86_64
>
> btrfs-progs v4.8.5
]zac[

I had three filesystem corruption.

The point at which the problem it appeared, is similar in all three cases.

Subvolume structure and operations sequence:

btrfsreceive/
btrfsreceive/root/
btrfsreceive/root/.part/

1) Sending XYZ differential snapshot in to ' btrfsreceive/root/.part/ '.
2) Create snapshot from ' btrfsreceive/root/.part/XYZ ' to ' btrfsreceive/root
/XYZ '.
3) Delete snapshot ' btrfsreceive/root/.part/XYZ '.

Always in step 2) I had two files unreadable error (view previous posts), and
one already existing object error (see below).

All three times I had to re-create from scratch the various partitions (on
disks and systems different).

I can help you, in some way, to find the problem?

Or is useless to continue report it?



dic 18 18:29:58 exnetold.gdb.it kernel: [ cut here ]
dic 18 18:29:58 exnetold.gdb.it kernel: WARNING: CPU: 1 PID: 4325 at
fs/btrfs/extent-tree.c:2960 btrfs_run_delayed_refs+0x283/0x2b0 [btrfs]
dic 18 18:29:58 exnetold.gdb.it kernel: BTRFS: Transaction aborted (error -17)
dic 18 18:29:58 exnetold.gdb.it kernel: Modules linked in: fuse xt_CHECKSUM
ipt_MASQUERADE nf_nat_masquerade_ipv4 tun nf_conntrack_netbios_ns
nf_conntrack_br
dic 18 18:29:58 exnetold.gdb.it kernel: soundcore acpi_cpufreq tpm_tis
tpm_tis_core tpm nfsd auth_rpcgss nfs_acl lockd grace sunrpc ata_generic
nouveau vide
dic 18 18:29:58 exnetold.gdb.it kernel: CPU: 1 PID: 4325 Comm: umount Tainted:
G W 4.8.8-100.fc23.x86_64 #1
dic 18 18:29:58 exnetold.gdb.it kernel: Hardware name: System manufacturer
System Product Name/M2N, BIOS 0902 02/16/2009
dic 18 18:29:58 exnetold.gdb.it kernel: 0286 dd260fac
8ffa0d25bb60 bc3e493e
dic 18 18:29:58 exnetold.gdb.it kernel: 8ffa0d25bbb0 
8ffa0d25bba0 bc0a0ecb
dic 18 18:29:58 exnetold.gdb.it kernel: 0b900049 8ff9e61b40a0
8ffa2da77800 
dic 18 18:29:58 exnetold.gdb.it kernel: Call Trace:
dic 18 18:29:58 exnetold.gdb.it kernel: [] 
dump_stack+0x63/0x85
dic 18 18:29:58 exnetold.gdb.it kernel: [] __warn+0xcb/0xf0
dic 18 18:29:58 exnetold.gdb.it kernel: []
warn_slowpath_fmt+0x5f/0x80
dic 18 18:29:58 exnetold.gdb.it kernel: []
btrfs_run_delayed_refs+0x283/0x2b0 [btrfs]
dic 18 18:29:58 exnetold.gdb.it kernel: [] ?
btrfs_cow_block+0x10c/0x1e0 [btrfs]
dic 18 18:29:58 exnetold.gdb.it kernel: []
commit_cowonly_roots+0xae/0x2e0 [btrfs]
dic 18 18:29:58 exnetold.gdb.it kernel: [] ?
btrfs_run_delayed_refs+0x206/0x2b0 [btrfs]
dic 18 18:29:58 exnetold.gdb.it kernel: [] ?
btrfs_qgroup_account_extents+0x84/0x180 [btrfs]
dic 18 18:29:58 exnetold.gdb.it kernel: []
btrfs_commit_transaction+0x547/0xa40 [btrfs]
dic 18 18:29:58 exnetold.gdb.it kernel: []
btrfs_commit_super+0x8f/0xa0 [btrfs]
dic 18 18:29:58 exnetold.gdb.it kernel: []
close_ctree+0x2db/0x380 [btrfs]
dic 18 18:29:58 exnetold.gdb.it kernel: [] ?
evict_inodes+0x15a/0x180
dic 18 18:29:58 exnetold.gdb.it kernel: []
btrfs_put_super+0x19/0x20 [btrfs]
dic 18 18:29:58 exnetold.gdb.it kernel: []
generic_shutdown_super+0x6f/0xf0
dic 18 18:29:58 exnetold.gdb.it kernel: []
kill_anon_super+0x12/0x20
dic 18 18:29:58 exnetold.gdb.it kernel: []
btrfs_kill_super+0x18/0x110 [btrfs]
dic 18 18:29:58 exnetold.gdb.it kernel: []
deactivate_locked_super+0x43/0x70
dic 18 18:29:58 exnetold.gdb.it kernel: []
deactivate_super+0x5c/0x60
dic 18 18:29:58 exnetold.gdb.it kernel: []
cleanup_mnt+0x3f/0x90
dic 18 18:29:58 exnetold.gdb.it kernel: []
__cleanup_mnt+0x12/0x20
dic 18 18:29:58 exnetold.gdb.it kernel: []
task_work_run+0x7e/0xa0
dic 18 18:29:58 exnetold.gdb.it kernel: []
exit_to_usermode_loop+0xc2/0xd0
dic 18 18:29:58 exnetold.gdb.it kernel: []
syscall_return_slowpath+0xa1/0xb0
dic 18 18:29:58 exnetold.gdb.it kernel: []
entry_SYSCALL_64_fastpath+0xa2/0xa4
dic 18 18:29:58 exnetold.gdb.it kernel: ---[ end trace f7eb2e818f727168 ]---
dic 18 18:29:58 exnetold.gdb.it kernel: BTRFS: error (device sda3) in
btrfs_run_delayed_refs:2960: errno=-17 Object already exists
dic 18 18:29:58 exnetold.gdb.it kernel: BTRFS info (device sda3): forced
readonly
dic 18 18:29:58 exnetold.gdb.it kernel: BTRFS warning (device sda3): Skipping
commit of aborted 

Re: OOM: Better, but still there on

2016-12-17 Thread Xin Zhou
Hi,
The system supposes to have special memory reservation for coredump and other 
debug info when encountering panic,
the size seems configurable.
Thanks,
Xin
 
 

Sent: Saturday, December 17, 2016 at 6:44 AM
From: "Tetsuo Handa" 
To: "Nils Holland" , "Michal Hocko" 
Cc: linux-ker...@vger.kernel.org, linux...@kvack.org, "Chris Mason" 
, "David Sterba" , linux-btrfs@vger.kernel.org
Subject: Re: OOM: Better, but still there on
On 2016/12/17 21:59, Nils Holland wrote:
> On Sat, Dec 17, 2016 at 01:02:03AM +0100, Michal Hocko wrote:
>> mount -t tracefs none /debug/trace
>> echo 1 > /debug/trace/events/vmscan/enable
>> cat /debug/trace/trace_pipe > trace.log
>>
>> should help
>> [...]
>
> No problem! I enabled writing the trace data to a file and then tried
> to trigger another OOM situation. That worked, this time without a
> complete kernel panic, but with only my processes being killed and the
> system becoming unresponsive. When that happened, I let it run for
> another minute or two so that in case it was still logging something
> to the trace file, it could continue to do so some time longer. Then I
> rebooted with the only thing that still worked, i.e. by means of magic
> SysRequest.

Under OOM situation, writing to a file on disk unlikely works. Maybe
logging via network ( "cat /debug/trace/trace_pipe > /dev/udp/$ip/$port"
if your are using bash) works better. (I wish we can do it from kernel
so that /bin/cat is not disturbed by delays due to page fault.)

If you can configure netconsole for logging OOM killer messages and
UDP socket for logging trace_pipe messages, udplogger at
https://osdn.net/projects/akari/scm/svn/tree/head/branches/udplogger/
might fit for logging both output with timestamp into a single file.

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at 
http://vger.kernel.org/majordomo-info.html[http://vger.kernel.org/majordomo-info.html]
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Help please: BTRFS fs crashed due to bad removal of USB drive, no help from recovery procedures

2016-12-17 Thread Xin Zhou


Hi Jari,
 
Similar with other file system, btrfs has copies of super blocks.
Try to run "man btrfs check", "man btrfs rescue" and related commands for more 
details.
Regards,
Xin
 
 

Sent: Saturday, December 17, 2016 at 2:06 AM
From: "Jari Seppälä" 
To: linux-btrfs@vger.kernel.org
Subject: Help please: BTRFS fs crashed due to bad removal of USB drive, no help 
from recovery procedures
Syslog tells:
[ 135.446222] BTRFS error (device sdb1): system chunk array too small 0 < 97
[ 135.446260] BTRFS error (device sdb1): superblock contains fatal errors
[ 135.462544] BTRFS error (device sdb1): open_ctree failed

What have been done:
* All "btrfs rescue" options

Info on system
* fs on external SSD via USB
* kernel 4.9.0 (tried with 4.8.13)
* btrfs-tools 4.4
* Mythbuntu (Ubuntu) 16.04.1 LTS with latest fixes 2012-12-16

Any help appreciated. Around 300G of TV recordings on the drive, which of 
course will eventually come as replays.

Jari
--
*** Jari Seppälä

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Server hangs when mount BTRFS filesystem.

2016-12-15 Thread Xin Zhou
Hi Кравцов,

>From the log message, it seems dm-22 has been running out space, probably some 
>checksum did not get committed to disk.
And when trying to repair, it reports checksum missing.


merge_reloc_roots:2426: errno=-28 No space left
Dec 15 00:05:47 OraCI2 kernel: BTRFS warning (device dm-22): Skipping
commit of aborted transaction.
Dec 15 00:05:47 OraCI2 kernel: BTRFS: error (device dm-22) in
cleanup_transaction:1854: errno=-28 No space left
Dec 15 00:05:57 OraCI2 kernel: pending csums is 34287616

...
ERROR: errors found in extent allocation tree or chunk allocation
Fixed 0 roots.
checking free space cache [.]
root 5 inode 28350 errors 1000, some csum missing
root 5 inode 28351 errors 1000, some csum missing

Thanks,
Xin
 
 

Sent: Thursday, December 15, 2016 at 12:58 AM
From: "Кравцов Роман Владимирович" 
To: linux-btrfs@vger.kernel.org
Subject: Server hangs when mount BTRFS filesystem.
Hello.

First, server is hangs when btrfs balance working (see logs below).
After server reset can't mount filesystem.

When trying to execute command

# mount -t btrfs /dev/OraCI2/pes.isuse_bp.stands
/var/lib/docker/db/pes.isuse_bp.stands/pes.isuse_bp.standby.base/

server hangs without any messages and log records.


# btrfs --version
btrfs-progs v4.8.3

# btrfs fi show /dev/mapper/OraCI2-pes.isuse_bp.stands
Label: 'pes.isuse_bp.stands' uuid: ada5d777-565b-48e7-87dc-c58c8ad13466
Total devices 1 FS bytes used 2.24TiB
devid 1 size 3.49TiB used 2.35TiB path
/dev/mapper/OraCI2-pes.isuse_bp.stands



# btrfsck --repair -p /dev/OraCI2/pes.isuse_bp.stands
enabling repair mode
Checking filesystem on /dev/OraCI2/pes.isuse_bp.stands
UUID: ada5d777-565b-48e7-87dc-c58c8ad13466
parent transid verify failed on 2651226128384 wanted 136007 found 136176
parent transid verify failed on 2651226128384 wanted 136007 found 136176
Ignoring transid failure
leaf parent key incorrect 2651226128384
bad block 2651226128384

ERROR: errors found in extent allocation tree or chunk allocation
Fixed 0 roots.
checking free space cache [.]
root 5 inode 28350 errors 1000, some csum missing
root 5 inode 28351 errors 1000, some csum missing
root 5 inode 28354 errors 1000, some csum missing
root 5 inode 28358 errors 1000, some csum missing
root 5 inode 28360 errors 1000, some csum missing
root 5 inode 28361 errors 1000, some csum missing
root 5 inode 28368 errors 1000, some csum missing
root 5 inode 28369 errors 1000, some csum missing
root 5 inode 28370 errors 1000, some csum missing
root 5 inode 28371 errors 1000, some csum missing
root 5 inode 28372 errors 1000, some csum missing
root 5 inode 28373 errors 1000, some csum missing
root 5 inode 28376 errors 1000, some csum missing
root 5 inode 28377 errors 1000, some csum missing
root 5 inode 28378 errors 1000, some csum missing
root 5 inode 28379 errors 1000, some csum missing
root 5 inode 28380 errors 1000, some csum missing
root 5 inode 28381 errors 1000, some csum missing
root 5 inode 28382 errors 1000, some csum missing
root 5 inode 28383 errors 1000, some csum missing
root 5 inode 28384 errors 1000, some csum missing
root 5 inode 28385 errors 1000, some csum missing
root 5 inode 28386 errors 1000, some csum missing
root 5 inode 28387 errors 1000, some csum missing
root 5 inode 28388 errors 1000, some csum missing
root 5 inode 28389 errors 1000, some csum missing
root 5 inode 28390 errors 1000, some csum missing
root 5 inode 28391 errors 1000, some csum missing
root 5 inode 28392 errors 1000, some csum missing
root 5 inode 28393 errors 1000, some csum missing
root 5 inode 28394 errors 1000, some csum missing
root 5 inode 28395 errors 1000, some csum missing
root 5 inode 28396 errors 1000, some csum missing
root 5 inode 55108 errors 1000, some csum missing
root 5 inode 55313 errors 1000, some csum missing
root 5 inode 55314 errors 1000, some csum missing
root 5 inode 55315 errors 1000, some csum missing
root 5 inode 55316 errors 1000, some csum missing
root 5 inode 55317 errors 1000, some csum missing
root 5 inode 55318 errors 1000, some csum missing

checking csums
checking root refs
Recowing metadata block 2651226128384
found 2462630760448 bytes used err is 0
total csum bytes: 2398866488
total tree bytes: 5910593536
total fs tree bytes: 1679392768
total extent tree bytes: 1436450816
btree space waste bytes: 887715010
file data blocks allocated: 459312458981376
referenced 2199769403392
extent buffer leak: start 2651226128384 len 16384


# cat /var/log/messages | grep 'Dec 15 00'
Dec 15 00:02:35 OraCI2 kernel: BTRFS info (device dm-22): found 41156
extents
Dec 15 00:02:35 OraCI2 kernel: BTRFS info (device dm-22): relocating
block group 2568411414528 flags 1
Dec 15 00:02:37 OraCI2 kernel: BTRFS info (device dm-22): found 34939
extents
Dec 15 00:05:47 OraCI2 kernel: use_block_rsv: 20 callbacks suppressed
Dec 15 00:05:47 OraCI2 kernel: [ cut here ]
Dec 15 00:05:47 OraCI2 kernel: WARNING: CPU: 35 PID: 30215 at
fs/btrfs/extent-tree.c:8321 

Re: page allocation stall in kernel 4.9 when copying files from one btrfs hdd to another

2016-12-14 Thread Xin Zhou
Hi,

The dirty data is in large amount, probably unable to commit to disk.
And this seems to happen when copying from 7200rpm to 5600rpm disks, according 
to previous post.

Probably the I/Os are buffered and pending, unable to get finished in-time.
It might be helpful to know if this only happens for specific types of 5600 rpm 
disks?

And are these disks on RAID groups? Thanks.
Xin
 
 

Sent: Wednesday, December 14, 2016 at 3:38 AM
From: admin 
To: "Michal Hocko" 
Cc: linux-btrfs@vger.kernel.org, linux-ker...@vger.kernel.org, "David Sterba" 
, "Chris Mason" 
Subject: Re: page allocation stall in kernel 4.9 when copying files from one 
btrfs hdd to another
Hi,

I verified the log files and see no prior oom killer invocation. Unfortunately 
the machine has been rebooted since. Next time it happens, I will also look in 
dmesg.

Thanks,
David Arendt


Michal Hocko – Wed., 14. December 2016 11:31
> Btw. the stall should be preceded by the OOM killer invocation. Could
> you share the OOM report please. I am asking because such an OOM killer
> would be clearly pre-mature as per your meminfo. I am trying to change
> that code and seeing your numbers might help me.
>
> Thanks!
>
> On Wed 14-12-16 11:17:43, Michal Hocko wrote:
> > On Tue 13-12-16 18:11:01, David Arendt wrote:
> > > Hi,
> > >
> > > I receive the following page allocation stall while copying lots of
> > > large files from one btrfs hdd to another.
> > >
> > > Dec 13 13:04:29 server kernel: kworker/u16:8: page allocation stalls for 
> > > 12260ms, order:0, mode:0x2400840(GFP_NOFS|__GFP_NOFAIL)
> > > Dec 13 13:04:29 server kernel: CPU: 0 PID: 24959 Comm: kworker/u16:8 
> > > Tainted: P O 4.9.0 #1
> > [...]
> > > Dec 13 13:04:29 server kernel: Call Trace:
> > > Dec 13 13:04:29 server kernel: [] ? dump_stack+0x46/0x5d
> > > Dec 13 13:04:29 server kernel: [] ? 
> > > warn_alloc+0x111/0x130
> > > Dec 13 13:04:33 server kernel: [] ? 
> > > __alloc_pages_nodemask+0xbe8/0xd30
> > > Dec 13 13:04:33 server kernel: [] ? 
> > > pagecache_get_page+0xe4/0x230
> > > Dec 13 13:04:33 server kernel: [] ? 
> > > alloc_extent_buffer+0x10b/0x400
> > > Dec 13 13:04:33 server kernel: [] ? 
> > > btrfs_alloc_tree_block+0x125/0x560
> >
> > OK, so this is
> > find_or_create_page(mapping, index, GFP_NOFS|__GFP_NOFAIL)
> >
> > The main question is whether this really needs to be NOFS request...
> >
> > > Dec 13 13:04:33 server kernel: [] ? 
> > > read_extent_buffer_pages+0x21f/0x280
> > > Dec 13 13:04:33 server kernel: [] ? 
> > > __btrfs_cow_block+0x141/0x580
> > > Dec 13 13:04:33 server kernel: [] ? 
> > > btrfs_cow_block+0x100/0x150
> > > Dec 13 13:04:33 server kernel: [] ? 
> > > btrfs_search_slot+0x1e9/0x9c0
> > > Dec 13 13:04:33 server kernel: [] ? 
> > > __set_extent_bit+0x512/0x550
> > > Dec 13 13:04:33 server kernel: [] ? 
> > > lookup_inline_extent_backref+0xf5/0x5e0
> > > Dec 13 13:04:34 server kernel: [] ? 
> > > set_extent_bit+0x24/0x30
> > > Dec 13 13:04:34 server kernel: [] ? 
> > > update_block_group.isra.34+0x114/0x380
> > > Dec 13 13:04:34 server kernel: [] ? 
> > > __btrfs_free_extent.isra.35+0xf4/0xd20
> > > Dec 13 13:04:34 server kernel: [] ? 
> > > btrfs_merge_delayed_refs+0x61/0x5d0
> > > Dec 13 13:04:34 server kernel: [] ? 
> > > __btrfs_run_delayed_refs+0x902/0x10a0
> > > Dec 13 13:04:34 server kernel: [] ? 
> > > btrfs_run_delayed_refs+0x90/0x2a0
> > > Dec 13 13:04:34 server kernel: [] ? 
> > > delayed_ref_async_start+0x84/0xa0
> >
> > What would cause the reclaim recursion?
> >
> > > Dec 13 13:04:34 server kernel: Mem-Info:
> > > Dec 13 13:04:34 server kernel: active_anon:20 inactive_anon:34
> > > isolated_anon:0\x0a active_file:7370032 inactive_file:450105
> > > isolated_file:320\x0a unevictable:0 dirty:522748 writeback:189
> > > unstable:0\x0a slab_reclaimable:178255 slab_unreclaimable:124617\x0a
> > > mapped:4236 shmem:0 pagetables:1163 bounce:0\x0a free:38224 free_pcp:241
> > > free_cma:0
> >
> > This speaks for itself. There is a lot of dirty data, basically no
> > anonymous memory and GFP_NOFS cannot do much to reclaim obviously. This
> > is either a configuraion bug as somebody noted down the thread (setting
> > the dirty_ratio) or suboptimality of the btrfs code which might request
> > NOFS even though it is not strictly necessary. This would be more for
> > btrfs developers.
> > --
> > Michal Hocko
> > SUSE Labs
>
> --
> Michal Hocko
> SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: page allocation stall in kernel 4.9 when copying files from one btrfs hdd to another

2016-12-13 Thread Xin Zhou
Hi David,

It has GFP_NOFS flags, according to definition,
the issue might have happened during initial DISK/IO.

By the way, did you get a chance to dump the meminfo and run "top" before the 
system hang?
It seems more info about the system running state needed to know the issue. 
Thanks.

Xin

 

Sent: Tuesday, December 13, 2016 at 9:11 AM
From: "David Arendt" 
To: linux-btrfs@vger.kernel.org, linux-ker...@vger.kernel.org
Subject: page allocation stall in kernel 4.9 when copying files from one btrfs 
hdd to another
Hi,

I receive the following page allocation stall while copying lots of
large files from one btrfs hdd to another.

Dec 13 13:04:29 server kernel: kworker/u16:8: page allocation stalls for
12260ms, order:0, mode:0x2400840(GFP_NOFS|__GFP_NOFAIL)
Dec 13 13:04:29 server kernel: CPU: 0 PID: 24959 Comm: kworker/u16:8
Tainted: P O 4.9.0 #1
Dec 13 13:04:29 server kernel: Hardware name: ASUS All Series/H87M-PRO,
BIOS 2102 10/28/2014
Dec 13 13:04:29 server kernel: Workqueue: btrfs-extent-refs
btrfs_extent_refs_helper
Dec 13 13:04:29 server kernel:  813f3a59
81976b28 c90011093750
Dec 13 13:04:29 server kernel: 81114fc1 02400840f39b6bc0
81976b28 c900110936f8
Dec 13 13:04:29 server kernel: 88070010 c90011093760
c90011093710 02400840
Dec 13 13:04:29 server kernel: Call Trace:
Dec 13 13:04:29 server kernel: [] ? dump_stack+0x46/0x5d
Dec 13 13:04:29 server kernel: [] ?
warn_alloc+0x111/0x130
Dec 13 13:04:33 server kernel: [] ?
__alloc_pages_nodemask+0xbe8/0xd30
Dec 13 13:04:33 server kernel: [] ?
pagecache_get_page+0xe4/0x230
Dec 13 13:04:33 server kernel: [] ?
alloc_extent_buffer+0x10b/0x400
Dec 13 13:04:33 server kernel: [] ?
btrfs_alloc_tree_block+0x125/0x560
Dec 13 13:04:33 server kernel: [] ?
read_extent_buffer_pages+0x21f/0x280
Dec 13 13:04:33 server kernel: [] ?
__btrfs_cow_block+0x141/0x580
Dec 13 13:04:33 server kernel: [] ?
btrfs_cow_block+0x100/0x150
Dec 13 13:04:33 server kernel: [] ?
btrfs_search_slot+0x1e9/0x9c0
Dec 13 13:04:33 server kernel: [] ?
__set_extent_bit+0x512/0x550
Dec 13 13:04:33 server kernel: [] ?
lookup_inline_extent_backref+0xf5/0x5e0
Dec 13 13:04:34 server kernel: [] ?
set_extent_bit+0x24/0x30
Dec 13 13:04:34 server kernel: [] ?
update_block_group.isra.34+0x114/0x380
Dec 13 13:04:34 server kernel: [] ?
__btrfs_free_extent.isra.35+0xf4/0xd20
Dec 13 13:04:34 server kernel: [] ?
btrfs_merge_delayed_refs+0x61/0x5d0
Dec 13 13:04:34 server kernel: [] ?
__btrfs_run_delayed_refs+0x902/0x10a0
Dec 13 13:04:34 server kernel: [] ?
btrfs_run_delayed_refs+0x90/0x2a0
Dec 13 13:04:34 server kernel: [] ?
delayed_ref_async_start+0x84/0xa0
Dec 13 13:04:34 server kernel: [] ?
process_one_work+0x11d/0x3b0
Dec 13 13:04:34 server kernel: [] ?
worker_thread+0x42/0x4b0
Dec 13 13:04:34 server kernel: [] ?
process_one_work+0x3b0/0x3b0
Dec 13 13:04:34 server kernel: [] ?
process_one_work+0x3b0/0x3b0
Dec 13 13:04:34 server kernel: [] ?
do_group_exit+0x2e/0xa0
Dec 13 13:04:34 server kernel: [] ? kthread+0xb9/0xd0
Dec 13 13:04:34 server kernel: [] ?
kthread_park+0x50/0x50
Dec 13 13:04:34 server kernel: [] ?
ret_from_fork+0x22/0x30
Dec 13 13:04:34 server kernel: Mem-Info:
Dec 13 13:04:34 server kernel: active_anon:20 inactive_anon:34
isolated_anon:0\x0a active_file:7370032 inactive_file:450105
isolated_file:320\x0a unevictable:0 dirty:522748 writeback:189
unstable:0\x0a slab_reclaimable:178255 slab_unreclaimable:124617\x0a
mapped:4236 shmem:0 pagetables:1163 bounce:0\x0a free:38224 free_pcp:241
free_cma:0
Dec 13 13:04:34 server kernel: Node 0 active_anon:80kB
inactive_anon:136kB active_file:29480128kB inactive_file:1800420kB
unevictable:0kB isolated(anon):0kB isolated(file):1280kB mapped:16944kB
dirty:2090992kB writeback:756kB shmem:0kB writeback_tmp:0kB unstable:0kB
pages_scanned:258821 all_unreclaimable? no
Dec 13 13:04:34 server kernel: DMA free:15868kB min:8kB low:20kB
high:32kB active_anon:0kB inactive_anon:0kB active_file:0kB
inactive_file:0kB unevictable:0kB writepending:0kB present:15976kB
managed:15892kB mlocked:0kB slab_reclaimable:0kB slab_unreclaimable:24kB
kernel_stack:0kB pagetables:0kB bounce:0kB free_pcp:0kB local_pcp:0kB
free_cma:0kB
Dec 13 13:04:34 server kernel: lowmem_reserve[]: 0 3428 32019 32019
Dec 13 13:04:34 server kernel: DMA32 free:116800kB min:2448kB low:5956kB
high:9464kB active_anon:0kB inactive_anon:0kB active_file:3087928kB
inactive_file:191336kB unevictable:0kB writepending:221828kB
present:3590832kB managed:3513936kB mlocked:0kB slab_reclaimable:93252kB
slab_unreclaimable:20520kB kernel_stack:48kB pagetables:212kB bounce:0kB
free_pcp:4kB local_pcp:0kB free_cma:0kB
Dec 13 13:04:34 server kernel: lowmem_reserve[]: 0 0 0 0
Dec 13 13:04:34 server kernel: DMA: 1*4kB (U) 1*8kB (U) 1*16kB (U)
1*32kB (U) 1*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U)
1*2048kB (M) 3*4096kB (M) = 15868kB
Dec 13 13:04:34 server kernel: DMA32: 940*4kB (UME) 4006*8kB (UME)
3308*16kB (UME) 791*32kB 

Re: [PATCH] btrfs: fix hole read corruption for compressed inline extents

2016-12-11 Thread Xin Zhou
Hi Zygo,
Since the corruption happens after I/O and checksum,
could it be possible to add some bug catcher code in code path for debug build,
to help narrowing down the issue?
Thanks,
Xin
 
 

Sent: Saturday, December 10, 2016 at 9:16 PM
From: "Zygo Blaxell" 
To: "Roman Mamedov" , "Filipe Manana" 
Cc: linux-btrfs@vger.kernel.org
Subject: Re: [PATCH] btrfs: fix hole read corruption for compressed inline 
extents
Ping?

I know at least two people have read this patch, but it hasn't appeared in
the usual integration branches yet, and I've seen no actionable suggestion
to improve it. I've provided two non-overlapping rationales for it.
Is there something else you are looking for?

This patch is a fix for a simple data corruption bug. It (or some
equivalent fix for the same bug) should be on its way to all stable
kernels starting from 2.6.32.

Thanks

On Mon, Nov 28, 2016 at 05:27:10PM +0500, Roman Mamedov wrote:
> On Mon, 28 Nov 2016 00:03:12 -0500
> Zygo Blaxell  wrote:
>
> > diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
> > index 8e3a5a2..b1314d6 100644
> > --- a/fs/btrfs/inode.c
> > +++ b/fs/btrfs/inode.c
> > @@ -6803,6 +6803,12 @@ static noinline int uncompress_inline(struct 
> > btrfs_path *path,
> > max_size = min_t(unsigned long, PAGE_SIZE, max_size);
> > ret = btrfs_decompress(compress_type, tmp, page,
> > extent_offset, inline_size, max_size);
> > + WARN_ON(max_size > PAGE_SIZE);
> > + if (max_size < PAGE_SIZE) {
> > + char *map = kmap(page);
> > + memset(map + max_size, 0, PAGE_SIZE - max_size);
> > + kunmap(page);
> > + }
> > kfree(tmp);
> > return ret;
> > }
>
> Wasn't this already posted as:
>
> btrfs: fix silent data corruption while reading compressed inline extents
> https://patchwork.kernel.org/patch/9371971/
>
> but you don't indicate that's a V2 or something, and in fact the patch seems
> exactly the same, just the subject and commit message are entirely different.
> Quite confusing.
>
> --
> With respect,
> Roman
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at 
> http://vger.kernel.org/majordomo-info.html[http://vger.kernel.org/majordomo-info.html]
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs-find-root duration?

2016-12-10 Thread Xin Zhou
Hi Markus, 

Some file system automatically generates snapshot, and create a hidden folder 
for recovery,
if the user accidently deletes some files.

It seems btrfs also has a autosnap feature,
so if this option has been enabled before deletion,
or the volume has been mannually generated snapshots, then probably it might be 
able to perform fast recover.

Regards,
Xin 

Sent: Saturday, December 10, 2016 at 4:12 PM
From: "Markus Binsteiner" 
To: linux-btrfs@vger.kernel.org
Subject: btrfs-find-root duration?
It seems I've accidentally deleted all files in my home directory,
which sits in its own btrfs partition (lvm on luks). Now I'm trying to
find the roots to be able to use btrfs restore later on.

btrfs-find-root seems to be taking ages though. I've run it like so:

btrfs-find-root /dev/mapper/think--big-home -o 5 > roots.txt

After 16 hours, there is still no output, but it's still running
utilizing 100% of one core. Is there any way to gauge how much longer
it'll take? Should there have been output already while it's running?

When I run it without redirecting stdout, I get:

$ btrfs-find-root /dev/mapper/think--big-home -o 5

Superblock doesn't contain generation info for root 5
Superblock doesn't contain the level info for root 5

When I omit the '-o 5', it says:

$ btrfs-find-root /dev/mapper/think--big-home

Superblock thinks the generation is 593818
Superblock thinks the level is 0

Is the latter the way to run it? Did that initially, but that didn't
return any results in a reasonable timeframe either.

The filesystem was created with Debian Jessie, but I'm using Ubuntu (
btrfs-progs v4.7.3 ) to try to restore the files at the moment.

Thanks!
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/6] btrfs dax IO

2016-12-07 Thread Xin Zhou
Hi Liu,
 
>From the patch, is the snapshot disabled by disabling the COW in the mounting 
>path?
It seems the create_snapshot() in ioctl.c does not get changed.

I experienced some similar system but am a bit new to the brtfs code.
  
Thanks, 
Xin
 
 

Subject: [PATCH 0/6] btrfs dax IOFrom: Liu Bo Date: Wed, 
7 Dec 2016 13:45:04 -0800Cc: Chris Mason , Jan Kara , 
David Sterba 
This is a prelimanary patch set to add dax support for btrfs, with
this we can do normal read/write to dax files and can mmap dax files
to userspace so that applications have the ability to access
persistent memory directly.

Please note that currently this is limited to nocow, i.e. all dax
inodes do not have COW behaviour.

COW:no
mutliple device:no
clone/reflink:  no
snapshot:   no
compression:no
checksum:   no

Right now snapshot is disabled while mounting with -odax, but snapshot
can be created without -odax, and writing to a dax file in snapshot
will get -EIO.

Clone/reflink is dealt with as same as snapshot, -EIO will be returned
when writing to shared extents.

This has adopted the latest iomap framework for dax read/write
and dax mmap.

 
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html