Re: Adventures in btrfs raid5 disk recovery

2016-06-27 Thread Chris Murphy
On Mon, Jun 27, 2016 at 5:21 AM, Austin S. Hemmelgarn wrote: > On 2016-06-25 12:44, Chris Murphy wrote: >> >> On Fri, Jun 24, 2016 at 12:19 PM, Austin S. Hemmelgarn >> wrote: >> >>> Well, the obvious major advantage that comes to mind for me to >>>

Re: Strange behavior when replacing device on BTRFS RAID 5 array.

2016-06-27 Thread Steven Haigh
On 28/06/16 03:46, Chris Murphy wrote: > On Mon, Jun 27, 2016 at 11:29 AM, Chris Murphy > wrote: > >> >> Next is to decide to what degree you want to salvage this volume and >> keep using Btrfs raid56 despite the risks > > Forgot to complete this thought. So if you get

Re: Kernel bug during RAID1 replace

2016-06-27 Thread Chris Murphy
On Mon, Jun 27, 2016 at 4:26 PM, Saint Germain wrote: >> > > Thanks for your help. > > Ok here is the log from the mounting, and including btrfs replace > (btrfs replace start -f /dev/sda1 /dev/sdd1 /home): > > BTRFS info (device sdb1): disk space caching is enabled > BTRFS

Re: Kernel bug during RAID1 replace

2016-06-27 Thread Chris Murphy
On Mon, Jun 27, 2016 at 5:03 PM, Saint Germain wrote: >> > > Ok thanks I will begin to make an image with dd. > Do you recommend to use sda or sdb ? Well at the moment you're kinda stuck. I'd leave them together and just get the data off the drive normally with cp -a (or

Re: Kernel bug during RAID1 replace

2016-06-27 Thread Chris Murphy
On Mon, Jun 27, 2016 at 6:00 PM, Chris Murphy wrote: > There are two options since your drives support SCT ERC. > > 1. > smartctl -l scterc,70,70 /dev/sdX ## done for both drives > > That will make sure the drive reports a read error in 7 seconds, well > under the

Re: [PATCH 05/14] Btrfs: warn_on for unaccounted spaces

2016-06-27 Thread Qu Wenruo
At 06/27/2016 09:03 PM, Chris Mason wrote: On 06/27/2016 12:47 AM, Qu Wenruo wrote: Hi Josef, Would you please move this patch to the first of the patchset? It's making bisect quite hard, as it will always stop at this patch, hard to check if it's a regression or existing bug. That's a

Re: Kernel bug during RAID1 replace

2016-06-27 Thread Chris Murphy
On Mon, Jun 27, 2016 at 5:06 PM, Saint Germain wrote: > On Mon, 27 Jun 2016 16:58:37 -0600, Chris Murphy > wrote : > >> On Mon, Jun 27, 2016 at 4:55 PM, Chris Murphy >> wrote: >> >> >> BTRFS info (device sdb1): dev_replace

[PATCH] btrfs: Fix leaking bytes_may_use after hitting EDQUOTA

2016-06-27 Thread Qu Wenruo
If one mount btrfs with enospc_debug mount option and hit qgroup limits in btrfs_check_data_free_space(), then at unmount time, kernel warning will be triggered alone with a data space info dump. -- [ cut here ] WARNING: CPU: 0 PID: 3875 at fs/btrfs/extent-tree.c:9785

Re: Kernel bug during RAID1 replace

2016-06-27 Thread Chris Murphy
On Mon, Jun 27, 2016 at 6:49 PM, Saint Germain wrote: > > I've tried both option and launched a replace, but I got the same error > (replace is cancelled, jernel bug). > I will let these options on and attempt a ddrescue on /dev/sda > to /dev/sdd. > Then I will disconnect

Re: Adventures in btrfs raid5 disk recovery

2016-06-27 Thread Chris Murphy
On Mon, Jun 27, 2016 at 3:57 PM, Zygo Blaxell wrote: > On Mon, Jun 27, 2016 at 10:17:04AM -0600, Chris Murphy wrote: > >> It just came up again in a thread over the weekend on linux-raid@. I'm >> going to ask while people are paying attention if a patch to change

Re: Kernel bug during RAID1 replace

2016-06-27 Thread Chris Murphy
On Mon, Jun 27, 2016 at 4:55 PM, Chris Murphy wrote: >> BTRFS info (device sdb1): dev_replace from /dev/sda1 (devid 1) to /dev/sdd1 >> started >> scrub_handle_errored_block: 166 callbacks suppressed >> BTRFS warning (device sdb1): checksum error at logical 93445255168

Re: Kernel bug during RAID1 replace

2016-06-27 Thread Saint Germain
On Mon, 27 Jun 2016 16:58:37 -0600, Chris Murphy wrote : > On Mon, Jun 27, 2016 at 4:55 PM, Chris Murphy > wrote: > > >> BTRFS info (device sdb1): dev_replace from /dev/sda1 (devid 1) > >> to /dev/sdd1 started scrub_handle_errored_block: 166

Re: Kernel bug during RAID1 replace

2016-06-27 Thread Saint Germain
On Mon, 27 Jun 2016 18:00:34 -0600, Chris Murphy wrote : > On Mon, Jun 27, 2016 at 5:06 PM, Saint Germain > wrote: > > On Mon, 27 Jun 2016 16:58:37 -0600, Chris Murphy > > wrote : > > > >> On Mon, Jun 27, 2016 at 4:55 PM,

Re: Kernel bug during RAID1 replace

2016-06-27 Thread Saint Germain
On Mon, 27 Jun 2016 15:42:42 -0600, Chris Murphy wrote : > On Mon, Jun 27, 2016 at 3:36 PM, Saint Germain > wrote: > > Hello, > > > > I am on Debian Jessie with a kernel from backports: > > 4.6.0-0.bpo.1-amd64 > > > > I am also using btrfs-tools

Re: Kernel bug during RAID1 replace

2016-06-27 Thread Saint Germain
On Mon, 27 Jun 2016 16:55:07 -0600, Chris Murphy wrote : > On Mon, Jun 27, 2016 at 4:26 PM, Saint Germain > wrote: > > >> > > > > Thanks for your help. > > > > Ok here is the log from the mounting, and including btrfs replace > > (btrfs replace

Re: Adventures in btrfs raid5 disk recovery

2016-06-27 Thread Zygo Blaxell
On Mon, Jun 27, 2016 at 04:30:23PM -0600, Chris Murphy wrote: > On Mon, Jun 27, 2016 at 3:57 PM, Zygo Blaxell > wrote: > > On Mon, Jun 27, 2016 at 10:17:04AM -0600, Chris Murphy wrote: > > If anything, I want the timeout to be shorter so that upper layers with > >

[PATCH] fstests: btrfs: Regression test for leaking data reserved space

2016-06-27 Thread Qu Wenruo
When btrfs hits EDQUOTA when reserving data space, it will leak already reserved data space. This test case will check it by using more restrict enospc_debug mount option to trigger kernel warning at umount time. Signed-off-by: Qu Wenruo --- tests/btrfs/124 | 73

Re: [BUG] Btrfs scrub sometime recalculate wrong parity in raid5

2016-06-27 Thread Christoph Anton Mitterer
On Mon, 2016-06-27 at 07:35 +0300, Andrei Borzenkov wrote: > The problem is that current implementation of RAID56 puts exactly CoW > data at risk. I.e. writing new (copy of) data may suddenly make old > (copy of) data inaccessible, even though it had been safely committed > to > disk and is now in

Btrfs full balance command fails due to ENOSPC (bug 121071)

2016-06-27 Thread Francesco Turco
Note: I already filed bug 121071 but perhaps I should have written to this mailing list first. I get the ENOSPC error when running a btrfs full balance command for my root partition, even if it seems I have a lot of free/unallocated space. # btrfs filesystem show / Label: none uuid:

Re: Strange behavior when replacing device on BTRFS RAID 5 array.

2016-06-27 Thread Austin S. Hemmelgarn
On 2016-06-27 13:29, Chris Murphy wrote: On Sun, Jun 26, 2016 at 10:02 PM, Nick Austin wrote: On Sun, Jun 26, 2016 at 8:57 PM, Nick Austin wrote: sudo btrfs fi show /mnt/newdata Label: '/var/data' uuid: e4a2eb77-956e-447a-875e-4f6595a5d3ec

Re: Strange behavior when replacing device on BTRFS RAID 5 array.

2016-06-27 Thread Chris Murphy
On Mon, Jun 27, 2016 at 11:29 AM, Chris Murphy wrote: > > Next is to decide to what degree you want to salvage this volume and > keep using Btrfs raid56 despite the risks Forgot to complete this thought. So if you get a backup, and decide you want to fix it, I would see

Re: Bad hard drive - checksum verify failure forces readonly mount

2016-06-27 Thread Chris Murphy
On Mon, Jun 27, 2016 at 12:30 AM, Vasco Almeida wrote: > File system image available at (choose one link) > https://mega.nz/#!AkAEgKyB!RUa7G5xHIygWm0ALx5ZxQjjXNdFYa7lDRHJ_sW0bWLs > https://www.sendspace.com/file/i70cft > Should I file a bug report with that image dump

Re: Strange behavior when replacing device on BTRFS RAID 5 array.

2016-06-27 Thread Chris Murphy
On Sun, Jun 26, 2016 at 10:02 PM, Nick Austin wrote: > On Sun, Jun 26, 2016 at 8:57 PM, Nick Austin wrote: >> sudo btrfs fi show /mnt/newdata >> Label: '/var/data' uuid: e4a2eb77-956e-447a-875e-4f6595a5d3ec >> Total devices 4 FS bytes used

[RFC] Btrfs: add asynchronous compression support in zlib

2016-06-27 Thread Weigang Li
This patch introduces a change in zlib.c to use the new asynchronous compression API (acomp) proposed in cryptodev (working in progress): https://patchwork.kernel.org/patch/9163577/ Now BTRFS can offload the zlib (de)compression to a hardware accelerator engine if acomp hardware driver is

Re: [PATCH] fstests: btrfs: Regression test for leaking data reserved space

2016-06-27 Thread Eryu Guan
On Tue, Jun 28, 2016 at 09:54:51AM +0800, Qu Wenruo wrote: > When btrfs hits EDQUOTA when reserving data space, it will leak already > reserved data space. > > This test case will check it by using more restrict enospc_debug mount > option to trigger kernel warning at umount time. > >

Re: Adventures in btrfs raid5 disk recovery

2016-06-27 Thread Chris Murphy
On Mon, Jun 27, 2016 at 7:52 PM, Zygo Blaxell wrote: > On Mon, Jun 27, 2016 at 04:30:23PM -0600, Chris Murphy wrote: >> On Mon, Jun 27, 2016 at 3:57 PM, Zygo Blaxell >> wrote: >> > On Mon, Jun 27, 2016 at 10:17:04AM -0600, Chris

Re: Adventures in btrfs raid5 disk recovery

2016-06-27 Thread Zygo Blaxell
On Mon, Jun 27, 2016 at 08:39:21PM -0600, Chris Murphy wrote: > On Mon, Jun 27, 2016 at 7:52 PM, Zygo Blaxell > wrote: > > On Mon, Jun 27, 2016 at 04:30:23PM -0600, Chris Murphy wrote: > >> Btrfs does have something of a work around for when things get slow, > >>

Re: Btrfs full balance command fails due to ENOSPC (bug 121071)

2016-06-27 Thread Chris Murphy
On Mon, Jun 27, 2016 at 12:32 PM, Francesco Turco wrote: > On 2016-06-27 20:18, Chris Murphy wrote: >> If you can grab btrfs-debugfs from >> https://github.com/kdave/btrfs-progs/blob/master/btrfs-debugfs >> >> And then attach the output to the bug report it might be useful for

Re: Btrfs full balance command fails due to ENOSPC (bug 121071)

2016-06-27 Thread Francesco Turco
On 2016-06-27 20:18, Chris Murphy wrote: > If you can grab btrfs-debugfs from > https://github.com/kdave/btrfs-progs/blob/master/btrfs-debugfs > > And then attach the output to the bug report it might be useful for a > developer. But really your case is an odd duck, because there's fully > 14GiB

Re: Bug in 'btrfs filesystem du' ?

2016-06-27 Thread Henk Slager
On Mon, Jun 27, 2016 at 3:33 PM, M G Berberich wrote: > Am Montag, den 27. Juni schrieb M G Berberich: >> after a balance ‘btrfs filesystem du’ probably shows false data about >> shared data. > > Oh, I forgot: I have btrfs-progs v4.5.2 and kernel 4.6.2. With

Re: Btrfs full balance command fails due to ENOSPC (bug 121071)

2016-06-27 Thread Chris Murphy
On Mon, Jun 27, 2016 at 11:28 AM, Francesco Turco wrote: > Note: I already filed bug 121071 but perhaps I should have written to > this mailing list first. https://bugzilla.kernel.org/show_bug.cgi?id=121071 It's a good bug report. > Is there anything I can try? Should I

Re: [BUG] Btrfs scrub sometime recalculate wrong parity in raid5

2016-06-27 Thread Duncan
Steven Haigh posted on Mon, 27 Jun 2016 13:21:00 +1000 as excerpted: > I'd also recommend updates to the ArchLinux wiki - as for some reason I > always seem to end up there when searching for a certain topic... Not really btrfs related, but for people using popular search engines, at least,

[PATCH v2 4/4] btrfs/126,127,128: test feature ioctl and sysfs interfaces

2016-06-27 Thread jeffm
From: Jeff Mahoney This tests the exporting of feature information from the kernel via sysfs and ioctl. The first test works whether the sysfs permissions are correct, if the information exported via sysfs matches what the ioctls are reporting, and if they both match the on-disk

Re: Adventures in btrfs raid5 disk recovery

2016-06-27 Thread Chris Murphy
For what it's worth I found btrfs-map-logical can produce mapping for raid5 (didn't test raid6) by specifying the extent block length. If that's omitted it only shows the device+mapping for the first strip. This example is a 3 disk raid5, with a 128KiB file all in a single extent. [root@f24s ~]#

[PATCH v2 1/4] btrfs/048: extend _filter_btrfs_prop_error to handle additional errors

2016-06-27 Thread jeffm
From: Jeff Mahoney btrfsprogs v4.5.3 changed the formatting of some error messages. This patch extends the filter for btrfs prop to handle those. Signed-off-by: Jeff Mahoney --- common/filter.btrfs | 10 +++--- tests/btrfs/048 | 6 --

[PATCH v2 3/4] btrfs/125: test sysfs exports of allocation and device membership info

2016-06-27 Thread jeffm
From: Jeff Mahoney This tests the sysfs publishing for btrfs allocation and device membership info under a number of different layouts, similar to the btrfs replace test. We test the allocation files only for existence and that they contain numerical values. We test the device

[PATCH v2 2/4] btrfs/124: test global metadata reservation reporting

2016-06-27 Thread jeffm
From: Jeff Mahoney Btrfs can now report the size of the global metadata reservation via ioctl and sysfs. This test confirms that we get sane results on an empty file system. Signed-off-by: Jeff Mahoney --- .gitignore | 1 + common/btrfs

[PATCH v2 0/4] btrfs feature testing + props fix

2016-06-27 Thread jeffm
From: Jeff Mahoney Hi all - Thanks, Eryu, for the review. The btrfs feature testing changes were a patchet I wrote three years ago, and it looks like significant cleanup has happened in the xfstests since then. I'm sorry for the level of the review you had to do for them, but

Re: Btrfs full balance command fails due to ENOSPC (bug 121071)

2016-06-27 Thread Henk Slager
On Mon, Jun 27, 2016 at 9:24 PM, Chris Murphy wrote: > On Mon, Jun 27, 2016 at 12:32 PM, Francesco Turco wrote: >> On 2016-06-27 20:18, Chris Murphy wrote: >>> If you can grab btrfs-debugfs from >>>

Re: Adventures in btrfs raid5 disk recovery

2016-06-27 Thread Henk Slager
On Mon, Jun 27, 2016 at 6:17 PM, Chris Murphy wrote: > On Mon, Jun 27, 2016 at 5:21 AM, Austin S. Hemmelgarn > wrote: >> On 2016-06-25 12:44, Chris Murphy wrote: >>> >>> On Fri, Jun 24, 2016 at 12:19 PM, Austin S. Hemmelgarn >>>

Re: Strange behavior when replacing device on BTRFS RAID 5 array.

2016-06-27 Thread Duncan
Nick Austin posted on Sun, 26 Jun 2016 20:57:32 -0700 as excerpted: > I have a 4 device BTRFS RAID 5 filesystem. > > One of the device members of this file system (sdr) had badblocks, so I > decided to replace it. While the others answered the direct question, there's something potentially

Kernel bug during RAID1 replace

2016-06-27 Thread Saint Germain
Hello, I am on Debian Jessie with a kernel from backports: 4.6.0-0.bpo.1-amd64 I am also using btrfs-tools 4.4.1-1.1~bpo8+1 When trying to replace a RAID1 drive (with btrfs replace start -f /dev/sda1 /dev/sdd1), the operation is cancelled after completing only 5%. I got this error in the

Re: Kernel bug during RAID1 replace

2016-06-27 Thread Chris Murphy
On Mon, Jun 27, 2016 at 3:36 PM, Saint Germain wrote: > Hello, > > I am on Debian Jessie with a kernel from backports: > 4.6.0-0.bpo.1-amd64 > > I am also using btrfs-tools 4.4.1-1.1~bpo8+1 > > When trying to replace a RAID1 drive (with btrfs replace start > -f /dev/sda1

Re: Bad hard drive - checksum verify failure forces readonly mount

2016-06-27 Thread Vasco Almeida
A Dom, 26-06-2016 às 13:54 -0600, Chris Murphy escreveu: > On Sun, Jun 26, 2016 at 7:05 AM, Vasco Almeida > wrote: > > I have tried "btrfs check --repair /device" but that seems do not > > do > > any good. > > http://paste.fedoraproject.org/384960/66945936/ > > It did fix

Re: [PATCH 2/4] fstests: btrfs/124: test global metadata reservation reporting

2016-06-27 Thread Eryu Guan
On Fri, Jun 24, 2016 at 11:08:32AM -0400, je...@suse.com wrote: > From: Jeff Mahoney > > Btrfs can now report the size of the global metadata reservation > via ioctl and sysfs. > > This test confirms that we get sane results on an empty file system. > > ENOTTY and missing

Re: Btrfs full balance command fails due to ENOSPC (bug 121071)

2016-06-27 Thread Hans van Kranenburg
Hi! On 06/27/2016 11:26 PM, Henk Slager wrote: btrfs-debug does not show metadata ans system chunks; the balancing problem might come from those. This script does show all chunks: https://github.com/knorrie/btrfs-heatmap/blob/master/show_usage.py Since the existence of python-btrfs, it has

Re: Adventures in btrfs raid5 disk recovery

2016-06-27 Thread Zygo Blaxell
On Mon, Jun 27, 2016 at 10:17:04AM -0600, Chris Murphy wrote: > On Mon, Jun 27, 2016 at 5:21 AM, Austin S. Hemmelgarn > wrote: > > On 2016-06-25 12:44, Chris Murphy wrote: > >> On Fri, Jun 24, 2016 at 12:19 PM, Austin S. Hemmelgarn > >> wrote: > >> >

Bug in 'btrfs filesystem du' ?

2016-06-27 Thread M G Berberich
Hello, after a balance ‘btrfs filesystem du’ probably shows false data about shared data. To reproduce, create a (smal) btrfs-filesystem, copy some data in a directory, then ‘cp -a --reflink’ the data. Now all data is shared and ‘btrfs fi du’ shows it correct. In my case: Total Exclusive

Re: Rescue a single-device btrfs instance with zeroed tree root

2016-06-27 Thread Ivan Shapovalov
On 2016-06-21 at 20:23 +0300, Ivan Shapovalov wrote: > Hello, > > So this is another case of "I lost my partition and do not have > backups". More precisely, _this_ is the backup and it turned out to > be > damaged. > > (The backup was made by partclone.btrfs. Together with a zeroed out > tree

Re: [PATCH 05/14] Btrfs: warn_on for unaccounted spaces

2016-06-27 Thread Chris Mason
On 06/27/2016 12:47 AM, Qu Wenruo wrote: Hi Josef, Would you please move this patch to the first of the patchset? It's making bisect quite hard, as it will always stop at this patch, hard to check if it's a regression or existing bug. That's a good idea. Which workload are you having

Re: Bug in 'btrfs filesystem du' ?

2016-06-27 Thread M G Berberich
Am Montag, den 27. Juni schrieb M G Berberich: > after a balance ‘btrfs filesystem du’ probably shows false data about > shared data. Oh, I forgot: I have btrfs-progs v4.5.2 and kernel 4.6.2. MfG bmg -- „Des is völlig wurscht, was heut beschlos- | M G Berberich sen wird: I bin

Re: [PATCH 4/4] fstests: btrfs/126,127,128: test feature ioctl and sysfs interfaces

2016-06-27 Thread Eryu Guan
On Fri, Jun 24, 2016 at 11:08:34AM -0400, je...@suse.com wrote: > From: Jeff Mahoney > > This tests the exporting of feature information from the kernel via > sysfs and ioctl. The first test works whether the sysfs permissions > are correct, if the information exported via sysfs

Re: [PATCH 2/4] fstests: btrfs/124: test global metadata reservation reporting

2016-06-27 Thread Eryu Guan
On Mon, Jun 27, 2016 at 03:16:47PM +0800, Eryu Guan wrote: > On Fri, Jun 24, 2016 at 11:08:32AM -0400, je...@suse.com wrote: > > From: Jeff Mahoney > > [snip] > > + > > +# get standard environment, filters and checks > > +. ./common/rc > > +. ./common/filter.btrfs > > + > >

Re: [PATCH v2 5/6] fstests: btrfs: test RAID1 device reappear and balanced

2016-06-27 Thread Eryu Guan
On Wed, Jun 22, 2016 at 07:01:54PM +0800, Anand Jain wrote: > > > On 06/21/2016 09:31 PM, Eryu Guan wrote: > > On Wed, Jun 15, 2016 at 04:48:47PM +0800, Anand Jain wrote: > > > From: Anand Jain > > > > > > The test does the following: > > > Initialize a RAID1 with some

Re: Adventures in btrfs raid5 disk recovery

2016-06-27 Thread Austin S. Hemmelgarn
On 2016-06-25 12:44, Chris Murphy wrote: On Fri, Jun 24, 2016 at 12:19 PM, Austin S. Hemmelgarn wrote: Well, the obvious major advantage that comes to mind for me to checksumming parity is that it would let us scrub the parity data itself and verify it. OK but hold on.

Re: [PATCH 3/4] fstests: btrfs/125: test sysfs exports of allocation and device membership info

2016-06-27 Thread Eryu Guan
On Fri, Jun 24, 2016 at 11:08:33AM -0400, je...@suse.com wrote: > From: Jeff Mahoney > > This tests the sysfs publishing for btrfs allocation and device > membership info under a number of different layouts, similar to the > btrfs replace test. We test the allocation files only