Re: RAID-1 refuses to balance large drive

2016-03-25 Thread Duncan
Henk Slager posted on Fri, 25 Mar 2016 15:35:52 +0100 as excerpted: > For the original OP situation, with chunks all filled op with extents > and devices all filled up with chunks, 'integrating' a new 6TB drive > in an 4TB+3TG+2TB raid1 array could probably be done in a bit unusual > way in order

Re: Possible Raid Bug

2016-03-25 Thread Duncan
Chris Murphy posted on Fri, 25 Mar 2016 15:34:11 -0600 as excerpted: > Basically you get one chance to mount rw,degraded and you have to fix > the problem at that time. And you have to balance away any phantom > single chunks that have appeared. For what it's worth it's not the > reboot that

Re: Possible Raid Bug

2016-03-25 Thread Anand Jain
On 03/26/2016 04:09 AM, Alexander Fougner wrote: 2016-03-25 20:57 GMT+01:00 Patrik Lundquist : On 25 March 2016 at 18:20, Stephen Williams wrote: Your information below was very helpful and I was able to recreate the Raid array. However my

[PATCH] btrfs: Cleanup compress_file_range()

2016-03-25 Thread Ashish Samant
Remove unnecessary checks in compress_file_range(). Signed-off-by: Ashish Samant --- fs/btrfs/inode.c | 79 +++- 1 file changed, 38 insertions(+), 41 deletions(-) diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c

Re: RAID Assembly with Missing Empty Drive

2016-03-25 Thread Chris Murphy
[let me try keeping the list cc'd] On Fri, Mar 25, 2016 at 7:21 PM, John Marrett wrote: > Chris, > >> Quite honestly I don't understand how Btrfs raid1 volume with two >> missing devices even permits you to mount it degraded,rw in the first >> place. > > I think you missed

Re: RAID Assembly with Missing Empty Drive

2016-03-25 Thread John Marrett
Chris, > Quite honestly I don't understand how Btrfs raid1 volume with two > missing devices even permits you to mount it degraded,rw in the first > place. I think you missed my previous post, it's simple, I patched the kernel to bypass the check for missing devices with rw mounts, I did this

Re: RAID Assembly with Missing Empty Drive

2016-03-25 Thread Chris Murphy
On Fri, Mar 25, 2016 at 4:31 PM, John Marrett wrote: > Continuing with my recovery efforts I've built overlay mounts of each > of the block devices supporting my btrfs filesystem as well as the new > disk I'm trying to introduce. I have patched the kernel to disable the >

Re: RAID Assembly with Missing Empty Drive

2016-03-25 Thread John Marrett
Continuing with my recovery efforts I've built overlay mounts of each of the block devices supporting my btrfs filesystem as well as the new disk I'm trying to introduce. I have patched the kernel to disable the check for multiple missing devices. I then exported the overlayed devices using iSCSI

Re: Possible Raid Bug

2016-03-25 Thread Chris Murphy
On Fri, Mar 25, 2016 at 1:57 PM, Patrik Lundquist wrote: > > Only errors on the device formerly known as /dev/sde, so why won't it > mount degraded,rw? Now I'm stuck like Stephen. > > # btrfs device usage /mnt > /dev/sdb, ID: 1 >Device size: 2.00GiB >

Re: Possible Raid Bug

2016-03-25 Thread Alexander Fougner
2016-03-25 20:57 GMT+01:00 Patrik Lundquist : > On 25 March 2016 at 18:20, Stephen Williams wrote: >> >> Your information below was very helpful and I was able to recreate the >> Raid array. However my initial question still stands - What if the

Re: Possible Raid Bug

2016-03-25 Thread Patrik Lundquist
On 25 March 2016 at 18:20, Stephen Williams wrote: > > Your information below was very helpful and I was able to recreate the > Raid array. However my initial question still stands - What if the > drives dies completely? I work in a Data center and we see this quite a > lot

Re: [PATCH] Btrfs: fix crash/invalid memory access on fsync when using overlayfs

2016-03-25 Thread Filipe Manana
On Fri, Mar 25, 2016 at 6:49 PM, Chris Mason wrote: > On Mon, Mar 21, 2016 at 05:52:44PM +, Filipe Manana wrote: >> On Mon, Mar 21, 2016 at 5:51 PM, Chris Mason wrote: >> > On Mon, Mar 21, 2016 at 05:38:44PM +, fdman...@kernel.org wrote: >> >> From: Filipe

Re: [PATCH] Btrfs: fix crash/invalid memory access on fsync when using overlayfs

2016-03-25 Thread Chris Mason
On Mon, Mar 21, 2016 at 05:52:44PM +, Filipe Manana wrote: > On Mon, Mar 21, 2016 at 5:51 PM, Chris Mason wrote: > > On Mon, Mar 21, 2016 at 05:38:44PM +, fdman...@kernel.org wrote: > >> From: Filipe Manana > >> > >> If the lower or upper directory of an

Re: [PATCH 03/14] Btrfs: always reserve metadata for delalloc extents

2016-03-25 Thread Liu Bo
On Fri, Mar 25, 2016 at 01:25:49PM -0400, Josef Bacik wrote: > There are a few races in the metadata reservation stuff. First we add the > bytes > to the block_rsv well after we've set the bit on the inode saying that we have > space for it and after we've reserved the bytes. So use the normal

Re: [PATCH 14/14] Btrfs: don't do nocow check unless we have to

2016-03-25 Thread Liu Bo
On Fri, Mar 25, 2016 at 01:26:00PM -0400, Josef Bacik wrote: > Before we write into prealloc/nocow space we have to make sure that there are > no > references to the extents we are writing into, which means checking the extent > tree and csum tree in the case of nocow. So we don't want to do the

[PATCH 13/14] Btrfs: don't bother kicking async if there's nothing to reclaim

2016-03-25 Thread Josef Bacik
We do this check when we start the async reclaimer thread, might as well check before we kick it off to save us some cycles. Thanks, Signed-off-by: Josef Bacik --- fs/btrfs/extent-tree.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/fs/btrfs/extent-tree.c

[PATCH 14/14] Btrfs: don't do nocow check unless we have to

2016-03-25 Thread Josef Bacik
Before we write into prealloc/nocow space we have to make sure that there are no references to the extents we are writing into, which means checking the extent tree and csum tree in the case of nocow. So we don't want to do the nocow dance unless we can't reserve data space, since it's a serious

[PATCH 06/14] Btrfs: add tracepoint for adding block groups

2016-03-25 Thread Josef Bacik
I'm writing a tool to visualize the enospc system inside btrfs, I need this tracepoint in order to keep track of the block groups in the system. Thanks, Signed-off-by: Josef Bacik --- fs/btrfs/extent-tree.c | 2 ++ include/trace/events/btrfs.h | 40

[PATCH 05/14] Btrfs: warn_on for unaccounted spaces

2016-03-25 Thread Josef Bacik
These were hidden behind enospc_debug, which isn't helpful as they indicate actual bugs, unlike the rest of the enospc_debug stuff which is really debug information. Thanks, Signed-off-by: Josef Bacik --- fs/btrfs/extent-tree.c | 14 -- 1 file changed, 8

[PATCH 12/14] Btrfs: fix release reserved extents trace points

2016-03-25 Thread Josef Bacik
We were doing trace_btrfs_release_reserved_extent() in pin_down_extent which isn't quite right because we will go through and free that extent later when we unpin, so it messes up apps that are accounting for the reservation space. We were also unconditionally doing it in

[PATCH 04/14] Btrfs: change delayed reservation fallback behavior

2016-03-25 Thread Josef Bacik
We reserve space for the inode update when we first reserve space for writing to a file. However there are lots of ways that we can use this reservation and not have it for subsequent ordered extents. Previously we'd fall through and try to reserve metadata bytes for this, then we'd just steal

[PATCH 10/14] Btrfs: add tracepoints for flush events

2016-03-25 Thread Josef Bacik
We want to track when we're triggering flushing from our reservation code and what flushing is being done when we start flushing. Thanks, Signed-off-by: Josef Bacik --- fs/btrfs/ctree.h | 9 + fs/btrfs/extent-tree.c | 22 ++--

[PATCH 03/14] Btrfs: always reserve metadata for delalloc extents

2016-03-25 Thread Josef Bacik
There are a few races in the metadata reservation stuff. First we add the bytes to the block_rsv well after we've set the bit on the inode saying that we have space for it and after we've reserved the bytes. So use the normal btrfs_block_rsv_add helper for this case. Secondly we can flush

[PATCH 11/14] Btrfs: add fsid to some tracepoints

2016-03-25 Thread Josef Bacik
When tracing enospc problems on a box with multiple file systems mounted I need to be able to differentiate between the two file systems. Most of the important trace points I'm looking at already have an fsid, but the reserved extent trace points do not, so add that to make it possible to figure

[PATCH 08/14] Btrfs: trace pinned extents

2016-03-25 Thread Josef Bacik
Pinned extents are an important metric to keep track of for enospc. Signed-off-by: Josef Bacik --- fs/btrfs/extent-tree.c | 8 1 file changed, 8 insertions(+) diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c index 1673365..26f7a9d 100644 ---

[PATCH 07/14] Btrfs: introduce ticketed enospc infrastructure

2016-03-25 Thread Josef Bacik
Our enospc flushing sucks. It is born from a time where we were early enospc'ing constantly because multiple threads would race in for the same reservation and randomly starve other ones out. So I came up with this solution to block any other reservations from happening while one guy tried to

[PATCH 02/14] Btrfs: fix callers of btrfs_block_rsv_migrate

2016-03-25 Thread Josef Bacik
So btrfs_block_rsv_migrate just unconditionally calls block_rsv_migrate_bytes. Not only this but it unconditionally changes the size of the block_rsv. This isn't a bug strictly speaking, but it makes truncate block rsv's look funny because every time we migrate bytes over its size grows, even

[PATCH 09/14] Btrfs: fix delalloc reservation amount tracepoint

2016-03-25 Thread Josef Bacik
We can sometimes drop the reservation we had for our inode, so we need to remove that amount from to_reserve so that our tracepoint reports a valid amount of space. Signed-off-by: Josef Bacik --- fs/btrfs/extent-tree.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)

[PATCH 00/14] Enospc rework

2016-03-25 Thread Josef Bacik
1) Huge latency spikes. One guy starts flushing, he doesn't wake up until the flushers are finished doing work and then checks to see if he can continue. Meanwhile everybody is backed up waiting for that guy to finish getting his reservation. 2) The flushers flush everything. They have no idea

[PATCH 01/14] Btrfs: add bytes_readonly to the spaceinfo at once

2016-03-25 Thread Josef Bacik
For some reason we're adding bytes_readonly to the space info after we update the space info with the block group info. This creates a tiny race where we could over-reserve space because we haven't yet taken out the bytes_readonly bit. Since we already know this information at the time we call

Re: Possible Raid Bug

2016-03-25 Thread Stephen Williams
Hi Patrik, [root@Xen ~]# uname -r 4.4.5-1-ARCH [root@Xen ~]# pacman -Q btrfs-progs btrfs-progs 4.4.1-1 Your information below was very helpful and I was able to recreate the Raid array. However my initial question still stands - What if the drives dies completely? I work in a Data center and we

Re: [PATCH v8 25/27] btrfs: dedupe: Add support for compression and dedpue

2016-03-25 Thread Chris Mason
On Fri, Mar 25, 2016 at 09:44:31AM +0800, Qu Wenruo wrote: > > > Chris Mason wrote on 2016/03/24 16:35 -0400: > >On Tue, Mar 22, 2016 at 09:35:50AM +0800, Qu Wenruo wrote: > >>From: Wang Xiaoguang > >> > >>The basic idea is also calculate hash before compression, and

Re: [PATCH v8 10/27] btrfs: dedupe: Add basic tree structure for on-disk dedupe method

2016-03-25 Thread Chris Mason
On Fri, Mar 25, 2016 at 09:59:39AM +0800, Qu Wenruo wrote: > > > Chris Mason wrote on 2016/03/24 16:58 -0400: > >Are you storing the entire hash, or just the parts not represented in > >the key? I'd like to keep the on-disk part as compact as possible for > >this part. > > Currently, it's

Re: Possible Raid Bug

2016-03-25 Thread Patrik Lundquist
On Debian Stretch with Linux 4.4.6, btrfs-progs 4.4 in VirtualBox 5.0.16 with 4*2GB VDIs: # mkfs.btrfs -m raid10 -d raid10 /dev/sdb /dev/sdc /dev/sdd /dev/sdbe # mount /dev/sdb /mnt # touch /mnt/test # umount /mnt Everything fine so far. # wipefs -a /dev/sde *reboot* # mount /dev/sdb /mnt

[PATCH] delete obsolete function btrfs_print_tree()

2016-03-25 Thread Holger Hoffstätte
Dan Carpenter's static checker recently found missing IS_ERR handling in print-tree.c:btrfs_print_tree(). While looking into this I found that this function is no longer called anywhere and was moved to btrfs-progs long ago. It can simply be removed. Reported-by: Dan Carpenter

Re: btrfs ways to travel back in time

2016-03-25 Thread Alexander Fougner
2016-03-23 23:31 GMT+01:00 Vytautas D : > >> atime). Also, this might break some configurations not expecting the >> set-default method > > I have never seen this before. Can you extend on this or provide a link so i > can read more about such limitation? >> Ubuntu, for

Re: Possible Raid Bug

2016-03-25 Thread Duncan
Patrik Lundquist posted on Fri, 25 Mar 2016 13:48:08 +0100 as excerpted: > On 25 March 2016 at 12:49, Stephen Williams > wrote: >> >> So catch 22, you need all the drives otherwise it won't let you mount, >> But what happens if a drive dies and the OS doesn't detect it?

Re: RAID-1 refuses to balance large drive

2016-03-25 Thread Henk Slager
On Fri, Mar 25, 2016 at 2:16 PM, Patrik Lundquist wrote: > On 23 March 2016 at 20:33, Chris Murphy wrote: >> >> On Wed, Mar 23, 2016 at 1:10 PM, Brad Templeton wrote: >> > >> > I am surprised to hear it said that having the

scrub: Tree block spanning stripes, ignored

2016-03-25 Thread Ivan P
Hello, using kernel 4.4.5 and btrfs-progs 4.4.1, I today ran a scrub on my 2x1Tb btrfs raid1 array and it finished with 36 unrecoverable errors [1], all blaming the treeblock 741942071296. Running "btrfs check --readonly" on one of the devices lists that extent as corrupted [2]. How can I

Re: [PATCH v2] fstests: add btrfs test for fsync after snapshot deletion

2016-03-25 Thread Eryu Guan
On Thu, Mar 24, 2016 at 08:08:36PM +, fdman...@kernel.org wrote: > From: Filipe Manana > > Test that if we delete a snapshot, delete its parent directory, create > another directory with the same name as that parent and then fsync either > the new directory or a file

[PATCH] Btrfs: don't use src fd for printk

2016-03-25 Thread Josef Bacik
The fd we pass in may not be on a btrfs file system, so don't try to do BTRFS_I() on it. Thanks, Signed-off-by: Josef Bacik --- fs/btrfs/ioctl.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c index 8bbecda..b0d1345 100644

Re: [PATCH] fstests: add btrfs test for fsync after snapshot deletion

2016-03-25 Thread Eryu Guan
On Fri, Mar 25, 2016 at 12:58:52PM +0100, Holger Hoffstätte wrote: > On 03/25/16 04:53, Eryu Guan wrote: > > Test fails on v4.5 kernel as expected, but I failed to compile btrfs > > after applying this patch, seems btrfs_must_commit_transaction was not > > defined anywhere (I did grep it through

Re: RAID-1 refuses to balance large drive

2016-03-25 Thread Patrik Lundquist
On 23 March 2016 at 20:33, Chris Murphy wrote: > > On Wed, Mar 23, 2016 at 1:10 PM, Brad Templeton wrote: > > > > I am surprised to hear it said that having the mixed sizes is an odd > > case. > > Not odd as in wrong, just uncommon compared to other

Re: Aw: cannot repair raid6 volume rescue zero-log crashed

2016-03-25 Thread Duncan
Jan Koester posted on Fri, 25 Mar 2016 12:02:29 +0100 as excerpted: > with btrfs tools 4.5 i got this message: Unfortunately this isn't going to be a lot of direct help in regard to your specific situation as I'm simply a btrfs using admin and list regular, not a dev, and I don't use btrfs

Re: Possible Raid Bug

2016-03-25 Thread Patrik Lundquist
On 25 March 2016 at 12:49, Stephen Williams wrote: > > So catch 22, you need all the drives otherwise it won't let you mount, > But what happens if a drive dies and the OS doesn't detect it? BTRFS > wont allow you to mount the raid volume to remove the bad disk! Version of

[PATCH v2] fstests: add btrfs test for fsync after snapshot deletion

2016-03-25 Thread fdmanana
From: Filipe Manana Test that if we delete a snapshot, delete its parent directory, create another directory with the same name as that parent and then fsync either the new directory or a file inside the new directory, the fsync succeeds, the fsync log is replayable and

Re: [PATCH] fstests: add btrfs test for fsync after snapshot deletion

2016-03-25 Thread Holger Hoffstätte
On 03/25/16 04:53, Eryu Guan wrote: > Test fails on v4.5 kernel as expected, but I failed to compile btrfs > after applying this patch, seems btrfs_must_commit_transaction was not > defined anywhere (I did grep it through the kernel tree, nothing showed > up), did I miss anything? > >

Possible Raid Bug

2016-03-25 Thread Stephen Williams
Hi, Find instructions on how to recreate below - I have a BTRFS raid 10 setup in Virtualbox (I'm getting to grips with the Filesystem) I have the raid mounted to /mnt like so - [root@Xen ~]# btrfs filesystem show /mnt/ Label: none uuid: ad1d95ee-5cdc-420f-ad30-bd16158ad8cb Total

Re: btrfs_destroy_inode WARN_ON.

2016-03-25 Thread Markus Trippelsdorf
On 2016.03.24 at 18:54 -0400, Dave Jones wrote: > Just hit this on a tree from earlier this morning, v4.5-11140 or so. > > WARNING: CPU: 2 PID: 32570 at fs/btrfs/inode.c:9261 > btrfs_destroy_inode+0x389/0x3f0 [btrfs] > CPU: 2 PID: 32570 Comm: rm Not tainted 4.5.0-think+ #14 > c039baf9

Re: systemd : Timed out waiting for defice dev-disk-by…

2016-03-25 Thread Qu Wenruo
Hi, Although I know the post is almost one year ago, but I'm quite interested in the long mount time. Any info about the fs except it's a 12 x 4T raid10? We're investigating such long mount time, but unfortunately, we didn't find a good idea to reproduce it (although we don't have 12