kernel 3.8.8: btrfs still crashes on boot when it can't replay a log

2013-05-16 Thread Marc MERLIN
I've reported this bug a few times over different kernel versions over the last year now, and unfortunately it's still not fixed as of 3.8 (yes, I know 3.9 is out, I'm just about to switch). What happens as far as I know: I have btrfs on top of dmcrypt on an SDD. The SSD on occasion seems to

Re: kernel 3.8.8: btrfs still crashes on boot when it can't replay a log

2013-05-17 Thread Marc MERLIN
On Thu, May 16, 2013 at 08:09:18AM -0700, Marc MERLIN wrote: I've reported this bug a few times over different kernel versions over the last year now, and unfortunately it's still not fixed as of 3.8 (yes, I know 3.9 is out, I'm just about to switch). What happens as far as I know: I have

Re: kernel 3.8.8: btrfs still crashes on boot when it can't replay a log

2013-05-17 Thread Marc MERLIN
hangs (after reboot it was ok) Thanks, Marc On Fri, May 17, 2013 at 08:48:11AM -0700, Marc MERLIN wrote: Sigh, last night my laptop hung again, I don't have a way to know why. When I rebooted wit 3.9.2, soon after boot, I started to get this: INFO: task btrfs-transacti:520 blocked for more than

btrfs raid5 recovery with 1 half failed drive, or multiple drives kicked out at the same time.

2013-08-17 Thread Marc MERLIN
I know the raid5 code is still new and being worked on, but I was curious. With md raid5, I can do this: mdadm /dev/md7 --replace /dev/sde1 This is cool because it lets you replace a drive with bad sectors where at least one other drive in the array has bad sectors, and the md layer will read

Re: Ability to free space on a full btrfs filesystem

2013-08-25 Thread Marc MERLIN
On Sun, Aug 25, 2013 at 06:23:20PM +0200, Matthieu Dalstein wrote: Hello, I'm currently experiencing some space issue on btrfs with linux kernel 3.10. My btrfs partition is full and I fail to remove any data to give some free space: Did you snapshot your filesystem? If so, delete some/all

Newer kernels do not oops if log cannot be re-read at mount

2013-09-04 Thread Marc MERLIN
I just wanted to confirm that the crash on unexpected log read during mount is indeed fixed for me, I now just get a log read failed message. I've filed a bug with debian to encourage them to add btrfs-zero-log as a tool in the initrd so that one can have the choice of running this on a non

btrfs race conditions on snapshot delete/create

2013-09-27 Thread Marc MERLIN
I had a cronjob that mistakenly created and deleted snapshots in the same place at the same time. Interesting output I got: /var/local/scr/btrfs_snaps: line 23: 26017 Segmentation fault (core dumped) /sbin/btrfs subvolume delete $sub On ubuntu precise (i.e. super ancient), but I upgraded

Re: How to recover from failing btrffs send | btrfs receive?

2014-02-13 Thread Marc MERLIN
Ok, let me try something else :) Of those who are using btrfs send/receive, has anyone gotten in a state where incrementals will not apply anymore? Thanks, Marc On Wed, Feb 12, 2014 at 06:22:07AM -0800, Marc MERLIN wrote: So, I've veen running this for a few weeks, and soon should have

Re: How to recover from failing btrfs send | btrfs receive?

2014-02-16 Thread Marc MERLIN
, Marc MERLIN wrote: So, I've veen running this for a few weeks, and soon should have something half decent to share for others to use. Unfortunately, one of my backups is now failing like so: btrfs send -p $src_snap $src_newsnap | btrfs receive $dest_pool/ + btrfs send -p /mnt/btrfs_pool1

Re: How to recover from failing btrfs send | btrfs receive?

2014-02-16 Thread Marc MERLIN
On Sun, Feb 16, 2014 at 03:38:18PM +, Filipe David Manana wrote: On Sun, Feb 16, 2014 at 2:23 PM, Marc MERLIN m...@merlins.org wrote: Hi Fillipe, I see you have another fix for btrfs send (attached below), as ell as your other patch on Jan 21st (neither are in my 3.12.7). Hi Marc

Re: How to recover from failing btrfs send | btrfs receive?

2014-02-16 Thread Marc MERLIN
On Sun, Feb 16, 2014 at 09:08:57PM +, Filipe David Manana wrote: I'll see if I come up with other ways of getting into that issue. If you're collecting them, I found another bug, although it might not matter to most: if I put my laptop in S3 sleep during a send/receive, it reliably breaks

btrfs send ioctl failed with -25: Inappropriate ioctl for device

2014-02-22 Thread Marc MERLIN
On Sun, Feb 16, 2014 at 09:32:32PM -0800, Marc MERLIN wrote: On Sun, Feb 16, 2014 at 09:08:57PM +, Filipe David Manana wrote: I'll see if I come up with other ways of getting into that issue. If you're collecting them, I found another bug, although it might not matter to most: if I put

btrfs userland interface isn't 32/64bit clean (breaks lsattr and btrfs send)

2014-02-23 Thread Marc MERLIN
I was trying to make sense out of this: gargamel:~# lsattr lsattr: Inappropriate ioctl for device While reading flags on ./satapmtool lsattr: Inappropriate ioctl for device While reading flags on ./usbreset As well as the btrfs send issue I reported: gargamel:/mnt/btrfs_pool1# btrfs send

3.13.5 kernel hangs some processes with btrfs

2014-02-23 Thread Marc MERLIN
Does someone know how I can debug further why this is hanging? It seems that accessing a certain directory on one of my btrfs filesystems causes this. The rest of my system seems ok, as long as I'm not touching this filesystem. Is this a bug, or a performance problem? [ 1930.287192] INFO: task

Re: 3.13.5 kernel hangs some processes with btrfs

2014-02-23 Thread Marc MERLIN
On Sun, Feb 23, 2014 at 10:14:26PM -0800, Marc MERLIN wrote: Does someone know how I can debug further why this is hanging? It seems that accessing a certain directory on one of my btrfs filesystems causes this. The rest of my system seems ok, as long as I'm not touching this filesystem

Re: 3.13.5 kernel hangs some processes with btrfs

2014-02-23 Thread Marc MERLIN
On Mon, Feb 24, 2014 at 02:27:46PM +0800, Wang Shilong wrote: Note that it says running for 5 seconds, but it started 4H ago. Any idea what's going on here? What is dmesg output? Did it output something like Skip abort transaction? Also what is your mount option? did you enable nodatasum

Re: 3.13.5 kernel hangs some processes with btrfs

2014-02-23 Thread Marc MERLIN
On Mon, Feb 24, 2014 at 06:42:30AM +, Duncan wrote: I believe there's a fix coming (a cancel that blows away the tracking file if it finds it and no actual running scrub is the most obvious fix), but meanwhile, see the /var/lib/btrfs/scrub.status.* files. That's where scrub state is

Re: 3.13.5 kernel hangs some processes with btrfs

2014-02-24 Thread Marc MERLIN
On Mon, Feb 24, 2014 at 07:29:58AM +, Duncan wrote: But I'm still seeing these, albeit less often. Any idea what they could be linked to? (I have a btrs send/receive going right now, it could hanging /mnt/btrfs_pool1 in a way that affects smbd, but the array feels ok otherwise,

Re: 3.14rc3 kernel also hangs some processes with btrfs

2014-02-24 Thread Marc MERLIN
On Mon, Feb 24, 2014 at 09:35:19AM -0800, Marc MERLIN wrote: On Mon, Feb 24, 2014 at 07:29:58AM +, Duncan wrote: But I'm still seeing these, albeit less often. Any idea what they could be linked to? (I have a btrs send/receive going right now, it could hanging /mnt/btrfs_pool1

3.14.0rc3: did not find backref in send_root

2014-02-24 Thread Marc MERLIN
I got this during a btrfs send: BTRFS error (device dm-2): did not find backref in send_root. inode=22672, offset=524288, disk_byte=1490517954560 found extent=1490517954560 I'll try a scrub when I've finished my backup, but is there anything I can run on the file I've found from the inode?

Re: btrfs userland interface isn't 32/64bit clean (breaks lsattr and btrfs send)

2014-02-24 Thread Marc MERLIN
On Mon, Feb 24, 2014 at 08:43:44AM +, Duncan wrote: Hugo Mills posted on Mon, 24 Feb 2014 08:29:38 + as excerpted: On Mon, Feb 24, 2014 at 06:32:14AM +, Duncan wrote: This is a known issue. There's patches in the pipeline for 32-bit userspace on a 64-bit kernel, already.

Re: 3.14.0rc3: btrfs send ioctl failed with -5: Input/output error

2014-02-25 Thread Marc MERLIN
On Tue, Feb 25, 2014 at 03:50:15PM +0800, Wang Shilong wrote: Hi Marc, This seems a regression which has been fixed by the following commit(only pushed into btrfs-next): https://git.kernel.org/cgit/linux/kernel/git/josef/btrfs-next.git/commit/?id=1334bebe71bebbca47b3b92f25511ea980fdeab8

Re: 3.14.0rc3: did not find backref in send_root

2014-02-25 Thread Marc MERLIN
On Wed, Feb 26, 2014 at 11:38:30AM +0800, Wang Shilong wrote: Hi Marc, On 02/26/2014 01:30 AM, Marc MERLIN wrote: On Tue, Feb 25, 2014 at 03:50:15PM +0800, Wang Shilong wrote: Hi Marc, This seems a regression which has been fixed by the following commit(only pushed into btrfs-next

Re: 3.14.0rc3: did not find backref in send_root

2014-02-26 Thread Marc MERLIN
On Wed, Feb 26, 2014 at 03:51:37PM +0800, Wang Shilong wrote: I've applied your patch from https://git.kernel.org/cgit/linux/kernel/git/josef/btrfs-next.git/commit/?id=1334bebe71bebbca47b3b92f25511ea980fdeab8 I can confirm this fixed the btrfs send error on my server, thank you. At snapshot

3.14.0-rc3 btrfs scrub is preventing my laptop from going to sleep

2014-02-27 Thread Marc MERLIN
This does not happen consistently, but sometimes: PM: Preparing system for mem sleep Freezing user space processes ... (...) Freezing of tasks failed after 20.002 seconds (1 tasks refusing to freeze, wq_busy=0): btrfs D 88017639c800 0 12239 12224 0x0084

Re: 3.14.0-rc3 btrfs scrub is preventing my laptop from going to sleep

2014-02-27 Thread Marc MERLIN
On Thu, Feb 27, 2014 at 11:06:56AM -0800, Marc MERLIN wrote: This does not happen consistently, but sometimes: PM: Preparing system for mem sleep Freezing user space processes ... (...) Freezing of tasks failed after 20.002 seconds (1 tasks refusing to freeze, wq_busy=0): btrfs

Re: 3.14.0-rc3 btrfs scrub is preventing my laptop from going to sleep

2014-03-01 Thread Marc MERLIN
On Fri, Feb 28, 2014 at 09:09:37PM -0800, Marc MERLIN wrote: On Fri, Feb 28, 2014 at 09:18:06AM +0800, Wang Shilong wrote: Could you run the following command when scrub is blocked, we can know more why scrub is blocked here. # echo w /proc/sysrq-trigger # dmesg Yes, there you go

Re: 3.14.0-rc3 btrfs scrub is preventing my laptop from going to sleep

2014-03-02 Thread Marc MERLIN
On Mon, Mar 03, 2014 at 11:17:51AM +0800, Wang Shilong wrote: Hi Marc, On 03/01/2014 11:22 PM, Marc MERLIN wrote: On Fri, Feb 28, 2014 at 09:09:37PM -0800, Marc MERLIN wrote: On Fri, Feb 28, 2014 at 09:18:06AM +0800, Wang Shilong wrote: Could you run the following command when scrub

Re: 3.14.0-rc3 btrfs scrub is preventing my laptop from going to sleep

2014-03-03 Thread Marc MERLIN
On Mon, Mar 03, 2014 at 02:50:33PM +0800, Wang Shilong wrote: Here's the log of failure: http://marc.merlins.org/tmp/btrfs_nofreeze2.txt Unfortunately, i could not reproduce this problem here. It should not be the problem that i addressed before, there is not deadlock here. try the

Re: 3.14.0-rc3 btrfs scrub is preventing my laptop from going to sleep

2014-03-03 Thread Marc MERLIN
On Mon, Mar 03, 2014 at 12:09:11PM -0500, Josef Bacik wrote: Ok I lied I just went ahead and did it, please let me know if this fixes it This looked promising, but I still have the problem. PM: Syncing filesystems ... done. PM: Preparing system for mem sleep Freezing user space processes ...

Re: 3.14.0-rc3 btrfs scrub is preventing my laptop from going to sleep

2014-03-03 Thread Marc MERLIN
On Mon, Mar 03, 2014 at 05:18:33PM -0500, Josef Bacik wrote: Maybe it will work if we cancel the scrub as opposed to pausing it, but of course it's not ideal. Is that the next step? Sigh I thought the PM stuff called freeze_fs() but I think that was just tuxonice. I don't have a quick fix

Re: [Repost] Is BTRFS bedup maintained ?

2014-03-05 Thread Marc MERLIN
On Wed, Mar 05, 2014 at 06:24:40PM +0100, Swâmi Petaramesh wrote: Hello, (Not having received a single answer, I repost this...) I got your post, and posted myself about bedup not working at all for me, and got no answer either. As far as I can tell, it's entirely unmaintained and was likely

3.14.0-rc3: btrfs send/receive blocks btrfs IO on other devices (near deadlocks)

2014-03-12 Thread Marc MERLIN
I have a file server with 4 cpu cores and 5 btrfs devices: Label: btrfs_boot uuid: e4c1daa8-9c39-4a59-b0a9-86297d397f3b Total devices 1 FS bytes used 48.92GiB devid1 size 79.93GiB used 73.04GiB path /dev/mapper/cryptroot Label: varlocalspace uuid:

Re: 3.14.0-rc3: btrfs send/receive blocks btrfs IO on other devices (near deadlocks)

2014-03-13 Thread Marc MERLIN
/receive that is taking a btrfs-wide lock? Or btrfs scrub maybe? Thanks, Marc On Wed, Mar 12, 2014 at 08:18:08AM -0700, Marc MERLIN wrote: I have a file server with 4 cpu cores and 5 btrfs devices: Label: btrfs_boot uuid: e4c1daa8-9c39-4a59-b0a9-86297d397f3b Total devices 1 FS bytes used

Re: discard synchronous on most SSDs?

2014-03-13 Thread Marc MERLIN
On Thu, Mar 13, 2014 at 09:39:02PM -0600, Chris Murphy wrote: On Mar 13, 2014, at 8:11 PM, Marc MERLIN m...@merlins.org wrote: On Sun, Mar 09, 2014 at 11:33:50AM +, Hugo Mills wrote: discard is, except on the very latest hardware, a synchronous command (it's a limitation of the SATA

Re: discard synchronous on most SSDs?

2014-03-14 Thread Marc MERLIN
On Fri, Mar 14, 2014 at 12:07:54PM +, Duncan wrote: Marc MERLIN posted on Thu, 13 Mar 2014 22:17:50 -0700 as excerpted: On Thu, Mar 13, 2014 at 09:39:02PM -0600, Chris Murphy wrote: On Mar 13, 2014, at 8:11 PM, Marc MERLIN m...@merlins.org wrote: On Sun, Mar 09, 2014 at 11:33

Re: discard synchronous on most SSDs?

2014-03-14 Thread Marc MERLIN
On Fri, Mar 14, 2014 at 08:46:09PM +, Holger Hoffstätte wrote: On Fri, 14 Mar 2014 15:57:41 -0400, Martin K. Petersen wrote: So right now I'm afraid we don't have a good way for a user to determine whether a device supports queued trims or not. Mount with discard, unpack kernel tree,

Re: discard synchronous on most SSDs?

2014-03-16 Thread Marc MERLIN
On Sat, Mar 15, 2014 at 11:26:27AM +, Duncan wrote: Chris Samuel posted on Sat, 15 Mar 2014 17:48:56 +1100 as excerpted: $ sudo smartctl --identify /dev/sdb | fgrep 'Trim bit in DATA SET MANAGEMENT' 169 0 1 Trim bit in DATA SET MANAGEMENT command supported $

Re: discard synchronous on most SSDs?

2014-03-16 Thread Marc MERLIN
On Sun, Mar 16, 2014 at 12:22:05PM -0400, Martin K. Petersen wrote: queued trim, not even a prototype. I went out and bought a 840 EVO this morning because the general lazyweb opinion seemed to indicate that this drive supports queued trim. Well, it doesn't. At least not in the 120GB version:

How to handle a RAID5 arrawy with a failing drive?

2014-03-16 Thread Marc MERLIN
I just created this array: polgara:/mnt/btrfs_backupcopy# btrfs fi show Label: backupcopy uuid: 7d8e1197-69e4-40d8-8d86-278d275af896 Total devices 10 FS bytes used 220.32GiB devid1 size 465.76GiB used 25.42GiB path /dev/dm-0 devid2 size 465.76GiB used 25.40GiB path

Re: How to handle a RAID5 arrawy with a failing drive?

2014-03-16 Thread Marc MERLIN
On Sun, Mar 16, 2014 at 05:12:10PM -0600, Chris Murphy wrote: On Mar 16, 2014, at 4:55 PM, Chris Murphy li...@colorremedies.com wrote: Then use btrfs replace start. Looks like in 3.14rc6 replace isn't yet supported. I get dev_replace cannot yet handle RAID5/RAID6. When I do: btrfs

Re: How to handle a RAID5 arrawy with a failing drive?

2014-03-16 Thread Marc MERLIN
On Sun, Mar 16, 2014 at 05:23:25PM -0600, Chris Murphy wrote: On Mar 16, 2014, at 5:17 PM, Marc MERLIN m...@merlins.org wrote: - but no matter how I remove the faulty drive, there is no rebuild on a new drive procedure that works yet Correct? I'm not sure. From what I've read we

Re: How to handle a RAID5 arrawy with a failing drive?

2014-03-16 Thread Marc MERLIN
On Sun, Mar 16, 2014 at 07:06:23PM -0600, Chris Murphy wrote: On Mar 16, 2014, at 6:51 PM, Marc MERLIN m...@merlins.org wrote: polgara:/mnt/btrfs_backupcopy# btrfs device delete /dev/mapper/crypt_sde1 `pwd` ERROR: error removing the device '/dev/mapper/crypt_sde1' - Invalid

Re: How to handle a RAID5 arrawy with a failing drive?

2014-03-16 Thread Marc MERLIN
On Sun, Mar 16, 2014 at 08:56:35PM -0600, Chris Murphy wrote: polgara:/mnt/btrfs_backupcopy# btrfs device delete /dev/mapper/crypt_sde1 `pwd` ERROR: error removing the device '/dev/mapper/crypt_sde1' - Invalid argument You didn't specify a mount point, is the reason for that error.

Re: How to handle a RAID5 arrawy with a failing drive?

2014-03-17 Thread Marc MERLIN
On Sun, Mar 16, 2014 at 11:12:43PM -0600, Chris Murphy wrote: On Mar 16, 2014, at 9:44 PM, Marc MERLIN m...@merlins.org wrote: On Sun, Mar 16, 2014 at 08:56:35PM -0600, Chris Murphy wrote: If I add a device, isn't it going to grow my raid to make it bigger instead of trying

Re: How to handle a RAID5 arrawy with a failing drive? - raid5 mostly works, just no rebuilds

2014-03-19 Thread Marc MERLIN
On Tue, Mar 18, 2014 at 09:02:07AM +, Duncan wrote: First just a note that you hijacked Mr Manana's patch thread. Replying (...) I did, I use mutt, I know about in Reply-To, I was tired, I screwed up, sorry, and there was no undo :) Since you don't have to worry about the data I'd suggest

Re: How to handle a RAID5 arrawy with a failing drive? - raid5 mostly works, just no rebuilds

2014-03-19 Thread Marc MERLIN
On Wed, Mar 19, 2014 at 12:32:55AM -0600, Chris Murphy wrote: On Mar 19, 2014, at 12:09 AM, Marc MERLIN m...@merlins.org wrote: 7) you can remove a drive from an array, add files, and then if you plug the drive in, it apparently gets auto sucked in back in the array

btrfs-rmw-2: page allocation failure: order:1, mode:0x8020

2014-03-19 Thread Marc MERLIN
My server died last night during a btrfs send/receive to a btrfs radi5 array Here are the logs. Is this anything known or with a possible workaround? Thanks, Marc btrfs-rmw-2: page allocation failure: order:1, mode:0x8020 CPU: 1 PID: 12499 Comm: btrfs-rmw-2 Not tainted

Re: btrfs-rmw-2: page allocation failure: order:1, mode:0x8020

2014-03-19 Thread Marc MERLIN
On Wed, Mar 19, 2014 at 12:20:08PM -0400, Chris Mason wrote: On 03/19/2014 11:45 AM, Marc MERLIN wrote: My server died last night during a btrfs send/receive to a btrfs radi5 array Here are the logs. Is this anything known or with a possible workaround? Thanks, Marc btrfs-rmw-2: page

Re: How to handle a RAID5 arrawy with a failing drive? - raid5 mostly works, just no rebuilds

2014-03-19 Thread Marc MERLIN
On Wed, Mar 19, 2014 at 10:53:33AM -0600, Chris Murphy wrote: Yes, although it's limited, you apparently only lose new data that was added after you went into degraded mode and only if you add another drive where you write more data. In real life this shouldn't be too common, even if it is

Re: btrfs-rmw-2: page allocation failure: order:1, mode:0x8020

2014-03-19 Thread Marc MERLIN
On Thu, Mar 20, 2014 at 12:13:36AM +, Chris Mason wrote: Should I double it? For now, I have the copy running again, and it's been going for 8 hours without failure on the old kernel but of course that doesn't mean my 2TB copy will complete without hitting the bug again. Sorry, I

Re: btrfs scrub process prevents system suspend

2014-03-20 Thread Marc MERLIN
On Thu, Mar 20, 2014 at 11:30:33AM -0400, Josef Bacik wrote: Yeah there's a way to make suspend run commands while it goes down, you'll want to make it do btrfs scrub cancel on your btrfs fses. If you search the archives you'll see we've covered this recently and the guy posted the script

Re: Understanding btrfs and backups = automatic snapshot script

2014-03-20 Thread Marc MERLIN
On Sun, Mar 16, 2014 at 10:42:24PM -0700, Marc MERLIN wrote: On Thu, Mar 06, 2014 at 09:33:24PM +, Duncan wrote: However, best snapshot management practice does progressive snapshot thinning, so you never have more than a few hundred snapshots to manage at once. Think of it this way

btrfs deadlock (3.14 kernel)

2014-03-21 Thread Marc MERLIN
After 1.5 days of running, the machine I was doing btrfs receive on got stuck with this (note, the traces are not all the same). The machine is not dead, but any IO that goes through btrfs seems dead. If you want Sysrq-W, let me know. Is thre anything I can try to unwedge or prevent this

Send/Receive howto and script for others to use (was Re: Is anyone using btrfs send/receive)

2014-03-21 Thread Marc MERLIN
On Wed, Jan 08, 2014 at 12:02:06AM -0800, Marc MERLIN wrote: On Tue, Jan 07, 2014 at 10:53:29AM +, Hugo Mills wrote: You need to move /mnt/btrfs_pool2/tmp_read_only_new to a different name as well. The send stream contains the name of the subvolume it wants to create, so it's trying

Re: btrfs deadlock (3.14 kernel)

2014-03-21 Thread Marc MERLIN
On Fri, Mar 21, 2014 at 02:24:45PM -0400, Josef Bacik wrote: Is thre anything I can try to unwedge or prevent this problem next time I try? Sysrq+w would be nice so I can see what everybody is doing. Thanks, Sure thing. There you go http://marc.merlins.org/tmp/sysreq-w-btrfs.txt (too big

Re: btrfs deadlock (3.14 kernel)

2014-03-22 Thread Marc MERLIN
On Fri, Mar 21, 2014 at 05:51:23PM -0700, Marc MERLIN wrote: On Fri, Mar 21, 2014 at 02:24:45PM -0400, Josef Bacik wrote: Is thre anything I can try to unwedge or prevent this problem next time I try? Sysrq+w would be nice so I can see what everybody is doing. Thanks, Sure thing

Re: Send/Receive howto and script for others to use (was Re: Is anyone using btrfs send/receive)

2014-03-22 Thread Marc MERLIN
On Sat, Mar 22, 2014 at 09:44:05PM +0200, Brendan Hide wrote: Hi, Marc Feel free to use ideas from my own script. Some aspects in my script are more mature and others are frankly pathetic. ;) There are also quite a lot of TODOs throughout my script that aren't likely to get the urgent

btrfs send/receive still gets out of sync in 3.14.0

2014-03-22 Thread Marc MERLIN
After deleting a huge directory tree in my /home subvolume, syncing snapshots now fails with: ERROR: rmdir o1952777-157-0 failed. No such file or directory Error line 156 with status 1 DIE: Code dump: 153 if [[ -n $init ]]; then 154 btrfs send $src_newsnap | $ssh btrfs receive

Re: Send/Receive howto and script for others to use (was Re: Is anyone using btrfs send/receive)

2014-03-22 Thread Marc MERLIN
Please consider adding a blank line between quotes, it makes them just a bit more readable :) On Sat, Mar 22, 2014 at 11:02:24PM +0200, Brendan Hide wrote: - it doesn't create writeable snapshots on the destination in case you want to use the copy as a live filesystem One of the issues with

Re: btrfs deadlock (3.14 kernel)

2014-03-22 Thread Marc MERLIN
On Fri, Mar 21, 2014 at 11:47:18PM -0700, Marc MERLIN wrote: On Fri, Mar 21, 2014 at 05:51:23PM -0700, Marc MERLIN wrote: On Fri, Mar 21, 2014 at 02:24:45PM -0400, Josef Bacik wrote: Is thre anything I can try to unwedge or prevent this problem next time I try? Sysrq+w would

ERROR: error during balancing '.' - No space left on device

2014-03-23 Thread Marc MERLIN
legolas:/mnt/btrfs_pool2# btrfs balance . ERROR: error during balancing '.' - No space left on device There may be more info in syslog - try dmesg | tail [ 8454.159635] BTRFS info (device dm-1): relocating block group 288329039872 flags 1 [ 8590.167294] BTRFS info (device dm-1): relocating block

btrfs-tools missing btrfs device delete devid=x path ?

2014-03-23 Thread Marc MERLIN
I'm still doing some testing so that I can write some howto. I got that far after a rebalance (mmmh, that took 2 days with little data, and unfortunately 5 deadlocks and reboots. polgara:/mnt/btrfs_backupcopy# btrfs fi show Label: backupcopy uuid: eed9b55c-1d5a-40bf-a032-1be6980648e1

Re: ERROR: error during balancing '.' - No space left on device

2014-03-23 Thread Marc MERLIN
Both legolas:/mnt/btrfs_pool2# btrfs balance start -v -dusage=5 /mnt/btrfs_pool2 legolas:/mnt/btrfs_pool2# btrfs balance start -v -dusage=0 /mnt/btrfs_pool2 failed unfortunately. On Sun, Mar 23, 2014 at 12:26:32PM +, Duncan wrote: When it rains, it pours. What you're missing is that this is

Re: btrfs-tools missing btrfs device delete devid=x path ?

2014-03-23 Thread Marc MERLIN
On Sun, Mar 23, 2014 at 04:18:43PM +, Hugo Mills wrote: On Sun, Mar 23, 2014 at 08:25:17AM -0700, Marc MERLIN wrote: I'm still doing some testing so that I can write some howto. I got that far after a rebalance (mmmh, that took 2 days with little data, and unfortunately 5 deadlocks

Re: ERROR: error during balancing '.' - No space left on device

2014-03-23 Thread Marc MERLIN
On Sun, Mar 23, 2014 at 04:28:25PM +, Hugo Mills wrote: Before you do this, can you take a btrfs-image of your metadata, and add a report to bugzilla.kernel.org? You're not the only person who's had this problem recently, and I suspect there's something still lurking in there that needs

Cannot add device is mounted for unmounted drive that used to be in raidset that is mounted

2014-03-23 Thread Marc MERLIN
I found out that a drive that used to be part of a raid system that is mounted and running without it, btrfs apparently decides that the drive is part of the mounted raidset and in use. As a result, I had to eventually dd 0's over it, btrfs device scan, and finally I was able to use it again.

Re: Cannot add device is mounted for unmounted drive that used to be in raidset that is mounted

2014-03-23 Thread Marc MERLIN
On Sun, Mar 23, 2014 at 11:09:07AM -0700, Marc MERLIN wrote: I found out that a drive that used to be part of a raid system that is mounted and running without it, btrfs apparently decides that the drive is part of the mounted raidset and in use. As a result, I had to eventually dd 0's over

Re: ERROR: error during balancing '.' - No space left on device

2014-03-23 Thread Marc MERLIN
On Sun, Mar 23, 2014 at 05:34:09PM +, Hugo Mills wrote: xaba on IRC has just pointed out that it looks like you're running this on a mounted filesystem -- it needs to be unmounted for btrfs-image to work reliably. Sorry, I didn't realize that, although it makes sense. btrfs-image

Re: How to handle a RAID5 arrawy with a failing drive? - raid5 mostly works, just no rebuilds

2014-03-23 Thread Marc MERLIN
On Wed, Mar 19, 2014 at 10:53:33AM -0600, Chris Murphy wrote: On Mar 19, 2014, at 9:40 AM, Marc MERLIN m...@merlins.org wrote: After adding a drive, I couldn't quite tell if it was striping over 11 drive2 or 10, but it felt that at least at times, it was striping over 11 drives

Re: ERROR: error during balancing '.' - No space left on device

2014-03-23 Thread Marc MERLIN
On Sun, Mar 23, 2014 at 12:10:17PM -0700, Marc MERLIN wrote: On Sun, Mar 23, 2014 at 05:34:09PM +, Hugo Mills wrote: xaba on IRC has just pointed out that it looks like you're running this on a mounted filesystem -- it needs to be unmounted for btrfs-image to work reliably. Sorry

Any use for mkfs.btrfs -d raid5 -m raid1 ?

2014-03-23 Thread Marc MERLIN
If I lose 2 drives on a raid5, -m raid1 should ensure I haven't lost my metadate. From there, would I indeed have small files that would be stored entirely on some of the drives that didn't go missing, and therefore I could recover some data with 2 missing drives? Or is it kind of pointless/waste

Btrfs and raid5 status with kernel 3.14, documentation, and howto

2014-03-23 Thread Marc MERLIN
Ok, thanks to the help I got from you, and my own experiments, I've written this: http://marc.merlins.org/perso/btrfs/post_2014-03-23_Btrfs-Raid5-Status.html If someone reminds me how to edit the btrfs wiki, I'm happy to copy that there, or give anyone permission to take part of all of what I

Re: Any use for mkfs.btrfs -d raid5 -m raid1 ?

2014-03-23 Thread Marc MERLIN
On Sun, Mar 23, 2014 at 10:52:29PM +, Hugo Mills wrote: On Sun, Mar 23, 2014 at 03:44:35PM -0700, Marc MERLIN wrote: If I lose 2 drives on a raid5, -m raid1 should ensure I haven't lost my metadate. From there, would I indeed have small files that would be stored entirely on some

Re: Btrfs and raid5 status with kernel 3.14, documentation, and howto

2014-03-24 Thread Marc MERLIN
On Mon, Mar 24, 2014 at 07:17:12PM +, Martin wrote: Thanks for the very good summary. So... In very brief summary, btrfs raid5 is very much a work in progress. If you know how to use it, which I didn't know do now, it's technically very usable as is. The corner cases are in having a

Re: btrfs-tools missing btrfs device delete devid=x path ?

2014-03-24 Thread Marc MERLIN
On Mon, Mar 24, 2014 at 06:38:30PM +, Duncan wrote: Marc MERLIN posted on Sun, 23 Mar 2014 09:25:06 -0700 as excerpted: On Sun, Mar 23, 2014 at 04:18:43PM +, Hugo Mills wrote: On Sun, Mar 23, 2014 at 08:25:17AM -0700, Marc MERLIN wrote: What's the syntax for removing a drive

Re: Cannot add device is mounted for unmounted drive that used to be in raidset that is mounted

2014-03-24 Thread Marc MERLIN
On Mon, Mar 24, 2014 at 07:19:14PM +, Duncan wrote: Marc MERLIN posted on Sun, 23 Mar 2014 11:58:16 -0700 as excerpted: On Sun, Mar 23, 2014 at 11:09:07AM -0700, Marc MERLIN wrote: I found out that a drive that used to be part of a raid system that is mounted and running without

Re: Btrfs and raid5 status with kernel 3.14, documentation, and howto

2014-03-24 Thread Marc MERLIN
On Tue, Mar 25, 2014 at 01:11:43AM +, Martin wrote: Yes, looking good, but for my usage I need the option to run ok with a failed drive. So, that's one to keep a development eye on for continued progress... So it does run with a failed drive, it'll just fill the logs with write errors,

How to debug very very slow file delete?

2014-03-24 Thread Marc MERLIN
I had a tree with some amount of thousand files (less than 1 million) on top of md raid5. It took 18H to rm it in 3 tries: gargamel:/mnt/dshelf2/backup/polgara# time rm -rf current.todel/ real1087m26.491s user0m2.448s sys 4m42.012s gargamel:/mnt/dshelf2/backup/polgara# btrfs fi show

Re: How to debug very very slow file delete? (btrfs on md-raid5)

2014-03-25 Thread Marc MERLIN
On Tue, Mar 25, 2014 at 12:13:50PM +, Martin wrote: On 25/03/14 01:49, Marc MERLIN wrote: I had a tree with some amount of thousand files (less than 1 million) on top of md raid5. It took 18H to rm it in 3 tries: I ran another test after typing the original Email: gargamel:/mnt

Re: RHEL/CentOS or Debian for stable deployment

2014-03-29 Thread Marc MERLIN
On Fri, Mar 28, 2014 at 11:45:03PM +, Hugo Mills wrote: On Fri, Mar 28, 2014 at 04:38:09PM -0700, Lists wrote: On 03/28/2014 02:42 PM, Avi Miller wrote: Have you considered Oracle Linux? We are continually backporting btrfs fixes and enhancements to our Unbreakable Enterprise Kernel

determining snapshot size

2014-03-29 Thread Marc MERLIN
I had a look at http://bj0z.wordpress.com/2011/04/27/determining-snapshot-size-in-btrfs/#comment-35 but it's quite old and does not work anymore since userland became incompatible with it. Has anyone seen something newer or have a newer fixed version of this? Thanks, Marc -- A mouse is a device

Btrfs file content missmatch incrementally sending subvolumes containing systemd journal files

2014-03-29 Thread Marc MERLIN
I had someone asking me about this bug: Btrfs file content missmatch incrementally sending subvolumes containing systemd journal files https://bugzilla.kernel.org/show_bug.cgi?id=66941 Specifically: Just to note, I have also this issue with other files: jkarlson/.config/chromium/Default/Archived

Re: btrfs send/receive still gets out of sync in 3.14.0

2014-03-29 Thread Marc MERLIN
On Sat, Mar 22, 2014 at 02:04:56PM -0700, Marc MERLIN wrote: After deleting a huge directory tree in my /home subvolume, syncing snapshots now fails with: ERROR: rmdir o1952777-157-0 failed. No such file or directory So, I'm ok again after I deleted my destination snapshot and re-init'ed

Re: Especially broken btrfs

2014-03-29 Thread Marc MERLIN
On Thu, Mar 20, 2014 at 11:21:27PM -0400, sepero...@gmx.com wrote: Hello all. I submit bugs to different foss projects regularly, but I don't really have a bug report this time. I have a broken filesystem to report. And I have no idea how to reproduce it. I am including a link to the

Re: btrfs send/receive still gets out of sync in 3.14.0

2014-03-30 Thread Marc MERLIN
On Sun, Mar 30, 2014 at 02:13:35PM +0100, Filipe David Manana wrote: On Sun, Mar 30, 2014 at 1:42 PM, Hugo Mills h...@carfax.org.uk wrote: On Sat, Mar 29, 2014 at 08:22:02PM -0700, Marc MERLIN wrote: On Sat, Mar 22, 2014 at 02:04:56PM -0700, Marc MERLIN wrote: After deleting a huge

Re: btrfs send/receive still gets out of sync in 3.14.0

2014-03-30 Thread Marc MERLIN
On Sun, Mar 30, 2014 at 04:14:59PM +0100, Filipe David Manana wrote: Cool, thanks for fixing those. Is that meant to make it in 3.14 final, or is it going to be 3.15? My guess is 3.15. Understood. I'll see if I can find your btrfs patches and apply them to 3.14. Thanks for letting me

Re: determining snapshot size - adding work to do info to btrfs send

2014-03-31 Thread Marc MERLIN
On Sat, Mar 29, 2014 at 05:21:23PM -0700, Marc MERLIN wrote: I had a look at http://bj0z.wordpress.com/2011/04/27/determining-snapshot-size-in-btrfs/#comment-35 but it's quite old and does not work anymore since userland became incompatible with it. Has anyone seen something newer or have

Re: [PATCH 00/27] Replace the old man page with asciidoc and man page for each btrfs subcommand.

2014-04-02 Thread Marc MERLIN
On Wed, Apr 02, 2014 at 09:24:10AM -0400, Chris Mason wrote: On 04/02/2014 04:29 AM, Qu Wenruo wrote: Convert the old btrfs man pages to new asciidoc and split the huge btrfs man page into subcommand man page. The asciidoc style and Makefile things are mostly simplified from git

Re: [PATCH 24/27] btrfs-progs: Convert man page for btrfs-zero-log

2014-04-04 Thread Marc MERLIN
On Wed, Apr 02, 2014 at 04:29:35PM +0800, Qu Wenruo wrote: Convert man page for btrfs-zero-log Signed-off-by: Qu Wenruo quwen...@cn.fujitsu.com --- Documentation/Makefile | 2 +- Documentation/btrfs-zero-log.txt | 39 +++ 2 files changed, 40

Re: [PATCH 14/27] btrfs-progs: Convert man page for btrfs-replace.

2014-04-04 Thread Marc MERLIN
On Wed, Apr 02, 2014 at 04:29:25PM +0800, Qu Wenruo wrote: +If the source device is not available anymore, or if the -r option is set, +the data is built only using the RAID redundancy mechanisms. +After completion of the operation, the source device is removed from the +filesystem. Woudl it

kernel BUG at fs/btrfs/extent_io.c:4324! (3.14-rc5)

2014-04-05 Thread Marc MERLIN
static void btrfs_release_extent_buffer_page(struct extent_buffer *eb, unsigned long start_idx) { unsigned long index; unsigned long num_pages; struct page *page; int mapped = !test_bit(EXTENT_BUFFER_DUMMY, eb-bflags);

Re: kernel BUG at fs/btrfs/extent_io.c:4324! (3.14-rc5)

2014-04-05 Thread Marc MERLIN
On Sat, Apr 05, 2014 at 08:37:21AM -0700, Marc MERLIN wrote: static void btrfs_release_extent_buffer_page(struct extent_buffer *eb, unsigned long start_idx) { unsigned long index; unsigned long num_pages; struct page *page

Re: [PATCH 24/27] btrfs-progs: Convert man page for btrfs-zero-log

2014-04-05 Thread Marc MERLIN
On Sat, Apr 05, 2014 at 04:00:27PM -0600, cwillu wrote: +'btrfs-zero-log' will remove the log tree if log tree is corrupt, which will +allow you to mount the filesystem again. + +The common case where this happens has been fixed a long time ago, +so it is unlikely that you will see

Re: [PATCH 24/27] btrfs-progs: Convert man page for btrfs-zero-log

2014-04-05 Thread Marc MERLIN
On Sat, Apr 05, 2014 at 03:02:03PM -0700, Marc MERLIN wrote: On Sat, Apr 05, 2014 at 04:00:27PM -0600, cwillu wrote: +'btrfs-zero-log' will remove the log tree if log tree is corrupt, which will +allow you to mount the filesystem again. + +The common case where this happens has

Re: [PATCH 24/27] btrfs-progs: Convert man page for btrfs-zero-log

2014-04-05 Thread Marc MERLIN
On Sat, Apr 05, 2014 at 11:03:46PM +0100, Hugo Mills wrote: As far as I recall, -orecovery is read-write. -oro,recovery is read-only. Yes, we both corrected my Email at the same time :) Actually it's better/worse than that. From my notes at

Re: [RFC PATCH] Btrfs: send, add calculate data size flag to allow for progress estimation

2014-04-05 Thread Marc MERLIN
On Fri, Apr 04, 2014 at 04:20:41PM +0100, Filipe David Borba Manana wrote: This new send flag makes send calculate first the amount of new file data (in bytes) the send root has relatively to the parent root, or for the case of a non-incremental send, the total amount of file data we will

Re: [RFC PATCH] Btrfs: send, add calculate data size flag to allow for progress estimation

2014-04-06 Thread Marc MERLIN
On Sun, Apr 06, 2014 at 05:57:38PM +0100, Filipe David Manana wrote: I looked around and found nothing that looked similar enough. Obviously it's an assert, so I can run without it, but my source being very different from yours just made me want to check that this was most likely ok to run

btrfs on 3.14rc5 stuck on btrfs_tree_read_lock sync

2014-04-07 Thread Marc MERLIN
I was debugging my why backup failed to run, and eventually found it was stuck on sync: 14080 18:18 btrfs_tree_read_lock sync This was hung for hours on this lock. Strangely, it looks like taking my sysrq-w hung the machine pretty hard for close to 30sec, but this seems to have

Re: btrfs on 3.14rc5 stuck on btrfs_tree_read_lock sync

2014-04-07 Thread Marc MERLIN
On Mon, Apr 07, 2014 at 12:10:52PM -0400, Josef Bacik wrote: On 04/07/2014 12:05 PM, Marc MERLIN wrote: I was debugging my why backup failed to run, and eventually found it was stuck on sync: 14080 18:18 btrfs_tree_read_lock sync This was hung for hours on this lock

  1   2   3   4   5   6   7   8   >