Re: btrfs: kernel BUG at fs/btrfs/extent_io.c:676!

2014-10-14 Thread Chris Mason
On Sun, Oct 12, 2014 at 10:15 PM, Sasha Levin sasha.le...@oracle.com wrote: Ping? This BUG_ON()ing due to GFP_ATOMIC allocation failure is really silly :( Agreed, I have a patch for this in testing. It didn't make my first pull but I'll get it fixed up. -chris -- To unsubscribe from

btrfs soft lockups: locks gone crazy

2014-10-14 Thread Davidlohr Bueso
Hello, I'm getting massive amounts of cpu soft lockups in Linus's tree for today. This occurs almost immediately and is very reproducible in aim7 disk workloads using btrfs: kernel:[ 559.800017] NMI watchdog: BUG: soft lockup - CPU#114 stuck for 22s! [reaim:44435] ... [ 999.800070] Modules

Re: [PATCH v4] btrfs-progs: fix page align issue for lzo compress in restore

2014-10-14 Thread Marc Dietrich
This hasn't landed in an btrfs-progs branch I found. Any update? Marc Am Dienstag, 23. September 2014, 16:34:54 schrieb Gui Hecheng: When runing restore under lzo compression, bad compress length problems are encountered. It is because there is a page align problem with the @decompress_lzo,

[GIT PULL] LLVMLinux patches for v3.18

2014-10-14 Thread Behan Webster
These patches remove the use of VLAIS using a new SHASH_DESC_ON_STACK macro. Some of the previously accepted VLAIS removal patches haven't used this macro. I will push new patches to consistently use this macro in all those older cases for 3.19 The following changes since commit

Re: [PATCH v4] btrfs-progs: fix page align issue for lzo compress in restore

2014-10-14 Thread David Sterba
On Tue, Oct 14, 2014 at 10:06:16AM +0200, Marc Dietrich wrote: This hasn't landed in an btrfs-progs branch I found. Any update? I had it tagged for review and found something that needs fixing. The PAGE_CACHE_SIZE is hardcoded to 4k, this will break on filesystems with larger sectors (eg. the

Re: btrfs random filesystem corruption in kernel 3.17

2014-10-14 Thread admin
Summarizing what I've seen on the threads... First of all many thanks for summarizing the info. 1) The bug seems to be read-only snapshot related. The connection to send is that send creates read-only snapshots, but people creating read- only snapshots for other purposes are now reporting

Re: what is the best way to monitor raid1 drive failures?

2014-10-14 Thread Suman C
Hi, Here's a simple raid1 recovery experiment that's not working as expected. kernel: 3.17, latest mainline progs: 3.16.1 I started with a simple raid1 mirror of 2 drives (sda and sdb). The filesystem is functional, I created one subvol, put some data, read/write tested etc.. yanked the sdb

Re: what is the best way to monitor raid1 drive failures?

2014-10-14 Thread Rich Freeman
On Tue, Oct 14, 2014 at 10:48 AM, Suman C schakr...@gmail.com wrote: The new drive shows up as sdb. btrfs fi show still prints drive missing. mounted the filesystem with ro,degraded tried adding the new sdb drive which results in the following error. (-f because the new drive has a fs from

Re: what is the best way to monitor raid1 drive failures?

2014-10-14 Thread Suman C
I cannot delete that way because it would mean going below the minimum number of devices and it fails, as explained in the wiki. The solution from wiki is to add a new device and then delete the old one, but the problem here may be due to the new device appearing with the same name(sdb)? Suman

Scrub already running

2014-10-14 Thread Bob Williams
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 In my openSUSE 13.1 system, I have / and /home on separate partitions of a two disk btrfs raid1 array. # btrfs fi sh Label: system uuid: 0b07b829-9a0e-44ab-89ee-14b36a45199e Total devices 2 FS bytes used 11.99GiB devid1 size

Re: Scrub already running

2014-10-14 Thread Calvin Walton
On Tue, 2014-10-14 at 16:18 +0100, Bob Williams wrote: When I run scrub on the root (system) partition now, I get ERROR: scrub is already running. To cancel use 'btrfs scrub cancel /'. To see the status use 'btrfs scrub status [-d] /'. but # btrfs scrub cancel / ERROR: scrub cancel

Re: btrfs random filesystem corruption in kernel 3.17

2014-10-14 Thread David Arendt
The corruption seems to be worse than expected. In kernel 3.16.5 I can not mount this filesystem read/write. I'm in progress of doing a tar - mkfs.btrfs - untar recovery and staying on 3.16.5 for now. [ 55.465584] parent transid verify failed on 51150848 wanted 272368 found 276401 [

Random file system corruption in 3.17 (not BTRFS related...?)

2014-10-14 Thread Robert White
Howdy, So I run several gentoo systems and I upgraded two of them to kernel 3.17.0 One using BTRFS for root. One using ext3 for root (via the ext4 driver) _Both_ systems exhibited strange behavior (long pauses and then hangs requiring hard-power) within several hours. Both then had random

Re: Random file system corruption in 3.17 (not BTRFS related...?)

2014-10-14 Thread David Arendt
I didn't notice a corruption on other filesystems with kernel 3.17.0. Also I didn't experience any hangs except when trying to mount a corrupted btrfs but this was causing a hang within less than 10 seconds. It could be that your problem is unrelated and that the corruption you are

[RFC 1/1 linux-next] btrfs: don't opencode zero_user_segment

2014-10-14 Thread Fabian Frederick
use function defined in include/linux/highmem.h Note that this reverts 2 last function call order Signed-off-by: Fabian Frederick f...@skynet.be --- fs/btrfs/scrub.c | 8 ++-- 1 file changed, 2 insertions(+), 6 deletions(-) diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c index

Re: what is the best way to monitor raid1 drive failures?

2014-10-14 Thread Chris Murphy
On Oct 14, 2014, at 10:48 AM, Suman C schakr...@gmail.com wrote: mounted the filesystem with ro,degraded tried adding the new sdb drive which results in the following error. (-f because the new drive has a fs from past) # btrfs device add -f /dev/sdb /mnt2/raid1pool /dev/sdb is mounted

Re: [RFC 1/1 linux-next] btrfs: don't opencode zero_user_segment

2014-10-14 Thread Zach Brown
On Tue, Oct 14, 2014 at 07:46:14PM +0200, Fabian Frederick wrote: use function defined in include/linux/highmem.h Note that this reverts 2 last function call order And adds a BUG_ON(PAGE_CACHE_SIZE PAGE_SIZE). We can take bets on whether that will ever trigger. - z -- To unsubscribe from

Re: Scrub already running

2014-10-14 Thread Bob Williams
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 14/10/14 17:25, Calvin Walton wrote: On Tue, 2014-10-14 at 16:18 +0100, Bob Williams wrote: When I run scrub on the root (system) partition now, I get ERROR: scrub is already running. To cancel use 'btrfs scrub cancel /'. To see the status

Re: Random file system corruption in 3.17 (not BTRFS related...?)

2014-10-14 Thread Robert White
On 10/14/2014 10:22 AM, David Arendt wrote: I didn't notice a corruption on other filesystems with kernel 3.17.0. Also I didn't experience any hangs except when trying to mount a corrupted btrfs but this was causing a hang within less than 10 seconds. It could be that your problem is unrelated

Re: what is the best way to monitor raid1 drive failures?

2014-10-14 Thread Suman C
After the reboot step, where I indicated that I mounted ro, I was unable to mount rw or rw,degraded. I get the mount: wrong fs type, bad option, bad superblock error if I try to mount it rw. What might be the reason for that? Suman On Tue, Oct 14, 2014 at 12:15 PM, Chris Murphy

Re: [RFC 1/1 linux-next] btrfs: don't opencode zero_user_segment

2014-10-14 Thread Fabian Frederick
On 14 October 2014 at 21:15 Zach Brown z...@zabbo.net wrote: On Tue, Oct 14, 2014 at 07:46:14PM +0200, Fabian Frederick wrote: use function defined in include/linux/highmem.h Note that this reverts 2 last function call order And adds a BUG_ON(PAGE_CACHE_SIZE PAGE_SIZE).  We can take

Re: btrfs random filesystem corruption in kernel 3.17

2014-10-14 Thread Duncan
admin posted on Tue, 14 Oct 2014 13:17:41 +0200 as excerpted: And if you're affected, be aware that until we have a fix, we don't know if it'll be possible to remove the affected and currently undeletable snapshots. If it's not, at some point you'll need to do a fresh mkfs.btrfs, to get rid

Re: what is the best way to monitor raid1 drive failures?

2014-10-14 Thread Duncan
Suman C posted on Tue, 14 Oct 2014 07:48:01 -0700 as excerpted: Here's a simple raid1 recovery experiment that's not working as expected. kernel: 3.17, latest mainline progs: 3.16.1 I started with a simple raid1 mirror of 2 drives (sda and sdb). The filesystem is functional, I created

Re: btrfs random filesystem corruption in kernel 3.17

2014-10-14 Thread Robert White
On 10/14/2014 02:35 PM, Duncan wrote: But at some point, presumably after a fix is in place, since the damaged snapshots aren't currently always deletable, if the fix only prevents new damage from occurring and doesn't provide a way to fix the damaged ones, then mkfs would be the only way to do

Wishlist Item :: One Subvol in Multiple Places

2014-10-14 Thread Robert White
I've got no idea if this is possible given the current storage layout, but it would be Really Nice™ if there were a way to have a single subvolume exist in more than one place in hirearchy. I know this can be faked via mount tricks (bind or use of subvol=), but having it be a real thing would

Re: Random file system corruption in 3.17 (not BTRFS related...?)

2014-10-14 Thread Duncan
Robert White posted on Tue, 14 Oct 2014 09:54:51 -0700 as excerpted: On the BTRFS system much of my browser settings for firefox were trashed, particularly the cookies and saved conifigurations for add-ons (like which sites had scripts enabled/disabled in no-script) etc. FWIW, this reply is

Re: btrfs random filesystem corruption in kernel 3.17

2014-10-14 Thread Duncan
Robert White posted on Tue, 14 Oct 2014 15:03:21 -0700 as excerpted: What happens if btrfs property set is used to (attempt to) promote the snapshot from read-only to read-write? Can the damaged snapshot then be subjected to scrub of btrfsck? e.g. btrfs property set /path/to/snapshot ro

Wishlist Item :: re/setable Snapshot property

2014-10-14 Thread Robert White
From what I can tell the status of a subvolume as a snapshot or not is a persistent property that has exactly one effect... determining if the subvolume is listed when using -s on btrfs subvol list. That is the status as snapshot or not seems to be largely cosmetic. It would be useful if that

[PATCH v2] btrfs: add more superblock checks

2014-10-14 Thread David Sterba
Populate btrfs_check_super_valid() with checks that try to verify consistency of superblock by additional conditions that may arise from corrupted devices or bitflips. Some of tests are only hints and issue warnings instead of failing the mount, basically when the checks are derived from the data

Btrfs-progs pre-release 3.17

2014-10-14 Thread David Sterba
Hi, the 3.17 release is almost ready, I've updated the git repositories at http://repo.or.cz/w/btrfs-progs-unstable/devel.git https://github.com/kdave/btrfs-progs tagged as 3.17-rc3 (7fd6d933528f30a). Please give it some testing, I'm about to do a release in 1-2 days. Among other fixes and

Re: what is the best way to monitor raid1 drive failures?

2014-10-14 Thread Anand Jain
On 10/14/14 22:48, Suman C wrote: Hi, Here's a simple raid1 recovery experiment that's not working as expected. kernel: 3.17, latest mainline progs: 3.16.1 I started with a simple raid1 mirror of 2 drives (sda and sdb). The filesystem is functional, I created one subvol, put some data,