Re: Blocked tasks on 3.15.1

2014-08-18 Thread James Cloos
 MM == Marc MERLIN m...@merlins.org writes:

MM Note 3.16.0 is actually worse than 3.15 for me.

Here (a single partition btrfs), 3.16.0 works fine, but 3.17-rc1 fails again.

My /var/log is also a compressed, single-partition btrfs; that doesn't
show the problem with any version.  Just the partition with git, svn and
rsync trees.

Last night's test of 3.17-rc1 showed the problem with the first git
pull, getting stuck reading FETCH_HEAD.  All repos on that fs failed
the same way.

But rebooting back to 3.16.0 let everything work perfectly.

-JimC
-- 
James Cloos cl...@jhcloos.com OpenPGP: 0x997A9F17ED7DAEA6
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Blocked tasks on 3.15.1

2014-08-11 Thread Charles Cazabon
The blocked tasks issue that got significantly worse in 3.15 -- did anything
go into 3.16 related to this?  I didn't see a single btrfs in Linus' 3.16
announcement, so I don't know whether it should be better, the same, or worse
in this respect...

I haven't seen a definite statement about this on this list, either.

Can someone more familiar with the state of development comment on this?

Charles
-- 
---
Charles Cazabon
GPL'ed software available at:   http://pyropus.ca/software/
---
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Blocked tasks on 3.15.1

2014-08-11 Thread Liu Bo
On Mon, Aug 11, 2014 at 08:55:21PM -0600, Charles Cazabon wrote:
 The blocked tasks issue that got significantly worse in 3.15 -- did anything
 go into 3.16 related to this?  I didn't see a single btrfs in Linus' 3.16
 announcement, so I don't know whether it should be better, the same, or worse
 in this respect...
 
 I haven't seen a definite statement about this on this list, either.
 
 Can someone more familiar with the state of development comment on this?

Good news is that we've figured out the bug and the patch is already under
testing :-) 

thanks,
-liubo
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Blocked tasks on 3.15.1

2014-08-11 Thread Duncan
Liu Bo posted on Tue, 12 Aug 2014 10:56:42 +0800 as excerpted:

 On Mon, Aug 11, 2014 at 08:55:21PM -0600, Charles Cazabon wrote:
 The blocked tasks issue that got significantly worse in 3.15 -- did
 anything go into 3.16 related to this?  I didn't see a single btrfs
 in Linus' 3.16 announcement, so I don't know whether it should be
 better, the same, or worse in this respect...
 
 I haven't seen a definite statement about this on this list, either.
 
 Can someone more familiar with the state of development comment on
 this?
 
 Good news is that we've figured out the bug and the patch is already
 under testing :-)

IOW, it's not in 3.16.0, but will hopefully make it into 3.16.2 (it'll 
likely be a too late for 3.16.1).

-- 
Duncan - List replies preferred.   No HTML msgs.
Every nonfree program has a lord, a master --
and if you use the program, he is your master.  Richard Stallman

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Blocked tasks on 3.15.1

2014-08-11 Thread Marc MERLIN
On Mon, Aug 11, 2014 at 08:55:21PM -0600, Charles Cazabon wrote:
 The blocked tasks issue that got significantly worse in 3.15 -- did anything
 go into 3.16 related to this?  I didn't see a single btrfs in Linus' 3.16
 announcement, so I don't know whether it should be better, the same, or worse
 in this respect...
 
 I haven't seen a definite statement about this on this list, either.

Yes, 3.15 is unusable for some workloads, mine included.
Go back to 3.14 until there is a patch in 3.16, which there isn't quite
as for right now, but very soon hopefully.

Note 3.16.0 is actually worse than 3.15 for me.

Marc
-- 
A mouse is a device used to point at the xterm you want to type in - A.S.R.
Microsoft is to operating systems 
   what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/ | PGP 1024R/763BE901
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Blocked tasks on 3.15.1

2014-08-07 Thread Tobias Holst
Hi

Is there anything new on this topic? I am using Ubuntu 14.04.1 and
experiencing the same problem.
- 6 HDDs
- LUKS on every HDD
- btrfs RAID6 over this 6 crypt-devices
No LVM, no nodatacow files.
Mount-options: defaults,compress-force=lzo,space_cache
With the original 3.13-kernel (3.13.0-32-generic) it is working fine.

Then I tried the following kernels from here:
http://kernel.ubuntu.com/~kernel-ppa/mainline/
linux-image-3.14.15-031415-generic_3.14.15-031415.201407311853_amd64.deb
- not even booting, kernel panic at boot.
linux-image-3.15.6-031506-generic_3.15.6-031506.201407172034_amd64.deb,
linux-image-3.15.7-031507-generic_3.15.7-031507.201407281235_amd64.deb,
and linux-image-3.16.0-031600-generic_3.16.0-031600.201408031935_amd64.deb
causing the hangs like described in this thread. When doing big IO
(unpacking a .rar-archive with multiple GB) the filesystem stops
working. Load stays very high but nothing actually happens on the
drives accoding to dstat. htop shows a D (uninterruptible sleep
(usually IO)) at many kworker-threads.
Unmounting of the btrfs-filesystem only works with -l (lazy) option.
Reboot or shutdown doesn't work because of the blocking threads. So
only a power cut works. After the reboot the last written data before
the hang is lost.

I am now back on 3.13.

Regards


2014-07-25 4:27 GMT+02:00 Cody P Schafer d...@codyps.com:

 On Tue, Jul 22, 2014 at 9:53 AM, Chris Mason c...@fb.com wrote:
 
 
  On 07/19/2014 02:23 PM, Martin Steigerwald wrote:
 
  Running 3.15.6 with this patch applied on top:
   - still causes a hang with `rsync -hPaHAXx --del /mnt/home/nyx/ 
  /home/nyx/`
  - no extra error messages printed (`dmesg | grep racing`) compared to
  without the patch
 
  I got same results with 3.16-rc5 + this patch (see thread BTRFS hang with
  3.16-rc5). 3.16-rc4 still is fine with me. No hang whatsoever so far.
 
  To recap some details (so I can have it all in one place):
   - /home/ is btrfs with compress=lzo
 
  BTRFS RAID 1 with lzo.
 
   - I have _not_ created any nodatacow files.
 
  Me neither.
 
   - Full stack is: sata - dmcrypt - lvm - btrfs (I noticed others
  mentioning the use of dmcrypt)
 
  Same, except no dmcrypt.
 
 
  Thanks for the help in tracking this down everyone.  We'll get there!
  Are you all running multi-disk systems (from a btrfs POV, more than one
  device?)  I don't care how many physical drives this maps to, just does
  btrfs think there's more than one drive.

 No, both of my btrfs filesystems are single disk.
 --
 To unsubscribe from this list: send the line unsubscribe linux-btrfs in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Blocked tasks on 3.15.1

2014-08-07 Thread Duncan
Tobias Holst posted on Thu, 07 Aug 2014 17:12:17 +0200 as excerpted:

 Is there anything new on this topic? I am using Ubuntu 14.04.1 and
 experiencing the same problem.
 - 6 HDDs - LUKS on every HDD - btrfs RAID6 over this 6 crypt-devices No
 LVM, no nodatacow files.
 Mount-options: defaults,compress-force=lzo,space_cache With the original
 3.13-kernel (3.13.0-32-generic) it is working fine.

I see you're using compress-force.  See the recent replies to the
Btrfs: fix compressed write corruption on enospc thread.

I'm not /sure/ your case is directly related (tho the kworker code is 
pretty new and 3.13 may be working for you due to being before the 
migration to kworkers, supporting the case of it being either the same 
problem or another related to it), but that's certainly one problem 
they've recently traced down... to a bug in the kworker threads code, 
that starts a new worker that can race with the first instead of obeying 
a flag that says keep it on the first worker.

Looks like they're doing patch that takes a slower but safer path to work 
around the kworker bug for now, as that bug was just traced (there was 
another bug, with a patch available originally hiding the ultimate 
problem, but obviously that's only half the fix as it simply revealed 
another bug underneath) and fixing it properly is likely to take some 
time.  Now that it's basically traced the workaround patch should be 
published on-list shortly and should make it into 3.17 and back into the 
stables, altho I'm not sure it'll make it into 3.16.1, etc.

But there's certainly progress. =:^)

-- 
Duncan - List replies preferred.   No HTML msgs.
Every nonfree program has a lord, a master --
and if you use the program, he is your master.  Richard Stallman

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Blocked tasks on 3.15.1

2014-07-23 Thread Felix Seidel
On 23 Jul 2014, at 03:06, Rich Freeman r-bt...@thefreemanclan.net wrote:
 I disabled lzo and haven't had problems since.  I'm now running
 on mainline without issue, but I think I did see the hang on mainline
 when I tried enabling lzo again briefly.

Can confirm. I’m running mainline 3.16rc5 and was experiencing deadlocks
when having LZO enabled. 
Disabled it, now all seems ok.

Using btrfs RAID1 - dm-crypt - SATA.

I’ve attached some more dmesg “blocked” messages using kernel versions
3.15.5, 3.14.6 and 3.16rc5 just in case it helps anyone.

Jul 18 23:36:58 nas kernel: INFO: task sudo:1214 blocked for more than 120 
seconds.
Jul 18 23:36:58 nas kernel:   Tainted: G   O  3.15.5-2-ARCH #1
Jul 18 23:36:58 nas kernel: echo 0  /proc/sys/kernel/hung_task_timeout_secs 
disables this message.
Jul 18 23:36:58 nas kernel: sudoD  0  1214  
1 0x0004
Jul 18 23:36:58 nas kernel:  88001d0ebc20 0086 88002cca5bb0 
00014700
Jul 18 23:36:58 nas kernel:  88001d0ebfd8 00014700 88002cca5bb0 

Jul 18 23:36:58 nas kernel:  880028ee4000 0003 284e0d53 
0002
Jul 18 23:36:58 nas kernel: Call Trace:
Jul 18 23:36:58 nas kernel:  [815110dc] ? __do_page_fault+0x2ec/0x600
Jul 18 23:36:58 nas kernel:  [81509fa9] schedule+0x29/0x70
Jul 18 23:36:58 nas kernel:  [8150a426] 
schedule_preempt_disabled+0x16/0x20
Jul 18 23:36:58 nas kernel:  [8150bda5] 
__mutex_lock_slowpath+0xe5/0x230
Jul 18 23:36:58 nas kernel:  [8150bf07] mutex_lock+0x17/0x30
Jul 18 23:36:58 nas kernel:  [811bfa24] lookup_slow+0x34/0xc0
Jul 18 23:36:58 nas kernel:  [811c1b73] path_lookupat+0x723/0x880
Jul 18 23:36:58 nas kernel:  [8114f111] ? release_pages+0xc1/0x280
Jul 18 23:36:58 nas kernel:  [811bfd97] ? getname_flags+0x37/0x130
Jul 18 23:36:58 nas kernel:  [811c1cf6] 
filename_lookup.isra.30+0x26/0x80
Jul 18 23:36:58 nas kernel:  [811c4fd7] user_path_at_empty+0x67/0xd0
Jul 18 23:36:58 nas kernel:  [81172b52] ? unmap_region+0xe2/0x130
Jul 18 23:36:58 nas kernel:  [811c5051] user_path_at+0x11/0x20
Jul 18 23:36:58 nas kernel:  [811b979a] vfs_fstatat+0x6a/0xd0
Jul 18 23:36:58 nas kernel:  [811b981b] vfs_stat+0x1b/0x20
Jul 18 23:36:58 nas kernel:  [811b9df9] SyS_newstat+0x29/0x60
Jul 18 23:36:58 nas kernel:  [8117501c] ? vm_munmap+0x4c/0x60
Jul 18 23:36:58 nas kernel:  [81175f92] ? SyS_munmap+0x22/0x30
Jul 18 23:36:58 nas kernel:  [81515fa9] system_call_fastpath+0x16/0x1b
---
Jul 19 18:34:17 nas kernel: INFO: task rsync:4900 blocked for more than 120 
seconds.
Jul 19 18:34:17 nas kernel:   Tainted: G   O  3.15.5-2-ARCH #1
Jul 19 18:34:17 nas kernel: echo 0  /proc/sys/kernel/hung_task_timeout_secs 
disables this message.
Jul 19 18:34:17 nas kernel: rsync   D  0  4900   
4899 0x
Jul 19 18:34:17 nas kernel:  880005947c20 0082 880034aa4750 
00014700
Jul 19 18:34:17 nas kernel:  880005947fd8 00014700 880034aa4750 
810a5995
Jul 19 18:34:17 nas kernel:  88011fc14700 8800dd828a30 8800cece6a00 
880005947bd8
Jul 19 18:34:17 nas kernel: Call Trace:
Jul 19 18:34:17 nas kernel:  [810a5995] ? set_next_entity+0x95/0xb0
Jul 19 18:34:17 nas kernel:  [810ac0be] ? 
pick_next_task_fair+0x46e/0x550
Jul 19 18:34:17 nas kernel:  [810136c1] ? __switch_to+0x1f1/0x540
Jul 19 18:34:17 nas kernel:  [81509fa9] schedule+0x29/0x70
Jul 19 18:34:17 nas kernel:  [8150a426] 
schedule_preempt_disabled+0x16/0x20
Jul 19 18:34:17 nas kernel:  [8150bda5] 
__mutex_lock_slowpath+0xe5/0x230
Jul 19 18:34:17 nas kernel:  [8150bf07] mutex_lock+0x17/0x30
Jul 19 18:34:17 nas kernel:  [811bfa24] lookup_slow+0x34/0xc0
Jul 19 18:34:17 nas kernel:  [811c1b73] path_lookupat+0x723/0x880
Jul 19 18:34:17 nas kernel:  [8150a2bf] ? io_schedule+0xbf/0xf0
Jul 19 18:34:17 nas kernel:  [8150a7d1] ? __wait_on_bit_lock+0x91/0xb0
Jul 19 18:34:17 nas kernel:  [811bfd97] ? getname_flags+0x37/0x130
Jul 19 18:34:17 nas kernel:  [811c1cf6] 
filename_lookup.isra.30+0x26/0x80
Jul 19 18:34:17 nas kernel:  [811c4fd7] user_path_at_empty+0x67/0xd0
Jul 19 18:34:17 nas kernel:  [811c5051] user_path_at+0x11/0x20
Jul 19 18:34:17 nas kernel:  [811b979a] vfs_fstatat+0x6a/0xd0
Jul 19 18:34:17 nas kernel:  [811d4414] ? mntput+0x24/0x40
Jul 19 18:34:17 nas kernel:  [811b983e] vfs_lstat+0x1e/0x20
Jul 19 18:34:17 nas kernel:  [811b9e59] SyS_newlstat+0x29/0x60
Jul 19 18:34:17 nas kernel:  [8108a3c4] ? task_work_run+0xa4/0xe0
Jul 19 18:34:17 nas kernel:  [8150e939] ? 
do_device_not_available+0x19/0x20
Jul 19 18:34:17 nas kernel:  [8151760e] ? 
device_not_available+0x1e/0x30
Jul 19 

Re: Blocked tasks on 3.15.1

2014-07-23 Thread Martin Steigerwald
Am Dienstag, 22. Juli 2014, 17:15:21 schrieb Chris Mason:
 On 07/22/2014 05:13 PM, Martin Steigerwald wrote:
  Am Dienstag, 22. Juli 2014, 10:53:03 schrieb Chris Mason:
  On 07/19/2014 02:23 PM, Martin Steigerwald wrote:
  Running 3.15.6 with this patch applied on top:
   - still causes a hang with `rsync -hPaHAXx --del /mnt/home/nyx/
   /home/nyx/`
  
  - no extra error messages printed (`dmesg | grep racing`) compared to
  without the patch
  
  I got same results with 3.16-rc5 + this patch (see thread BTRFS hang
  with
  3.16-rc5). 3.16-rc4 still is fine with me. No hang whatsoever so far.
  
  To recap some details (so I can have it all in one place):
   - /home/ is btrfs with compress=lzo
  
  BTRFS RAID 1 with lzo.
  
   - I have _not_ created any nodatacow files.
  
  Me neither.
  
   - Full stack is: sata - dmcrypt - lvm - btrfs (I noticed others
  
  mentioning the use of dmcrypt)
  
  Same, except no dmcrypt.
  
  Thanks for the help in tracking this down everyone.  We'll get there!
  Are you all running multi-disk systems (from a btrfs POV, more than one
  device?)  I don't care how many physical drives this maps to, just does
  btrfs think there's more than one drive.
  
  As I told before I am using BTRFS RAID 1. Two logival volumes on two
  distinct SSDs. RAID is directly in BTRFS, no SoftRAID here (which I
  wouldn´t want to use with SSDs anyway).
 
 When you say logical volumes, you mean LVM right?  Just making sure I
 know all the pieces involved.

Exactly.

As a recap from the other thread:

merkaba:~ btrfs fi sh /home
Label: 'home'  uuid: […]
Total devices 2 FS bytes used 123.20GiB
devid1 size 160.00GiB used 159.98GiB path /dev/mapper/msata-home
devid2 size 160.00GiB used 159.98GiB path /dev/dm-3

Btrfs v3.14.1

merkaba:~#1 btrfs fi df /home
Data, RAID1: total=154.95GiB, used=120.61GiB
System, RAID1: total=32.00MiB, used=48.00KiB
Metadata, RAID1: total=5.00GiB, used=2.59GiB
unknown, single: total=512.00MiB, used=0.00
merkaba:~ df -hT /home
DateisystemTyp   Größe Benutzt Verf. Verw% Eingehängt auf
/dev/mapper/msata-home btrfs  320G247G   69G   79% /home

merkaba:~ file -sk /dev/sata/home
/dev/sata/home: symbolic link to `../dm-3'
merkaba:~ file -sk /dev/dm-3 
/dev/dm-3: BTRFS Filesystem label home, sectorsize 4096, nodesize 16384, 
leafsize 16384, UUID=[…], 
132303151104/343597383680 bytes used, 2 devices


And LVM layout:

merkaba:~ lsblk
NAMEMAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
sda   8:00 279,5G  0 disk 
├─sda18:10 4M  0 part 
├─sda28:20   191M  0 part 
├─sda38:30   286M  0 part 
└─sda48:40   279G  0 part 
  ├─sata-home (dm-3)254:30   160G  0 lvm  
  ├─sata-swap (dm-4)254:4012G  0 lvm  [SWAP]
  └─sata-debian (dm-5)  254:5030G  0 lvm  
sdb   8:16   0 447,1G  0 disk 
├─sdb18:17   0   200M  0 part 
├─sdb28:18   0   300M  0 part /boot
└─sdb38:19   0 446,7G  0 part 
  ├─msata-home (dm-0)   254:00   160G  0 lvm  
  ├─msata-daten (dm-1)  254:10   200G  0 lvm  
  └─msata-debian (dm-2) 254:2030G  0 lvm  
sr0  11:01  1024M  0 rom 

sda is Intel SSD 320 SATA

sdb is Crucial m500 mSATA

Thanks,
-- 
Martin 'Helios' Steigerwald - http://www.Lichtvoll.de
GPG: 03B0 0D6C 0040 0710 4AFA  B82F 991B EAAC A599 84C7
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Blocked tasks on 3.15.1

2014-07-23 Thread Charles Cazabon
Chris Mason c...@fb.com wrote:
 
 Thanks for the help in tracking this down everyone.  We'll get there!
 Are you all running multi-disk systems (from a btrfs POV, more than one
 device?)  I don't care how many physical drives this maps to, just does
 btrfs think there's more than one drive.

Not me, at least - I'm doing the device aggregation down at the LVM level
(sata-dmcrypt-lvm-btrfs stack), so it's presented to btrfs as a single logical
device.

Charles
-- 
---
Charles Cazabon
GPL'ed software available at:   http://pyropus.ca/software/
---
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Blocked tasks on 3.15.1

2014-07-22 Thread Chris Mason


On 07/19/2014 02:23 PM, Martin Steigerwald wrote:

 Running 3.15.6 with this patch applied on top:
  - still causes a hang with `rsync -hPaHAXx --del /mnt/home/nyx/ /home/nyx/`
 - no extra error messages printed (`dmesg | grep racing`) compared to
 without the patch
 
 I got same results with 3.16-rc5 + this patch (see thread BTRFS hang with 
 3.16-rc5). 3.16-rc4 still is fine with me. No hang whatsoever so far.
 
 To recap some details (so I can have it all in one place):
  - /home/ is btrfs with compress=lzo
 
 BTRFS RAID 1 with lzo.
 
  - I have _not_ created any nodatacow files.
 
 Me neither.
 
  - Full stack is: sata - dmcrypt - lvm - btrfs (I noticed others
 mentioning the use of dmcrypt)
 
 Same, except no dmcrypt.
 

Thanks for the help in tracking this down everyone.  We'll get there!
Are you all running multi-disk systems (from a btrfs POV, more than one
device?)  I don't care how many physical drives this maps to, just does
btrfs think there's more than one drive.

-chris
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Blocked tasks on 3.15.1

2014-07-22 Thread Torbjørn

On 07/22/2014 04:53 PM, Chris Mason wrote:


On 07/19/2014 02:23 PM, Martin Steigerwald wrote:


Running 3.15.6 with this patch applied on top:
  - still causes a hang with `rsync -hPaHAXx --del /mnt/home/nyx/ /home/nyx/`
- no extra error messages printed (`dmesg | grep racing`) compared to
without the patch

I got same results with 3.16-rc5 + this patch (see thread BTRFS hang with
3.16-rc5). 3.16-rc4 still is fine with me. No hang whatsoever so far.


To recap some details (so I can have it all in one place):
  - /home/ is btrfs with compress=lzo

BTRFS RAID 1 with lzo.


  - I have _not_ created any nodatacow files.

Me neither.


  - Full stack is: sata - dmcrypt - lvm - btrfs (I noticed others
mentioning the use of dmcrypt)

Same, except no dmcrypt.


Thanks for the help in tracking this down everyone.  We'll get there!
Are you all running multi-disk systems (from a btrfs POV, more than one
device?)  I don't care how many physical drives this maps to, just does
btrfs think there's more than one drive.

-chris

Hi,

In case it's interesting:
From an earlier email thread with subject: 3.15-rc6 - 
btrfs-transacti:4157 blocked for more than 120


TLDR: yes, btrfs sees multiple devices.

sata - dmcrypt - btrfs raid10
btrfs raid10 consist of multiple dmcrypt devices from multiple sata devices.

Mount: /dev/mapper/sdu on /mnt/storage type btrfs 
(rw,noatime,space_cache,compress=lzo,inode_cache,subvol=storage)

(yes I know inode_cache is not recommended for general use)

I have a nocow directory in a separate subvolume containing vm-images 
used by kvm.

The same kvm-vms are reading/writing data from that array over nfs.

I'm still holding that system on 3.14. Anything above causes blocks.

--
Torbjørn
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Blocked tasks on 3.15.1

2014-07-22 Thread Marc MERLIN
On Tue, Jul 22, 2014 at 10:53:03AM -0400, Chris Mason wrote:
 Thanks for the help in tracking this down everyone.  We'll get there!
 Are you all running multi-disk systems (from a btrfs POV, more than one
 device?)  I don't care how many physical drives this maps to, just does
 btrfs think there's more than one drive.

In the bugs I sent you, it was a mix of arrays that were
mdraid / dmcrypt / btrfs

I have also one array with:
disk1 / dmcrypt \
 - btrfs (2 drives visible by btrfs)
disk2 / dmcrypt /

The multidrive setup seemed a bit worse, I just destroyed it and went
back to putting all the drives together with mdadm and showing a single
dmcrypted device to btrfs.

But that is still super unstable on my server with 3.15, while being
somewhat usable with my laptop (it still hangs, but more rarely)
The one difference is that my laptop actually does
disk  dmcrypt  btrfs
while my server does 
disks  mdadm  dmcrypt  btrfs

Marc
-- 
A mouse is a device used to point at the xterm you want to type in - A.S.R.
Microsoft is to operating systems 
   what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/ | PGP 1024R/763BE901
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Blocked tasks on 3.15.1

2014-07-22 Thread Torbjørn

On 07/22/2014 04:53 PM, Chris Mason wrote:


On 07/19/2014 02:23 PM, Martin Steigerwald wrote:


Running 3.15.6 with this patch applied on top:
  - still causes a hang with `rsync -hPaHAXx --del /mnt/home/nyx/ /home/nyx/`
- no extra error messages printed (`dmesg | grep racing`) compared to
without the patch

I got same results with 3.16-rc5 + this patch (see thread BTRFS hang with
3.16-rc5). 3.16-rc4 still is fine with me. No hang whatsoever so far.


To recap some details (so I can have it all in one place):
  - /home/ is btrfs with compress=lzo

BTRFS RAID 1 with lzo.


  - I have _not_ created any nodatacow files.

Me neither.


  - Full stack is: sata - dmcrypt - lvm - btrfs (I noticed others
mentioning the use of dmcrypt)

Same, except no dmcrypt.


Thanks for the help in tracking this down everyone.  We'll get there!
Are you all running multi-disk systems (from a btrfs POV, more than one
device?)  I don't care how many physical drives this maps to, just does
btrfs think there's more than one drive.

-chris

3.16-rc6 with your patch on top still causes hangs here.
No traces of racing in dmesg
Hang is on a btrfs raid 0 consisting of 3 drives.
Full stack is: sata - dmcrypt - btrfs raid0

Hang was caused by
1. Several rsync -av --inplace --delete source backup subvol
2. btrfs subvolume snapshot -r backup subvol bacup snap

The rsync jobs are done one at a time
btrfs is stuck when trying to create the read only snapshot

--
Torbjørn

All output via netconsole.
sysrq-w: 
https://gist.github.com/anonymous/d1837187e261f9a4cbd2#file-gistfile1-txt
sysrq-t: 
https://gist.github.com/anonymous/2bdb73f035ab9918c63d#file-gistfile1-txt


dmesg:
[ 9352.784136] INFO: task btrfs-transacti:3874 blocked for more than 120 
seconds.

[ 9352.784222]   Tainted: GE 3.16.0-rc6+ #64
[ 9352.784270] echo 0  /proc/sys/kernel/hung_task_timeout_secs 
disables this message.
[ 9352.784354] btrfs-transacti D 88042fc943c0 0  3874  2 
0x
[ 9352.784413]  8803fb9dfca0 0002 8800c4214b90 
8803fb9dffd8
[ 9352.784502]  000143c0 000143c0 88041977b260 
8803d29f23a0
[ 9352.784592]  8803d29f23a8 7fff 8800c4214b90 
880232e2c0a8

[ 9352.784682] Call Trace:
[ 9352.784726]  [8170eb59] schedule+0x29/0x70
[ 9352.784774]  [8170df99] schedule_timeout+0x209/0x280
[ 9352.784827]  [8170874b] ? __slab_free+0xfe/0x2c3
[ 9352.784879]  [810829f4] ? wake_up_worker+0x24/0x30
[ 9352.784929]  [8170f656] wait_for_completion+0xa6/0x160
[ 9352.784981]  [8109d4e0] ? wake_up_state+0x20/0x20
[ 9352.785049]  [c045b936] 
btrfs_wait_and_free_delalloc_work+0x16/0x30 [btrfs]
[ 9352.785141]  [c04658be] 
btrfs_run_ordered_operations+0x1ee/0x2c0 [btrfs]
[ 9352.785260]  [c044bbb7] btrfs_commit_transaction+0x27/0xa40 
[btrfs]

[ 9352.785324]  [c0447d65] transaction_kthread+0x1b5/0x240 [btrfs]
[ 9352.785385]  [c0447bb0] ? 
btrfs_cleanup_transaction+0x560/0x560 [btrfs]

[ 9352.785469]  [8108cc52] kthread+0xd2/0xf0
[ 9352.785517]  [8108cb80] ? kthread_create_on_node+0x180/0x180
[ 9352.785571]  [81712dfc] ret_from_fork+0x7c/0xb0
[ 9352.785620]  [8108cb80] ? kthread_create_on_node+0x180/0x180
[ 9352.785678] INFO: task kworker/u16:3:6932 blocked for more than 120 
seconds.

[ 9352.785732]   Tainted: GE 3.16.0-rc6+ #64
[ 9352.785780] echo 0  /proc/sys/kernel/hung_task_timeout_secs 
disables this message.
[ 9352.785863] kworker/u16:3   D 88042fd943c0 0  6932  2 
0x

[ 9352.785930] Workqueue: btrfs-flush_delalloc normal_work_helper [btrfs]
[ 9352.785983]  88035f1bbb58 0002 880417e564c0 
88035f1bbfd8
[ 9352.786072]  000143c0 000143c0 8800c1a03260 
88042fd94cd8
[ 9352.786160]  88042ffb4be8 88035f1bbbe0 0002 
81159930

[ 9352.786250] Call Trace:
[ 9352.786292]  [81159930] ? wait_on_page_read+0x60/0x60
[ 9352.786343]  [8170ee6d] io_schedule+0x9d/0x130
[ 9352.786393]  [8115993e] sleep_on_page+0xe/0x20
[ 9352.786443]  [8170f3e8] __wait_on_bit_lock+0x48/0xb0
[ 9352.786495]  [81159a4a] __lock_page+0x6a/0x70
[ 9352.786544]  [810b14a0] ? autoremove_wake_function+0x40/0x40
[ 9352.786607]  [c046711e] ? flush_write_bio+0xe/0x10 [btrfs]
[ 9352.786669]  [c046b0c0] 
extent_write_cache_pages.isra.28.constprop.46+0x3d0/0x3f0 [btrfs]

[ 9352.786766]  [c046cd2d] extent_writepages+0x4d/0x70 [btrfs]
[ 9352.786828]  [c04506f0] ? btrfs_submit_direct+0x6a0/0x6a0 
[btrfs]

[ 9352.786883]  [810b0d78] ? __wake_up_common+0x58/0x90
[ 9352.786943]  [c044e1d8] btrfs_writepages+0x28/0x30 [btrfs]
[ 9352.786997]  [811668ee] do_writepages+0x1e/0x40
[ 9352.787045]  [8115b409] __filemap_fdatawrite_range+0x59/0x60
[ 9352.787097]  [8115b4bc] filemap_flush+0x1c/0x20
[ 

Re: Blocked tasks on 3.15.1

2014-07-22 Thread Chris Mason
On 07/22/2014 03:42 PM, Torbjørn wrote:
 On 07/22/2014 04:53 PM, Chris Mason wrote:

 On 07/19/2014 02:23 PM, Martin Steigerwald wrote:

 Running 3.15.6 with this patch applied on top:
   - still causes a hang with `rsync -hPaHAXx --del /mnt/home/nyx/
 /home/nyx/`
 - no extra error messages printed (`dmesg | grep racing`) compared to
 without the patch
 I got same results with 3.16-rc5 + this patch (see thread BTRFS hang
 with
 3.16-rc5). 3.16-rc4 still is fine with me. No hang whatsoever so far.

 To recap some details (so I can have it all in one place):
   - /home/ is btrfs with compress=lzo
 BTRFS RAID 1 with lzo.

   - I have _not_ created any nodatacow files.
 Me neither.

   - Full stack is: sata - dmcrypt - lvm - btrfs (I noticed others
 mentioning the use of dmcrypt)
 Same, except no dmcrypt.

 Thanks for the help in tracking this down everyone.  We'll get there!
 Are you all running multi-disk systems (from a btrfs POV, more than one
 device?)  I don't care how many physical drives this maps to, just does
 btrfs think there's more than one drive.

 -chris
 3.16-rc6 with your patch on top still causes hangs here.
 No traces of racing in dmesg
 Hang is on a btrfs raid 0 consisting of 3 drives.
 Full stack is: sata - dmcrypt - btrfs raid0
 
 Hang was caused by
 1. Several rsync -av --inplace --delete source backup subvol
 2. btrfs subvolume snapshot -r backup subvol bacup snap
 
 The rsync jobs are done one at a time
 btrfs is stuck when trying to create the read only snapshot

The trace is similar, but you're stuck trying to read the free space
cache.  This one I saw earlier this morning, but I haven't seen these
parts from the 3.15 bug reports.

Maybe they are related though, I'll dig into the 3.15 bug reports again.

-chris
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Blocked tasks on 3.15.1

2014-07-22 Thread Torbjørn

On 07/22/2014 09:50 PM, Chris Mason wrote:

On 07/22/2014 03:42 PM, Torbjørn wrote:

On 07/22/2014 04:53 PM, Chris Mason wrote:

On 07/19/2014 02:23 PM, Martin Steigerwald wrote:


Running 3.15.6 with this patch applied on top:
   - still causes a hang with `rsync -hPaHAXx --del /mnt/home/nyx/
/home/nyx/`
- no extra error messages printed (`dmesg | grep racing`) compared to
without the patch

I got same results with 3.16-rc5 + this patch (see thread BTRFS hang
with
3.16-rc5). 3.16-rc4 still is fine with me. No hang whatsoever so far.


To recap some details (so I can have it all in one place):
   - /home/ is btrfs with compress=lzo

BTRFS RAID 1 with lzo.


   - I have _not_ created any nodatacow files.

Me neither.


   - Full stack is: sata - dmcrypt - lvm - btrfs (I noticed others
mentioning the use of dmcrypt)

Same, except no dmcrypt.


Thanks for the help in tracking this down everyone.  We'll get there!
Are you all running multi-disk systems (from a btrfs POV, more than one
device?)  I don't care how many physical drives this maps to, just does
btrfs think there's more than one drive.

-chris

3.16-rc6 with your patch on top still causes hangs here.
No traces of racing in dmesg
Hang is on a btrfs raid 0 consisting of 3 drives.
Full stack is: sata - dmcrypt - btrfs raid0

Hang was caused by
1. Several rsync -av --inplace --delete source backup subvol
2. btrfs subvolume snapshot -r backup subvol bacup snap

The rsync jobs are done one at a time
btrfs is stuck when trying to create the read only snapshot

The trace is similar, but you're stuck trying to read the free space
cache.  This one I saw earlier this morning, but I haven't seen these
parts from the 3.15 bug reports.

Maybe they are related though, I'll dig into the 3.15 bug reports again.

-chris
In case it was not clear, this hang was on a different btrfs volume than 
the 3.15 hang (but the same server).
Earlier the affected volume was readable during the hang. This time the 
volume is not readable either.


I'll keep the patched 3.16 running and see if I can trigger something 
similar to the 3.15 hang.


Thanks

--
Torbjørn
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Blocked tasks on 3.15.1

2014-07-22 Thread Martin Steigerwald
Am Dienstag, 22. Juli 2014, 10:53:03 schrieb Chris Mason:
 On 07/19/2014 02:23 PM, Martin Steigerwald wrote:
  Running 3.15.6 with this patch applied on top:
   - still causes a hang with `rsync -hPaHAXx --del /mnt/home/nyx/
   /home/nyx/`
  
  - no extra error messages printed (`dmesg | grep racing`) compared to
  without the patch
  
  I got same results with 3.16-rc5 + this patch (see thread BTRFS hang with
  3.16-rc5). 3.16-rc4 still is fine with me. No hang whatsoever so far.
  
  To recap some details (so I can have it all in one place):
   - /home/ is btrfs with compress=lzo
  
  BTRFS RAID 1 with lzo.
  
   - I have _not_ created any nodatacow files.
  
  Me neither.
  
   - Full stack is: sata - dmcrypt - lvm - btrfs (I noticed others
  
  mentioning the use of dmcrypt)
  
  Same, except no dmcrypt.
 
 Thanks for the help in tracking this down everyone.  We'll get there!
 Are you all running multi-disk systems (from a btrfs POV, more than one
 device?)  I don't care how many physical drives this maps to, just does
 btrfs think there's more than one drive.

As I told before I am using BTRFS RAID 1. Two logival volumes on two distinct 
SSDs. RAID is directly in BTRFS, no SoftRAID here (which I wouldn´t want to 
use with SSDs anyway).

-- 
Martin 'Helios' Steigerwald - http://www.Lichtvoll.de
GPG: 03B0 0D6C 0040 0710 4AFA  B82F 991B EAAC A599 84C7
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Blocked tasks on 3.15.1

2014-07-22 Thread Chris Mason
On 07/22/2014 05:13 PM, Martin Steigerwald wrote:
 Am Dienstag, 22. Juli 2014, 10:53:03 schrieb Chris Mason:
 On 07/19/2014 02:23 PM, Martin Steigerwald wrote:
 Running 3.15.6 with this patch applied on top:
  - still causes a hang with `rsync -hPaHAXx --del /mnt/home/nyx/
  /home/nyx/`

 - no extra error messages printed (`dmesg | grep racing`) compared to
 without the patch

 I got same results with 3.16-rc5 + this patch (see thread BTRFS hang with
 3.16-rc5). 3.16-rc4 still is fine with me. No hang whatsoever so far.

 To recap some details (so I can have it all in one place):
  - /home/ is btrfs with compress=lzo

 BTRFS RAID 1 with lzo.

  - I have _not_ created any nodatacow files.

 Me neither.

  - Full stack is: sata - dmcrypt - lvm - btrfs (I noticed others

 mentioning the use of dmcrypt)

 Same, except no dmcrypt.

 Thanks for the help in tracking this down everyone.  We'll get there!
 Are you all running multi-disk systems (from a btrfs POV, more than one
 device?)  I don't care how many physical drives this maps to, just does
 btrfs think there's more than one drive.
 
 As I told before I am using BTRFS RAID 1. Two logival volumes on two distinct 
 SSDs. RAID is directly in BTRFS, no SoftRAID here (which I wouldn´t want to 
 use with SSDs anyway).
 

When you say logical volumes, you mean LVM right?  Just making sure I
know all the pieces involved.

-chris
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Blocked tasks on 3.15.1

2014-07-22 Thread Rich Freeman
On Tue, Jul 22, 2014 at 10:53 AM, Chris Mason c...@fb.com wrote:

 Thanks for the help in tracking this down everyone.  We'll get there!
 Are you all running multi-disk systems (from a btrfs POV, more than one
 device?)  I don't care how many physical drives this maps to, just does
 btrfs think there's more than one drive.

I've been away on vacation so I haven't been able to try your latest
patch, but I can try whatever is out there starting this weekend.

I was getting fairly consistent hangs during heavy IO (especially
rsync) on 3.15 with lzo enabled.  This is on raid1 across 5 drives,
directly against the partitions themselves (no dmcrypt, mdadm, lvm,
etc).  I disabled lzo and haven't had problems since.  I'm now running
on mainline without issue, but I think I did see the hang on mainline
when I tried enabling lzo again briefly.

Rich
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Blocked tasks on 3.15.1

2014-07-20 Thread Matt
[ deadlocks during rsync in 3.15 with compression enabled ]

Hi everyone,

I still haven't been able to reproduce this one here, but I'm going
through a series of tests with lzo compression foraced and every
operation forced to ordered.  Hopefully it'll kick it out soon.

While I'm hammering away, could you please try this patch.  If this is
the buy you're hitting, the deadlock will go away and you'll see this
printk in the log.

thanks!

-chris

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 3668048..8ab56df 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -8157,6 +8157,13 @@ void btrfs_destroy_inode(struct inode *inode)
  spin_unlock(root-fs_info-ordered_root_lock);
  }

+ spin_lock(root-fs_info-ordered_root_lock);
+ if (!list_empty(BTRFS_I(inode)-ordered_operations)) {
+ list_del_init(BTRFS_I(inode)-ordered_operations);
+printk(KERN_CRIT racing inode deletion with ordered operations!!!\n);
+ }
+ spin_unlock(root-fs_info-ordered_root_lock);
+
  if (test_bit(BTRFS_INODE_HAS_ORPHAN_ITEM,
  BTRFS_I(inode)-runtime_flags)) {
  btrfs_info(root-fs_info, inode %llu still on the orphan list,
--



Hi Chris,

just had that hang during rsync from /home (ZFS, mirrored) to /bak
(Btrfs w. lzo compression) again with that patch applied, it doesn't
seem to be related to that issue (or patch) - only applicable to my
case, obviously - since search for that string (e.g. racing) doesn't
show anything in that message:

[16028.169347] INFO: task kworker/u16:2:11956 blocked for more than 180 seconds.
[16028.169349] Tainted: P O 3.14.13_btrfs+_BFS_test27_integration #2
[16028.169350] echo 0  /proc/sys/kernel/hung_task_timeout_secs
disables this message.
[16028.169351] kworker/u16:2 D 88081ec13540 0 11956 2 0x0008
[16028.169356] Workqueue: btrfs-delalloc normal_work_helper
[16028.169358] 8806180ab8e0 0046 
0004
[16028.169359] a000 8806210f16b0 8806180abfd8
81e11500
[16028.169360] 8806210f16b0 0206 8113e6cc
88081ec135c0
[16028.169362] Call Trace:
[16028.169367] [8113e6cc] ? delayacct_end+0x7c/0x90
[16028.169370] [811689d0] ? wait_on_page_read+0x60/0x60
[16028.169374] [819cfc78] ? io_schedule+0x88/0xe0
[16028.169375] [811689d5] ? sleep_on_page+0x5/0x10
[16028.169377] [819cfffc] ? __wait_on_bit_lock+0x3c/0x90
[16028.169378] [81168ac5] ? __lock_page+0x65/0x70
[16028.169382] [810f5580] ? autoremove_wake_function+0x30/0x30
[16028.169384] [81169854] ? __find_lock_page+0x44/0x70
[16028.169385] [811698ca] ? find_or_create_page+0x2a/0xa0
[16028.169388] [8145a1cf] ? io_ctl_prepare_pages+0x4f/0x150
[16028.169390] [8145bd45] ? __load_free_space_cache+0x195/0x5d0
[16028.169392] [8145c26b] ? load_free_space_cache+0xeb/0x1b0
[16028.169395] [813fd6a1] ? cache_block_group+0x191/0x390
[16028.169396] [810f5550] ? prepare_to_wait_event+0xf0/0xf0
[16028.169398] [814085ea] ? find_free_extent+0x95a/0xdb0
[16028.169400] [81408bf9] ? btrfs_reserve_extent+0x69/0x150
[16028.169403] [81421116] ? cow_file_range+0x136/0x420
[16028.169404] [81422493] ? submit_compressed_extents+0x1f3/0x480
[16028.169406] [81422720] ? submit_compressed_extents+0x480/0x480
[16028.169407] [8144896b] ? normal_work_helper+0x1ab/0x330
[16028.169410] [810df26d] ? process_one_work+0x16d/0x490
[16028.169411] [810dff8b] ? worker_thread+0x12b/0x410
[16028.169412] [810dfe60] ? manage_workers.isra.28+0x2c0/0x2c0
[16028.169414] [810e579a] ? kthread+0xca/0xe0
[16028.169415] [810e56d0] ? kthread_create_on_node+0x180/0x180
[16028.169417] [819d3c7c] ? ret_from_fork+0x7c/0xb0
[16028.169418] [810e56d0] ? kthread_create_on_node+0x180/0x180
[16028.169422] INFO: task btrfs-transacti:12042 blocked for more than
180 seconds.
[16028.169422] Tainted: P O 3.14.13_btrfs+_BFS_test27_integration #2
[16028.169423] echo 0  /proc/sys/kernel/hung_task_timeout_secs
disables this message.
[16028.169423] btrfs-transacti D 88081ec13540 0 12042 2 0x0008
[16028.169425] 88009c7adb20 0046 
88040d84ca68
[16028.169426] a000 88061f284ba0 88009c7adfd8
81e11500
[16028.169427] 88061f284ba0 88061a21dea8 811b8c2d
8805fc919e00
[16028.169428] Call Trace:
[16028.169431] [811b8c2d] ? kmem_cache_alloc_trace+0x14d/0x160
[16028.169433] [813fd632] ? cache_block_group+0x122/0x390
[16028.169434] [810f5550] ? prepare_to_wait_event+0xf0/0xf0
[16028.169436] [814085ea] ? find_free_extent+0x95a/0xdb0
[16028.169437] [81408bf9] ? btrfs_reserve_extent+0x69/0x150
[16028.169439] [81422fa8] ? __btrfs_prealloc_file_range+0xe8/0x380
[16028.169441] [8140b6f2] ? btrfs_write_dirty_block_groups+0x642/0x6d0
[16028.169442] [819cb00c] ? 

Re: Blocked tasks on 3.15.1

2014-07-19 Thread Cody P Schafer
On Thu, Jul 17, 2014 at 8:18 AM, Chris Mason c...@fb.com wrote:

 [ deadlocks during rsync in 3.15 with compression enabled ]

 Hi everyone,

 I still haven't been able to reproduce this one here, but I'm going
 through a series of tests with lzo compression foraced and every
 operation forced to ordered.  Hopefully it'll kick it out soon.

 While I'm hammering away, could you please try this patch.  If this is
 the buy you're hitting, the deadlock will go away and you'll see this
 printk in the log.

 thanks!

 -chris

 diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
 index 3668048..8ab56df 100644
 --- a/fs/btrfs/inode.c
 +++ b/fs/btrfs/inode.c
 @@ -8157,6 +8157,13 @@ void btrfs_destroy_inode(struct inode *inode)
 spin_unlock(root-fs_info-ordered_root_lock);
 }

 +   spin_lock(root-fs_info-ordered_root_lock);
 +   if (!list_empty(BTRFS_I(inode)-ordered_operations)) {
 +   list_del_init(BTRFS_I(inode)-ordered_operations);
 +printk(KERN_CRIT racing inode deletion with ordered 
 operations!!!\n);
 +   }
 +   spin_unlock(root-fs_info-ordered_root_lock);
 +
 if (test_bit(BTRFS_INODE_HAS_ORPHAN_ITEM,
  BTRFS_I(inode)-runtime_flags)) {
 btrfs_info(root-fs_info, inode %llu still on the orphan 
 list,

Thanks Chris.

Running 3.15.6 with this patch applied on top:
 - still causes a hang with `rsync -hPaHAXx --del /mnt/home/nyx/ /home/nyx/`
 - no extra error messages printed (`dmesg | grep racing`) compared to
without the patch

To recap some details (so I can have it all in one place):
 - /home/ is btrfs with compress=lzo
 - /mnt/home is btrfs with no compression enabled.
 - I have _not_ created any nodatacow files.
 - Both filesystems are on different physical disks.
 - Full stack is: sata - dmcrypt - lvm - btrfs (I noticed others
mentioning the use of dmcrypt)
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Blocked tasks on 3.15.1

2014-07-19 Thread Martin Steigerwald
Am Samstag, 19. Juli 2014, 12:38:53 schrieb Cody P Schafer:
 On Thu, Jul 17, 2014 at 8:18 AM, Chris Mason c...@fb.com wrote:
  [ deadlocks during rsync in 3.15 with compression enabled ]
  
  Hi everyone,
  
  I still haven't been able to reproduce this one here, but I'm going
  through a series of tests with lzo compression foraced and every
  operation forced to ordered.  Hopefully it'll kick it out soon.
  
  While I'm hammering away, could you please try this patch.  If this is
  the buy you're hitting, the deadlock will go away and you'll see this
  printk in the log.
  
  thanks!
  
  -chris
  
  diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
  index 3668048..8ab56df 100644
  --- a/fs/btrfs/inode.c
  +++ b/fs/btrfs/inode.c
  @@ -8157,6 +8157,13 @@ void btrfs_destroy_inode(struct inode *inode)
  
  spin_unlock(root-fs_info-ordered_root_lock);
  
  }
  
  +   spin_lock(root-fs_info-ordered_root_lock);
  +   if (!list_empty(BTRFS_I(inode)-ordered_operations)) {
  +   list_del_init(BTRFS_I(inode)-ordered_operations);
  +printk(KERN_CRIT racing inode deletion with ordered
  operations!!!\n); +   }
  +   spin_unlock(root-fs_info-ordered_root_lock);
  +
  
  if (test_bit(BTRFS_INODE_HAS_ORPHAN_ITEM,
  
   BTRFS_I(inode)-runtime_flags)) {
  
  btrfs_info(root-fs_info, inode %llu still on the orphan
  list,
 
 Thanks Chris.
 
 Running 3.15.6 with this patch applied on top:
  - still causes a hang with `rsync -hPaHAXx --del /mnt/home/nyx/ /home/nyx/`
 - no extra error messages printed (`dmesg | grep racing`) compared to
 without the patch

I got same results with 3.16-rc5 + this patch (see thread BTRFS hang with 
3.16-rc5). 3.16-rc4 still is fine with me. No hang whatsoever so far.

 To recap some details (so I can have it all in one place):
  - /home/ is btrfs with compress=lzo

BTRFS RAID 1 with lzo.

  - I have _not_ created any nodatacow files.

Me neither.

  - Full stack is: sata - dmcrypt - lvm - btrfs (I noticed others
 mentioning the use of dmcrypt)

Same, except no dmcrypt.

-- 
Martin 'Helios' Steigerwald - http://www.Lichtvoll.de
GPG: 03B0 0D6C 0040 0710 4AFA  B82F 991B EAAC A599 84C7
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Blocked tasks on 3.15.1, raid1 btrfs is no ends of trouble for me

2014-07-18 Thread Marc MERLIN
On Thu, Jul 17, 2014 at 09:18:07AM -0400, Chris Mason wrote:
 
 [ deadlocks during rsync in 3.15 with compression enabled ]
 
 Hi everyone,
 
 I still haven't been able to reproduce this one here, but I'm going
 through a series of tests with lzo compression foraced and every
 operation forced to ordered.  Hopefully it'll kick it out soon.
 
 While I'm hammering away, could you please try this patch.  If this is
 the buy you're hitting, the deadlock will go away and you'll see this
 printk in the log.
 
 thanks!
 
 -chris
 
 diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
 index 3668048..8ab56df 100644
 --- a/fs/btrfs/inode.c
 +++ b/fs/btrfs/inode.c
 @@ -8157,6 +8157,13 @@ void btrfs_destroy_inode(struct inode *inode)
   spin_unlock(root-fs_info-ordered_root_lock);
   }
  
 + spin_lock(root-fs_info-ordered_root_lock);
 + if (!list_empty(BTRFS_I(inode)-ordered_operations)) {
 + list_del_init(BTRFS_I(inode)-ordered_operations);
 +printk(KERN_CRIT racing inode deletion with ordered 
 operations!!!\n);
 + }
 + spin_unlock(root-fs_info-ordered_root_lock);
 +
   if (test_bit(BTRFS_INODE_HAS_ORPHAN_ITEM,
BTRFS_I(inode)-runtime_flags)) {
   btrfs_info(root-fs_info, inode %llu still on the orphan list,

I've gotten more blocked messages with your patch:


See also the message I sent about memory leaks, and how enabling
kmemleak gets btrfs to deadlock soon after boot relibably.
https://www.mail-archive.com/linux-btrfs@vger.kernel.org/msg35568.html
(this was before your patch though)

With your patch (and without kmemleak):

gargamel:/etc/apache2/sites-enabled# ps -eo pid,etime,wchan:30,args |grep df
 349501:48:38 btrfs_statfs   df -hP -x none -x tmpfs -x 
iso9660 -x udf -x nfs
 410507:48:39 btrfs_statfs   df -hP -x none -x tmpfs -x 
iso9660 -x udf -x nfs
12639   48:38 btrfs_statfs   df -hP -x none -x tmpfs -x 
iso9660 -x udf -x nfs
12691   48:37 btrfs_statfs   df
1475306:48:38 btrfs_statfs   df -hP -x none -x tmpfs -x 
iso9660 -x udf -x nfs
1721410:48:39 btrfs_statfs   df -hP -x none -x tmpfs -x 
iso9660 -x udf -x nfs
1752603:48:38 btrfs_statfs   df -hP -x none -x tmpfs -x 
iso9660 -x udf -x nfs
1871009:48:38 btrfs_statfs   df -hP -x none -x tmpfs -x 
iso9660 -x udf -x nfs
2366805:48:38 btrfs_statfs   df -hP -x none -x tmpfs -x 
iso9660 -x udf -x nfs
2667511:37:42 btrfs_statfs   df .
2682802:48:38 btrfs_statfs   df -hP -x none -x tmpfs -x 
iso9660 -x udf -x nfs
2751508:48:38 btrfs_statfs   df -hP -x none -x tmpfs -x 
iso9660 -x udf -x nfs

sysrq-w does not show me output for those and I cannot understand why.

Howver, I have found that btrfs raid 1 on top of dmcrypt has given me no ends 
of trouble.
I lost that filesystem twice due to corruption, and now it hangs my machine 
(strace finds
that df is hanging on that partition).
gargamel:~# btrfs fi df /mnt/btrfs_raid0
Data, RAID1: total=222.00GiB, used=221.61GiB
Data, single: total=8.00MiB, used=0.00
System, RAID1: total=8.00MiB, used=48.00KiB
System, single: total=4.00MiB, used=0.00
Metadata, RAID1: total=2.00GiB, used=1.10GiB
Metadata, single: total=8.00MiB, used=0.00
unknown, single: total=384.00MiB, used=0.00
gargamel:~# btrfs fi show /mnt/btrfs_raid0
Label: 'btrfs_raid0'  uuid: 74279e10-46e7-4ac4-8216-a291819a6691
Total devices 2 FS bytes used 222.71GiB
devid1 size 836.13GiB used 224.03GiB path /dev/dm-3
devid2 size 836.13GiB used 224.01GiB path /dev/mapper/raid0d2

Btrfs v3.14.1


This is not encouraging, I think I'm going to stop using raid1 in btrfs :(

I tried sysrq-t, but the output goes faster than my serial console can
capture it, I can't get you a traceback on those df processes.
the dmesg buffer is too small
I already have 
Kernel log buffer size (16 = 64KB, 17 = 128KB) (LOG_BUF_SHIFT) [17] (NEW) 17
and the kernel config does not let me increase it to something more useful like 
24.

Btrfs in 3.15 has been no end of troubles for me on my 2 machines, and I can't 
even capture
useful info when it happens since my long sysrq dumps get truncated and
flow faster than syslog can capture and relay them it seems.

Do you have any suggestions on how to capture that data better?

In the meantime, kernel log when things started hanging is below. the zm 
processes 
are indeed accessing that raid1 partition.

[67499.502755] INFO: task btrfs-transacti:2867 blocked for more than 120 
seconds.
[67499.526860]   Not tainted 3.15.5-amd64-i915-preempt-20140714cm1 #1
[67499.548624] echo 0  /proc/sys/kernel/hung_task_timeout_secs disables this 
message.
[67499.575212] btrfs-transacti D 0001 0  2867  2 0x
[67499.598611]  8802135e7e10 0046 880118322158 
8802135e7fd8

Re: Blocked tasks on 3.15.1, raid1 btrfs is no ends of trouble for me

2014-07-18 Thread Marc MERLIN
On Fri, Jul 18, 2014 at 05:33:45PM -0700, Marc MERLIN wrote:
 Howver, I have found that btrfs raid 1 on top of dmcrypt has given me no ends 
 of trouble.
 I lost that filesystem twice due to corruption, and now it hangs my machine 
 (strace finds
 that df is hanging on that partition).
 gargamel:~# btrfs fi df /mnt/btrfs_raid0
 Data, RAID1: total=222.00GiB, used=221.61GiB
 Data, single: total=8.00MiB, used=0.00
 System, RAID1: total=8.00MiB, used=48.00KiB
 System, single: total=4.00MiB, used=0.00
 Metadata, RAID1: total=2.00GiB, used=1.10GiB
 Metadata, single: total=8.00MiB, used=0.00
 unknown, single: total=384.00MiB, used=0.00
 gargamel:~# btrfs fi show /mnt/btrfs_raid0
 Label: 'btrfs_raid0'  uuid: 74279e10-46e7-4ac4-8216-a291819a6691
 Total devices 2 FS bytes used 222.71GiB
 devid1 size 836.13GiB used 224.03GiB path /dev/dm-3
 devid2 size 836.13GiB used 224.01GiB path /dev/mapper/raid0d2
 
 Btrfs v3.14.1
 
 
 This is not encouraging, I think I'm going to stop using raid1 in btrfs :(

Sorry, this may be a bit misleading. I actually lost 2 filesystems that
were raid0 on top of dmcrypt.
This time it's raid1, and the data isn't lost, but btrfs is tripping all
over itself and taking my whole system apparently because of that
filesystem.

Marc
-- 
A mouse is a device used to point at the xterm you want to type in - A.S.R.
Microsoft is to operating systems 
   what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/ | PGP 1024R/763BE901
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Blocked tasks on 3.15.1, raid1 btrfs is no ends of trouble for me

2014-07-18 Thread Marc MERLIN
TL;DR: 3.15.5 (or .1 when I tried it) just hang over and over again in
multiple ways on my server.
They also hang on my laptop reliably if I enable kmemleak, but otherwise
my laptop mostly survives with 3.15.x without kmemleak (although it does
deadlock eventually, but that could be after days/weeks, not hours).

I reverted to 3.14 on that machine, and everything is good again.

As a note, this is the 3rd time I try to upgrade this server to 3.15 and
everything goes to crap. I then go back to 3.14 and things work again,
not great since btrfs has never been great and stable for me, but it
works well enough.

On Fri, Jul 18, 2014 at 05:44:57PM -0700, Marc MERLIN wrote:
 On Fri, Jul 18, 2014 at 05:33:45PM -0700, Marc MERLIN wrote:
  Howver, I have found that btrfs raid 1 on top of dmcrypt has given me no 
  ends of trouble.
  I lost that filesystem twice due to corruption, and now it hangs my machine 
  (strace finds
  that df is hanging on that partition).
  gargamel:~# btrfs fi df /mnt/btrfs_raid0
  Data, RAID1: total=222.00GiB, used=221.61GiB
  Data, single: total=8.00MiB, used=0.00
  System, RAID1: total=8.00MiB, used=48.00KiB
  System, single: total=4.00MiB, used=0.00
  Metadata, RAID1: total=2.00GiB, used=1.10GiB
  Metadata, single: total=8.00MiB, used=0.00
  unknown, single: total=384.00MiB, used=0.00
  gargamel:~# btrfs fi show /mnt/btrfs_raid0
  Label: 'btrfs_raid0'  uuid: 74279e10-46e7-4ac4-8216-a291819a6691
  Total devices 2 FS bytes used 222.71GiB
  devid1 size 836.13GiB used 224.03GiB path /dev/dm-3
  devid2 size 836.13GiB used 224.01GiB path /dev/mapper/raid0d2
  
  Btrfs v3.14.1
  
  
  This is not encouraging, I think I'm going to stop using raid1 in btrfs :(
 
 Sorry, this may be a bit misleading. I actually lost 2 filesystems that
 were raid0 on top of dmcrypt.
 This time it's raid1, and the data isn't lost, but btrfs is tripping all
 over itself and taking my whole system apparently because of that
 filesystem.

And just to say that I'm wrong at pinning this down, the same 3.15.5
with your patch locked up on my root filesystem on the next boot

This time sysrq-w worked for a change.
Excerpt:

31933   03:54 btrfs_file_llseek  tail -n 50 
/var/local/src/misterhouse/data/logs/print.log
31960   32:54 btrfs_file_llseek  tail -n 50 
/var/local/src/misterhouse/data/logs/print.log
32077   18:54 btrfs_file_llseek  tail -n 50 
/var/local/src/misterhouse/data/logs/print.log

[ 2176.230211] tailD 8801b3a567c0 0 25396  22031 0x20020080
[ 2176.252788]  88006fed3e20 0082 00a8 
88006fed3fd8
[ 2176.276039]  8801a542a3d0 000141c0 88020c374e10 
88020c374e14
[ 2176.299273]  8801a542a3d0 88020c374e18  
88006fed3e30
[ 2176.322515] Call Trace:
[ 2176.330739]  [8161fa5e] schedule+0x73/0x75
[ 2176.346527]  [8161fd1f] schedule_preempt_disabled+0x18/0x24
[ 2176.367208]  [81620e42] __mutex_lock_slowpath+0x160/0x1d7
[ 2176.386946]  [81620ed0] mutex_lock+0x17/0x27
[ 2176.403727]  [8123a33a] btrfs_file_llseek+0x40/0x205
[ 2176.422603]  [810be59a] ? from_kgid_munged+0x12/0x1e
[ 2176.441015]  [810482f1] ? cp_stat64+0x50/0x20b
[ 2176.457841]  [81156627] vfs_llseek+0x2e/0x30
[ 2176.474606]  [81156c32] SyS_llseek+0x5b/0xaa
[ 2176.490895]  [8162ab2c] sysenter_dispatch+0x7/0x21

Full log:
http://marc.merlins.org/tmp/btrfs_hang3.txt

After reboot, it's now hanging on this:
[  362.811392] INFO: task kworker/u8:0:6 blocked for more than 120 seconds.
[  362.831717]   Not tainted 3.15.5-amd64-i915-preempt-20140714cm1 #1
[  362.851516] echo 0  /proc/sys/kernel/hung_task_timeout_secs disables this 
message.
[  362.875213] kworker/u8:0D 88021265a800 0 6  2 0x
[  362.896672] Workqueue: btrfs-flush_delalloc normal_work_helper
[  362.914260]  8802148cbb60 0046 8802148cbb30 
8802148cbfd8
[  362.936741]  8802148c4150 000141c0 88021f3941c0 
8802148c4150
[  362.959195]  8802148cbc00 0002 810fdda8 
8802148cbb70
[  362.981602] Call Trace:
[  362.988972]  [810fdda8] ? wait_on_page_read+0x3c/0x3c
[  363.006769]  [8161fa5e] schedule+0x73/0x75
[  363.021704]  [8161fc03] io_schedule+0x60/0x7a
[  363.037414]  [810fddb6] sleep_on_page+0xe/0x12
[  363.053416]  [8161ff93] __wait_on_bit_lock+0x46/0x8a
[  363.070980]  [810fde71] __lock_page+0x69/0x6b
[  363.086722]  [810848d1] ? autoremove_wake_function+0x34/0x34
[  363.106373]  [81242ab0] lock_page+0x1e/0x21
[  363.121585]  [812465bb] 
extent_write_cache_pages.isra.16.constprop.32+0x10e/0x2c6
[  363.148103]  [81246a19] extent_writepages+0x4b/0x5c
[  363.166792]  [81230ce4] ? btrfs_submit_direct+0x3f4/0x3f4
[  363.187074]  [810765ec] ? 

Re: Blocked tasks on 3.15.1, raid1 btrfs is no ends of trouble for me

2014-07-18 Thread Chris Samuel
On Fri, 18 Jul 2014 05:44:57 PM Marc MERLIN wrote:

 Sorry, this may be a bit misleading. I actually lost 2 filesystems that
 were raid0 on top of dmcrypt.

Stupid question I know, but does this happen without dmcrypt?

cheers,
Chris
-- 
 Chris Samuel  :  http://www.csamuel.org/  :  Melbourne, VIC


--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Blocked tasks on 3.15.1, raid1 btrfs is no ends of trouble for me

2014-07-18 Thread Marc MERLIN
On Sat, Jul 19, 2014 at 11:59:24AM +1000, Chris Samuel wrote:
 On Fri, 18 Jul 2014 05:44:57 PM Marc MERLIN wrote:
 
  Sorry, this may be a bit misleading. I actually lost 2 filesystems that
  were raid0 on top of dmcrypt.
 
 Stupid question I know, but does this happen without dmcrypt?

It's not a stupid question: I don't use btrfs without dmcrypt, so I can't
say.
(and I'm not interested in trying :) 

Marc
-- 
A mouse is a device used to point at the xterm you want to type in - A.S.R.
Microsoft is to operating systems 
   what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/  
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Blocked tasks on 3.15.1

2014-07-17 Thread Chris Mason

[ deadlocks during rsync in 3.15 with compression enabled ]

Hi everyone,

I still haven't been able to reproduce this one here, but I'm going
through a series of tests with lzo compression foraced and every
operation forced to ordered.  Hopefully it'll kick it out soon.

While I'm hammering away, could you please try this patch.  If this is
the buy you're hitting, the deadlock will go away and you'll see this
printk in the log.

thanks!

-chris

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 3668048..8ab56df 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -8157,6 +8157,13 @@ void btrfs_destroy_inode(struct inode *inode)
spin_unlock(root-fs_info-ordered_root_lock);
}
 
+   spin_lock(root-fs_info-ordered_root_lock);
+   if (!list_empty(BTRFS_I(inode)-ordered_operations)) {
+   list_del_init(BTRFS_I(inode)-ordered_operations);
+printk(KERN_CRIT racing inode deletion with ordered operations!!!\n);
+   }
+   spin_unlock(root-fs_info-ordered_root_lock);
+
if (test_bit(BTRFS_INODE_HAS_ORPHAN_ITEM,
 BTRFS_I(inode)-runtime_flags)) {
btrfs_info(root-fs_info, inode %llu still on the orphan list,
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Blocked tasks on 3.15.1

2014-07-02 Thread Chris Mason
On 07/02/2014 08:27 AM, Cody P Schafer wrote:

 Will do. The rsync I'm running is processing a lot of chromium cache
 files when it hangs (just for a reference), and ends up triggering a
 bunch of deletes as well.
 
 Still a problem with your v3.15.y (eb97581), here's the log with
 sysrq-t and sysrq-l
 https://urldefense.proofpoint.com/v1/url?u=http://bpaste.net/show/428234/k=ZVNjlDMF0FElm4dQtryO4A%3D%3D%0Ar=6%2FL0lzzDhu0Y1hL9xm%2BQyA%3D%3D%0Am=9lRzEuxWeyHtsDXvFJNWlf2CgKZWZ1w%2FScqbUMy1jII%3D%0As=0daa1232bef652c4f16c9d12cdad408909feaa5069ba3c1888fa4895e01ec3a2
 
 Also, correction, it's a firefox cache dir rsync that seems to trigger
 it (stalls pretty early on and very consistently):
 
 [... snip ...]
 .cache/mozilla/firefox/kqtl1tlc.test/Cache/7/1F/F43F9d01
   5.23M 100%   17.82MB/s0:00:00 (xfr#452, ir-chk=1201/6659)
 .cache/mozilla/firefox/kqtl1tlc.test/Cache/7/20/
 .cache/mozilla/firefox/kqtl1tlc.test/Cache/7/20/23A66d01
 116.82K 100%  376.50kB/s0:00:00 (xfr#453, ir-chk=1200/6659)
 .cache/mozilla/firefox/kqtl1tlc.test/Cache/7/21/
 .cache/mozilla/firefox/kqtl1tlc.test/Cache/7/23/
 .cache/mozilla/firefox/kqtl1tlc.test/Cache/7/24/
 .cache/mozilla/firefox/kqtl1tlc.test/Cache/7/25/
 .cache/mozilla/firefox/kqtl1tlc.test/Cache/7/25/7C836d01
 [... stall here ...]
 

Ok, and just to clarify, are you actively using the files on the
destination outside of rsync?

-chris
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Blocked tasks on 3.15.1

2014-07-02 Thread Chris Mason


On 07/02/2014 09:58 AM, Chris Mason wrote:
 On 07/02/2014 08:27 AM, Cody P Schafer wrote:
 
 Will do. The rsync I'm running is processing a lot of chromium cache
 files when it hangs (just for a reference), and ends up triggering a
 bunch of deletes as well.

 Still a problem with your v3.15.y (eb97581), here's the log with
 sysrq-t and sysrq-l
 https://urldefense.proofpoint.com/v1/url?u=http://bpaste.net/show/428234/k=ZVNjlDMF0FElm4dQtryO4A%3D%3D%0Ar=6%2FL0lzzDhu0Y1hL9xm%2BQyA%3D%3D%0Am=9lRzEuxWeyHtsDXvFJNWlf2CgKZWZ1w%2FScqbUMy1jII%3D%0As=0daa1232bef652c4f16c9d12cdad408909feaa5069ba3c1888fa4895e01ec3a2

 Also, correction, it's a firefox cache dir rsync that seems to trigger
 it (stalls pretty early on and very consistently):

 [... snip ...]
 .cache/mozilla/firefox/kqtl1tlc.test/Cache/7/1F/F43F9d01
   5.23M 100%   17.82MB/s0:00:00 (xfr#452, ir-chk=1201/6659)
 .cache/mozilla/firefox/kqtl1tlc.test/Cache/7/20/
 .cache/mozilla/firefox/kqtl1tlc.test/Cache/7/20/23A66d01
 116.82K 100%  376.50kB/s0:00:00 (xfr#453, ir-chk=1200/6659)
 .cache/mozilla/firefox/kqtl1tlc.test/Cache/7/21/
 .cache/mozilla/firefox/kqtl1tlc.test/Cache/7/23/
 .cache/mozilla/firefox/kqtl1tlc.test/Cache/7/24/
 .cache/mozilla/firefox/kqtl1tlc.test/Cache/7/25/
 .cache/mozilla/firefox/kqtl1tlc.test/Cache/7/25/7C836d01
 [... stall here ...]

 
 Ok, and just to clarify, are you actively using the files on the
 destination outside of rsync?

Also, do you have compression on?

-chris

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Blocked tasks on 3.15.1

2014-07-01 Thread Chris Mason
On 06/30/2014 07:42 PM, Cody P Schafer wrote:
 On Mon, Jun 30, 2014 at 1:30 PM, Chris Mason c...@fb.com wrote:
 On 06/30/2014 02:11 PM, Chris Mason wrote:
 On 06/29/2014 04:02 PM, Cody P Schafer wrote:
 On Fri, Jun 27, 2014 at 7:22 PM, Chris Samuel ch...@csamuel.org wrote:
 On Fri, 27 Jun 2014 05:20:41 PM Duncan wrote:

 If I'm not mistaken the fix for the 3.16 series bug was:

 ea4ebde02e08558b020c4b61bb9a4c0fcf63028e

 Btrfs: fix deadlocks with trylock on tree nodes.

 That patch applies cleanly to 3.15.2 so if it is indeed the fix it should
 probably go to -stable for the next 3.15 release..

 Unfortunately my test system died a while ago (hardware problem) and I've 
 not
 been able to resurrect it yet.

 I'm also seeing stuck tasks on btrfs (3.14.4, 3.15.1, 3.15.2).
 I've also tried 3.15.2 with ea4ebde02e08558b020c4b61bb9a4c applied on
 top with similar results.
 I've been triggering the hang with 'rsync -hPaHAXx --del /mnt/home/a/
 /home/a/' where /mnt/home and /home are 2 separate btrfs filesystems
 on 2 separate disks.

 dmesg with w-trigger: 
 https://urldefense.proofpoint.com/v1/url?u=http://bpaste.net/show/419555k=ZVNjlDMF0FElm4dQtryO4A%3D%3D%0Ar=6%2FL0lzzDhu0Y1hL9xm%2BQyA%3D%3D%0Am=SAjzDO8AnhJBEWtUi6s8VGVQd2sORQ%2FJz5tWH4nOYWg%3D%0As=2c4ff3f7f39b2e6d3dcd4947905df54d6a534b35adf63c55d8c50e28ef5781b6
 --

 These traces show us waiting for IO, but it doesn't show anyone doing
 the IO.  Either we're failing to kick off our work queues or they are
 stuck on something else.

 Could you please send a sysrq-t and sysrq-l while you're stuck?  That
 will show us all the procs and all the CPUs.

 Also, do you have any nodatacow files in here?  Please say yes.

 
 kernel log from 3.15.2 + ea4ebde02 showing the blocked tasks,
 sysrq-{w,t,l} included
 https://urldefense.proofpoint.com/v1/url?u=http://bpaste.net/show/423296/k=ZVNjlDMF0FElm4dQtryO4A%3D%3D%0Ar=6%2FL0lzzDhu0Y1hL9xm%2BQyA%3D%3D%0Am=SAjzDO8AnhJBEWtUi6s8VGVQd2sORQ%2FJz5tWH4nOYWg%3D%0As=5af8bc75059925af242b0eef1f4b94348d233d79968d53ff36b7c2594c9dd6b9
 
 I haven't explicitely created any nodatacow files, is there a quick
 way to tell if there are any? Right now I'm doing
 `lsattr -R /mnt/home/a/ 2/dev/null | grep -- '^-*C-* '` to try and check.
 
 (2/dev/null is hiding lots of Operation not supported While reading
 flags on warnings)
 

If you haven't turned nodatacow on intentionally, you don't have any
nodatacow files ;)  I have been trying to reproduce this with rsync and
other code that hammers on the ordered writeback, but no luck yet.

Before we spend too much time triggering it again, I'd like you to
please try a patch from Filipe that is in current mainline.  I've cherry
picked on top of 3.15.3 in a branch called v3.15.y:

git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs.git v3.15.y

-chris

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Blocked tasks on 3.15.1

2014-06-30 Thread Chris Mason
On 06/29/2014 04:02 PM, Cody P Schafer wrote:
 On Fri, Jun 27, 2014 at 7:22 PM, Chris Samuel ch...@csamuel.org wrote:
 On Fri, 27 Jun 2014 05:20:41 PM Duncan wrote:

 If I'm not mistaken the fix for the 3.16 series bug was:

 ea4ebde02e08558b020c4b61bb9a4c0fcf63028e

 Btrfs: fix deadlocks with trylock on tree nodes.

 That patch applies cleanly to 3.15.2 so if it is indeed the fix it should
 probably go to -stable for the next 3.15 release..

 Unfortunately my test system died a while ago (hardware problem) and I've not
 been able to resurrect it yet.
 
 I'm also seeing stuck tasks on btrfs (3.14.4, 3.15.1, 3.15.2).
 I've also tried 3.15.2 with ea4ebde02e08558b020c4b61bb9a4c applied on
 top with similar results.
 I've been triggering the hang with 'rsync -hPaHAXx --del /mnt/home/a/
 /home/a/' where /mnt/home and /home are 2 separate btrfs filesystems
 on 2 separate disks.
 
 dmesg with w-trigger: http://bpaste.net/show/419555
 --

These traces show us waiting for IO, but it doesn't show anyone doing
the IO.  Either we're failing to kick off our work queues or they are
stuck on something else.

Could you please send a sysrq-t and sysrq-l while you're stuck?  That
will show us all the procs and all the CPUs.

-chris
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Blocked tasks on 3.15.1

2014-06-30 Thread Chris Mason


On 06/30/2014 02:11 PM, Chris Mason wrote:
 On 06/29/2014 04:02 PM, Cody P Schafer wrote:
 On Fri, Jun 27, 2014 at 7:22 PM, Chris Samuel ch...@csamuel.org wrote:
 On Fri, 27 Jun 2014 05:20:41 PM Duncan wrote:

 If I'm not mistaken the fix for the 3.16 series bug was:

 ea4ebde02e08558b020c4b61bb9a4c0fcf63028e

 Btrfs: fix deadlocks with trylock on tree nodes.

 That patch applies cleanly to 3.15.2 so if it is indeed the fix it should
 probably go to -stable for the next 3.15 release..

 Unfortunately my test system died a while ago (hardware problem) and I've 
 not
 been able to resurrect it yet.

 I'm also seeing stuck tasks on btrfs (3.14.4, 3.15.1, 3.15.2).
 I've also tried 3.15.2 with ea4ebde02e08558b020c4b61bb9a4c applied on
 top with similar results.
 I've been triggering the hang with 'rsync -hPaHAXx --del /mnt/home/a/
 /home/a/' where /mnt/home and /home are 2 separate btrfs filesystems
 on 2 separate disks.

 dmesg with w-trigger: http://bpaste.net/show/419555
 --
 
 These traces show us waiting for IO, but it doesn't show anyone doing
 the IO.  Either we're failing to kick off our work queues or they are
 stuck on something else.
 
 Could you please send a sysrq-t and sysrq-l while you're stuck?  That
 will show us all the procs and all the CPUs.

Also, do you have any nodatacow files in here?  Please say yes.

-chris

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Blocked tasks on 3.15.1

2014-06-30 Thread Cody P Schafer
On Mon, Jun 30, 2014 at 1:30 PM, Chris Mason c...@fb.com wrote:
 On 06/30/2014 02:11 PM, Chris Mason wrote:
 On 06/29/2014 04:02 PM, Cody P Schafer wrote:
 On Fri, Jun 27, 2014 at 7:22 PM, Chris Samuel ch...@csamuel.org wrote:
 On Fri, 27 Jun 2014 05:20:41 PM Duncan wrote:

 If I'm not mistaken the fix for the 3.16 series bug was:

 ea4ebde02e08558b020c4b61bb9a4c0fcf63028e

 Btrfs: fix deadlocks with trylock on tree nodes.

 That patch applies cleanly to 3.15.2 so if it is indeed the fix it should
 probably go to -stable for the next 3.15 release..

 Unfortunately my test system died a while ago (hardware problem) and I've 
 not
 been able to resurrect it yet.

 I'm also seeing stuck tasks on btrfs (3.14.4, 3.15.1, 3.15.2).
 I've also tried 3.15.2 with ea4ebde02e08558b020c4b61bb9a4c applied on
 top with similar results.
 I've been triggering the hang with 'rsync -hPaHAXx --del /mnt/home/a/
 /home/a/' where /mnt/home and /home are 2 separate btrfs filesystems
 on 2 separate disks.

 dmesg with w-trigger: http://bpaste.net/show/419555
 --

 These traces show us waiting for IO, but it doesn't show anyone doing
 the IO.  Either we're failing to kick off our work queues or they are
 stuck on something else.

 Could you please send a sysrq-t and sysrq-l while you're stuck?  That
 will show us all the procs and all the CPUs.

 Also, do you have any nodatacow files in here?  Please say yes.


kernel log from 3.15.2 + ea4ebde02 showing the blocked tasks,
sysrq-{w,t,l} included
http://bpaste.net/show/423296/

I haven't explicitely created any nodatacow files, is there a quick
way to tell if there are any? Right now I'm doing
`lsattr -R /mnt/home/a/ 2/dev/null | grep -- '^-*C-* '` to try and check.

(2/dev/null is hiding lots of Operation not supported While reading
flags on warnings)
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Blocked tasks on 3.15.1

2014-06-30 Thread Charles Cazabon
Chris Mason c...@fb.com wrote:
 On 06/29/2014 04:02 PM, Cody P Schafer wrote:
  been able to resurrect it yet.
  
  I'm also seeing stuck tasks on btrfs (3.14.4, 3.15.1, 3.15.2).

I'm seeing these with 3.15.2 as well.

 Could you please send a sysrq-t and sysrq-l while you're stuck?  That
 will show us all the procs and all the CPUs.

For what it's worth, http://bpaste.net/show/BswHMVpHlguSrdELgv7e/ is my syslog
covering my most recent stuck event, including the results of sysrq-t and
sysrq-l.

Charles
-- 
---
Charles Cazabon
GPL'ed software available at:   http://pyropus.ca/software/
---
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Blocked tasks on 3.15.1

2014-06-29 Thread Cody P Schafer
On Fri, Jun 27, 2014 at 7:22 PM, Chris Samuel ch...@csamuel.org wrote:
 On Fri, 27 Jun 2014 05:20:41 PM Duncan wrote:

 If I'm not mistaken the fix for the 3.16 series bug was:

 ea4ebde02e08558b020c4b61bb9a4c0fcf63028e

 Btrfs: fix deadlocks with trylock on tree nodes.

 That patch applies cleanly to 3.15.2 so if it is indeed the fix it should
 probably go to -stable for the next 3.15 release..

 Unfortunately my test system died a while ago (hardware problem) and I've not
 been able to resurrect it yet.

I'm also seeing stuck tasks on btrfs (3.14.4, 3.15.1, 3.15.2).
I've also tried 3.15.2 with ea4ebde02e08558b020c4b61bb9a4c applied on
top with similar results.
I've been triggering the hang with 'rsync -hPaHAXx --del /mnt/home/a/
/home/a/' where /mnt/home and /home are 2 separate btrfs filesystems
on 2 separate disks.

dmesg with w-trigger: http://bpaste.net/show/419555/ (3.15.2 + ea4ebde)
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Blocked tasks on 3.15.1

2014-06-29 Thread Cody P Schafer
On Sun, Jun 29, 2014 at 3:02 PM, Cody P Schafer d...@codyps.com wrote:
 On Fri, Jun 27, 2014 at 7:22 PM, Chris Samuel ch...@csamuel.org wrote:
 On Fri, 27 Jun 2014 05:20:41 PM Duncan wrote:

 If I'm not mistaken the fix for the 3.16 series bug was:

 ea4ebde02e08558b020c4b61bb9a4c0fcf63028e

 Btrfs: fix deadlocks with trylock on tree nodes.

 That patch applies cleanly to 3.15.2 so if it is indeed the fix it should
 probably go to -stable for the next 3.15 release..

 Unfortunately my test system died a while ago (hardware problem) and I've not
 been able to resurrect it yet.

 I'm also seeing stuck tasks on btrfs (3.14.4, 3.15.1, 3.15.2).
 I've also tried 3.15.2 with ea4ebde02e08558b020c4b61bb9a4c applied on
 top with similar results.
 I've been triggering the hang with 'rsync -hPaHAXx --del /mnt/home/a/
 /home/a/' where /mnt/home and /home are 2 separate btrfs filesystems
 on 2 separate disks.

 dmesg with w-trigger: http://bpaste.net/show/419555/ (3.15.2 + ea4ebde)

And here's the same thing but with lockdep enabled (in the hope that
that info might be useful)
http://bpaste.net/show/419899/
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Blocked tasks on 3.15.1

2014-06-29 Thread Rich Freeman
On Fri, Jun 27, 2014 at 8:22 PM, Chris Samuel ch...@csamuel.org wrote:
 On Fri, 27 Jun 2014 05:20:41 PM Duncan wrote:

 If I'm not mistaken the fix for the 3.16 series bug was:

 ea4ebde02e08558b020c4b61bb9a4c0fcf63028e

 Btrfs: fix deadlocks with trylock on tree nodes.

 That patch applies cleanly to 3.15.2 so if it is indeed the fix it should
 probably go to -stable for the next 3.15 release..

I can confirm that 3.15.2 definitely has the deadlock problem.  I
tried upgrading just to convince myself of this before patching it and
it only took a few hours before it stopped syncing with the usual
errors.

I applied the patch on Jun 28 around 20:00UTC.  I haven't had a
deadlock since, despite having the file system fairly active with a
few reboots, some deleted snapshots, being assimilated by the new
sysvinit replacement, etc.  That doesn't really prove anything though
- for all I know it will hang a week from now.

However, the patch seems stable so far on 3.15.2.

Rich
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Blocked tasks on 3.15.1

2014-06-27 Thread Tomasz Chmielewski

I've been getting blocked tasks on 3.15.1 generally at times when the
filesystem is somewhat busy (such as doing a backup via scp/clonezilla
writing to the disk).

A week ago I had enabled snapper for a day which resulted in a daily
cleanup of about 8 snapshots at once, which might have contributed,
but I've been limping along since.


I've started seeing similar on several servers, after upgrading to 3.15 
or 3.15.1. With 3.16-rc1 it was even crashing for me.

I've rolled back to the latest 3.14.x, and it's still behaving fine.

I've signalled it before on the list in btrfs filesystem hang with 
3.15-rc3 when doing rsync thread.


--
Tomasz Chmielewski
http://www.sslrack.com

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Blocked tasks on 3.15.1

2014-06-27 Thread Duncan
Tomasz Chmielewski posted on Fri, 27 Jun 2014 12:02:43 +0200 as excerpted:

 I've been getting blocked tasks on 3.15.1 generally at times when the
 filesystem is somewhat busy (such as doing a backup via scp/clonezilla
 writing to the disk).
 
 I've started seeing similar on several servers, after upgrading to 3.15
 or 3.15.1. With 3.16-rc1 it was even crashing for me.
 I've rolled back to the latest 3.14.x, and it's still behaving fine.
 
 I've signalled it before on the list in btrfs filesystem hang with
 3.15-rc3 when doing rsync thread.

There is a known btrfs lockup bug that was introduced in the commit-
window btrfs pull for 3.16, that was fixed by a pull I believe the day 
before 3.16-rc2.  So 3.16-pre to rc2 is known-bad tho it'll work for a 
few minutes and didn't do any permanent damage that I could see, here.

But from 3.16-rc2 on, the 3.16-pre series has been working fine for me.

For 3.15, I didn't run the pre-releases as I had another project I was 
focusing on, but I experienced no problems with 3.15 itself.  However, my 
use-case is multiple independent small btrfs on partitioned SSD, sub-100-
GB per btrfs, so I'd be less likely to experience the blocked task issues 
that others reported, mostly on TB+ size spinning rust.

And it /did/ seem to me that the frequency of blocked-task reports were 
higher for 3.15 than for previous kernel series, tho 3.15 worked fine for 
me on small btrfs on SSD, the relatively short time I ran it.

Hopefully that problem's fixed on 3.16-rc2+, but as of yet there's not 
enough 3.16-rc2+ reports out there from folks experiencing issues with 
3.15 blocked tasks to rightfully say.  What CAN be said is that the known 
3.16-series commit-window btrfs lockups bug that DID affect me was fixed 
right before rc2, and I'm running rc2+ just fine, here.

-- 
Duncan - List replies preferred.   No HTML msgs.
Every nonfree program has a lord, a master --
and if you use the program, he is your master.  Richard Stallman

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Blocked tasks on 3.15.1

2014-06-27 Thread Rich Freeman
On Fri, Jun 27, 2014 at 9:06 AM, Duncan 1i5t5.dun...@cox.net wrote:
 Hopefully that problem's fixed on 3.16-rc2+, but as of yet there's not
 enough 3.16-rc2+ reports out there from folks experiencing issues with
 3.15 blocked tasks to rightfully say.

Any chance that it was backported to 3.15.2?  I'd rather not move to
mainline just for btrfs.

I got another block this morning and failed to capture a log before my
terminals gave out.  I switched back to 3.15.0 for the moment, and
we'll see if that fares any better.

Rich
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Blocked tasks on 3.15.1

2014-06-27 Thread Chris Murphy

On Jun 27, 2014, at 9:14 AM, Rich Freeman r-bt...@thefreemanclan.net wrote:

 On Fri, Jun 27, 2014 at 9:06 AM, Duncan 1i5t5.dun...@cox.net wrote:
 Hopefully that problem's fixed on 3.16-rc2+, but as of yet there's not
 enough 3.16-rc2+ reports out there from folks experiencing issues with
 3.15 blocked tasks to rightfully say.
 
 Any chance that it was backported to 3.15.2?  I'd rather not move to
 mainline just for btrfs.

The backports don't happen that quickly. I'm uncertain about specifics but I 
think many such fixes need to be demonstrated in mainline before they get 
backported to stable.


 
 I got another block this morning and failed to capture a log before my
 terminals gave out.  I switched back to 3.15.0 for the moment, and
 we'll see if that fares any better. 

Yeah I'd start going backwards. The idea of going forwards is to hopefully get 
you unstuck or extract data where otherwise you can't, it's not really a 
recommendation for production usage. It's also often useful if you can 
reproduce the block with a current rc kernel and issue sysrq+w and post that. 
Then do your regression with an older kernel.

Chris Murphy--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Blocked tasks on 3.15.1

2014-06-27 Thread Duncan
Chris Murphy posted on Fri, 27 Jun 2014 09:52:46 -0600 as excerpted:

 On Jun 27, 2014, at 9:14 AM, Rich Freeman r-bt...@thefreemanclan.net
 wrote:
 
 On Fri, Jun 27, 2014 at 9:06 AM, Duncan 1i5t5.dun...@cox.net wrote:
 Hopefully that problem's fixed on 3.16-rc2+, but as of yet there's not
 enough 3.16-rc2+ reports out there from folks experiencing issues with
 3.15 blocked tasks to rightfully say.
 
 Any chance that it was backported to 3.15.2?  I'd rather not move to
 mainline just for btrfs.
 
 The backports don't happen that quickly.

The lockup bug that affected early 3.16 was introduced in the commit-
window pull for 3.16, so the fix for that shouldn't have needed backported 
(unless the problem commit ended up in stable too, which I doubt but 
don't know for sure).

3.15.0 didn't contain that bug, which affected me, but as I said, there 
did seem to be more blocked-task reports in 3.15, which didn't affect me.

I didn't run 3.15.1, however, staying on 3.15.0 until after 3.16-rc2 
fixed the earlier 3.16-pre series bug that had kept me from the 3.16 
series until then.  So anything that might have affected the 3.15 stable 
series after 3.15.0, I wouldn't know about.

If I'm not mistaken the fix for the 3.16 series bug was:

ea4ebde02e08558b020c4b61bb9a4c0fcf63028e

Btrfs: fix deadlocks with trylock on tree nodes.

But I think the 3.16 commit-window changes introducing the bug weren't 
btrfs specific but instead at the generic vfs level.  If that's the case, 
then it's possible that the bug was there before 3.16's commit window and 
might have been triggering some of the 3.15 reports as well, and the 3.16 
vfs change simply made it much worse.

IOW, I don't know whether that 3.16 series fix will help 3.15 or not, but 
I don't believe it'll hurt, and it /might/ help.

-- 
Duncan - List replies preferred.   No HTML msgs.
Every nonfree program has a lord, a master --
and if you use the program, he is your master.  Richard Stallman

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Blocked tasks on 3.15.1

2014-06-27 Thread Rich Freeman
On Fri, Jun 27, 2014 at 11:52 AM, Chris Murphy li...@colorremedies.com wrote:
 On Jun 27, 2014, at 9:14 AM, Rich Freeman r-bt...@thefreemanclan.net wrote:


 I got another block this morning and failed to capture a log before my
 terminals gave out.  I switched back to 3.15.0 for the moment, and
 we'll see if that fares any better.

 Yeah I'd start going backwards. The idea of going forwards is to
 hopefully get you unstuck or extract data where otherwise you can't,
 it's not really a recommendation for production usage. It's also often
 useful if you can reproduce the block with a current rc kernel and
 issue sysrq+w and post that. Then do your regression with an older
 kernel.

So, obviously I'm getting my money's worth from the btrfs team, but
neither is always a great option as neither involves me running a
stable kernel.  3.15.0 contains CVE-2014-4014, although I'm running a
version patched for that vulnerability.  If I go back any further I'd
probably have to backport it myself, and I only know about it because
my distro patched that CVE on 3.15.0 before moving to 3.15.1.

Running 3.16 doesn't bother me much from a btrfs standpoint, but it
means I'm getting unstable updates on all the other modules as well.
It is just more to deal with.

I might give 3.15.2 a shot and see what happens, and I can always fall
back to 3.15.0 again.

Rich
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Blocked tasks on 3.15.1

2014-06-27 Thread Chris Samuel
On Fri, 27 Jun 2014 05:20:41 PM Duncan wrote:

 If I'm not mistaken the fix for the 3.16 series bug was:
 
 ea4ebde02e08558b020c4b61bb9a4c0fcf63028e
 
 Btrfs: fix deadlocks with trylock on tree nodes.

That patch applies cleanly to 3.15.2 so if it is indeed the fix it should 
probably go to -stable for the next 3.15 release..

Unfortunately my test system died a while ago (hardware problem) and I've not 
been able to resurrect it yet.

cheers,
Chris
-- 
 Chris Samuel  :  http://www.csamuel.org/  :  Melbourne, VIC



signature.asc
Description: This is a digitally signed message part.