Re: [GIT PULL] Btrfs updates

2011-12-05 Thread Miao Xie
Hi, Chris and Oliva

On thu, 1 Dec 2011 10:39:55 -0500, Chris Mason wrote:
 git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs.git for-linus
 
 Has our current set of fixes.  This is fairly small, Alexandre Oliva has
 been chasing problems in our block allocator and kicked out important
 fixes.
 
 Jan Schmidt fixed a merge error in the raid repair code, we're now
 properly repairing failed blocks (io errors or crc errors) without
 having to run a scrub.
 
 Alexandre Oliva (5) commits (+8/-8):
 Btrfs: skip block groups without enough space for a cluster (+1/-1)

This patch introduce a bug that we can not allocate blocks from the cluster
with enough space and it may make the block allocation fail.

This is because the check that the above patch introduced make the allocator
skip the cluster allocation, and jump to the uncluster allocation without
reclaiming all the blocks in the cluster, At this time, if all the free space
is in the cluster, and no space in the block group, the allocation will fail.
(we can trigger this bug on SSD.)

Fortunately, the following patch written by Oliva can fix this bug.

[PATCH 08/20] Btrfs: try to allocate from cluster even at LOOP_NO_EMPTY_SIZE

Thanks
Miao

 Btrfs: start search for new cluster at the beginning (+2/-4)
 Btrfs: reset cluster's max_size when creating bitmap (+1/-0)
 Btrfs: skip allocation attempt from empty cluster (+3/-3)
 Btrfs: initialize new bitmaps' list (+1/-0)
 
 Li Zefan (1) commits (+3/-3):
 Btrfs: fix oops when calling statfs on readonly device
 
 Arnd Hannemann (1) commits (+2/-2):
 Fix URL of btrfs-progs git repository in docs
 
 Jan Schmidt (1) commits (+20/-7):
 Btrfs: fix meta data raid-repair merge problem
 
 Dan Carpenter (1) commits (+5/-0):
 btrfs scrub: handle -ENOMEM from init_ipath()
 
 Mike Fleetwood (1) commits (+1/-1):
 Btrfs: Don't error on resizing FS to same size
 
 Miao Xie (1) commits (+22/-5):
 Btrfs: fix deadlock on metadata reservation when evicting a inode
 
 Total: (11) commits (+60/-25)
 
  Documentation/filesystems/btrfs.txt |4 ++--
  fs/btrfs/ctree.h|3 +++
  fs/btrfs/extent-tree.c  |   34 +++---
  fs/btrfs/extent_io.c|   27 ---
  fs/btrfs/free-space-cache.c |2 ++
  fs/btrfs/inode.c|2 +-
  fs/btrfs/ioctl.c|2 +-
  fs/btrfs/scrub.c|5 +
  fs/btrfs/super.c|6 +++---
  9 files changed, 60 insertions(+), 25 deletions(-)
 --
 To unsubscribe from this list: send the line unsubscribe linux-btrfs in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Blocked for more than 120 seconds

2011-12-05 Thread Chris Mason
On Sat, Dec 03, 2011 at 04:36:44PM +0200, Konstantinos Skarlatos wrote:
 unfortunately i was wrong. rc4 does not fix this issue for me when
 rsyncing large amounts of data...
 
 my mount options:
 mount -o loop,compress=zlib,compress-force btrfs_test /storage/btrfs
 the filesystem is a file on a raid5 xfs volume.

Oh, the loop + raid5 + xfs is going to cause problems.  The loop driver
is fine for testing but I wouldn't be using it in a production
environment.

-chris
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PULL] Btrfs updates

2011-12-05 Thread Chris Mason
On Mon, Dec 05, 2011 at 04:10:49PM +0800, Miao Xie wrote:
 Hi, Chris and Oliva
 
 On thu, 1 Dec 2011 10:39:55 -0500, Chris Mason wrote:
  git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs.git 
  for-linus
  
  Has our current set of fixes.  This is fairly small, Alexandre Oliva has
  been chasing problems in our block allocator and kicked out important
  fixes.
  
  Jan Schmidt fixed a merge error in the raid repair code, we're now
  properly repairing failed blocks (io errors or crc errors) without
  having to run a scrub.
  
  Alexandre Oliva (5) commits (+8/-8):
  Btrfs: skip block groups without enough space for a cluster (+1/-1)
 
 This patch introduce a bug that we can not allocate blocks from the cluster
 with enough space and it may make the block allocation fail.
 
 This is because the check that the above patch introduced make the allocator
 skip the cluster allocation, and jump to the uncluster allocation without
 reclaiming all the blocks in the cluster, At this time, if all the free space
 is in the cluster, and no space in the block group, the allocation will fail.
 (we can trigger this bug on SSD.)
 
 Fortunately, the following patch written by Oliva can fix this bug.
 
 [PATCH 08/20] Btrfs: try to allocate from cluster even at LOOP_NO_EMPTY_SIZE

Thanks, I'll push this 08/20 out as well.

-chris
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PULL] Btrfs updates

2011-12-05 Thread David Sterba
On Mon, Dec 05, 2011 at 08:14:13AM -0500, Chris Mason wrote:
 On Mon, Dec 05, 2011 at 04:10:49PM +0800, Miao Xie wrote:
  [PATCH 08/20] Btrfs: try to allocate from cluster even at LOOP_NO_EMPTY_SIZE
 
 Thanks, I'll push this 08/20 out as well.

Please pick

Li Zefan: Btrfs: check if the to-be-added device is writable
Liu Bo: Btrfs: drop spin lock when memory alloc fails

(collected in branch fixes-20111205 at my repo)

I overlooked them and forgot to include in the fixes branch before the
last pull request, sorry.


david
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Btrfs: protect orphan block rsv with spin_lock

2011-12-05 Thread Josef Bacik
On Mon, Dec 05, 2011 at 12:50:39PM +0100, Christian Brunner wrote:
 2011/12/2 Josef Bacik jo...@redhat.com:
  We've been seeing warnings coming out of the orphan commit stuff forever 
  from
  ceph.  Turns out it's because we're racing with checking if the orphan block
  reserve is set, because we clear it outside of the spin_lock.  So leave the
  normal fastpath checks where they are, but take the spin_lock and _recheck_ 
  to
  make sure we haven't had an orphan block rsv added in the meantime.  Then 
  clear
  the root's orphan block rsv and release the lock.  With this patch a user 
  said
  the warnings went away and they usually showed up pretty soon after he 
  started
  ceph.  Thanks,
 
 *sigh* - As soon as I turned my back to the serve console it also
 happened again on one of our nodes. That was 25 hours after I started
 the system. Usually I see these warnings a few minutes after the
 start, but the have been cases in the past where it took longer. So
 I'm not sure if the improvement is due to the patch.
 
 Josef: I was still running the patch you sent me, but there was no
 message from the printk's you added.
 

:( ok I'll try again.  Thanks,

Josef
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Btrfs: Use kcalloc instead of kzalloc to allocate array

2011-12-05 Thread Thomas Meyer
The advantage of kcalloc is, that will prevent integer overflows which could
result from the multiplication of number of elements and size and it is also
a bit nicer to read.

The semantic patch that makes this change is available
in https://lkml.org/lkml/2011/11/25/107

Signed-off-by: Thomas Meyer tho...@m3y3r.de
---

diff -u -p a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
--- a/fs/btrfs/volumes.c 2011-11-28 19:36:48.113451068 +0100
+++ b/fs/btrfs/volumes.c 2011-11-28 19:48:10.374677247 +0100
@@ -2450,7 +2450,7 @@ static int __btrfs_alloc_chunk(struct bt
max_chunk_size = min(div_factor(fs_devices-total_rw_bytes, 1),
 max_chunk_size);
 
-   devices_info = kzalloc(sizeof(*devices_info) * fs_devices-rw_devices,
+   devices_info = kcalloc(fs_devices-rw_devices, sizeof(*devices_info),
   GFP_NOFS);
if (!devices_info)
return -ENOMEM;
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


CPU usage in 3.2 RC4

2011-12-05 Thread Peeters Simon
hi everybody

i recently switched from a 3.1 kernel to 3.2rc4 from fedora rawhide.
since i switched i am having serious performance isues, mainly due to
my limmited cpu (1.6Ghz attom single core)
when i boot in 3.1 my bootup is mainly I/O limited (according to bootchart)
when i boot in 3.2 my bootup is compleetly CPU limited and takes a lot more time

also file actions (like tar -xf linux-3.1.tar takes) are a cpu hogg,
according to top most of the cpu usage is in kernel space, and i see
different btrfs threads at the top

time tar -xf linux-3.1.tar.bz2
real    11m10.037s
user    1m26.861s
sys    5m26.738s

cpu usage
8%user,
82%sys,

flush-btrfs-5 50%
tar 45%
btrfs-endio-write 20%
btrfs-transaction 15%

it would be nice if somebody could look into this, and hopefully get
it fixed in 3.2 final (i am not a kernel or btrfs guru, so i don't
know where to start looking for the problem)

greetings

Simon Peeters
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Don't prevent removal of devices that break raid reqs

2011-12-05 Thread Phillip Susi

On 11/10/2011 2:32 PM, Alexandre Oliva wrote:

Instead of preventing the removal of devices that would render existing
raid10 or raid1 impossible, warn but go ahead with it; the rebalancing
code is smart enough to use different block group types.

Should the refusal remain, so that we'd only proceed with a
newly-introduced --force option or so?


I just thought of something.  When adding the second device, balance 
converts DUP to RAID1 automatically, and it is the RAID1 that prevents 
removing the second disk.  What if the chunks were left with both the 
DUP and RAID1 flags set?  That way if you explicitly requested raid1, 
then it won't let you accidentally drop below two disks, but if it were 
auto promoted from DUP, then going back to DUP is ok.


--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: BUG at fs/btrfs/inode.c:841!

2011-12-05 Thread Miao Xie
On mon, 05 Dec 2011 14:49:16 +0100, Jan Schmidt wrote:
 While running xfstest 013 with Chris' for-linus on an ssd I hit this the
 following bug. Before, the system was freshly bootet and all I did was
 insmod and starting ./check in the xfstests directory. I cannot
 reproduce it so far:
 
 Dec  5 14:22:18 oglaroon kernel: [ 1284.890509] device fsid
 f3549a38-35ea-4fc4-a027-d473bc226620 devid 1 transid 15537 /dev/sdv2
 Dec  5 14:22:18 oglaroon kernel: [ 1284.894758] Btrfs detected SSD
 devices, enabling SSD mode
 Dec  5 14:22:23 oglaroon kernel: [ 1286.466817] [ cut here
 ]
 Dec  5 14:22:23 oglaroon kernel: [ 1286.522351] kernel BUG at
 fs/btrfs/inode.c:841!
 Dec  5 14:22:23 oglaroon kernel: [ 1286.576855] invalid opcode: 
 [#1] SMP
 Dec  5 14:22:23 oglaroon kernel: [ 1286.626464] CPU 1
 Dec  5 14:22:23 oglaroon kernel: [ 1286.648512] Modules linked in: btrfs
 mpt2sas scsi_transport_sas raid_class
 Dec  5 14:22:23 oglaroon kernel: [ 1286.733894]
 Dec  5 14:22:23 oglaroon kernel: [ 1286.751785] Pid: 27856, comm:
 fsstress Not tainted 3.1.0+ #9 Supermicro X8SIL/X8SIL
 Dec  5 14:22:23 oglaroon kernel: [ 1286.844197] RIP:
 0010:[a00712e3]  [a00712e3]
 cow_file_range+0x383/0x3a0 [btrfs]
 Dec  5 14:22:23 oglaroon kernel: [ 1286.953334] RSP:
 0018:880234fe59c8  EFLAGS: 00010286
 Dec  5 14:22:23 oglaroon kernel: [ 1287.017092] RAX: ffe4
 RBX: 8802271270e0 RCX: 0020
 Dec  5 14:22:23 oglaroon kernel: [ 1287.102783] RDX: 0001
 RSI: 0001 RDI: 880232a1
 Dec  5 14:22:23 oglaroon kernel: [ 1287.188582] RBP: 880234fe5a78
 R08:  R09: 0002
 Dec  5 14:22:23 oglaroon kernel: [ 1287.274383] R10: 
 R11: 0001 R12: 000b8000
 Dec  5 14:22:23 oglaroon kernel: [ 1287.360182] R13: 1000
 R14: 000b7fff R15: 8802271273e0
 Dec  5 14:22:23 oglaroon kernel: [ 1287.445880] FS:
 7f1ca1e62700() GS:88023fc4() knlGS:
 Dec  5 14:22:23 oglaroon kernel: [ 1287.543119] CS:  0010 DS:  ES:
  CR0: 8005003b
 Dec  5 14:22:23 oglaroon kernel: [ 1287.612175] CR2: 7f1ca13515f0
 CR3: 000233888000 CR4: 06e0
 Dec  5 14:22:23 oglaroon kernel: [ 1287.698079] DR0: 
 DR1:  DR2: 
 Dec  5 14:22:23 oglaroon kernel: [ 1287.783878] DR3: 
 DR6: 0ff0 DR7: 0400
 Dec  5 14:22:23 oglaroon kernel: [ 1287.869677] Process fsstress (pid:
 27856, threadinfo 880234fe4000, task 88022f3f3ea0)
 Dec  5 14:22:23 oglaroon kernel: [ 1287.972118] Stack:
 Dec  5 14:22:23 oglaroon kernel: [ 1287.996247]  
 880234fe5a28 0001 0018
 Dec  5 14:22:23 oglaroon kernel: [ 1288.085687]  
 880227127118 ea0008a039c0 8802345d9708
 Dec  5 14:22:23 oglaroon kernel: [ 1288.175023]  88022f19a800
 1000 1000 8802271270d8
 Dec  5 14:22:23 oglaroon kernel: [ 1288.264474] Call Trace:
 Dec  5 14:22:23 oglaroon kernel: [ 1288.293802]  [a0072237]
 run_delalloc_range+0x347/0x380 [btrfs]
 Dec  5 14:22:23 oglaroon kernel: [ 1288.374434]  [a0085c18]
 __extent_writepage+0x598/0x720 [btrfs]
 Dec  5 14:22:23 oglaroon kernel: [ 1288.454995]  [813af9e6] ?
 radix_tree_gang_lookup_tag_slot+0x96/0xe0
 Dec  5 14:22:23 oglaroon kernel: [ 1288.540793]  [8113607b] ?
 find_get_pages_tag+0x11b/0x1b0
 Dec  5 14:22:23 oglaroon kernel: [ 1288.615056]  [a008626a]
 T.1043+0x2fa/0x3a0 [btrfs]
 Dec  5 14:22:23 oglaroon kernel: [ 1288.683073]  [a008634f]
 extent_writepages+0x3f/0x60 [btrfs]
 Dec  5 14:22:23 oglaroon kernel: [ 1288.760552]  [a006d0b0] ?
 acls_after_inode_item+0xd0/0xd0 [btrfs]
 Dec  5 14:22:23 oglaroon kernel: [ 1288.844193]  [813a9cbd] ?
 _atomic_dec_and_lock+0x4d/0x70
 Dec  5 14:22:23 oglaroon kernel: [ 1288.918526]  [a006a842]
 btrfs_writepages+0x22/0x30 [btrfs]
 Dec  5 14:22:23 oglaroon kernel: [ 1288.995116]  [8113deff]
 do_writepages+0x1f/0x40
 Dec  5 14:22:23 oglaroon kernel: [ 1289.060076]  [81134970]
 __filemap_fdatawrite_range+0x80/0x90
 Dec  5 14:22:23 oglaroon kernel: [ 1289.138481]  [81134b97]
 filemap_flush+0x17/0x20
 Dec  5 14:22:23 oglaroon kernel: [ 1289.203384]  [a0068f52]
 btrfs_start_delalloc_inodes+0xd2/0x230 [btrfs]
 Dec  5 14:22:23 oglaroon kernel: [ 1289.292296]  [811a9c30] ?
 __sync_filesystem+0x80/0x80
 Dec  5 14:22:23 oglaroon kernel: [ 1289.363540]  [a00439b0]
 btrfs_sync_fs+0x30/0xc0 [btrfs]
 Dec  5 14:22:23 oglaroon kernel: [ 1289.436882]  [811a9c09]
 __sync_filesystem+0x59/0x80
 Dec  5 14:22:23 oglaroon kernel: [ 1289.506017]  [811a9c47]
 sync_one_sb+0x17/0x20
 Dec  5 14:22:23 oglaroon kernel: [ 1289.568816]  [8118043e]
 iterate_supers+0x7e/0xe0
 Dec  5 14:22:23 oglaroon kernel: [ 

Re: [GIT PULL] Btrfs updates

2011-12-05 Thread Miao Xie
On Mon, 5 Dec 2011 08:14:13 -0500, Chris Mason wrote:
 On Mon, Dec 05, 2011 at 04:10:49PM +0800, Miao Xie wrote:
 Hi, Chris and Oliva

 On thu, 1 Dec 2011 10:39:55 -0500, Chris Mason wrote:
 git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs.git 
 for-linus

 Has our current set of fixes.  This is fairly small, Alexandre Oliva has
 been chasing problems in our block allocator and kicked out important
 fixes.

 Jan Schmidt fixed a merge error in the raid repair code, we're now
 properly repairing failed blocks (io errors or crc errors) without
 having to run a scrub.

 Alexandre Oliva (5) commits (+8/-8):
 Btrfs: skip block groups without enough space for a cluster (+1/-1)

 This patch introduce a bug that we can not allocate blocks from the cluster
 with enough space and it may make the block allocation fail.

 This is because the check that the above patch introduced make the allocator
 skip the cluster allocation, and jump to the uncluster allocation without
 reclaiming all the blocks in the cluster, At this time, if all the free space
 is in the cluster, and no space in the block group, the allocation will fail.
 (we can trigger this bug on SSD.)

 Fortunately, the following patch written by Oliva can fix this bug.

 [PATCH 08/20] Btrfs: try to allocate from cluster even at LOOP_NO_EMPTY_SIZE
 
 Thanks, I'll push this 08/20 out as well.

I'm sorry for my careless test.
I tested it again just now, I found the above patch could not fix the bug 
completely,
we still need

  [PATCH 16/20] Btrfs: try cluster but don't advance in search list

After applying these two patch, all my test can pass.

Thanks
Miao
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/2] vfs: make writeback_in_progress() inline

2011-12-05 Thread Miao Xie
writeback_in_progress() is very simple, and we will use writeback_in_progress()
in the module, so make it inline.

Signed-off-by: Miao Xie mi...@cn.fujitsu.com
---
 fs/fs-writeback.c   |   12 
 include/linux/backing-dev.h |   12 +++-
 2 files changed, 11 insertions(+), 13 deletions(-)

diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
index 04cf3b9..341448c 100644
--- a/fs/fs-writeback.c
+++ b/fs/fs-writeback.c
@@ -59,18 +59,6 @@ struct wb_writeback_work {
  */
 int nr_pdflush_threads;
 
-/**
- * writeback_in_progress - determine whether there is writeback in progress
- * @bdi: the device's backing_dev_info structure.
- *
- * Determine whether there is writeback waiting to be handled against a
- * backing device.
- */
-int writeback_in_progress(struct backing_dev_info *bdi)
-{
-   return test_bit(BDI_writeback_running, bdi-state);
-}
-
 static inline struct backing_dev_info *inode_to_bdi(struct inode *inode)
 {
struct super_block *sb = inode-i_sb;
diff --git a/include/linux/backing-dev.h b/include/linux/backing-dev.h
index 3b2f9cb..ae4d7c0 100644
--- a/include/linux/backing-dev.h
+++ b/include/linux/backing-dev.h
@@ -258,7 +258,17 @@ int bdi_set_max_ratio(struct backing_dev_info *bdi, 
unsigned int max_ratio);
 extern struct backing_dev_info default_backing_dev_info;
 extern struct backing_dev_info noop_backing_dev_info;
 
-int writeback_in_progress(struct backing_dev_info *bdi);
+/**
+ * writeback_in_progress - determine whether there is writeback in progress
+ * @bdi: the device's backing_dev_info structure.
+ *
+ * Determine whether there is writeback waiting to be handled against a
+ * backing device.
+ */
+static inline int writeback_in_progress(struct backing_dev_info *bdi)
+{
+   return test_bit(BDI_writeback_running, bdi-state);
+}
 
 static inline int bdi_congested(struct backing_dev_info *bdi, int bdi_bits)
 {
-- 
1.7.6.4
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/2] Btrfs: fix deadlock on sb-s_umount when doing umount

2011-12-05 Thread Miao Xie
The reason the deadlock is that:
  Task  Btrfs-cleaner
  umount()
down_write(s-s_umount)
close_ctree()
  wait for the end of
  btrfs-cleaner
start_transaction
  reserve space
shrink_delalloc()
  writeback_inodes_sb_nr_if_idle()
down_read(sb-s_umount)
So, the deadlock has happened.

We fix it by trying to lock s_umount, if _trylock_ fails, it means the fs
is on remounting or umounting. At this time, we will use the sync function of
btrfs to sync all the delalloc file. It may waste lots of time, but as a
corner case, we needn't care.

Reported-by: Tsutomu Itoh t-i...@jp.fujitsu.com
Signed-off-by: Miao Xie mi...@cn.fujitsu.com
---
 fs/btrfs/extent-tree.c |   23 ++-
 1 files changed, 22 insertions(+), 1 deletions(-)

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index 813c6bb..86c295d 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -3372,6 +3372,27 @@ out:
return ret;
 }
 
+void btrfs_writeback_inodes_sb_nr(struct btrfs_root *root,
+ unsigned long nr_pages)
+{
+   struct super_block *sb = root-fs_info-sb;
+
+   if (writeback_in_progress(sb-s_bdi))
+   return;
+
+   /*
+* If we can not get s_umount, it means the fs is on remounting or
+* umounting. At this time, we just sync all the delalloc file.
+*/
+   if (down_read_trylock(sb-s_umount)) {
+   writeback_inodes_sb_nr(sb, nr_pages);
+   up_read(sb-s_umount);
+   } else {
+   btrfs_start_delalloc_inodes(root, 0);
+   btrfs_wait_ordered_extents(root, 0, 0);
+   }
+}
+
 /*
  * shrink metadata reservation for delalloc
  */
@@ -3416,7 +3437,7 @@ static int shrink_delalloc(struct btrfs_root *root, u64 
to_reclaim,
smp_mb();
nr_pages = min_t(unsigned long, nr_pages,
   root-fs_info-delalloc_bytes  PAGE_CACHE_SHIFT);
-   writeback_inodes_sb_nr_if_idle(root-fs_info-sb, nr_pages);
+   btrfs_writeback_inodes_sb_nr(root, nr_pages);
 
spin_lock(space_info-lock);
if (reserved  space_info-bytes_may_use)
-- 
1.7.6.4
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] Btrfs: fix deadlock on sb-s_umount when doing umount

2011-12-05 Thread Al Viro

 +void btrfs_writeback_inodes_sb_nr(struct btrfs_root *root,
 +   unsigned long nr_pages)
 +{
 + struct super_block *sb = root-fs_info-sb;
 +
 + if (writeback_in_progress(sb-s_bdi))
 + return;
 +
 + /*
 +  * If we can not get s_umount, it means the fs is on remounting or
 +  * umounting. At this time, we just sync all the delalloc file.
 +  */
 + if (down_read_trylock(sb-s_umount)) {
 + writeback_inodes_sb_nr(sb, nr_pages);
 + up_read(sb-s_umount);
 + } else {
 + btrfs_start_delalloc_inodes(root, 0);
 + btrfs_wait_ordered_extents(root, 0, 0);
 + }
 +}

If that can race with umount, what prevents sb, its -s_bdi et.al. being freed
under you?
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] vfs: make writeback_in_progress() inline

2011-12-05 Thread Miao Xie
cc Fengguang
cc Linux-kernel

On tue, 06 Dec 2011 13:35:45 +0800, Miao Xie wrote:
 writeback_in_progress() is very simple, and we will use 
 writeback_in_progress()
 in the module, so make it inline.
 
 Signed-off-by: Miao Xie mi...@cn.fujitsu.com
 ---
  fs/fs-writeback.c   |   12 
  include/linux/backing-dev.h |   12 +++-
  2 files changed, 11 insertions(+), 13 deletions(-)
 
 diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
 index 04cf3b9..341448c 100644
 --- a/fs/fs-writeback.c
 +++ b/fs/fs-writeback.c
 @@ -59,18 +59,6 @@ struct wb_writeback_work {
   */
  int nr_pdflush_threads;
  
 -/**
 - * writeback_in_progress - determine whether there is writeback in progress
 - * @bdi: the device's backing_dev_info structure.
 - *
 - * Determine whether there is writeback waiting to be handled against a
 - * backing device.
 - */
 -int writeback_in_progress(struct backing_dev_info *bdi)
 -{
 - return test_bit(BDI_writeback_running, bdi-state);
 -}
 -
  static inline struct backing_dev_info *inode_to_bdi(struct inode *inode)
  {
   struct super_block *sb = inode-i_sb;
 diff --git a/include/linux/backing-dev.h b/include/linux/backing-dev.h
 index 3b2f9cb..ae4d7c0 100644
 --- a/include/linux/backing-dev.h
 +++ b/include/linux/backing-dev.h
 @@ -258,7 +258,17 @@ int bdi_set_max_ratio(struct backing_dev_info *bdi, 
 unsigned int max_ratio);
  extern struct backing_dev_info default_backing_dev_info;
  extern struct backing_dev_info noop_backing_dev_info;
  
 -int writeback_in_progress(struct backing_dev_info *bdi);
 +/**
 + * writeback_in_progress - determine whether there is writeback in progress
 + * @bdi: the device's backing_dev_info structure.
 + *
 + * Determine whether there is writeback waiting to be handled against a
 + * backing device.
 + */
 +static inline int writeback_in_progress(struct backing_dev_info *bdi)
 +{
 + return test_bit(BDI_writeback_running, bdi-state);
 +}
  
  static inline int bdi_congested(struct backing_dev_info *bdi, int bdi_bits)
  {

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] Btrfs: fix deadlock on sb-s_umount when doing umount

2011-12-05 Thread Miao Xie
On tue, 6 Dec 2011 05:49:06 +, Al Viro wrote:
 
 +void btrfs_writeback_inodes_sb_nr(struct btrfs_root *root,
 +  unsigned long nr_pages)
 +{
 +struct super_block *sb = root-fs_info-sb;
 +
 +if (writeback_in_progress(sb-s_bdi))
 +return;
 +
 +/*
 + * If we can not get s_umount, it means the fs is on remounting or
 + * umounting. At this time, we just sync all the delalloc file.
 + */
 +if (down_read_trylock(sb-s_umount)) {
 +writeback_inodes_sb_nr(sb, nr_pages);
 +up_read(sb-s_umount);
 +} else {
 +btrfs_start_delalloc_inodes(root, 0);
 +btrfs_wait_ordered_extents(root, 0, 0);
 +}
 +}
 
 If that can race with umount, what prevents sb, its -s_bdi et.al. being freed
 under you?

In fact, it happened. See the following mail.

   http://marc.info/?l=linux-btrfsm=131495252725296w=2

The above function is called when some one want to modify the meta-data.
Btrfs will wait until all the meta-data operations end, and then free -s_bdi
and the other objects. So we needn't worry about those objects.
(Maybe I misunderstood what you said. If yes, I'm sorry)

Thanks
Miao

 --
 To unsubscribe from this list: send the line unsubscribe linux-btrfs in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html