Re: [PATCH] Btrfs: shrink_delalloc check bdi write congested

2012-09-30 Thread David Sterba
Hi again,

On Sun, Sep 30, 2012 at 11:11:10AM +0900, Itaru Kitayama wrote:
 This is the correct one.
 
 Signed-off-by: Itaru Kitayama kitay...@cl.bb4.ne.jp
 
 diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
 index efb044e..c032dbe 100644
 --- a/fs/btrfs/extent-tree.c
 +++ b/fs/btrfs/extent-tree.c
 @@ -3712,8 +3712,9 @@ static void shrink_delalloc(struct btrfs_root
 *root, u64 to_reclaim, u64 orig,
 while (delalloc_bytes  loops  3) {
 max_reclaim = min(delalloc_bytes, to_reclaim);
 nr_pages = max_reclaim  PAGE_CACHE_SHIFT;
 -   writeback_inodes_sb_nr_if_idle(root-fs_info-sb, nr_pages,
 -  WB_REASON_FS_FREE_SPACE);
 +   if (!bdi_write_congested(root-fs_info-sb-s_bdi))
 +   writeback_inodes_sb_nr_if_idle(root-fs_info, 
 nr_pages,
 +  
 WB_REASON_FS_FREE_SPACE);

You don't pass the same argument to writeback_inodes_sb_nr_if_idle in
the changed code, this would not compile (root-fs_info-sb), but it's
a minor thing.

I'm more interested in the background motivation of the change, it's
clear that it tries to avoid writing data if the devices are congested,
have you measured an improvement against original behaviour?

writeback_inodes_sb_nr_if_idle checks if writeback is in progress and
does not start if this is true. That way this will not hammer the device
unnecessarily.

Your code tries to avoid even more hammering of the device when the
writes do not come from writeback. This may or may not be a good thing
(and hard to guess without a few tests, or at least I don't see that).

Shrink delalloc starts the writeback for a given number of pages and
then hopes they'll be flushed so the reserved space can be reclaimed
back. If the device is congested, this will not start the writeback and
it would be very unlikely that total delalloc bytes shrinks. The rest of
the function relies on asynchronous behaviour, it's even less clear what
it would do without the writeback call. In the worst case it could block
on 'wait_event' or at 'btrfs_wait_ordered_extents' in some case, though
this is just more of a speculation.


david
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Btrfs: shrink_delalloc check bdi write congested

2012-09-30 Thread Itaru Kitayama
Hi David,
Thank you for your comments. I wanted to fix a lockdep warning on a
possible deadlock
case encountered during the xfstests with a scratch space almost full.

You are right I encountered the worst scenario you described below, I
drop this patch and
I'll look at btrfs_congested_fn more to examine the mechanisms
implemented there are
working as expected.

Thanks,

Itaru

On Mon, Oct 1, 2012 at 7:29 AM, David Sterba d...@jikos.cz wrote:
 Hi again,

 On Sun, Sep 30, 2012 at 11:11:10AM +0900, Itaru Kitayama wrote:
 This is the correct one.

 Signed-off-by: Itaru Kitayama kitay...@cl.bb4.ne.jp

 diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
 index efb044e..c032dbe 100644
 --- a/fs/btrfs/extent-tree.c
 +++ b/fs/btrfs/extent-tree.c
 @@ -3712,8 +3712,9 @@ static void shrink_delalloc(struct btrfs_root
 *root, u64 to_reclaim, u64 orig,
 while (delalloc_bytes  loops  3) {
 max_reclaim = min(delalloc_bytes, to_reclaim);
 nr_pages = max_reclaim  PAGE_CACHE_SHIFT;
 -   writeback_inodes_sb_nr_if_idle(root-fs_info-sb, nr_pages,
 -  WB_REASON_FS_FREE_SPACE);
 +   if (!bdi_write_congested(root-fs_info-sb-s_bdi))
 +   writeback_inodes_sb_nr_if_idle(root-fs_info, 
 nr_pages,
 +  
 WB_REASON_FS_FREE_SPACE);

 You don't pass the same argument to writeback_inodes_sb_nr_if_idle in
 the changed code, this would not compile (root-fs_info-sb), but it's
 a minor thing.

 I'm more interested in the background motivation of the change, it's
 clear that it tries to avoid writing data if the devices are congested,
 have you measured an improvement against original behaviour?

 writeback_inodes_sb_nr_if_idle checks if writeback is in progress and
 does not start if this is true. That way this will not hammer the device
 unnecessarily.

 Your code tries to avoid even more hammering of the device when the
 writes do not come from writeback. This may or may not be a good thing
 (and hard to guess without a few tests, or at least I don't see that).

 Shrink delalloc starts the writeback for a given number of pages and
 then hopes they'll be flushed so the reserved space can be reclaimed
 back. If the device is congested, this will not start the writeback and
 it would be very unlikely that total delalloc bytes shrinks. The rest of
 the function relies on asynchronous behaviour, it's even less clear what
 it would do without the writeback call. In the worst case it could block
 on 'wait_event' or at 'btrfs_wait_ordered_extents' in some case, though
 this is just more of a speculation.


 david
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Btrfs: shrink_delalloc check bdi write congested

2012-09-29 Thread David Sterba
On Sat, Sep 29, 2012 at 10:20:09PM +0900, Itaru Kitayama wrote:
 --- a/fs/btrfs/extent-tree.c
 +++ b/fs/btrfs/extent-tree.c
 @@ -3712,7 +3712,7 @@ static void shrink_delalloc(struct btrfs_root *root, 
 u64 t
 while (delalloc_bytes  loops  3) {
 max_reclaim = min(delalloc_bytes, to_reclaim);
 nr_pages = max_reclaim  PAGE_CACHE_SHIFT;
 -   writeback_inodes_sb_nr_if_idle(root-fs_info-sb, nr_pages,
 +   if (!bdi_write_congested(root-fs_info-sb-s_bdi)) 
 writeback_in
WB_REASON_FS_FREE_SPACE);

malformed patch
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Btrfs: shrink_delalloc check bdi write congested

2012-09-29 Thread Itaru Kitayama
Resubmit this after the checkpatch test.

In srhink_delalloc(), writeback starts if idle, also check the bdi is
not write congested.
The patch is against the head of the btrfs-next.

Signed-off-by: Itaru Kitayama kitay...@cl.bb4.ne.jp

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index efb044e..1aae046 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -3712,8 +3712,9 @@ static void shrink_delalloc(struct btrfs_root
*root, u64 to_reclaim, u64 orig,
while (delalloc_bytes  loops  3) {
max_reclaim = min(delalloc_bytes, to_reclaim);
nr_pages = max_reclaim  PAGE_CACHE_SHIFT;
-   writeback_inodes_sb_nr_if_idle(root-fs_info-sb, nr_pages,
-  WB_REASON_FS_FREE_SPACE);
+   if (!bdi_write_congested(root-fs_info-sb-s_bdi))
+   writeback_inodes_sb_nr_if_idle(root-fs_info, nr_page,
+  WB_REASON_FS_FREE_SPACE);

/*
 * We need to wait for the async pages to actually start before


On Sun, Sep 30, 2012 at 1:21 AM, David Sterba d...@jikos.cz wrote:
 On Sat, Sep 29, 2012 at 10:20:09PM +0900, Itaru Kitayama wrote:
 --- a/fs/btrfs/extent-tree.c
 +++ b/fs/btrfs/extent-tree.c
 @@ -3712,7 +3712,7 @@ static void shrink_delalloc(struct btrfs_root *root, 
 u64 t
 while (delalloc_bytes  loops  3) {
 max_reclaim = min(delalloc_bytes, to_reclaim);
 nr_pages = max_reclaim  PAGE_CACHE_SHIFT;
 -   writeback_inodes_sb_nr_if_idle(root-fs_info-sb, nr_pages,
 +   if (!bdi_write_congested(root-fs_info-sb-s_bdi)) 
 writeback_in
WB_REASON_FS_FREE_SPACE);

 malformed patch
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Btrfs: shrink_delalloc check bdi write congested

2012-09-29 Thread Itaru Kitayama
This is the correct one.

Signed-off-by: Itaru Kitayama kitay...@cl.bb4.ne.jp

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index efb044e..c032dbe 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -3712,8 +3712,9 @@ static void shrink_delalloc(struct btrfs_root
*root, u64 to_reclaim, u64 orig,
while (delalloc_bytes  loops  3) {
max_reclaim = min(delalloc_bytes, to_reclaim);
nr_pages = max_reclaim  PAGE_CACHE_SHIFT;
-   writeback_inodes_sb_nr_if_idle(root-fs_info-sb, nr_pages,
-  WB_REASON_FS_FREE_SPACE);
+   if (!bdi_write_congested(root-fs_info-sb-s_bdi))
+   writeback_inodes_sb_nr_if_idle(root-fs_info, nr_pages,
+  WB_REASON_FS_FREE_SPACE);

/*
 * We need to wait for the async pages to actually start before


On Sun, Sep 30, 2012 at 10:28 AM, Itaru Kitayama kitay...@cl.bb4u.ne.jp wrote:
 Resubmit this after the checkpatch test.

 In srhink_delalloc(), writeback starts if idle, also check the bdi is
 not write congested.
 The patch is against the head of the btrfs-next.

 Signed-off-by: Itaru Kitayama kitay...@cl.bb4.ne.jp

 diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
 index efb044e..1aae046 100644
 --- a/fs/btrfs/extent-tree.c
 +++ b/fs/btrfs/extent-tree.c
 @@ -3712,8 +3712,9 @@ static void shrink_delalloc(struct btrfs_root
 *root, u64 to_reclaim, u64 orig,
 while (delalloc_bytes  loops  3) {
 max_reclaim = min(delalloc_bytes, to_reclaim);
 nr_pages = max_reclaim  PAGE_CACHE_SHIFT;
 -   writeback_inodes_sb_nr_if_idle(root-fs_info-sb, nr_pages,
 -  WB_REASON_FS_FREE_SPACE);
 +   if (!bdi_write_congested(root-fs_info-sb-s_bdi))
 +   writeback_inodes_sb_nr_if_idle(root-fs_info, nr_page,
 +  
 WB_REASON_FS_FREE_SPACE);

 /*
  * We need to wait for the async pages to actually start 
 before


 On Sun, Sep 30, 2012 at 1:21 AM, David Sterba d...@jikos.cz wrote:
 On Sat, Sep 29, 2012 at 10:20:09PM +0900, Itaru Kitayama wrote:
 --- a/fs/btrfs/extent-tree.c
 +++ b/fs/btrfs/extent-tree.c
 @@ -3712,7 +3712,7 @@ static void shrink_delalloc(struct btrfs_root *root, 
 u64 t
 while (delalloc_bytes  loops  3) {
 max_reclaim = min(delalloc_bytes, to_reclaim);
 nr_pages = max_reclaim  PAGE_CACHE_SHIFT;
 -   writeback_inodes_sb_nr_if_idle(root-fs_info-sb, nr_pages,
 +   if (!bdi_write_congested(root-fs_info-sb-s_bdi)) 
 writeback_in
WB_REASON_FS_FREE_SPACE);

 malformed patch
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html