> Only if the queue depth is not bound. Queue depths are bound and so
> the distance we can go over the threshold is limited.  This is the
> fundamental principle on which the throttling is based.....
> 
> Hence, if the queue is not full, then we will have either written
> dirty pages to it (i.e wbc->nr_write != write_chunk so we will throttle
> or continue normally if write_chunk was written) or we have no more
> dirty pages left.
> 
> Having no dirty pages left on the bdi and it not being congested
> means we effectively have a clean, idle bdi. We should not be trying
> to throttle writeback here - we can't do anything to improve the
> situation by continuing to try to do writeback on this bdi, so we
> may as well give up and let the writer continue. Once we have dirty
> pages on the bdi, we'll get throttled appropriately.

OK, you convinced me.

How about this patch?  I introduced a new wbc counter, that sums the
number of dirty pages encountered, including ones already under
writeback.

Dave, big thanks for your insights.

Miklos

Index: linux/include/linux/writeback.h
===================================================================
--- linux.orig/include/linux/writeback.h        2007-03-14 22:43:42.000000000 
+0100
+++ linux/include/linux/writeback.h     2007-03-14 22:58:56.000000000 +0100
@@ -44,6 +44,7 @@ struct writeback_control {
        long nr_to_write;               /* Write this many pages, and decrement
                                           this for each page written */
        long pages_skipped;             /* Pages which were not written */
+       long nr_dirty;                  /* Number of dirty pages encountered */
 
        /*
         * For a_ops->writepages(): is start or end are non-zero then this is
Index: linux/mm/page-writeback.c
===================================================================
--- linux.orig/mm/page-writeback.c      2007-03-14 22:41:01.000000000 +0100
+++ linux/mm/page-writeback.c   2007-03-14 23:00:20.000000000 +0100
@@ -220,6 +220,17 @@ static void balance_dirty_pages(struct a
                        pages_written += write_chunk - wbc.nr_to_write;
                        if (pages_written >= write_chunk)
                                break;          /* We've done our duty */
+
+                       /*
+                        * If just a few dirty pages were encountered, and
+                        * the queue is not congested, then allow this dirty
+                        * producer to continue.  This resolves the deadlock
+                        * that happens when one filesystem writes back data
+                        * through another.  It should also help when a slow
+                        * device is completely blocking other writes.
+                        */
+                       if (wbc.nr_dirty < 8 && !bdi_write_congested(bdi))
+                               break;
                }
                congestion_wait(WRITE, HZ/10);
        }
@@ -612,6 +623,7 @@ retry:
                                              min(end - index, 
(pgoff_t)PAGEVEC_SIZE-1) + 1))) {
                unsigned i;
 
+               wbc->nr_dirty += nr_pages;
                scanned = 1;
                for (i = 0; i < nr_pages; i++) {
                        struct page *page = pvec.pages[i];
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to