On Wed, Feb 5, 2014 at 3:17 PM, Ted Unangst <t...@tedunangst.com> wrote:
> We are missing back pressure channels from uvm to the buf cache. The
> buf cache will happily sit on 9000 free pages while uvm churns around
> trying to scavenge up one more page.

Indeed, those are it's minimums (I presume in your case) and are
exactly the amount of memory
that uvm would never have even seen under the model of the static
cache. So I don't agree
with your statement "we are missing back pressure channels from the
uvm to the buf cache".

It looks to me like the situation you are talking about is that the
buffer cache has already backed off
to it's minium (which is used to ensure things like avoiding deadlocks
in the bufer cache on delayed
writes and fun stuff like that).

Or are you in a situation here where the cache has *not* backed off?


>
> Fixing this is beyond the scope of a simple diff, but here's something
> that seems to help in a lot of the common cases, particularly the pla
> deadlock detected spin cycle that people see.
>
> If we're out of memory, kick the buf cache into releasing some
> clean pages. The buf cache may eventually find itself sorely in need
> of memory and unable to get it, but this is better than nothing. I've
> deliberately saved the back pressure until we're already on the about to
> die path to minimize regressions. uvm won't steal back memory unless
> it absolutely has to.

And for the reasons you say, I think this has great potential to move
the deadlock
into the buffer cache on small memory machines when we get the entire
cache filled with
delwri...

Sure, we can make the miniums smaller, but in the end we still have to
fix the page
daemon or we are just delaying the inevitable or moving the deadlock
to other subsystems.




>
> Index: kern/vfs_bio.c
> ===================================================================
> RCS file: /cvs/src/sys/kern/vfs_bio.c,v
> retrieving revision 1.154
> diff -u -p -r1.154 vfs_bio.c
> --- kern/vfs_bio.c      25 Jan 2014 04:23:31 -0000      1.154
> +++ kern/vfs_bio.c      5 Feb 2014 22:08:07 -0000
> @@ -305,6 +305,26 @@ bufadjust(int newbufpages)
>         splx(s);
>  }
>
> +int
> +buf_nukeclean(void)
> +{
> +       struct buf *bp;
> +       int n;
> +
> +       printf("nuking clean bufs\n");
> +       n = 0;
> +       while ((bp = TAILQ_FIRST(&bufqueues[BQ_CLEAN])) && n++ < 10) {
> +               bremfree(bp);
> +               if (bp->b_vp) {
> +                       RB_REMOVE(buf_rb_bufs,
> +                           &bp->b_vp->v_bufs_tree, bp);
> +                       brelvp(bp);
> +               }
> +               buf_put(bp);
> +       }
> +       return (n);
> +}
> +
>  /*
>   * Make the buffer cache back off from cachepct.
>   */
> Index: sys/buf.h
> ===================================================================
> RCS file: /cvs/src/sys/sys/buf.h,v
> retrieving revision 1.93
> diff -u -p -r1.93 buf.h
> --- sys/buf.h   21 Nov 2013 01:16:52 -0000      1.93
> +++ sys/buf.h   5 Feb 2014 22:04:09 -0000
> @@ -312,6 +312,8 @@ void        buf_fix_mapping(struct buf *, vsize
>  void   buf_alloc_pages(struct buf *, vsize_t);
>  void   buf_free_pages(struct buf *);
>
> +int    buf_nukeclean(void);
> +
>
>  void   minphys(struct buf *bp);
>  int    physio(void (*strategy)(struct buf *), dev_t dev, int flags,
> Index: uvm/uvm_pdaemon.c
> ===================================================================
> RCS file: /cvs/src/sys/uvm/uvm_pdaemon.c,v
> retrieving revision 1.64
> diff -u -p -r1.64 uvm_pdaemon.c
> --- uvm/uvm_pdaemon.c   30 May 2013 16:29:46 -0000      1.64
> +++ uvm/uvm_pdaemon.c   5 Feb 2014 22:04:15 -0000
> @@ -116,7 +116,7 @@ uvm_wait(const char *wmsg)
>          * check for page daemon going to sleep (waiting for itself)
>          */
>
> -       if (curproc == uvm.pagedaemon_proc) {
> +       if (curproc == uvm.pagedaemon_proc && buf_nukeclean() == 0) {
>                 /*
>                  * now we have a problem: the pagedaemon wants to go to
>                  * sleep until it frees more memory.   but how can it
> Index: uvm/uvm_pmemrange.c
> ===================================================================
> RCS file: /cvs/src/sys/uvm/uvm_pmemrange.c,v
> retrieving revision 1.36
> diff -u -p -r1.36 uvm_pmemrange.c
> --- uvm/uvm_pmemrange.c 29 Jan 2013 19:55:48 -0000      1.36
> +++ uvm/uvm_pmemrange.c 5 Feb 2014 22:07:37 -0000
> @@ -22,6 +22,7 @@
>  #include <sys/malloc.h>
>  #include <sys/proc.h>          /* XXX for atomic */
>  #include <sys/kernel.h>
> +#include <sys/buf.h>
>
>  /*
>   * 2 trees: addr tree and size tree.
> @@ -1883,6 +1884,13 @@ uvm_wait_pla(paddr_t low, paddr_t high,
>         const char *wmsg = "pmrwait";
>
>         if (curproc == uvm.pagedaemon_proc) {
> +               uvm_unlock_fpageq();
> +               if (buf_nukeclean() != 0) {
> +                       uvm_lock_fpageq();
> +                       return 0;
> +               }
> +               uvm_lock_fpageq();
> +
>                 /*
>                  * XXX detect pagedaemon deadlock - see comment in
>                  * uvm_wait(), as this is exactly the same issue.
>
>

Reply via email to