Re: svn commit: r364129 - head/sys/vm

2020-08-11 Thread Conrad Meyer
Done in r364134.

On Tue, Aug 11, 2020 at 3:31 PM Mark Johnston  wrote:
>
> On Tue, Aug 11, 2020 at 03:21:31PM -0700, Conrad Meyer wrote:
> > Hi Konstantin, Mark (raised the same question),
> >
> > On Tue, Aug 11, 2020 at 2:32 PM Konstantin Belousov  
> > wrote:
> > >
> > > On Tue, Aug 11, 2020 at 08:37:45PM +, Conrad Meyer wrote:
> > > > Author: cem
> > > > Date: Tue Aug 11 20:37:45 2020
> > > > New Revision: 364129
> > > > URL: https://svnweb.freebsd.org/changeset/base/364129
> > > >
> > > > Log:
> > > >   Add support for multithreading the inactive queue pageout within a 
> > > > domain.
> > > > ...
> > > > @@ -2488,7 +2488,7 @@ vm_page_zone_import(void *arg, void **store, int 
> > > > cnt,
> > > >* main purpose is to replenish the store of free pages.
> > > >*/
> > > >   if (vmd->vmd_severeset || curproc == pageproc ||
> > > > - !_vm_domain_allocate(vmd, VM_ALLOC_NORMAL, cnt))
> > > > + !_vm_domain_allocate(vmd, VM_ALLOC_SYSTEM, cnt))
> > >
> > > Why this change needed ?
> >
> > The change was inherited from Jeff, along with the rest of it.  I
> > don't know why he changed it, but it does seem orthogonal to the rest
> > of the revision.  This part is nonessential as far as I know, and
> > could be backed out.
>
> To me it doesn't make sense.  The difference between VM_ALLOC_NORMAL and
> _SYSTEM is that they use different thresholds for the free page count
> before deciding whether to continue the allocation.  The _NORMAL
> threshold is higher than the _SYSTEM threshold, but both are smaller
> than the "severe" threshold at which we set vmd_severeset.  In other
> words, if the free page count is such that a _NORMAL allocation would
> fail but a _SYSTEM allocation would succeed, then vmd_severeset would be
> set and we'd never get to the _vmd_domain_allocate() call to begin with.
>
> So, I think this part of the change should be reverted.  Even if the
> analysis is incorrect, it's logically separate from the rest of the
> diff.  Prior to r355003 it made more sense, but per that commit it can
> be harmful to populate the per-CPU page caches when the system is
> severely short on free pages, so I would disagree with it anyway without
> more testing.
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r364129 - head/sys/vm

2020-08-11 Thread Mark Johnston
On Tue, Aug 11, 2020 at 03:21:31PM -0700, Conrad Meyer wrote:
> Hi Konstantin, Mark (raised the same question),
> 
> On Tue, Aug 11, 2020 at 2:32 PM Konstantin Belousov  
> wrote:
> >
> > On Tue, Aug 11, 2020 at 08:37:45PM +, Conrad Meyer wrote:
> > > Author: cem
> > > Date: Tue Aug 11 20:37:45 2020
> > > New Revision: 364129
> > > URL: https://svnweb.freebsd.org/changeset/base/364129
> > >
> > > Log:
> > >   Add support for multithreading the inactive queue pageout within a 
> > > domain.
> > > ...
> > > @@ -2488,7 +2488,7 @@ vm_page_zone_import(void *arg, void **store, int 
> > > cnt,
> > >* main purpose is to replenish the store of free pages.
> > >*/
> > >   if (vmd->vmd_severeset || curproc == pageproc ||
> > > - !_vm_domain_allocate(vmd, VM_ALLOC_NORMAL, cnt))
> > > + !_vm_domain_allocate(vmd, VM_ALLOC_SYSTEM, cnt))
> >
> > Why this change needed ?
> 
> The change was inherited from Jeff, along with the rest of it.  I
> don't know why he changed it, but it does seem orthogonal to the rest
> of the revision.  This part is nonessential as far as I know, and
> could be backed out.

To me it doesn't make sense.  The difference between VM_ALLOC_NORMAL and
_SYSTEM is that they use different thresholds for the free page count
before deciding whether to continue the allocation.  The _NORMAL
threshold is higher than the _SYSTEM threshold, but both are smaller
than the "severe" threshold at which we set vmd_severeset.  In other
words, if the free page count is such that a _NORMAL allocation would
fail but a _SYSTEM allocation would succeed, then vmd_severeset would be
set and we'd never get to the _vmd_domain_allocate() call to begin with.

So, I think this part of the change should be reverted.  Even if the
analysis is incorrect, it's logically separate from the rest of the
diff.  Prior to r355003 it made more sense, but per that commit it can
be harmful to populate the per-CPU page caches when the system is
severely short on free pages, so I would disagree with it anyway without
more testing.
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r364129 - head/sys/vm

2020-08-11 Thread Conrad Meyer
Hi Konstantin, Mark (raised the same question),

On Tue, Aug 11, 2020 at 2:32 PM Konstantin Belousov  wrote:
>
> On Tue, Aug 11, 2020 at 08:37:45PM +, Conrad Meyer wrote:
> > Author: cem
> > Date: Tue Aug 11 20:37:45 2020
> > New Revision: 364129
> > URL: https://svnweb.freebsd.org/changeset/base/364129
> >
> > Log:
> >   Add support for multithreading the inactive queue pageout within a domain.
> > ...
> > @@ -2488,7 +2488,7 @@ vm_page_zone_import(void *arg, void **store, int cnt,
> >* main purpose is to replenish the store of free pages.
> >*/
> >   if (vmd->vmd_severeset || curproc == pageproc ||
> > - !_vm_domain_allocate(vmd, VM_ALLOC_NORMAL, cnt))
> > + !_vm_domain_allocate(vmd, VM_ALLOC_SYSTEM, cnt))
>
> Why this change needed ?

The change was inherited from Jeff, along with the rest of it.  I
don't know why he changed it, but it does seem orthogonal to the rest
of the revision.  This part is nonessential as far as I know, and
could be backed out.

Best,
Conrad
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r364129 - head/sys/vm

2020-08-11 Thread Mark Johnston
On Tue, Aug 11, 2020 at 08:37:45PM +, Conrad Meyer wrote:
> Author: cem
> Date: Tue Aug 11 20:37:45 2020
> New Revision: 364129
> URL: https://svnweb.freebsd.org/changeset/base/364129
> 
> Log:
>   Add support for multithreading the inactive queue pageout within a domain.
>   
>   In very high throughput workloads, the inactive scan can become overwhelmed
>   as you have many cores producing pages and a single core freeing.  Since
>   Mark's introduction of batched pagequeue operations, we can now run multiple
>   inactive threads working on independent batches.
>   
>   To avoid confusing the pid and other control algorithms, I (Jeff) do this in
>   a mpi-like fan out and collect model that is driven from the primary page
>   daemon.  It decides whether the shortfall can be overcome with a single
>   thread and if not dispatches multiple threads and waits for their results.
>   
>   The heuristic is based on timing the pageout activity and averaging a
>   pages-per-second variable which is exponentially decayed. This is visible in
>   sysctl and may be interesting for other purposes.
>   
>   I (Jeff) have verified that this does indeed double our paging throughput
>   when used with two threads. With four we tend to run into other contention
>   problems.  For now I would like to commit this infrastructure with only a
>   single thread enabled.
>   
>   The number of worker threads per domain can be controlled with the
>   'vm.pageout_threads_per_domain' tunable.
>   
>   Submitted by:   jeff (earlier version)
>   Discussed with: markj
>   Tested by:  pho
>   Sponsored by:   probably Netflix (based on contemporary commits)
>   Differential Revision:  https://reviews.freebsd.org/D21629
> 
> Modified:
>   head/sys/vm/vm_meter.c
>   head/sys/vm/vm_page.c
>   head/sys/vm/vm_page.h
>   head/sys/vm/vm_pageout.c
>   head/sys/vm/vm_pagequeue.h
> 
> Modified: head/sys/vm/vm_page.c
> ==
> --- head/sys/vm/vm_page.c Tue Aug 11 17:54:10 2020(r364128)
> +++ head/sys/vm/vm_page.c Tue Aug 11 20:37:45 2020(r364129)
> @@ -421,7 +421,7 @@ sysctl_vm_page_blacklist(SYSCTL_HANDLER_ARGS)
>   * In principle, this function only needs to set the flag PG_MARKER.
>   * Nonetheless, it write busies the page as a safety precaution.
>   */
> -static void
> +void
>  vm_page_init_marker(vm_page_t marker, int queue, uint16_t aflags)
>  {
>  
> @@ -2488,7 +2488,7 @@ vm_page_zone_import(void *arg, void **store, int cnt, 
>* main purpose is to replenish the store of free pages.
>*/
>   if (vmd->vmd_severeset || curproc == pageproc ||
> - !_vm_domain_allocate(vmd, VM_ALLOC_NORMAL, cnt))
> + !_vm_domain_allocate(vmd, VM_ALLOC_SYSTEM, cnt))

Why did this change?  It does not look related to the rest of the
change.
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r364129 - head/sys/vm

2020-08-11 Thread Konstantin Belousov
On Tue, Aug 11, 2020 at 08:37:45PM +, Conrad Meyer wrote:
> Author: cem
> Date: Tue Aug 11 20:37:45 2020
> New Revision: 364129
> URL: https://svnweb.freebsd.org/changeset/base/364129
> 
> Log:
>   Add support for multithreading the inactive queue pageout within a domain.
>   
>   In very high throughput workloads, the inactive scan can become overwhelmed
>   as you have many cores producing pages and a single core freeing.  Since
>   Mark's introduction of batched pagequeue operations, we can now run multiple
>   inactive threads working on independent batches.
>   
>   To avoid confusing the pid and other control algorithms, I (Jeff) do this in
>   a mpi-like fan out and collect model that is driven from the primary page
>   daemon.  It decides whether the shortfall can be overcome with a single
>   thread and if not dispatches multiple threads and waits for their results.
>   
>   The heuristic is based on timing the pageout activity and averaging a
>   pages-per-second variable which is exponentially decayed. This is visible in
>   sysctl and may be interesting for other purposes.
>   
>   I (Jeff) have verified that this does indeed double our paging throughput
>   when used with two threads. With four we tend to run into other contention
>   problems.  For now I would like to commit this infrastructure with only a
>   single thread enabled.
>   
>   The number of worker threads per domain can be controlled with the
>   'vm.pageout_threads_per_domain' tunable.
>   
>   Submitted by:   jeff (earlier version)
>   Discussed with: markj
>   Tested by:  pho
>   Sponsored by:   probably Netflix (based on contemporary commits)
>   Differential Revision:  https://reviews.freebsd.org/D21629
> 
> Modified:
>   head/sys/vm/vm_meter.c
>   head/sys/vm/vm_page.c
>   head/sys/vm/vm_page.h
>   head/sys/vm/vm_pageout.c
>   head/sys/vm/vm_pagequeue.h
> 
> Modified: head/sys/vm/vm_meter.c
> ==
> --- head/sys/vm/vm_meter.cTue Aug 11 17:54:10 2020(r364128)
> +++ head/sys/vm/vm_meter.cTue Aug 11 20:37:45 2020(r364129)
> @@ -552,6 +552,9 @@ vm_domain_stats_init(struct vm_domain *vmd, struct sys
>   SYSCTL_ADD_UINT(NULL, SYSCTL_CHILDREN(oid), OID_AUTO,
>   "free_severe", CTLFLAG_RD, >vmd_free_severe, 0,
>   "Severe free pages");
> + SYSCTL_ADD_UINT(NULL, SYSCTL_CHILDREN(oid), OID_AUTO,
> + "inactive_pps", CTLFLAG_RD, >vmd_inactive_pps, 0,
> + "inactive pages freed/second");
>  
>  }
>  
> 
> Modified: head/sys/vm/vm_page.c
> ==
> --- head/sys/vm/vm_page.c Tue Aug 11 17:54:10 2020(r364128)
> +++ head/sys/vm/vm_page.c Tue Aug 11 20:37:45 2020(r364129)
> @@ -421,7 +421,7 @@ sysctl_vm_page_blacklist(SYSCTL_HANDLER_ARGS)
>   * In principle, this function only needs to set the flag PG_MARKER.
>   * Nonetheless, it write busies the page as a safety precaution.
>   */
> -static void
> +void
>  vm_page_init_marker(vm_page_t marker, int queue, uint16_t aflags)
>  {
>  
> @@ -2488,7 +2488,7 @@ vm_page_zone_import(void *arg, void **store, int cnt, 
>* main purpose is to replenish the store of free pages.
>*/
>   if (vmd->vmd_severeset || curproc == pageproc ||
> - !_vm_domain_allocate(vmd, VM_ALLOC_NORMAL, cnt))
> + !_vm_domain_allocate(vmd, VM_ALLOC_SYSTEM, cnt))
Why this change needed ?

>   return (0);
>   domain = vmd->vmd_domain;
>   vm_domain_free_lock(vmd);
> 
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


svn commit: r364129 - head/sys/vm

2020-08-11 Thread Conrad Meyer
Author: cem
Date: Tue Aug 11 20:37:45 2020
New Revision: 364129
URL: https://svnweb.freebsd.org/changeset/base/364129

Log:
  Add support for multithreading the inactive queue pageout within a domain.
  
  In very high throughput workloads, the inactive scan can become overwhelmed
  as you have many cores producing pages and a single core freeing.  Since
  Mark's introduction of batched pagequeue operations, we can now run multiple
  inactive threads working on independent batches.
  
  To avoid confusing the pid and other control algorithms, I (Jeff) do this in
  a mpi-like fan out and collect model that is driven from the primary page
  daemon.  It decides whether the shortfall can be overcome with a single
  thread and if not dispatches multiple threads and waits for their results.
  
  The heuristic is based on timing the pageout activity and averaging a
  pages-per-second variable which is exponentially decayed. This is visible in
  sysctl and may be interesting for other purposes.
  
  I (Jeff) have verified that this does indeed double our paging throughput
  when used with two threads. With four we tend to run into other contention
  problems.  For now I would like to commit this infrastructure with only a
  single thread enabled.
  
  The number of worker threads per domain can be controlled with the
  'vm.pageout_threads_per_domain' tunable.
  
  Submitted by: jeff (earlier version)
  Discussed with:   markj
  Tested by:pho
  Sponsored by: probably Netflix (based on contemporary commits)
  Differential Revision:https://reviews.freebsd.org/D21629

Modified:
  head/sys/vm/vm_meter.c
  head/sys/vm/vm_page.c
  head/sys/vm/vm_page.h
  head/sys/vm/vm_pageout.c
  head/sys/vm/vm_pagequeue.h

Modified: head/sys/vm/vm_meter.c
==
--- head/sys/vm/vm_meter.c  Tue Aug 11 17:54:10 2020(r364128)
+++ head/sys/vm/vm_meter.c  Tue Aug 11 20:37:45 2020(r364129)
@@ -552,6 +552,9 @@ vm_domain_stats_init(struct vm_domain *vmd, struct sys
SYSCTL_ADD_UINT(NULL, SYSCTL_CHILDREN(oid), OID_AUTO,
"free_severe", CTLFLAG_RD, >vmd_free_severe, 0,
"Severe free pages");
+   SYSCTL_ADD_UINT(NULL, SYSCTL_CHILDREN(oid), OID_AUTO,
+   "inactive_pps", CTLFLAG_RD, >vmd_inactive_pps, 0,
+   "inactive pages freed/second");
 
 }
 

Modified: head/sys/vm/vm_page.c
==
--- head/sys/vm/vm_page.c   Tue Aug 11 17:54:10 2020(r364128)
+++ head/sys/vm/vm_page.c   Tue Aug 11 20:37:45 2020(r364129)
@@ -421,7 +421,7 @@ sysctl_vm_page_blacklist(SYSCTL_HANDLER_ARGS)
  * In principle, this function only needs to set the flag PG_MARKER.
  * Nonetheless, it write busies the page as a safety precaution.
  */
-static void
+void
 vm_page_init_marker(vm_page_t marker, int queue, uint16_t aflags)
 {
 
@@ -2488,7 +2488,7 @@ vm_page_zone_import(void *arg, void **store, int cnt, 
 * main purpose is to replenish the store of free pages.
 */
if (vmd->vmd_severeset || curproc == pageproc ||
-   !_vm_domain_allocate(vmd, VM_ALLOC_NORMAL, cnt))
+   !_vm_domain_allocate(vmd, VM_ALLOC_SYSTEM, cnt))
return (0);
domain = vmd->vmd_domain;
vm_domain_free_lock(vmd);

Modified: head/sys/vm/vm_page.h
==
--- head/sys/vm/vm_page.h   Tue Aug 11 17:54:10 2020(r364128)
+++ head/sys/vm/vm_page.h   Tue Aug 11 20:37:45 2020(r364129)
@@ -630,6 +630,7 @@ vm_page_t vm_page_find_least(vm_object_t, vm_pindex_t)
 void vm_page_free_invalid(vm_page_t);
 vm_page_t vm_page_getfake(vm_paddr_t paddr, vm_memattr_t memattr);
 void vm_page_initfake(vm_page_t m, vm_paddr_t paddr, vm_memattr_t memattr);
+void vm_page_init_marker(vm_page_t marker, int queue, uint16_t aflags);
 int vm_page_insert (vm_page_t, vm_object_t, vm_pindex_t);
 void vm_page_invalid(vm_page_t m);
 void vm_page_launder(vm_page_t m);

Modified: head/sys/vm/vm_pageout.c
==
--- head/sys/vm/vm_pageout.cTue Aug 11 17:54:10 2020(r364128)
+++ head/sys/vm/vm_pageout.cTue Aug 11 20:37:45 2020(r364129)
@@ -82,6 +82,7 @@ __FBSDID("$FreeBSD$");
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -163,6 +164,12 @@ SYSCTL_INT(_vm, OID_AUTO, panic_on_oom,
 SYSCTL_INT(_vm, OID_AUTO, pageout_update_period,
CTLFLAG_RWTUN, _pageout_update_period, 0,
"Maximum active LRU update period");
+
+/* Access with get_pageout_threads_per_domain(). */
+static int pageout_threads_per_domain = 1;
+SYSCTL_INT(_vm, OID_AUTO, pageout_threads_per_domain, CTLFLAG_RDTUN,
+_threads_per_domain, 0,
+"Number of worker threads comprising each per-domain pagedaemon");
   
 SYSCTL_INT(_vm,