Re: svn commit: r332365 - head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs

2018-04-11 Thread Mark Johnston
On Wed, Apr 11, 2018 at 12:51:55AM -0400, Allan Jude wrote:
> On 2018-04-10 09:56, Mark Johnston wrote:
> > Author: markj
> > Date: Tue Apr 10 13:56:06 2018
> > New Revision: 332365
> > URL: https://svnweb.freebsd.org/changeset/base/332365
> > 
> > Log:
> >   Set zfs_arc_free_target to v_free_target.
> >   
> >   Page daemon output is now regulated by a PID controller with a setpoint
> >   of v_free_target. Moreover, the page daemon now wakes up regularly
> >   rather than waiting for a wakeup from another thread. This means that
> >   the free page count is unlikely to drop below the old
> >   zfs_arc_free_target value, and as a result the ARC was not readily
> >   freeing pages under memory pressure. Address the immediate problem by
> >   updating zfs_arc_free_target to match the page daemon's new behaviour.
> >   
> >   Reported and tested by:   truckman
> >   Discussed with:   jeff
> >   X-MFC with:   r329882
> >   Differential Revision:https://reviews.freebsd.org/D14994
> > 
> 
> On a somewhat unrelated note, can we rename this sysctl and change to be
> counted in bytes? When users are tuning ZFS, every other ZFS value is in
> bytes, not pages.
> 
> Maybe keep the currently variable as it is, in pages, and adjust it by
> dividing the user set value by the page size.
> 
> The current name is great, but I wouldn't want anyone to end up setting
> it to 4096x the value they actually want if we just changed it out from
> under them.

Sure, any suggestions for what the new sysctl should be named?
___
svn-src-head@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-head
To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"


Re: svn commit: r332365 - head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs

2018-04-10 Thread Allan Jude
On 2018-04-10 09:56, Mark Johnston wrote:
> Author: markj
> Date: Tue Apr 10 13:56:06 2018
> New Revision: 332365
> URL: https://svnweb.freebsd.org/changeset/base/332365
> 
> Log:
>   Set zfs_arc_free_target to v_free_target.
>   
>   Page daemon output is now regulated by a PID controller with a setpoint
>   of v_free_target. Moreover, the page daemon now wakes up regularly
>   rather than waiting for a wakeup from another thread. This means that
>   the free page count is unlikely to drop below the old
>   zfs_arc_free_target value, and as a result the ARC was not readily
>   freeing pages under memory pressure. Address the immediate problem by
>   updating zfs_arc_free_target to match the page daemon's new behaviour.
>   
>   Reported and tested by: truckman
>   Discussed with: jeff
>   X-MFC with: r329882
>   Differential Revision:  https://reviews.freebsd.org/D14994
> 

On a somewhat unrelated note, can we rename this sysctl and change to be
counted in bytes? When users are tuning ZFS, every other ZFS value is in
bytes, not pages.

Maybe keep the currently variable as it is, in pages, and adjust it by
dividing the user set value by the page size.

The current name is great, but I wouldn't want anyone to end up setting
it to 4096x the value they actually want if we just changed it out from
under them.

-- 
Allan Jude



signature.asc
Description: OpenPGP digital signature


Re: svn commit: r332365 - head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs

2018-04-10 Thread Slawa Olhovchenkov
On Tue, Apr 10, 2018 at 11:17:33AM -0400, Mark Johnston wrote:

> On Tue, Apr 10, 2018 at 05:09:57PM +0300, Slawa Olhovchenkov wrote:
> > On Tue, Apr 10, 2018 at 01:56:06PM +, Mark Johnston wrote:
> > 
> > > Author: markj
> > > Date: Tue Apr 10 13:56:06 2018
> > > New Revision: 332365
> > > URL: https://svnweb.freebsd.org/changeset/base/332365
> > > 
> > > Log:
> > >   Set zfs_arc_free_target to v_free_target.
> > >   
> > >   Page daemon output is now regulated by a PID controller with a setpoint
> > >   of v_free_target. Moreover, the page daemon now wakes up regularly
> > >   rather than waiting for a wakeup from another thread. This means that
> > >   the free page count is unlikely to drop below the old
> > >   zfs_arc_free_target value, and as a result the ARC was not readily
> > >   freeing pages under memory pressure. Address the immediate problem by
> > >   updating zfs_arc_free_target to match the page daemon's new behaviour.
> > 
> > Can you explain some more about new page daemon algo (and reclaim zone
> > free memory)?
> 
> The old algorithm was pretty simple: there was a free page target and
> below that, a wakeup threshold. Any time a thread allocated a page and
> in so doing caused the free page count to drop below the wakeup
> threshold, that thread would wake up the page daemon, which would scan
> the inactive queue and free pages until the free target is reached, or
> the end of the inactive queue was reached.
> 
> This is simple and easy to reason about, but has some drawbacks. When
> memory pressure is constant, it leads to bursts of CPU usage and lock
> contention. The static watermarks may also be insufficient for some
> demanding workloads. In particular, the wakeup threshold might be too
> low, thus allowing the free page count to drop to dangerous levels and
> triggering expensive memory shortage handling (i.e., VM_WAIT).
> 
> The new algorithm uses a control loop to dynamically compute a target
> for each scan of the inactive queue. The loop takes as input the
> magnitude of the page shortage (v_free_target - v_free_count) and keeps
> track of the rate of change of this difference (i.e., the rate at which
> free pages are being consumed) and the sum of this difference over time
> (i.e., a cumulative value for the magnitude of recent page shortages).
> These factors are used to compute "shortage", the number of pages to
> reclaim with the goal of maintaining a free page count of v_free_target.
> 
> The effect of the new algorithm is that the page daemon runs more
> frequently but for shorter durations, so its CPU usage is more even. It
> responds dynamically to the demands of the workload, so the shortcomings
> of a pair of static watermarks are gone.
> 
> r329882 doesn't really change anything with respect to reclamation of
> pages from UMA zones. There are some plans to address shortcomings there
> in the near future though.

Thank, very nice explain.
IMHO, in this case for ZFS best is old zfs_arc_free_target: too close
zfs_arc_free_target to vm_cnt.v_free_min can cause too often run
arc_target correction and cause CPU consumption and memory subsystem
overuse.
ZFS need more correct pressure, ZFS-specific, and I am try it in D7538.

> > PS: zfs need some more time for free pages from ARC. Also, vanila zfs
> > have broken logic for count used and free ARC's memory. For most
> > correctly count system-wide used and free memory need accounting
> > in-zone free memory.
> 
> Yes, there is a number of problems in this area predating r329882. This
> commit is really just a bandaid.
___
svn-src-head@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-head
To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"


Re: svn commit: r332365 - head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs

2018-04-10 Thread Mark Johnston
On Tue, Apr 10, 2018 at 05:09:57PM +0300, Slawa Olhovchenkov wrote:
> On Tue, Apr 10, 2018 at 01:56:06PM +, Mark Johnston wrote:
> 
> > Author: markj
> > Date: Tue Apr 10 13:56:06 2018
> > New Revision: 332365
> > URL: https://svnweb.freebsd.org/changeset/base/332365
> > 
> > Log:
> >   Set zfs_arc_free_target to v_free_target.
> >   
> >   Page daemon output is now regulated by a PID controller with a setpoint
> >   of v_free_target. Moreover, the page daemon now wakes up regularly
> >   rather than waiting for a wakeup from another thread. This means that
> >   the free page count is unlikely to drop below the old
> >   zfs_arc_free_target value, and as a result the ARC was not readily
> >   freeing pages under memory pressure. Address the immediate problem by
> >   updating zfs_arc_free_target to match the page daemon's new behaviour.
> 
> Can you explain some more about new page daemon algo (and reclaim zone
> free memory)?

The old algorithm was pretty simple: there was a free page target and
below that, a wakeup threshold. Any time a thread allocated a page and
in so doing caused the free page count to drop below the wakeup
threshold, that thread would wake up the page daemon, which would scan
the inactive queue and free pages until the free target is reached, or
the end of the inactive queue was reached.

This is simple and easy to reason about, but has some drawbacks. When
memory pressure is constant, it leads to bursts of CPU usage and lock
contention. The static watermarks may also be insufficient for some
demanding workloads. In particular, the wakeup threshold might be too
low, thus allowing the free page count to drop to dangerous levels and
triggering expensive memory shortage handling (i.e., VM_WAIT).

The new algorithm uses a control loop to dynamically compute a target
for each scan of the inactive queue. The loop takes as input the
magnitude of the page shortage (v_free_target - v_free_count) and keeps
track of the rate of change of this difference (i.e., the rate at which
free pages are being consumed) and the sum of this difference over time
(i.e., a cumulative value for the magnitude of recent page shortages).
These factors are used to compute "shortage", the number of pages to
reclaim with the goal of maintaining a free page count of v_free_target.

The effect of the new algorithm is that the page daemon runs more
frequently but for shorter durations, so its CPU usage is more even. It
responds dynamically to the demands of the workload, so the shortcomings
of a pair of static watermarks are gone.

r329882 doesn't really change anything with respect to reclamation of
pages from UMA zones. There are some plans to address shortcomings there
in the near future though.

> PS: zfs need some more time for free pages from ARC. Also, vanila zfs
> have broken logic for count used and free ARC's memory. For most
> correctly count system-wide used and free memory need accounting
> in-zone free memory.

Yes, there is a number of problems in this area predating r329882. This
commit is really just a bandaid.
___
svn-src-head@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-head
To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"


Re: svn commit: r332365 - head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs

2018-04-10 Thread Slawa Olhovchenkov
On Tue, Apr 10, 2018 at 01:56:06PM +, Mark Johnston wrote:

> Author: markj
> Date: Tue Apr 10 13:56:06 2018
> New Revision: 332365
> URL: https://svnweb.freebsd.org/changeset/base/332365
> 
> Log:
>   Set zfs_arc_free_target to v_free_target.
>   
>   Page daemon output is now regulated by a PID controller with a setpoint
>   of v_free_target. Moreover, the page daemon now wakes up regularly
>   rather than waiting for a wakeup from another thread. This means that
>   the free page count is unlikely to drop below the old
>   zfs_arc_free_target value, and as a result the ARC was not readily
>   freeing pages under memory pressure. Address the immediate problem by
>   updating zfs_arc_free_target to match the page daemon's new behaviour.

Can you explain some more about new page daemon algo (and reclaim zone
free memory)?

PS: zfs need some more time for free pages from ARC. Also, vanila zfs
have broken logic for count used and free ARC's memory. For most
correctly count system-wide used and free memory need accounting
in-zone free memory.

>  arc_free_target_init(void *unused __unused)
>  {
>  
> - zfs_arc_free_target = (vm_cnt.v_free_min / 10) * 11;
> + zfs_arc_free_target = vm_cnt.v_free_target;
>  }
___
svn-src-head@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-head
To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"


svn commit: r332365 - head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs

2018-04-10 Thread Mark Johnston
Author: markj
Date: Tue Apr 10 13:56:06 2018
New Revision: 332365
URL: https://svnweb.freebsd.org/changeset/base/332365

Log:
  Set zfs_arc_free_target to v_free_target.
  
  Page daemon output is now regulated by a PID controller with a setpoint
  of v_free_target. Moreover, the page daemon now wakes up regularly
  rather than waiting for a wakeup from another thread. This means that
  the free page count is unlikely to drop below the old
  zfs_arc_free_target value, and as a result the ARC was not readily
  freeing pages under memory pressure. Address the immediate problem by
  updating zfs_arc_free_target to match the page daemon's new behaviour.
  
  Reported and tested by:   truckman
  Discussed with:   jeff
  X-MFC with:   r329882
  Differential Revision:https://reviews.freebsd.org/D14994

Modified:
  head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c

Modified: head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c
==
--- head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c   Tue Apr 10 
13:47:09 2018(r332364)
+++ head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c   Tue Apr 10 
13:56:06 2018(r332365)
@@ -389,7 +389,7 @@ static void
 arc_free_target_init(void *unused __unused)
 {
 
-   zfs_arc_free_target = (vm_cnt.v_free_min / 10) * 11;
+   zfs_arc_free_target = vm_cnt.v_free_target;
 }
 SYSINIT(arc_free_target_init, SI_SUB_KTHREAD_PAGE, SI_ORDER_ANY,
 arc_free_target_init, NULL);
___
svn-src-head@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-head
To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"