util.c: remove the VM_WARN_ONCE for vm_committed_as underflow check

Michal Hocko Wed, 03 Jun 2020 06:37:14 -0700

On Wed 03-06-20 17:48:04, Feng Tang wrote:
> On Tue, Jun 02, 2020 at 12:02:22AM -0400, Qian Cai wrote:
> > 
> > 
> > > On Jun 1, 2020, at 11:37 PM, Feng Tang <feng.t...@intel.com> wrote:
> > > 
> > > I re-run the same benchmark with v5.7 and 5.7+remove_warning kernels,
> > > the overall performance change is trivial (which is expected)
> > > 
> > >   1330147            +0.1%    1331032        will-it-scale.72.processes
> > > 
> > > But the perf stats of "self" shows big change for __vm_enough_memory() 
> > > 
> > >      0.27            -0.3        0.00        pp.self.__vm_enough_memory
> > > 
> > > I post the full compare result in the end.
> > 
> > I don’t really see what that means exactly, but I suppose the warning is 
> > there for so long and no one seems notice much trouble (or benefit) because 
> > of it, so I think you will probably need to come up with a proper 
> > justification to explain why it is a trouble now, and how your patchset 
> > suddenly start to trigger the warning as well as why it is no better way 
> > but to suffer this debuggability regression (probably tiny but still).
> 
> Thanks for the suggestion, and I updated the commit log.
> 
> 
> >From 1633da8228bd3d0dcbbd8df982977ad4594962a1 Mon Sep 17 00:00:00 2001
> From: Feng Tang <feng.t...@intel.com>
> Date: Fri, 29 May 2020 08:48:48 +0800
> Subject: [PATCH] mm/util.c: remove the VM_WARN_ONCE for vm_committed_as
>  underflow check
> 
> This check was added by 82f71ae4a2b8 ("mm: catch memory commitment underflow")
> in 2014 to have a safety check for issues which have been fixed.
> And there has been few report caught by it, as described in its
> commit log:
> 
> : This shouldn't happen any more - the previous two patches fixed
> : the committed_as underflow issues.
> 
> But it was really found by Qian Cai when he used the LTP memory
> stress suite to test a RFC patchset, which tries to improve scalability
> of per-cpu counter 'vm_committed_as', by chosing a bigger 'batch' number
> for loose overcommit policies (OVERCOMMIT_ALWAYS and OVERCOMMIT_GUESS),
> while keeping current number for OVERCOMMIT_NEVER.
> 
> With that patchset, when system firstly uses a loose policy, the
> 'vm_committed_as' count could be a big negative value, as its big 'batch'
> number allows a big deviation, then when the policy is changed to
> OVERCOMMIT_NEVER, the 'batch' will be decreased to a much smaller value,
> thus hits this WARN check.
> 
> To mitigate this, one proposed solution is to queue work on all online
> CPUs to do a local sync for 'vm_committed_as' when changing policy to
> OVERCOMMIT_NEVER, plus some global syncing to garante the case won't
> be hit.
> 
> But this solution is costy and slow, given this check hasn't shown real
> trouble or benefit, simply drop it from one hot path of MM. And perf
> stats does show some tiny saving for removing it.
> 
> Reported-by: Qian Cai <c...@lca.pw> 
> Signed-off-by: Feng Tang <feng.t...@intel.com>
> Cc: Konstantin Khlebnikov <koc...@gmail.com>
> Cc: Michal Hocko <mho...@suse.com>
> Cc: Andi Kleen <andi.kl...@intel.com>


Acked-by: Michal Hocko <mho...@suse.com>

> ---
>  mm/util.c | 8 --------
>  1 file changed, 8 deletions(-)
> 
> diff --git a/mm/util.c b/mm/util.c
> index 9b3be03..c63c8e4 100644
> --- a/mm/util.c
> +++ b/mm/util.c
> @@ -814,14 +814,6 @@ int __vm_enough_memory(struct mm_struct *mm, long pages, 
> int cap_sys_admin)
>  {
>       long allowed;
>  
> -     /*
> -      * A transient decrease in the value is unlikely, so no need
> -      * READ_ONCE() for vm_committed_as.count.
> -      */
> -     VM_WARN_ONCE(data_race(percpu_counter_read(&vm_committed_as) <
> -                     -(s64)vm_committed_as_batch * num_online_cpus()),
> -                     "memory commitment underflow");
> -
>       vm_acct_memory(pages);
>  
>       /*
> -- 
> 2.7.4
> 

-- 
Michal Hocko
SUSE Labs

Re: [PATCH v4 3/4] mm/util.c: remove the VM_WARN_ONCE for vm_committed_as underflow check

Reply via email to