Re: [patch 2/3] mm, oom: remove unnecessary check for NULL zonelist

2014-08-04 Thread David Rientjes
On Sat, 2 Aug 2014, Johannes Weiner wrote:

> > I see one concern: that panic_on_oom == 1 will not trigger on pagefault 
> > when constrained by cpusets.  To address that, I'll state that, since 
> > cpuset-constrained allocations are the allocation context for pagefaults,
> > panic_on_oom == 1 should not trigger on pagefault when constrained by 
> > cpusets.
> 
> I expressed my concern pretty clearly above: out_of_memory() wants the
> zonelist that was used during the failed allocation, you are passing a
> non-sensical value in there that only happens to have the same type.
> 

It's certainly meaningful, the particular zonelist chosen isn't important 
because we don't care about the ordering and pagefaults are not going to 
be using __GFP_THISNODE.  In this context, we only need to pass a zonelist 
that includes all zones because constrained_alloc() tests if the 
allocation is cpuset-constrained based on the gfp flags.  We'll get 
CONSTRAINT_CPUSET in that case.

This is important because the behavior of panic_on_oom differs, as you 
pointed out, depending on the constraint.  pagefault_out_of_memory(), with 
my patch, will always get CONSTRAINT_CPUSET when needed and 
check_panic_on_oom() will behave correctly now for cpusets.

> We simply don't have the right information at the end of the page
> fault handler to respect constrained allocations.  Case in point:
> nodemask is unset from pagefault_out_of_memory(), so we still kill
> based on mempolicy even though check_panic_on_oom() says it wouldn't.
> 

That is, in fact, the only last bit of information we need in the 
pagefault handler to make correct decisions.  It's important, too, since 
if the vma of the faulting address is constrained by a mempolicy, we want 
to avoid needless killing a process that has a mempolicy with a disjoint 
set of nodes.

> The code change is not an adequate solution for the problem we have
> here and the changelog is an insult to everybody who wants to make
> sense of this from the git history later on.
> 

We can also address mempolicies by modifying the page fault handler and 
passing the vma and faulting address to make the correct panic_on_oom 
decisions but also filter processes that have mempolicies that consist 
solely of a disjoint set of nodes.  I'll post that patch series as well.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 2/3] mm, oom: remove unnecessary check for NULL zonelist

2014-08-04 Thread David Rientjes
On Sat, 2 Aug 2014, Johannes Weiner wrote:

  I see one concern: that panic_on_oom == 1 will not trigger on pagefault 
  when constrained by cpusets.  To address that, I'll state that, since 
  cpuset-constrained allocations are the allocation context for pagefaults,
  panic_on_oom == 1 should not trigger on pagefault when constrained by 
  cpusets.
 
 I expressed my concern pretty clearly above: out_of_memory() wants the
 zonelist that was used during the failed allocation, you are passing a
 non-sensical value in there that only happens to have the same type.
 

It's certainly meaningful, the particular zonelist chosen isn't important 
because we don't care about the ordering and pagefaults are not going to 
be using __GFP_THISNODE.  In this context, we only need to pass a zonelist 
that includes all zones because constrained_alloc() tests if the 
allocation is cpuset-constrained based on the gfp flags.  We'll get 
CONSTRAINT_CPUSET in that case.

This is important because the behavior of panic_on_oom differs, as you 
pointed out, depending on the constraint.  pagefault_out_of_memory(), with 
my patch, will always get CONSTRAINT_CPUSET when needed and 
check_panic_on_oom() will behave correctly now for cpusets.

 We simply don't have the right information at the end of the page
 fault handler to respect constrained allocations.  Case in point:
 nodemask is unset from pagefault_out_of_memory(), so we still kill
 based on mempolicy even though check_panic_on_oom() says it wouldn't.
 

That is, in fact, the only last bit of information we need in the 
pagefault handler to make correct decisions.  It's important, too, since 
if the vma of the faulting address is constrained by a mempolicy, we want 
to avoid needless killing a process that has a mempolicy with a disjoint 
set of nodes.

 The code change is not an adequate solution for the problem we have
 here and the changelog is an insult to everybody who wants to make
 sense of this from the git history later on.
 

We can also address mempolicies by modifying the page fault handler and 
passing the vma and faulting address to make the correct panic_on_oom 
decisions but also filter processes that have mempolicies that consist 
solely of a disjoint set of nodes.  I'll post that patch series as well.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 2/3] mm, oom: remove unnecessary check for NULL zonelist

2014-08-02 Thread Johannes Weiner
On Fri, Aug 01, 2014 at 02:42:19PM -0700, David Rientjes wrote:
> On Fri, 1 Aug 2014, Johannes Weiner wrote:
> 
> > > > out_of_memory() wants the zonelist that was used during allocation,
> > > > not just the random first node's zonelist that's simply picked to
> > > > serialize page fault OOM kills system-wide.
> > > > 
> > > > This would even change how panic_on_oom behaves for page fault OOMs
> > > > (in a completely unpredictable way) if we get CONSTRAINED_CPUSET.
> > > > 
> > > > This change makes no sense to me.
> > > > 
> > > 
> > > Allocations during fault will be constrained by the cpuset's mems, if we 
> > > are oom then why would we panic when panic_on_oom == 1?
> > 
> > Can you please address the concerns I raised?
> > 
> 
> I see one concern: that panic_on_oom == 1 will not trigger on pagefault 
> when constrained by cpusets.  To address that, I'll state that, since 
> cpuset-constrained allocations are the allocation context for pagefaults,
> panic_on_oom == 1 should not trigger on pagefault when constrained by 
> cpusets.

I expressed my concern pretty clearly above: out_of_memory() wants the
zonelist that was used during the failed allocation, you are passing a
non-sensical value in there that only happens to have the same type.

We simply don't have the right information at the end of the page
fault handler to respect constrained allocations.  Case in point:
nodemask is unset from pagefault_out_of_memory(), so we still kill
based on mempolicy even though check_panic_on_oom() says it wouldn't.

The code change is not an adequate solution for the problem we have
here and the changelog is an insult to everybody who wants to make
sense of this from the git history later on.

But the much bigger problem is that you continue to fail to address
even basic feedback and instead consistently derail discussions with
unrelated drivel and circular arguments.  As long as you continue to
do that I don't think we should be merging any of your patches.

> > And please describe user-visible changes in the changelog.
> > 
> 
> Ok, Andrew please annotate the changelog for 
> mm-oom-remove-unnecessary-check-for-null-zonelist.patch by including:
> 
> This also causes panic_on_oom == 1 to not panic the machine when the 
> pagefault is constrained by the mems of current's cpuset.  That behavior 
> agrees with the semantics of the sysctl in Documentation/sysctl/vm.txt.

Great, now we have a cleanup patch with the side-effect of changing
user-visible behavior and introducing non-sensical code semantics.

Nacked-by: Johannes Weiner 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 2/3] mm, oom: remove unnecessary check for NULL zonelist

2014-08-02 Thread Johannes Weiner
On Fri, Aug 01, 2014 at 02:42:19PM -0700, David Rientjes wrote:
 On Fri, 1 Aug 2014, Johannes Weiner wrote:
 
out_of_memory() wants the zonelist that was used during allocation,
not just the random first node's zonelist that's simply picked to
serialize page fault OOM kills system-wide.

This would even change how panic_on_oom behaves for page fault OOMs
(in a completely unpredictable way) if we get CONSTRAINED_CPUSET.

This change makes no sense to me.

   
   Allocations during fault will be constrained by the cpuset's mems, if we 
   are oom then why would we panic when panic_on_oom == 1?
  
  Can you please address the concerns I raised?
  
 
 I see one concern: that panic_on_oom == 1 will not trigger on pagefault 
 when constrained by cpusets.  To address that, I'll state that, since 
 cpuset-constrained allocations are the allocation context for pagefaults,
 panic_on_oom == 1 should not trigger on pagefault when constrained by 
 cpusets.

I expressed my concern pretty clearly above: out_of_memory() wants the
zonelist that was used during the failed allocation, you are passing a
non-sensical value in there that only happens to have the same type.

We simply don't have the right information at the end of the page
fault handler to respect constrained allocations.  Case in point:
nodemask is unset from pagefault_out_of_memory(), so we still kill
based on mempolicy even though check_panic_on_oom() says it wouldn't.

The code change is not an adequate solution for the problem we have
here and the changelog is an insult to everybody who wants to make
sense of this from the git history later on.

But the much bigger problem is that you continue to fail to address
even basic feedback and instead consistently derail discussions with
unrelated drivel and circular arguments.  As long as you continue to
do that I don't think we should be merging any of your patches.

  And please describe user-visible changes in the changelog.
  
 
 Ok, Andrew please annotate the changelog for 
 mm-oom-remove-unnecessary-check-for-null-zonelist.patch by including:
 
 This also causes panic_on_oom == 1 to not panic the machine when the 
 pagefault is constrained by the mems of current's cpuset.  That behavior 
 agrees with the semantics of the sysctl in Documentation/sysctl/vm.txt.

Great, now we have a cleanup patch with the side-effect of changing
user-visible behavior and introducing non-sensical code semantics.

Nacked-by: Johannes Weiner han...@cmpxchg.org
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 2/3] mm, oom: remove unnecessary check for NULL zonelist

2014-08-01 Thread David Rientjes
On Fri, 1 Aug 2014, Johannes Weiner wrote:

> > > out_of_memory() wants the zonelist that was used during allocation,
> > > not just the random first node's zonelist that's simply picked to
> > > serialize page fault OOM kills system-wide.
> > > 
> > > This would even change how panic_on_oom behaves for page fault OOMs
> > > (in a completely unpredictable way) if we get CONSTRAINED_CPUSET.
> > > 
> > > This change makes no sense to me.
> > > 
> > 
> > Allocations during fault will be constrained by the cpuset's mems, if we 
> > are oom then why would we panic when panic_on_oom == 1?
> 
> Can you please address the concerns I raised?
> 

I see one concern: that panic_on_oom == 1 will not trigger on pagefault 
when constrained by cpusets.  To address that, I'll state that, since 
cpuset-constrained allocations are the allocation context for pagefaults,
panic_on_oom == 1 should not trigger on pagefault when constrained by 
cpusets.

> And please describe user-visible changes in the changelog.
> 

Ok, Andrew please annotate the changelog for 
mm-oom-remove-unnecessary-check-for-null-zonelist.patch by including:

This also causes panic_on_oom == 1 to not panic the machine when the 
pagefault is constrained by the mems of current's cpuset.  That behavior 
agrees with the semantics of the sysctl in Documentation/sysctl/vm.txt.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 2/3] mm, oom: remove unnecessary check for NULL zonelist

2014-08-01 Thread Johannes Weiner
On Fri, Aug 01, 2014 at 02:10:37AM -0700, David Rientjes wrote:
> On Thu, 31 Jul 2014, Johannes Weiner wrote:
> 
> > out_of_memory() wants the zonelist that was used during allocation,
> > not just the random first node's zonelist that's simply picked to
> > serialize page fault OOM kills system-wide.
> > 
> > This would even change how panic_on_oom behaves for page fault OOMs
> > (in a completely unpredictable way) if we get CONSTRAINED_CPUSET.
> > 
> > This change makes no sense to me.
> > 
> 
> Allocations during fault will be constrained by the cpuset's mems, if we 
> are oom then why would we panic when panic_on_oom == 1?

Can you please address the concerns I raised?

And please describe user-visible changes in the changelog.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 2/3] mm, oom: remove unnecessary check for NULL zonelist

2014-08-01 Thread David Rientjes
On Thu, 31 Jul 2014, Johannes Weiner wrote:

> out_of_memory() wants the zonelist that was used during allocation,
> not just the random first node's zonelist that's simply picked to
> serialize page fault OOM kills system-wide.
> 
> This would even change how panic_on_oom behaves for page fault OOMs
> (in a completely unpredictable way) if we get CONSTRAINED_CPUSET.
> 
> This change makes no sense to me.
> 

Allocations during fault will be constrained by the cpuset's mems, if we 
are oom then why would we panic when panic_on_oom == 1?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 2/3] mm, oom: remove unnecessary check for NULL zonelist

2014-08-01 Thread David Rientjes
On Thu, 31 Jul 2014, Johannes Weiner wrote:

 out_of_memory() wants the zonelist that was used during allocation,
 not just the random first node's zonelist that's simply picked to
 serialize page fault OOM kills system-wide.
 
 This would even change how panic_on_oom behaves for page fault OOMs
 (in a completely unpredictable way) if we get CONSTRAINED_CPUSET.
 
 This change makes no sense to me.
 

Allocations during fault will be constrained by the cpuset's mems, if we 
are oom then why would we panic when panic_on_oom == 1?
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 2/3] mm, oom: remove unnecessary check for NULL zonelist

2014-08-01 Thread Johannes Weiner
On Fri, Aug 01, 2014 at 02:10:37AM -0700, David Rientjes wrote:
 On Thu, 31 Jul 2014, Johannes Weiner wrote:
 
  out_of_memory() wants the zonelist that was used during allocation,
  not just the random first node's zonelist that's simply picked to
  serialize page fault OOM kills system-wide.
  
  This would even change how panic_on_oom behaves for page fault OOMs
  (in a completely unpredictable way) if we get CONSTRAINED_CPUSET.
  
  This change makes no sense to me.
  
 
 Allocations during fault will be constrained by the cpuset's mems, if we 
 are oom then why would we panic when panic_on_oom == 1?

Can you please address the concerns I raised?

And please describe user-visible changes in the changelog.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 2/3] mm, oom: remove unnecessary check for NULL zonelist

2014-08-01 Thread David Rientjes
On Fri, 1 Aug 2014, Johannes Weiner wrote:

   out_of_memory() wants the zonelist that was used during allocation,
   not just the random first node's zonelist that's simply picked to
   serialize page fault OOM kills system-wide.
   
   This would even change how panic_on_oom behaves for page fault OOMs
   (in a completely unpredictable way) if we get CONSTRAINED_CPUSET.
   
   This change makes no sense to me.
   
  
  Allocations during fault will be constrained by the cpuset's mems, if we 
  are oom then why would we panic when panic_on_oom == 1?
 
 Can you please address the concerns I raised?
 

I see one concern: that panic_on_oom == 1 will not trigger on pagefault 
when constrained by cpusets.  To address that, I'll state that, since 
cpuset-constrained allocations are the allocation context for pagefaults,
panic_on_oom == 1 should not trigger on pagefault when constrained by 
cpusets.

 And please describe user-visible changes in the changelog.
 

Ok, Andrew please annotate the changelog for 
mm-oom-remove-unnecessary-check-for-null-zonelist.patch by including:

This also causes panic_on_oom == 1 to not panic the machine when the 
pagefault is constrained by the mems of current's cpuset.  That behavior 
agrees with the semantics of the sysctl in Documentation/sysctl/vm.txt.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 2/3] mm, oom: remove unnecessary check for NULL zonelist

2014-07-31 Thread Johannes Weiner
On Wed, Jul 23, 2014 at 06:16:32PM -0700, David Rientjes wrote:
> If the pagefault handler is modified to pass a non-NULL zonelist then an 
> unnecessary check for a NULL zonelist in constrained_alloc() can be removed.
>
> Signed-off-by: David Rientjes 
> ---
>  mm/oom_kill.c | 4 +---
>  1 file changed, 1 insertion(+), 3 deletions(-)
> 
> diff --git a/mm/oom_kill.c b/mm/oom_kill.c
> --- a/mm/oom_kill.c
> +++ b/mm/oom_kill.c
> @@ -208,8 +208,6 @@ static enum oom_constraint constrained_alloc(struct 
> zonelist *zonelist,
>   /* Default to all available memory */
>   *totalpages = totalram_pages + total_swap_pages;
>  
> - if (!zonelist)
> - return CONSTRAINT_NONE;
>   /*
>* Reach here only when __GFP_NOFAIL is used. So, we should avoid
>* to kill current.We have to random task kill in this case.
> @@ -696,7 +694,7 @@ void pagefault_out_of_memory(void)
>  
>   zonelist = node_zonelist(first_memory_node, GFP_KERNEL);
>   if (try_set_zonelist_oom(zonelist, GFP_KERNEL)) {
> - out_of_memory(NULL, 0, 0, NULL, false);
> + out_of_memory(zonelist, 0, 0, NULL, false);

out_of_memory() wants the zonelist that was used during allocation,
not just the random first node's zonelist that's simply picked to
serialize page fault OOM kills system-wide.

This would even change how panic_on_oom behaves for page fault OOMs
(in a completely unpredictable way) if we get CONSTRAINED_CPUSET.

This change makes no sense to me.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 2/3] mm, oom: remove unnecessary check for NULL zonelist

2014-07-31 Thread Johannes Weiner
On Wed, Jul 23, 2014 at 06:16:32PM -0700, David Rientjes wrote:
 If the pagefault handler is modified to pass a non-NULL zonelist then an 
 unnecessary check for a NULL zonelist in constrained_alloc() can be removed.

 Signed-off-by: David Rientjes rient...@google.com
 ---
  mm/oom_kill.c | 4 +---
  1 file changed, 1 insertion(+), 3 deletions(-)
 
 diff --git a/mm/oom_kill.c b/mm/oom_kill.c
 --- a/mm/oom_kill.c
 +++ b/mm/oom_kill.c
 @@ -208,8 +208,6 @@ static enum oom_constraint constrained_alloc(struct 
 zonelist *zonelist,
   /* Default to all available memory */
   *totalpages = totalram_pages + total_swap_pages;
  
 - if (!zonelist)
 - return CONSTRAINT_NONE;
   /*
* Reach here only when __GFP_NOFAIL is used. So, we should avoid
* to kill current.We have to random task kill in this case.
 @@ -696,7 +694,7 @@ void pagefault_out_of_memory(void)
  
   zonelist = node_zonelist(first_memory_node, GFP_KERNEL);
   if (try_set_zonelist_oom(zonelist, GFP_KERNEL)) {
 - out_of_memory(NULL, 0, 0, NULL, false);
 + out_of_memory(zonelist, 0, 0, NULL, false);

out_of_memory() wants the zonelist that was used during allocation,
not just the random first node's zonelist that's simply picked to
serialize page fault OOM kills system-wide.

This would even change how panic_on_oom behaves for page fault OOMs
(in a completely unpredictable way) if we get CONSTRAINED_CPUSET.

This change makes no sense to me.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/