Re: [PATCH 3/5] kernel.h: Add non_block_start/end()

2019-09-03 Thread Jason Gunthorpe
On Tue, Sep 03, 2019 at 09:28:23AM +0200, Daniel Vetter wrote:

> > Cleanest would be a new header I guess, together with might_sleep().
> > But moving that is a bit much I think, there's almost 500 callers of
> > that one from a quick git grep
> >
> > > If dropping do while is the only change then I can edit it in..
> > > I think we have the acks now
> >
> > Yeah sounds simplest, thanks.
> 
> Hi Jason,
> 
> Do you expect me to resend now, or do you plan to do the patchwork
> appeasement when applying? I've seen you merged the other patches
> (thanks!), but not these two here.

Sorry, I didn't get to this before I started travelling, and deferred
it since we were having linux-next related problems with hmm.git. I
hope to do it today.

I will fix it up as promised

Thanks,
Jason


Re: [PATCH 3/5] kernel.h: Add non_block_start/end()

2019-09-03 Thread Daniel Vetter
On Wed, Aug 28, 2019 at 8:56 PM Daniel Vetter  wrote:
> On Wed, Aug 28, 2019 at 8:43 PM Jason Gunthorpe  wrote:
> > On Wed, Aug 28, 2019 at 08:33:13PM +0200, Daniel Vetter wrote:
> > > On Wed, Aug 28, 2019 at 12:50 AM Jason Gunthorpe  wrote:
> > > >
> > > > > diff --git a/include/linux/kernel.h b/include/linux/kernel.h
> > > > > index 4fa360a13c1e..82f84cfe372f 100644
> > > > > +++ b/include/linux/kernel.h
> > > > > @@ -217,7 +217,9 @@ extern void __cant_sleep(const char *file, int 
> > > > > line, int preempt_offset);
> > > > >   * might_sleep - annotation for functions that can sleep
> > > > >   *
> > > > >   * this macro will print a stack trace if it is executed in an atomic
> > > > > - * context (spinlock, irq-handler, ...).
> > > > > + * context (spinlock, irq-handler, ...). Additional sections where 
> > > > > blocking is
> > > > > + * not allowed can be annotated with non_block_start() and 
> > > > > non_block_end()
> > > > > + * pairs.
> > > > >   *
> > > > >   * This is a useful debugging help to be able to catch problems 
> > > > > early and not
> > > > >   * be bitten later when the calling function happens to sleep when 
> > > > > it is not
> > > > > @@ -233,6 +235,25 @@ extern void __cant_sleep(const char *file, int 
> > > > > line, int preempt_offset);
> > > > >  # define cant_sleep() \
> > > > >   do { __cant_sleep(__FILE__, __LINE__, 0); } while (0)
> > > > >  # define sched_annotate_sleep()  (current->task_state_change = 0)
> > > > > +/**
> > > > > + * non_block_start - annotate the start of section where sleeping is 
> > > > > prohibited
> > > > > + *
> > > > > + * This is on behalf of the oom reaper, specifically when it is 
> > > > > calling the mmu
> > > > > + * notifiers. The problem is that if the notifier were to block on, 
> > > > > for example,
> > > > > + * mutex_lock() and if the process which holds that mutex were to 
> > > > > perform a
> > > > > + * sleeping memory allocation, the oom reaper is now blocked on 
> > > > > completion of
> > > > > + * that memory allocation. Other blocking calls like wait_event() 
> > > > > pose similar
> > > > > + * issues.
> > > > > + */
> > > > > +# define non_block_start() \
> > > > > + do { current->non_block_count++; } while (0)
> > > > > +/**
> > > > > + * non_block_end - annotate the end of section where sleeping is 
> > > > > prohibited
> > > > > + *
> > > > > + * Closes a section opened by non_block_start().
> > > > > + */
> > > > > +# define non_block_end() \
> > > > > + do { WARN_ON(current->non_block_count-- == 0); } while (0)
> > > >
> > > > check-patch does not like these, and I agree
> > > >
> > > > #101: FILE: include/linux/kernel.h:248:
> > > > +# define non_block_start() \
> > > > +   do { current->non_block_count++; } while (0)
> > > >
> > > > /tmp/tmp1spfxufy/0006-kernel-h-Add-non_block_start-end-.patch:108: 
> > > > WARNING: Single statement macros should not use a do {} while (0) loop
> > > > #108: FILE: include/linux/kernel.h:255:
> > > > +# define non_block_end() \
> > > > +   do { WARN_ON(current->non_block_count-- == 0); } while (0)
> > > >
> > > > Please use a static inline?
> > >
> > > We need get_current() plus the task_struct, so this gets real messy
> > > real fast. Not even sure which header this would fit in, or whether
> > > I'd need to create a new one. You're insisting on this or respinning
> > > with the do { } while (0) dropped ok.
> >
> > My prefernce is always a static inline, but if the headers are so
> > twisty we need to use #define to solve a missing include, then I
> > wouldn't insist on it.
>
> Cleanest would be a new header I guess, together with might_sleep().
> But moving that is a bit much I think, there's almost 500 callers of
> that one from a quick git grep
>
> > If dropping do while is the only change then I can edit it in..
> > I think we have the acks now
>
> Yeah sounds simplest, thanks.

Hi Jason,

Do you expect me to resend now, or do you plan to do the patchwork
appeasement when applying? I've seen you merged the other patches
(thanks!), but not these two here.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH 3/5] kernel.h: Add non_block_start/end()

2019-08-28 Thread Daniel Vetter
On Wed, Aug 28, 2019 at 8:43 PM Jason Gunthorpe  wrote:
> On Wed, Aug 28, 2019 at 08:33:13PM +0200, Daniel Vetter wrote:
> > On Wed, Aug 28, 2019 at 12:50 AM Jason Gunthorpe  wrote:
> > >
> > > > diff --git a/include/linux/kernel.h b/include/linux/kernel.h
> > > > index 4fa360a13c1e..82f84cfe372f 100644
> > > > +++ b/include/linux/kernel.h
> > > > @@ -217,7 +217,9 @@ extern void __cant_sleep(const char *file, int 
> > > > line, int preempt_offset);
> > > >   * might_sleep - annotation for functions that can sleep
> > > >   *
> > > >   * this macro will print a stack trace if it is executed in an atomic
> > > > - * context (spinlock, irq-handler, ...).
> > > > + * context (spinlock, irq-handler, ...). Additional sections where 
> > > > blocking is
> > > > + * not allowed can be annotated with non_block_start() and 
> > > > non_block_end()
> > > > + * pairs.
> > > >   *
> > > >   * This is a useful debugging help to be able to catch problems early 
> > > > and not
> > > >   * be bitten later when the calling function happens to sleep when it 
> > > > is not
> > > > @@ -233,6 +235,25 @@ extern void __cant_sleep(const char *file, int 
> > > > line, int preempt_offset);
> > > >  # define cant_sleep() \
> > > >   do { __cant_sleep(__FILE__, __LINE__, 0); } while (0)
> > > >  # define sched_annotate_sleep()  (current->task_state_change = 0)
> > > > +/**
> > > > + * non_block_start - annotate the start of section where sleeping is 
> > > > prohibited
> > > > + *
> > > > + * This is on behalf of the oom reaper, specifically when it is 
> > > > calling the mmu
> > > > + * notifiers. The problem is that if the notifier were to block on, 
> > > > for example,
> > > > + * mutex_lock() and if the process which holds that mutex were to 
> > > > perform a
> > > > + * sleeping memory allocation, the oom reaper is now blocked on 
> > > > completion of
> > > > + * that memory allocation. Other blocking calls like wait_event() pose 
> > > > similar
> > > > + * issues.
> > > > + */
> > > > +# define non_block_start() \
> > > > + do { current->non_block_count++; } while (0)
> > > > +/**
> > > > + * non_block_end - annotate the end of section where sleeping is 
> > > > prohibited
> > > > + *
> > > > + * Closes a section opened by non_block_start().
> > > > + */
> > > > +# define non_block_end() \
> > > > + do { WARN_ON(current->non_block_count-- == 0); } while (0)
> > >
> > > check-patch does not like these, and I agree
> > >
> > > #101: FILE: include/linux/kernel.h:248:
> > > +# define non_block_start() \
> > > +   do { current->non_block_count++; } while (0)
> > >
> > > /tmp/tmp1spfxufy/0006-kernel-h-Add-non_block_start-end-.patch:108: 
> > > WARNING: Single statement macros should not use a do {} while (0) loop
> > > #108: FILE: include/linux/kernel.h:255:
> > > +# define non_block_end() \
> > > +   do { WARN_ON(current->non_block_count-- == 0); } while (0)
> > >
> > > Please use a static inline?
> >
> > We need get_current() plus the task_struct, so this gets real messy
> > real fast. Not even sure which header this would fit in, or whether
> > I'd need to create a new one. You're insisting on this or respinning
> > with the do { } while (0) dropped ok.
>
> My prefernce is always a static inline, but if the headers are so
> twisty we need to use #define to solve a missing include, then I
> wouldn't insist on it.

Cleanest would be a new header I guess, together with might_sleep().
But moving that is a bit much I think, there's almost 500 callers of
that one from a quick git grep

> If dropping do while is the only change then I can edit it in..
> I think we have the acks now

Yeah sounds simplest, thanks.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch


Re: [PATCH 3/5] kernel.h: Add non_block_start/end()

2019-08-28 Thread Jason Gunthorpe
On Wed, Aug 28, 2019 at 08:33:13PM +0200, Daniel Vetter wrote:
> On Wed, Aug 28, 2019 at 12:50 AM Jason Gunthorpe  wrote:
> >
> > > diff --git a/include/linux/kernel.h b/include/linux/kernel.h
> > > index 4fa360a13c1e..82f84cfe372f 100644
> > > +++ b/include/linux/kernel.h
> > > @@ -217,7 +217,9 @@ extern void __cant_sleep(const char *file, int line, 
> > > int preempt_offset);
> > >   * might_sleep - annotation for functions that can sleep
> > >   *
> > >   * this macro will print a stack trace if it is executed in an atomic
> > > - * context (spinlock, irq-handler, ...).
> > > + * context (spinlock, irq-handler, ...). Additional sections where 
> > > blocking is
> > > + * not allowed can be annotated with non_block_start() and 
> > > non_block_end()
> > > + * pairs.
> > >   *
> > >   * This is a useful debugging help to be able to catch problems early 
> > > and not
> > >   * be bitten later when the calling function happens to sleep when it is 
> > > not
> > > @@ -233,6 +235,25 @@ extern void __cant_sleep(const char *file, int line, 
> > > int preempt_offset);
> > >  # define cant_sleep() \
> > >   do { __cant_sleep(__FILE__, __LINE__, 0); } while (0)
> > >  # define sched_annotate_sleep()  (current->task_state_change = 0)
> > > +/**
> > > + * non_block_start - annotate the start of section where sleeping is 
> > > prohibited
> > > + *
> > > + * This is on behalf of the oom reaper, specifically when it is calling 
> > > the mmu
> > > + * notifiers. The problem is that if the notifier were to block on, for 
> > > example,
> > > + * mutex_lock() and if the process which holds that mutex were to 
> > > perform a
> > > + * sleeping memory allocation, the oom reaper is now blocked on 
> > > completion of
> > > + * that memory allocation. Other blocking calls like wait_event() pose 
> > > similar
> > > + * issues.
> > > + */
> > > +# define non_block_start() \
> > > + do { current->non_block_count++; } while (0)
> > > +/**
> > > + * non_block_end - annotate the end of section where sleeping is 
> > > prohibited
> > > + *
> > > + * Closes a section opened by non_block_start().
> > > + */
> > > +# define non_block_end() \
> > > + do { WARN_ON(current->non_block_count-- == 0); } while (0)
> >
> > check-patch does not like these, and I agree
> >
> > #101: FILE: include/linux/kernel.h:248:
> > +# define non_block_start() \
> > +   do { current->non_block_count++; } while (0)
> >
> > /tmp/tmp1spfxufy/0006-kernel-h-Add-non_block_start-end-.patch:108: WARNING: 
> > Single statement macros should not use a do {} while (0) loop
> > #108: FILE: include/linux/kernel.h:255:
> > +# define non_block_end() \
> > +   do { WARN_ON(current->non_block_count-- == 0); } while (0)
> >
> > Please use a static inline?
> 
> We need get_current() plus the task_struct, so this gets real messy
> real fast. Not even sure which header this would fit in, or whether
> I'd need to create a new one. You're insisting on this or respinning
> with the do { } while (0) dropped ok.

My prefernce is always a static inline, but if the headers are so
twisty we need to use #define to solve a missing include, then I
wouldn't insist on it.

If dropping do while is the only change then I can edit it in..
I think we have the acks now

Jason


Re: [PATCH 3/5] kernel.h: Add non_block_start/end()

2019-08-28 Thread Daniel Vetter
On Wed, Aug 28, 2019 at 12:50 AM Jason Gunthorpe  wrote:
>
> > diff --git a/include/linux/kernel.h b/include/linux/kernel.h
> > index 4fa360a13c1e..82f84cfe372f 100644
> > +++ b/include/linux/kernel.h
> > @@ -217,7 +217,9 @@ extern void __cant_sleep(const char *file, int line, 
> > int preempt_offset);
> >   * might_sleep - annotation for functions that can sleep
> >   *
> >   * this macro will print a stack trace if it is executed in an atomic
> > - * context (spinlock, irq-handler, ...).
> > + * context (spinlock, irq-handler, ...). Additional sections where 
> > blocking is
> > + * not allowed can be annotated with non_block_start() and non_block_end()
> > + * pairs.
> >   *
> >   * This is a useful debugging help to be able to catch problems early and 
> > not
> >   * be bitten later when the calling function happens to sleep when it is 
> > not
> > @@ -233,6 +235,25 @@ extern void __cant_sleep(const char *file, int line, 
> > int preempt_offset);
> >  # define cant_sleep() \
> >   do { __cant_sleep(__FILE__, __LINE__, 0); } while (0)
> >  # define sched_annotate_sleep()  (current->task_state_change = 0)
> > +/**
> > + * non_block_start - annotate the start of section where sleeping is 
> > prohibited
> > + *
> > + * This is on behalf of the oom reaper, specifically when it is calling 
> > the mmu
> > + * notifiers. The problem is that if the notifier were to block on, for 
> > example,
> > + * mutex_lock() and if the process which holds that mutex were to perform a
> > + * sleeping memory allocation, the oom reaper is now blocked on completion 
> > of
> > + * that memory allocation. Other blocking calls like wait_event() pose 
> > similar
> > + * issues.
> > + */
> > +# define non_block_start() \
> > + do { current->non_block_count++; } while (0)
> > +/**
> > + * non_block_end - annotate the end of section where sleeping is prohibited
> > + *
> > + * Closes a section opened by non_block_start().
> > + */
> > +# define non_block_end() \
> > + do { WARN_ON(current->non_block_count-- == 0); } while (0)
>
> check-patch does not like these, and I agree
>
> #101: FILE: include/linux/kernel.h:248:
> +# define non_block_start() \
> +   do { current->non_block_count++; } while (0)
>
> /tmp/tmp1spfxufy/0006-kernel-h-Add-non_block_start-end-.patch:108: WARNING: 
> Single statement macros should not use a do {} while (0) loop
> #108: FILE: include/linux/kernel.h:255:
> +# define non_block_end() \
> +   do { WARN_ON(current->non_block_count-- == 0); } while (0)
>
> Please use a static inline?

We need get_current() plus the task_struct, so this gets real messy
real fast. Not even sure which header this would fit in, or whether
I'd need to create a new one. You're insisting on this or respinning
with the do { } while (0) dropped ok.

Thanks, Daniel

> Also, can we get one more ack on this patch?
>
> Jason



-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch


Re: [PATCH 3/5] kernel.h: Add non_block_start/end()

2019-08-28 Thread Michal Hocko
On Mon 26-08-19 22:14:23, Daniel Vetter wrote:
> In some special cases we must not block, but there's not a
> spinlock, preempt-off, irqs-off or similar critical section already
> that arms the might_sleep() debug checks. Add a non_block_start/end()
> pair to annotate these.
> 
> This will be used in the oom paths of mmu-notifiers, where blocking is
> not allowed to make sure there's forward progress. Quoting Michal:
> 
> "The notifier is called from quite a restricted context - oom_reaper -
> which shouldn't depend on any locks or sleepable conditionals. The code
> should be swift as well but we mostly do care about it to make a forward
> progress. Checking for sleepable context is the best thing we could come
> up with that would describe these demands at least partially."
> 
> Peter also asked whether we want to catch spinlocks on top, but Michal
> said those are less of a problem because spinlocks can't have an
> indirect dependency upon the page allocator and hence close the loop
> with the oom reaper.
> 
> Suggested by Michal Hocko.
> 
> v2:
> - Improve commit message (Michal)
> - Also check in schedule, not just might_sleep (Peter)
> 
> v3: It works better when I actually squash in the fixup I had lying
> around :-/
> 
> v4: Pick the suggestion from Andrew Morton to give non_block_start/end
> some good kerneldoc comments. I added that other blocking calls like
> wait_event pose similar issues, since that's the other example we
> discussed.
> 
> Cc: Jason Gunthorpe 
> Cc: Peter Zijlstra 
> Cc: Ingo Molnar 
> Cc: Andrew Morton 
> Cc: Michal Hocko 
> Cc: David Rientjes 
> Cc: "Christian König" 
> Cc: Daniel Vetter 
> Cc: "Jérôme Glisse" 
> Cc: linux...@kvack.org
> Cc: Masahiro Yamada 
> Cc: Wei Wang 
> Cc: Andy Shevchenko 
> Cc: Thomas Gleixner 
> Cc: Jann Horn 
> Cc: Feng Tang 
> Cc: Kees Cook 
> Cc: Randy Dunlap 
> Cc: linux-ker...@vger.kernel.org
> Acked-by: Christian König  (v1)
> Acked-by: Peter Zijlstra (Intel) 
> Signed-off-by: Daniel Vetter 

Acked-by: Michal Hocko 

Thanks and sorry for being mostly silent/slow in discussions here.
ETOOBUSY.

> ---
>  include/linux/kernel.h | 25 -
>  include/linux/sched.h  |  4 
>  kernel/sched/core.c| 19 ++-
>  3 files changed, 42 insertions(+), 6 deletions(-)
> 
> diff --git a/include/linux/kernel.h b/include/linux/kernel.h
> index 4fa360a13c1e..82f84cfe372f 100644
> --- a/include/linux/kernel.h
> +++ b/include/linux/kernel.h
> @@ -217,7 +217,9 @@ extern void __cant_sleep(const char *file, int line, int 
> preempt_offset);
>   * might_sleep - annotation for functions that can sleep
>   *
>   * this macro will print a stack trace if it is executed in an atomic
> - * context (spinlock, irq-handler, ...).
> + * context (spinlock, irq-handler, ...). Additional sections where blocking 
> is
> + * not allowed can be annotated with non_block_start() and non_block_end()
> + * pairs.
>   *
>   * This is a useful debugging help to be able to catch problems early and not
>   * be bitten later when the calling function happens to sleep when it is not
> @@ -233,6 +235,25 @@ extern void __cant_sleep(const char *file, int line, int 
> preempt_offset);
>  # define cant_sleep() \
>   do { __cant_sleep(__FILE__, __LINE__, 0); } while (0)
>  # define sched_annotate_sleep()  (current->task_state_change = 0)
> +/**
> + * non_block_start - annotate the start of section where sleeping is 
> prohibited
> + *
> + * This is on behalf of the oom reaper, specifically when it is calling the 
> mmu
> + * notifiers. The problem is that if the notifier were to block on, for 
> example,
> + * mutex_lock() and if the process which holds that mutex were to perform a
> + * sleeping memory allocation, the oom reaper is now blocked on completion of
> + * that memory allocation. Other blocking calls like wait_event() pose 
> similar
> + * issues.
> + */
> +# define non_block_start() \
> + do { current->non_block_count++; } while (0)
> +/**
> + * non_block_end - annotate the end of section where sleeping is prohibited
> + *
> + * Closes a section opened by non_block_start().
> + */
> +# define non_block_end() \
> + do { WARN_ON(current->non_block_count-- == 0); } while (0)
>  #else
>static inline void ___might_sleep(const char *file, int line,
>  int preempt_offset) { }
> @@ -241,6 +262,8 @@ extern void __cant_sleep(const char *file, int line, int 
> preempt_offset);
>  # define might_sleep() do { might_resched(); } while (0)
>  # define cant_sleep() do { } while (0)
>  # define sched_annotate_sleep() do { } while (0)
> +# define non_block_start() do { } while (0)
> +# define non_block_end() do { } while (0)
>  #endif
>  
>  #define might_sleep_if(cond) do { if (cond) might_sleep(); } while (0)
> diff --git a/include/linux/sched.h b/include/linux/sched.h
> index b6ec130dff9b..e8bb965f5019 100644
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -980,6 +980,10 @@ struct task_struct {

Re: [PATCH 3/5] kernel.h: Add non_block_start/end()

2019-08-27 Thread Jason Gunthorpe
> diff --git a/include/linux/kernel.h b/include/linux/kernel.h
> index 4fa360a13c1e..82f84cfe372f 100644
> +++ b/include/linux/kernel.h
> @@ -217,7 +217,9 @@ extern void __cant_sleep(const char *file, int line, int 
> preempt_offset);
>   * might_sleep - annotation for functions that can sleep
>   *
>   * this macro will print a stack trace if it is executed in an atomic
> - * context (spinlock, irq-handler, ...).
> + * context (spinlock, irq-handler, ...). Additional sections where blocking 
> is
> + * not allowed can be annotated with non_block_start() and non_block_end()
> + * pairs.
>   *
>   * This is a useful debugging help to be able to catch problems early and not
>   * be bitten later when the calling function happens to sleep when it is not
> @@ -233,6 +235,25 @@ extern void __cant_sleep(const char *file, int line, int 
> preempt_offset);
>  # define cant_sleep() \
>   do { __cant_sleep(__FILE__, __LINE__, 0); } while (0)
>  # define sched_annotate_sleep()  (current->task_state_change = 0)
> +/**
> + * non_block_start - annotate the start of section where sleeping is 
> prohibited
> + *
> + * This is on behalf of the oom reaper, specifically when it is calling the 
> mmu
> + * notifiers. The problem is that if the notifier were to block on, for 
> example,
> + * mutex_lock() and if the process which holds that mutex were to perform a
> + * sleeping memory allocation, the oom reaper is now blocked on completion of
> + * that memory allocation. Other blocking calls like wait_event() pose 
> similar
> + * issues.
> + */
> +# define non_block_start() \
> + do { current->non_block_count++; } while (0)
> +/**
> + * non_block_end - annotate the end of section where sleeping is prohibited
> + *
> + * Closes a section opened by non_block_start().
> + */
> +# define non_block_end() \
> + do { WARN_ON(current->non_block_count-- == 0); } while (0)

check-patch does not like these, and I agree

#101: FILE: include/linux/kernel.h:248:
+# define non_block_start() \
+   do { current->non_block_count++; } while (0)

/tmp/tmp1spfxufy/0006-kernel-h-Add-non_block_start-end-.patch:108: WARNING: 
Single statement macros should not use a do {} while (0) loop
#108: FILE: include/linux/kernel.h:255:
+# define non_block_end() \
+   do { WARN_ON(current->non_block_count-- == 0); } while (0)

Please use a static inline?

Also, can we get one more ack on this patch?

Jason


[PATCH 3/5] kernel.h: Add non_block_start/end()

2019-08-26 Thread Daniel Vetter
In some special cases we must not block, but there's not a
spinlock, preempt-off, irqs-off or similar critical section already
that arms the might_sleep() debug checks. Add a non_block_start/end()
pair to annotate these.

This will be used in the oom paths of mmu-notifiers, where blocking is
not allowed to make sure there's forward progress. Quoting Michal:

"The notifier is called from quite a restricted context - oom_reaper -
which shouldn't depend on any locks or sleepable conditionals. The code
should be swift as well but we mostly do care about it to make a forward
progress. Checking for sleepable context is the best thing we could come
up with that would describe these demands at least partially."

Peter also asked whether we want to catch spinlocks on top, but Michal
said those are less of a problem because spinlocks can't have an
indirect dependency upon the page allocator and hence close the loop
with the oom reaper.

Suggested by Michal Hocko.

v2:
- Improve commit message (Michal)
- Also check in schedule, not just might_sleep (Peter)

v3: It works better when I actually squash in the fixup I had lying
around :-/

v4: Pick the suggestion from Andrew Morton to give non_block_start/end
some good kerneldoc comments. I added that other blocking calls like
wait_event pose similar issues, since that's the other example we
discussed.

Cc: Jason Gunthorpe 
Cc: Peter Zijlstra 
Cc: Ingo Molnar 
Cc: Andrew Morton 
Cc: Michal Hocko 
Cc: David Rientjes 
Cc: "Christian König" 
Cc: Daniel Vetter 
Cc: "Jérôme Glisse" 
Cc: linux...@kvack.org
Cc: Masahiro Yamada 
Cc: Wei Wang 
Cc: Andy Shevchenko 
Cc: Thomas Gleixner 
Cc: Jann Horn 
Cc: Feng Tang 
Cc: Kees Cook 
Cc: Randy Dunlap 
Cc: linux-ker...@vger.kernel.org
Acked-by: Christian König  (v1)
Acked-by: Peter Zijlstra (Intel) 
Signed-off-by: Daniel Vetter 
---
 include/linux/kernel.h | 25 -
 include/linux/sched.h  |  4 
 kernel/sched/core.c| 19 ++-
 3 files changed, 42 insertions(+), 6 deletions(-)

diff --git a/include/linux/kernel.h b/include/linux/kernel.h
index 4fa360a13c1e..82f84cfe372f 100644
--- a/include/linux/kernel.h
+++ b/include/linux/kernel.h
@@ -217,7 +217,9 @@ extern void __cant_sleep(const char *file, int line, int 
preempt_offset);
  * might_sleep - annotation for functions that can sleep
  *
  * this macro will print a stack trace if it is executed in an atomic
- * context (spinlock, irq-handler, ...).
+ * context (spinlock, irq-handler, ...). Additional sections where blocking is
+ * not allowed can be annotated with non_block_start() and non_block_end()
+ * pairs.
  *
  * This is a useful debugging help to be able to catch problems early and not
  * be bitten later when the calling function happens to sleep when it is not
@@ -233,6 +235,25 @@ extern void __cant_sleep(const char *file, int line, int 
preempt_offset);
 # define cant_sleep() \
do { __cant_sleep(__FILE__, __LINE__, 0); } while (0)
 # define sched_annotate_sleep()(current->task_state_change = 0)
+/**
+ * non_block_start - annotate the start of section where sleeping is prohibited
+ *
+ * This is on behalf of the oom reaper, specifically when it is calling the mmu
+ * notifiers. The problem is that if the notifier were to block on, for 
example,
+ * mutex_lock() and if the process which holds that mutex were to perform a
+ * sleeping memory allocation, the oom reaper is now blocked on completion of
+ * that memory allocation. Other blocking calls like wait_event() pose similar
+ * issues.
+ */
+# define non_block_start() \
+   do { current->non_block_count++; } while (0)
+/**
+ * non_block_end - annotate the end of section where sleeping is prohibited
+ *
+ * Closes a section opened by non_block_start().
+ */
+# define non_block_end() \
+   do { WARN_ON(current->non_block_count-- == 0); } while (0)
 #else
   static inline void ___might_sleep(const char *file, int line,
   int preempt_offset) { }
@@ -241,6 +262,8 @@ extern void __cant_sleep(const char *file, int line, int 
preempt_offset);
 # define might_sleep() do { might_resched(); } while (0)
 # define cant_sleep() do { } while (0)
 # define sched_annotate_sleep() do { } while (0)
+# define non_block_start() do { } while (0)
+# define non_block_end() do { } while (0)
 #endif
 
 #define might_sleep_if(cond) do { if (cond) might_sleep(); } while (0)
diff --git a/include/linux/sched.h b/include/linux/sched.h
index b6ec130dff9b..e8bb965f5019 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -980,6 +980,10 @@ struct task_struct {
struct mutex_waiter *blocked_on;
 #endif
 
+#ifdef CONFIG_DEBUG_ATOMIC_SLEEP
+   int non_block_count;
+#endif
+
 #ifdef CONFIG_TRACE_IRQFLAGS
unsigned intirq_events;
unsigned long   hardirq_enable_ip;
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 45dceec209f4..0d01c7994a9a