David Rientjes wrote:
> On Tue, 2 Feb 2016, Michal Hocko wrote:
> > > I'm baffled by any reference to "memcg oom heavy loads", I don't
> > > understand this paragraph, sorry. If a memcg is oom, we shouldn't be
> > > disrupting the global runqueue by running oom_reaper at a high priority.
> > >
David Rientjes wrote:
> On Tue, 2 Feb 2016, Michal Hocko wrote:
> > > I'm baffled by any reference to "memcg oom heavy loads", I don't
> > > understand this paragraph, sorry. If a memcg is oom, we shouldn't be
> > > disrupting the global runqueue by running oom_reaper at a high priority.
> > >
On Tue, 2 Feb 2016, Tetsuo Handa wrote:
> Maybe we all agree with introducing OOM reaper without queuing, but I do
> want to see a guarantee for scheduling for next OOM-kill operation before
> trying to build a reliable queuing chain.
>
The race can be fixed in two ways which I've already
On Tue, 2 Feb 2016, Michal Hocko wrote:
> > Not exclude them, but I would have expected untrack_pfn().
>
> My understanding is that vm_normal_page will do the right thing for
> those mappings - especially for CoW VM_PFNMAP which are normal pages
> AFAIU. Wrt. to untrack_pfn I was relying that
Michal Hocko wrote:
> > In this case, the oom reaper has ignored the next victim and doesn't do
> > anything; the simple race has prevented it from zapping memory and does
> > not reduce the livelock probability.
> >
> > This can be solved either by queueing mm's to reap or involving the oom
>
On Mon 01-02-16 19:02:06, David Rientjes wrote:
> On Thu, 28 Jan 2016, Michal Hocko wrote:
>
> > [...]
> > > > +static bool __oom_reap_vmas(struct mm_struct *mm)
> > > > +{
> > > > + struct mmu_gather tlb;
> > > > + struct vm_area_struct *vma;
> > > > + struct zap_details
On Tue, 2 Feb 2016, Michal Hocko wrote:
> > Not exclude them, but I would have expected untrack_pfn().
>
> My understanding is that vm_normal_page will do the right thing for
> those mappings - especially for CoW VM_PFNMAP which are normal pages
> AFAIU. Wrt. to untrack_pfn I was relying that
On Tue, 2 Feb 2016, Tetsuo Handa wrote:
> Maybe we all agree with introducing OOM reaper without queuing, but I do
> want to see a guarantee for scheduling for next OOM-kill operation before
> trying to build a reliable queuing chain.
>
The race can be fixed in two ways which I've already
On Mon 01-02-16 19:02:06, David Rientjes wrote:
> On Thu, 28 Jan 2016, Michal Hocko wrote:
>
> > [...]
> > > > +static bool __oom_reap_vmas(struct mm_struct *mm)
> > > > +{
> > > > + struct mmu_gather tlb;
> > > > + struct vm_area_struct *vma;
> > > > + struct zap_details
Michal Hocko wrote:
> > In this case, the oom reaper has ignored the next victim and doesn't do
> > anything; the simple race has prevented it from zapping memory and does
> > not reduce the livelock probability.
> >
> > This can be solved either by queueing mm's to reap or involving the oom
>
On Thu, 28 Jan 2016, Michal Hocko wrote:
> [...]
> > > +static bool __oom_reap_vmas(struct mm_struct *mm)
> > > +{
> > > + struct mmu_gather tlb;
> > > + struct vm_area_struct *vma;
> > > + struct zap_details details = {.check_swap_entries = true,
> > > +
On Thu, 28 Jan 2016, Michal Hocko wrote:
> [...]
> > > +static bool __oom_reap_vmas(struct mm_struct *mm)
> > > +{
> > > + struct mmu_gather tlb;
> > > + struct vm_area_struct *vma;
> > > + struct zap_details details = {.check_swap_entries = true,
> > > +
On Wed 27-01-16 17:28:10, David Rientjes wrote:
> On Wed, 6 Jan 2016, Michal Hocko wrote:
>
> > From: Michal Hocko
> >
> > This is based on the idea from Mel Gorman discussed during LSFMM 2015 and
> > independently brought up by Oleg Nesterov.
> >
>
> Suggested-bys?
Sure, why not.
> > The
On Wed 27-01-16 17:28:10, David Rientjes wrote:
> On Wed, 6 Jan 2016, Michal Hocko wrote:
>
> > From: Michal Hocko
> >
> > This is based on the idea from Mel Gorman discussed during LSFMM 2015 and
> > independently brought up by Oleg Nesterov.
> >
>
> Suggested-bys?
Sure,
On Wed, 6 Jan 2016, Michal Hocko wrote:
> From: Michal Hocko
>
> This is based on the idea from Mel Gorman discussed during LSFMM 2015 and
> independently brought up by Oleg Nesterov.
>
Suggested-bys?
> The OOM killer currently allows to kill only a single task in a good
> hope that the task
On Wed, 6 Jan 2016, Michal Hocko wrote:
> From: Michal Hocko
>
> This is based on the idea from Mel Gorman discussed during LSFMM 2015 and
> independently brought up by Oleg Nesterov.
>
Suggested-bys?
> The OOM killer currently allows to kill only a single task in a good
>
On Thu 07-01-16 20:23:04, Tetsuo Handa wrote:
[...]
> According to commit a2b829d95958da20 ("mm/oom_kill.c: avoid attempting
> to kill init sharing same memory"), below patch is needed for avoid
> killing init process with SIGSEGV.
>
> --
> diff --git a/mm/oom_kill.c b/mm/oom_kill.c
>
On Thu 07-01-16 20:23:04, Tetsuo Handa wrote:
[...]
> According to commit a2b829d95958da20 ("mm/oom_kill.c: avoid attempting
> to kill init sharing same memory"), below patch is needed for avoid
> killing init process with SIGSEGV.
>
> --
> diff --git a/mm/oom_kill.c b/mm/oom_kill.c
>
On Wed 06-01-16 09:26:12, Paul Gortmaker wrote:
> [Re: [PATCH 1/2] mm, oom: introduce oom reaper] On 06/01/2016 (Wed 10:10)
> Michal Hocko wrote:
>
> > On Mon 21-12-15 15:38:21, Paul Gortmaker wrote:
> > [...]
> > > ...use one of the non-modular initcalls
[Re: [PATCH 1/2] mm, oom: introduce oom reaper] On 06/01/2016 (Wed 10:10)
Michal Hocko wrote:
> On Mon 21-12-15 15:38:21, Paul Gortmaker wrote:
> [...]
> > ...use one of the non-modular initcalls here? I'm trying to clean up most
> > of
> > the non-modular uses of m
On Mon 21-12-15 15:38:21, Paul Gortmaker wrote:
[...]
> ...use one of the non-modular initcalls here? I'm trying to clean up most of
> the non-modular uses of modular macros etc. since:
>
> (1) it is easy to accidentally code up an unused module_exit function
> (2) it can be misleading when
On Mon 21-12-15 15:38:21, Paul Gortmaker wrote:
[...]
> ...use one of the non-modular initcalls here? I'm trying to clean up most of
> the non-modular uses of modular macros etc. since:
>
> (1) it is easy to accidentally code up an unused module_exit function
> (2) it can be misleading when
[Re: [PATCH 1/2] mm, oom: introduce oom reaper] On 06/01/2016 (Wed 10:10)
Michal Hocko wrote:
> On Mon 21-12-15 15:38:21, Paul Gortmaker wrote:
> [...]
> > ...use one of the non-modular initcalls here? I'm trying to clean up most
> > of
> > the non-modular uses of m
On Wed 06-01-16 09:26:12, Paul Gortmaker wrote:
> [Re: [PATCH 1/2] mm, oom: introduce oom reaper] On 06/01/2016 (Wed 10:10)
> Michal Hocko wrote:
>
> > On Mon 21-12-15 15:38:21, Paul Gortmaker wrote:
> > [...]
> > > ...use one of the non-modular initcalls
On Fri 25-12-15 12:35:37, Michal Hocko wrote:
[...]
> Thanks I will try to reproduce early next year. But so far I think this
> is just a general issue of MADV_DONTNEED vs. truncate and oom_reaper is
> just lucky to trigger it. There shouldn't be anything oom_reaper
> specific here. Maybe there is
On Thu 24-12-15 20:06:50, Tetsuo Handa wrote:
> Michal Hocko wrote:
> > This is VM_BUG_ON_PAGE(page_mapped(page), page), right? Could you attach
> > the full kernel log? It all smells like a race when OOM reaper tears
> > down the mapping and there is a truncate still in progress. But hitting
> >
On Thu 24-12-15 13:44:03, Ross Zwisler wrote:
> On Thu, Dec 24, 2015 at 2:47 AM, Michal Hocko wrote:
> > On Wed 23-12-15 16:00:09, Ross Zwisler wrote:
> > [...]
> >> While running xfstests on next-20151223 I hit a pair of kernel BUGs
> >> that bisected to this commit:
> >>
> >> 1eb3a80d8239 ("mm,
On Thu 24-12-15 20:06:50, Tetsuo Handa wrote:
> Michal Hocko wrote:
> > This is VM_BUG_ON_PAGE(page_mapped(page), page), right? Could you attach
> > the full kernel log? It all smells like a race when OOM reaper tears
> > down the mapping and there is a truncate still in progress. But hitting
> >
On Fri 25-12-15 12:35:37, Michal Hocko wrote:
[...]
> Thanks I will try to reproduce early next year. But so far I think this
> is just a general issue of MADV_DONTNEED vs. truncate and oom_reaper is
> just lucky to trigger it. There shouldn't be anything oom_reaper
> specific here. Maybe there is
On Thu 24-12-15 13:44:03, Ross Zwisler wrote:
> On Thu, Dec 24, 2015 at 2:47 AM, Michal Hocko wrote:
> > On Wed 23-12-15 16:00:09, Ross Zwisler wrote:
> > [...]
> >> While running xfstests on next-20151223 I hit a pair of kernel BUGs
> >> that bisected to this commit:
> >>
> >>
On Thu, Dec 24, 2015 at 4:06 AM, Tetsuo Handa
wrote:
> Michal Hocko wrote:
>> This is VM_BUG_ON_PAGE(page_mapped(page), page), right? Could you attach
>> the full kernel log? It all smells like a race when OOM reaper tears
>> down the mapping and there is a truncate still in progress. But hitting
On Thu, Dec 24, 2015 at 2:47 AM, Michal Hocko wrote:
> On Wed 23-12-15 16:00:09, Ross Zwisler wrote:
> [...]
>> While running xfstests on next-20151223 I hit a pair of kernel BUGs
>> that bisected to this commit:
>>
>> 1eb3a80d8239 ("mm, oom: introduce oom reaper")
>
> Thank you for the report
Michal Hocko wrote:
> This is VM_BUG_ON_PAGE(page_mapped(page), page), right? Could you attach
> the full kernel log? It all smells like a race when OOM reaper tears
> down the mapping and there is a truncate still in progress. But hitting
> the BUG_ON just because of that doesn't make much sense
On Wed 23-12-15 16:00:09, Ross Zwisler wrote:
[...]
> While running xfstests on next-20151223 I hit a pair of kernel BUGs
> that bisected to this commit:
>
> 1eb3a80d8239 ("mm, oom: introduce oom reaper")
Thank you for the report and the bisection.
> Here is a BUG produced by generic/029 when
On Thu, Dec 24, 2015 at 4:06 AM, Tetsuo Handa
wrote:
> Michal Hocko wrote:
>> This is VM_BUG_ON_PAGE(page_mapped(page), page), right? Could you attach
>> the full kernel log? It all smells like a race when OOM reaper tears
>> down the mapping and there is a
On Thu, Dec 24, 2015 at 2:47 AM, Michal Hocko wrote:
> On Wed 23-12-15 16:00:09, Ross Zwisler wrote:
> [...]
>> While running xfstests on next-20151223 I hit a pair of kernel BUGs
>> that bisected to this commit:
>>
>> 1eb3a80d8239 ("mm, oom: introduce oom reaper")
>
> Thank
Michal Hocko wrote:
> This is VM_BUG_ON_PAGE(page_mapped(page), page), right? Could you attach
> the full kernel log? It all smells like a race when OOM reaper tears
> down the mapping and there is a truncate still in progress. But hitting
> the BUG_ON just because of that doesn't make much sense
On Wed 23-12-15 16:00:09, Ross Zwisler wrote:
[...]
> While running xfstests on next-20151223 I hit a pair of kernel BUGs
> that bisected to this commit:
>
> 1eb3a80d8239 ("mm, oom: introduce oom reaper")
Thank you for the report and the bisection.
> Here is a BUG produced by generic/029 when
On Tue, Dec 15, 2015 at 11:36 AM, Michal Hocko wrote:
> From: Michal Hocko
>
> This is based on the idea from Mel Gorman discussed during LSFMM 2015 and
> independently brought up by Oleg Nesterov.
>
> The OOM killer currently allows to kill only a single task in a good
> hope that the task will
On Tue, Dec 15, 2015 at 11:36 AM, Michal Hocko wrote:
> From: Michal Hocko
>
> This is based on the idea from Mel Gorman discussed during LSFMM 2015 and
> independently brought up by Oleg Nesterov.
>
> The OOM killer currently allows to kill only a single task
On Tue, Dec 15, 2015 at 1:36 PM, Michal Hocko wrote:
> From: Michal Hocko
>
> This is based on the idea from Mel Gorman discussed during LSFMM 2015 and
> independently brought up by Oleg Nesterov.
>
[...]
Since this is built-in always, can we
> diff --git a/mm/oom_kill.c b/mm/oom_kill.c
>
On Fri 18-12-15 13:14:00, Andrew Morton wrote:
> On Fri, 18 Dec 2015 12:54:55 +0100 Michal Hocko wrote:
>
> > /* Retry the down_read_trylock(mmap_sem) a few times */
> > - while (attempts++ < 10 && !__oom_reap_vmas(mm))
> > - msleep_interruptible(100);
> > + while (attempts++ <
On Tue, Dec 15, 2015 at 1:36 PM, Michal Hocko wrote:
> From: Michal Hocko
>
> This is based on the idea from Mel Gorman discussed during LSFMM 2015 and
> independently brought up by Oleg Nesterov.
>
[...]
Since this is built-in always, can we
> diff
On Fri 18-12-15 13:14:00, Andrew Morton wrote:
> On Fri, 18 Dec 2015 12:54:55 +0100 Michal Hocko wrote:
>
> > /* Retry the down_read_trylock(mmap_sem) a few times */
> > - while (attempts++ < 10 && !__oom_reap_vmas(mm))
> > - msleep_interruptible(100);
> > +
Tetsuo Handa wrote:
> Complete log is at http://I-love.SAKURA.ne.jp/tmp/serial-20151218.txt.xz .
> --
> [ 438.304082] Killed process 12680 (oom_reaper-test) total-vm:4324kB,
> anon-rss:120kB, file-rss:0kB, shmem-rss:0kB
> [ 439.318951] oom_reaper: attempts=11
> [ 445.581171]
Tetsuo Handa wrote:
> Complete log is at http://I-love.SAKURA.ne.jp/tmp/serial-20151218.txt.xz .
> --
> [ 438.304082] Killed process 12680 (oom_reaper-test) total-vm:4324kB,
> anon-rss:120kB, file-rss:0kB, shmem-rss:0kB
> [ 439.318951] oom_reaper: attempts=11
> [ 445.581171]
On Fri, 18 Dec 2015 12:54:55 +0100 Michal Hocko wrote:
> /* Retry the down_read_trylock(mmap_sem) a few times */
> - while (attempts++ < 10 && !__oom_reap_vmas(mm))
> - msleep_interruptible(100);
> + while (attempts++ < 10 && !__oom_reap_vmas(mm)) {
> +
On Thu 17-12-15 13:13:56, Andrew Morton wrote:
[...]
> Also, re-reading your description:
>
> : It has been shown (e.g. by Tetsuo Handa) that it is not that hard to
> : construct workloads which break the core assumption mentioned above and
> : the OOM victim might take unbounded amount of time
Michal Hocko wrote:
> On Wed 16-12-15 16:50:35, Andrew Morton wrote:
> > On Tue, 15 Dec 2015 19:36:15 +0100 Michal Hocko wrote:
> [...]
> > > +static void oom_reap_vmas(struct mm_struct *mm)
> > > +{
> > > + int attempts = 0;
> > > +
> > > + while (attempts++ < 10 && !__oom_reap_vmas(mm))
> > > +
On Thu 17-12-15 12:00:04, Andrew Morton wrote:
> On Thu, 17 Dec 2015 11:55:11 -0800 Linus Torvalds
> wrote:
>
> > On Thu, Dec 17, 2015 at 5:02 AM, Michal Hocko wrote:
> > > Ups. You are right. I will go with msleep_interruptible(100).
> >
> > I don't think that's right.
> >
> > If a signal
On Thu 17-12-15 16:15:21, Andrew Morton wrote:
> On Tue, 15 Dec 2015 19:36:15 +0100 Michal Hocko wrote:
>
> > This patch reduces the probability of such a lockup by introducing a
> > specialized kernel thread (oom_reaper)
>
> CONFIG_MMU=n:
>
> slub.c:(.text+0x4184): undefined reference to
On Thu 17-12-15 12:00:04, Andrew Morton wrote:
> On Thu, 17 Dec 2015 11:55:11 -0800 Linus Torvalds
> wrote:
>
> > On Thu, Dec 17, 2015 at 5:02 AM, Michal Hocko wrote:
> > > Ups. You are right. I will go with msleep_interruptible(100).
> >
> >
Michal Hocko wrote:
> On Wed 16-12-15 16:50:35, Andrew Morton wrote:
> > On Tue, 15 Dec 2015 19:36:15 +0100 Michal Hocko wrote:
> [...]
> > > +static void oom_reap_vmas(struct mm_struct *mm)
> > > +{
> > > + int attempts = 0;
> > > +
> > > + while (attempts++ < 10 &&
On Thu 17-12-15 16:15:21, Andrew Morton wrote:
> On Tue, 15 Dec 2015 19:36:15 +0100 Michal Hocko wrote:
>
> > This patch reduces the probability of such a lockup by introducing a
> > specialized kernel thread (oom_reaper)
>
> CONFIG_MMU=n:
>
> slub.c:(.text+0x4184):
On Thu 17-12-15 13:13:56, Andrew Morton wrote:
[...]
> Also, re-reading your description:
>
> : It has been shown (e.g. by Tetsuo Handa) that it is not that hard to
> : construct workloads which break the core assumption mentioned above and
> : the OOM victim might take unbounded amount of time
On Fri, 18 Dec 2015 12:54:55 +0100 Michal Hocko wrote:
> /* Retry the down_read_trylock(mmap_sem) a few times */
> - while (attempts++ < 10 && !__oom_reap_vmas(mm))
> - msleep_interruptible(100);
> + while (attempts++ < 10 && !__oom_reap_vmas(mm)) {
>
On Tue, 15 Dec 2015 19:36:15 +0100 Michal Hocko wrote:
> This patch reduces the probability of such a lockup by introducing a
> specialized kernel thread (oom_reaper)
CONFIG_MMU=n:
slub.c:(.text+0x4184): undefined reference to `tlb_gather_mmu'
slub.c:(.text+0x41bc): undefined reference to
On Thu, 17 Dec 2015 14:02:24 +0100 Michal Hocko wrote:
> > I guess it means that the __oom_reap_vmas() success rate is nice anud
> > high ;)
>
> I had a debugging trace_printks around this and there were no reties
> during my testing so I was probably lucky to not trigger the mmap_sem
>
On Thu, 17 Dec 2015 11:55:11 -0800 Linus Torvalds
wrote:
> On Thu, Dec 17, 2015 at 5:02 AM, Michal Hocko wrote:
> > Ups. You are right. I will go with msleep_interruptible(100).
>
> I don't think that's right.
>
> If a signal happens, that loop is now (again) just busy-looping.
It's called
On Thu, Dec 17, 2015 at 5:02 AM, Michal Hocko wrote:
> Ups. You are right. I will go with msleep_interruptible(100).
I don't think that's right.
If a signal happens, that loop is now (again) just busy-looping. That
doesn't sound right, although with the maximum limit of 10 attempts,
maybe it's
On Wed 16-12-15 16:50:35, Andrew Morton wrote:
> On Tue, 15 Dec 2015 19:36:15 +0100 Michal Hocko wrote:
[...]
> > +static void oom_reap_vmas(struct mm_struct *mm)
> > +{
> > + int attempts = 0;
> > +
> > + while (attempts++ < 10 && !__oom_reap_vmas(mm))
> > +
On Wed 16-12-15 16:50:35, Andrew Morton wrote:
> On Tue, 15 Dec 2015 19:36:15 +0100 Michal Hocko wrote:
[...]
> > +static void oom_reap_vmas(struct mm_struct *mm)
> > +{
> > + int attempts = 0;
> > +
> > + while (attempts++ < 10 && !__oom_reap_vmas(mm))
> > +
On Thu, Dec 17, 2015 at 5:02 AM, Michal Hocko wrote:
> Ups. You are right. I will go with msleep_interruptible(100).
I don't think that's right.
If a signal happens, that loop is now (again) just busy-looping. That
doesn't sound right, although with the maximum limit of 10
On Thu, 17 Dec 2015 11:55:11 -0800 Linus Torvalds
wrote:
> On Thu, Dec 17, 2015 at 5:02 AM, Michal Hocko wrote:
> > Ups. You are right. I will go with msleep_interruptible(100).
>
> I don't think that's right.
>
> If a signal happens, that
On Thu, 17 Dec 2015 14:02:24 +0100 Michal Hocko wrote:
> > I guess it means that the __oom_reap_vmas() success rate is nice anud
> > high ;)
>
> I had a debugging trace_printks around this and there were no reties
> during my testing so I was probably lucky to not trigger the
On Tue, 15 Dec 2015 19:36:15 +0100 Michal Hocko wrote:
> This patch reduces the probability of such a lockup by introducing a
> specialized kernel thread (oom_reaper)
CONFIG_MMU=n:
slub.c:(.text+0x4184): undefined reference to `tlb_gather_mmu'
slub.c:(.text+0x41bc):
On Tue, 15 Dec 2015 19:36:15 +0100 Michal Hocko wrote:
> From: Michal Hocko
>
> This is based on the idea from Mel Gorman discussed during LSFMM 2015 and
> independently brought up by Oleg Nesterov.
>
> The OOM killer currently allows to kill only a single task in a good
> hope that the task
On Tue, 15 Dec 2015 19:36:15 +0100 Michal Hocko wrote:
> From: Michal Hocko
>
> This is based on the idea from Mel Gorman discussed during LSFMM 2015 and
> independently brought up by Oleg Nesterov.
>
> The OOM killer currently allows to kill only a single
68 matches
Mail list logo