Re: [PATCH 1/2] fs: fsnotify: account fsnotify metadata to kmemcg

2018-06-29 Thread Michal Hocko
ify_perm_event_cachep caches with SLAB_ACCOUNT and instead specify > > __GFP_ACCOUNT manually? Otherwise the patch looks good to me. > > > > Hi Jan, IMHO having a visible __GFP_ACCOUNT along with > memalloc_use_memcg() makes the code more explicit and readable that we > want to targeted/remote memcg charging. Agreed. If you had an implicit SLAB_ACCOUNT then you could get inconsistencies when some allocations would get charged to the current task while others would not. -- Michal Hocko SUSE Labs

Re: [PATCH] mm,oom: Bring OOM notifier callbacks to outside of OOM killer.

2018-06-29 Thread Michal Hocko
On Thu 28-06-18 14:31:05, Paul E. McKenney wrote: > On Thu, Jun 28, 2018 at 01:39:42PM +0200, Michal Hocko wrote: > > On Wed 27-06-18 07:31:25, Paul E. McKenney wrote: > > > On Wed, Jun 27, 2018 at 09:22:07AM +0200, Michal Hocko wrote: > > > > On Tue 26-06-18 10

Re: [PATCH] mm: hugetlb: yield when prepping struct pages

2018-06-29 Thread Michal Hocko
implementation has been hacked into the existing hugetlb code in a quite ugly way. We have done some cleanups since then but there is still a lot of room for improvements. -- Michal Hocko SUSE Labs

Re: [PATCH] memcg, oom: move out_of_memory back to the charge path

2018-06-29 Thread Michal Hocko
On Thu 28-06-18 16:19:07, Greg Thelen wrote: > Michal Hocko wrote: [...] > > + if (mem_cgroup_out_of_memory(memcg, mask, order)) > > + return OOM_SUCCESS; > > + > > + WARN(1,"Memory cgroup charge failed because of no reclaimable memory! &q

Re: [RFC v2 PATCH 2/2] mm: mmap: zap pages with read mmap_sem for large mapping

2018-06-28 Thread Michal Hocko
On Wed 27-06-18 10:23:39, Yang Shi wrote: > > > On 6/27/18 12:24 AM, Michal Hocko wrote: > > On Tue 26-06-18 18:03:34, Yang Shi wrote: > > > > > > On 6/26/18 12:43 AM, Peter Zijlstra wrote: > > > > On Mon, Jun 25, 2018 at 05:06:23PM -0700, Yang Sh

Re: [PATCH v10 2/2] Refactor part of the oom report in dump_header

2018-06-28 Thread Michal Hocko
rictions. It, however, doesn't > > provide any information about memory cgroup the victim belongs to. This > > information can be interesting for container users because they can find > > the victim's container much more easily. > > > > I follow the advices of David Rientjes and

Re: [PATCH] mm: hugetlb: yield when prepping struct pages

2018-06-28 Thread Michal Hocko
eported as > successfully setup. > > Signed-off-by: Cannon Matthews Acked-by: Michal Hocko Thanks! > --- > mm/hugetlb.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/mm/hugetlb.c b/mm/hugetlb.c > index a963f2034dfc..d38273c32d3b 100644 > ---

Re: [PATCH] mm,oom: Bring OOM notifier callbacks to outside of OOM killer.

2018-06-28 Thread Michal Hocko
On Wed 27-06-18 07:31:25, Paul E. McKenney wrote: > On Wed, Jun 27, 2018 at 09:22:07AM +0200, Michal Hocko wrote: > > On Tue 26-06-18 10:03:45, Paul E. McKenney wrote: > > [...] > > > 3.Something else? > > > > How hard it would be to use a d

[PATCH] memcg, oom: move out_of_memory back to the charge path

2018-06-28 Thread Michal Hocko
From: Michal Hocko 3812c8c8f395 ("mm: memcg: do not trap chargers with full callstack on OOM") has changed the ENOMEM semantic of memcg charges. Rather than invoking the oom killer from the charging context it delays the oom killer to the page fault path (pagefault_out_of_memory). Th

Re: [PATCH] mm: drop VM_BUG_ON from __get_free_pages

2018-06-27 Thread Michal Hocko
On Wed 27-06-18 14:14:12, Andrew Morton wrote: > On Wed, 27 Jun 2018 09:50:01 +0200 Vlastimil Babka wrote: > > > On 06/27/2018 09:34 AM, Michal Hocko wrote: > > > On Tue 26-06-18 10:04:16, Andrew Morton wrote: > > > > > > And as I've argued before the c

Re: Memory zeroed when made available to user process

2018-06-27 Thread Michal Hocko
is a bad idea and the flag should have never been merged. I've just mentioned it for completness. -- Michal Hocko SUSE Labs

Re: [PATCH 2/2] mm: set PG_dma_pinned on get_user_pages*()

2018-06-27 Thread Michal Hocko
On Wed 27-06-18 13:53:49, Jan Kara wrote: > On Wed 27-06-18 13:32:21, Michal Hocko wrote: [...] > > Appart from that, do we really care about 32b here? Big DIO, IB users > > seem to be 64b only AFAIU. > > IMO it is a bad habit to leave unpriviledged-user-triggerable oops

Re: [PATCH 2/2] mm: set PG_dma_pinned on get_user_pages*()

2018-06-27 Thread Michal Hocko
On Tue 26-06-18 18:48:25, Jan Kara wrote: > On Tue 26-06-18 15:47:57, Michal Hocko wrote: > > On Mon 18-06-18 12:21:46, Dan Williams wrote: > > [...] > > > I do think we should explore a page flag for pages that are "long > > > term" pinned. Michal asked

Re: [PATCH] mm: drop VM_BUG_ON from __get_free_pages

2018-06-27 Thread Michal Hocko
On Wed 27-06-18 12:47:39, Vlastimil Babka wrote: > On 06/27/2018 09:54 AM, Michal Hocko wrote: > > On Wed 27-06-18 09:50:01, Vlastimil Babka wrote: > >> On 06/27/2018 09:34 AM, Michal Hocko wrote: > >>> On Tue 26-06-18 10:04:16, Andrew Morton wrote: > >>>

Re: why do we still need bootmem allocator?

2018-06-27 Thread Michal Hocko
On Wed 27-06-18 13:11:44, Mike Rapoport wrote: > On Mon, Jun 25, 2018 at 10:09:41AM -0600, Rob Herring wrote: > > On Mon, Jun 25, 2018 at 8:08 AM Michal Hocko wrote: > > > > > > Hi, > > > I am wondering why do we still keep mm/bootmem.c when most architectu

Re: [PATCH] mm: drop VM_BUG_ON from __get_free_pages

2018-06-27 Thread Michal Hocko
On Wed 27-06-18 09:50:01, Vlastimil Babka wrote: > On 06/27/2018 09:34 AM, Michal Hocko wrote: > > On Tue 26-06-18 10:04:16, Andrew Morton wrote: > > > > And as I've argued before the code would be wrong regardless. We would > > leak the memory or worse touch somebody'

Re: [PATCH] mm: drop VM_BUG_ON from __get_free_pages

2018-06-27 Thread Michal Hocko
On Wed 27-06-18 09:34:20, Michal Hocko wrote: > On Tue 26-06-18 10:04:16, Andrew Morton wrote: [...] > > Really, the changelog isn't right. There *is* a real reason to blow > > up. Effectively the caller is attempting to obtain the virtual address > > of a highmem page w

Re: [PATCH] mm: drop VM_BUG_ON from __get_free_pages

2018-06-27 Thread Michal Hocko
On Tue 26-06-18 10:04:16, Andrew Morton wrote: > On Tue, 26 Jun 2018 15:57:39 +0200 Vlastimil Babka wrote: > > > On 06/22/2018 06:28 PM, Michal Hocko wrote: > > > From: Michal Hocko > > > > > > There is no real reason to blow up just because the caller d

Re: [RFC v2 PATCH 2/2] mm: mmap: zap pages with read mmap_sem for large mapping

2018-06-27 Thread Michal Hocko
g two different approaches, > it looks this approach is the most straight-forward one. Yes, you just have to be careful about the max vma count limit. -- Michal Hocko SUSE Labs

Re: [PATCH] mm,oom: Bring OOM notifier callbacks to outside of OOM killer.

2018-06-27 Thread Michal Hocko
On Tue 26-06-18 10:03:45, Paul E. McKenney wrote: [...] > 3.Something else? How hard it would be to use a different API than oom notifiers? E.g. a shrinker which just kicks all the pending callbacks if the reclaim priority reaches low values (e.g. 0)? -- Michal Hocko SUSE Labs

Re: Kernel crash after "mm: initialize pages on demand during boot"

2018-06-27 Thread Michal Hocko
struct page *page = pfn_to_page(pfn); > + __init_single_page(page, pfn, zone, > nid); > + } > + } > + break; > + } > + > /* >* Check given memblock attribute by firmware which can affect >* kernel memory layout. If zone==ZONE_MOVABLE but memory is > @@ -5515,6 +5538,9 @@ void __meminit memmap_init_zone(unsigned long size, int > nid, unsigned long zone, > continue; > } > } > +#else > + if (!update_defer_init(pgdat, pfn, end_pfn, _initialised)) > + break; > #endif > > not_early: > > > > This second change fixed the issue for me as well. I just want to report the > issue and can submit a patch if one of approaches above are acceptable, and I > did not miss anything. > > Thanks, > -- > []'s > Herton -- Michal Hocko SUSE Labs

Re: [PATCH] mm: drop VM_BUG_ON from __get_free_pages

2018-06-26 Thread Michal Hocko
On Tue 26-06-18 15:57:39, Vlastimil Babka wrote: > On 06/22/2018 06:28 PM, Michal Hocko wrote: > > From: Michal Hocko > > > > There is no real reason to blow up just because the caller doesn't know > > that __get_free_pages cannot return highmem pages. Simply fix that

Re: [PATCH 2/2] mm: set PG_dma_pinned on get_user_pages*()

2018-06-26 Thread Michal Hocko
so their usage should better not be unbound. -- Michal Hocko SUSE Labs

Re: [PATCH v3 0/3] fix free pmd/pte page handlings on x86

2018-06-26 Thread Michal Hocko
On Tue 26-06-18 10:45:11, Thomas Gleixner wrote: > On Tue, 26 Jun 2018, Michal Hocko wrote: > > On Mon 25-06-18 21:15:03, Kani Toshimitsu wrote: > > > Lastly, for the code maintenance, I believe this memory allocation keeps > > > the code much simpler than it wou

Re: [PATCH v3 0/3] fix free pmd/pte page handlings on x86

2018-06-26 Thread Michal Hocko
On Mon 25-06-18 21:15:03, Kani Toshimitsu wrote: > On Mon, 2018-06-25 at 19:53 +0200, Michal Hocko wrote: > > On Mon 25-06-18 14:56:26, Kani Toshimitsu wrote: > > > On Sun, 2018-06-24 at 15:19 +0200, Thomas Gleixner wrote: > > > > On Wed,

Re: [PATCH v2] mm/memblock: add missing include

2018-06-25 Thread Michal Hocko
ck.o > > The #ifdef has been simplified from: > > #if defined(CONFIG_HAVE_MEMBLOCK) && defined(CONFIG_NO_BOOTMEM) > > to simply: > > #if defined(CONFIG_NO_BOOTMEM) Well, I would apreciate an explanation why do we need NO_BOOTMEM guard in the first place rather

Re: why do we still need bootmem allocator?

2018-06-25 Thread Michal Hocko
On Mon 25-06-18 10:09:41, Rob Herring wrote: > On Mon, Jun 25, 2018 at 8:08 AM Michal Hocko wrote: > > > > Hi, > > I am wondering why do we still keep mm/bootmem.c when most architectures > > already moved to nobootmem. Is there any fundamental reason why others

Re: [PATCH v3 0/3] fix free pmd/pte page handlings on x86

2018-06-25 Thread Michal Hocko
Joerg that allocating memory inside afunction that is supposed to free page table is far from ideal. More so that the allocation is hardcoded GFP_KERNEL. We already have this antipattern in functions to allocate page tables and it has turned to be maintenance PITA longterm. So if there is a way around that then I would strongly suggest finding a different solution. Whether that is sufficient to ditch the whole series is not my call though. -- Michal Hocko SUSE Labs

Re: [PATCH] mm,oom: Bring OOM notifier callbacks to outside of OOM killer.

2018-06-25 Thread Michal Hocko
On Mon 25-06-18 16:04:04, peter enderborg wrote: > On 06/25/2018 03:07 PM, Michal Hocko wrote: > > > On Mon 25-06-18 15:03:40, peter enderborg wrote: > >> On 06/20/2018 01:55 PM, Michal Hocko wrote: > >>> On Wed 20-06-18 20:20:38, Tetsuo Handa wrote: > >&g

why do we still need bootmem allocator?

2018-06-25 Thread Michal Hocko
in that regards? -- Michal Hocko SUSE Labs

Re: [PATCH] mm/memblock: add missing include and #ifdef

2018-06-25 Thread Michal Hocko
en do not compile this code for !HAVE_MEMBLOCK AFAICS. > /** > * memblock_virt_alloc_internal - allocate boot memory block > * @size: size of memory block to be allocated in bytes > @@ -1433,6 +1435,7 @@ void * __init memblock_virt_alloc_try_nid( > (u64)max_addr); > return NULL; > } > +#endif > > /** > * __memblock_free_early - free boot memory block > -- > 2.11.0 -- Michal Hocko SUSE Labs

Re: [PATCH] mm,oom: Bring OOM notifier callbacks to outside of OOM killer.

2018-06-25 Thread Michal Hocko
On Mon 25-06-18 15:03:40, peter enderborg wrote: > On 06/20/2018 01:55 PM, Michal Hocko wrote: > > On Wed 20-06-18 20:20:38, Tetsuo Handa wrote: > >> Sleeping with oom_lock held can cause AB-BA lockup bug because > >> __alloc_pages_may_oom() does n

Re: [RFC PATCH] mm, oom: distinguish blockable mode for mmu notifiers

2018-06-25 Thread Michal Hocko
On Mon 25-06-18 12:34:43, Paolo Bonzini wrote: > On 25/06/2018 10:45, Michal Hocko wrote: > > On Mon 25-06-18 10:10:18, Paolo Bonzini wrote: > >> On 25/06/2018 09:57, Michal Hocko wrote: > >>> On Sun 24-06-18 10:11:21, Paolo Bonzini wrote: > >>>

Re: [RFC v2 PATCH 2/2] mm: mmap: zap pages with read mmap_sem for large mapping

2018-06-25 Thread Michal Hocko
> VM_DEAD, it should be for both 32-bit and 64-bit. Do we really need any special handling for 32b? Who is going to create GB mappings for all this to be worth doing? -- Michal Hocko SUSE Labs

Re: [PATCH v2 1/4] lib/rhashtable: simplify bucket_table_alloc()

2018-06-25 Thread Michal Hocko
> will now > also use GFP_ATOMIC | __GFP_NOWARN. However, I consider this a positive > consequence > as for the same reasons we want nowarn semantics in bucket_table_alloc(). > > Signed-off-by: Davidlohr Bueso Acked-by: Michal Hocko > --- > > v2: > - Changes based on

Re: [patch] mm, oom: fix unnecessary killing of additional processes

2018-06-25 Thread Michal Hocko
On Fri 22-06-18 11:49:14, David Rientjes wrote: > On Fri, 22 Jun 2018, Michal Hocko wrote: > > > > > preempt_disable() is required because it calls kvm_kick_many_cpus() > > > > with > > > > wait == true because KVM_REQ_APIC_PAGE_RELO

Re: [RFC PATCH] mm, oom: distinguish blockable mode for mmu notifiers

2018-06-25 Thread Michal Hocko
On Mon 25-06-18 10:10:18, Paolo Bonzini wrote: > On 25/06/2018 09:57, Michal Hocko wrote: > > On Sun 24-06-18 10:11:21, Paolo Bonzini wrote: > >> On 22/06/2018 17:02, Michal Hocko wrote: > >>> @@ -7215,6 +7216,8 @@ void kvm_arch_mmu_notifier_invalidat

Re: [RFC PATCH] mm, oom: distinguish blockable mode for mmu notifiers

2018-06-25 Thread Michal Hocko
On Sun 24-06-18 10:11:21, Paolo Bonzini wrote: > On 22/06/2018 17:02, Michal Hocko wrote: > > @@ -7215,6 +7216,8 @@ void kvm_arch_mmu_notifier_invalidate_range(struct > > kvm *kvm, > > apic_address = gfn_to_hva(kvm, APIC_DEFAULT_PHYS_BASE >> PAGE_SHIFT); > &

[PATCH] mm: drop VM_BUG_ON from __get_free_pages

2018-06-22 Thread Michal Hocko
From: Michal Hocko There is no real reason to blow up just because the caller doesn't know that __get_free_pages cannot return highmem pages. Simply fix that up silently. Even if we have some confused users such a fixup will not be harmful. Signed-off-by: Michal Hocko --- Hi Andrew, previously

Re: [PATCH] kasan: depend on CONFIG_SLUB_DEBUG

2018-06-22 Thread Michal Hocko
c: Andrew Morton > Cc: Andrey Ryabinin > Cc: > Cc: > Cc: > Signed-off-by: Jason A. Donenfeld This is the simplest way to do but I strongly suspect that the whole SLUB_DEBUG is not really necessary Acked-by: Michal Hocko > --- > lib/Kconfig.kasan | 1 + > 1 file cha

Re: [Intel-gfx] [RFC PATCH] mm, oom: distinguish blockable mode for mmu notifiers

2018-06-22 Thread Michal Hocko
On Fri 22-06-18 16:36:49, Chris Wilson wrote: > Quoting Michal Hocko (2018-06-22 16:02:42) > > Hi, > > this is an RFC and not tested at all. I am not very familiar with the > > mmu notifiers semantics very much so this is a crude attempt to achieve > > what I need basica

Re: [RFC PATCH] mm, oom: distinguish blockable mode for mmu notifiers

2018-06-22 Thread Michal Hocko
; mmu_notifier *mn, > > /* notification is exclusive, but interval is inclusive */ > > end -= 1; > > - amdgpu_mn_read_lock(rmn); > > + if (amdgpu_mn_read_lock(rmn, blockable)) > > + return -EAGAIN; > > it = interval_tree_iter_first(>objects, start, end); > > while (it) { > > @@ -262,6 +277,8 @@ static void amdgpu_mn_invalidate_range_start_hsa(struct > > mmu_notifier *mn, > > amdgpu_amdkfd_evict_userptr(mem, mm); > > } > > } > > + > > + return 0; > > } > > /** -- Michal Hocko SUSE Labs

Re: [patch] mm, oom: fix unnecessary killing of additional processes

2018-06-22 Thread Michal Hocko
On Fri 22-06-18 09:42:57, Michal Hocko wrote: > On Thu 21-06-18 13:50:53, David Rientjes wrote: > > On Thu, 21 Jun 2018, Michal Hocko wrote: > > > > > > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > > > > > index 6bcecc325e7e..ac08f5d71

Re: [PATCH v9] Refactor part of the oom report in dump_header

2018-06-22 Thread Michal Hocko
straint and static const char * const > oom_constraint_text[] to two parts, am I right ? Just split the patch into two parts. The first to add oom_constraint* and use it. And the second which adds the missing memcg information to the oom report. -- Michal Hocko SUSE Labs

Re: [PATCH v9] Refactor part of the oom report in dump_header

2018-06-22 Thread Michal Hocko
> > > I do not get why you separate this specific part out. > > oom_constraint_text is not used in the patch. It is almost always > > preferable to have a user of newly added functionality. > > So do I need to separate this part ? You misunderstood my suggestion. Let me be more specific. Please separate the whole new oom_constraint including its _usage_. -- Michal Hocko SUSE Labs

Re: [PATCH 1/2] arm64: avoid alloc memory on offline node

2018-06-22 Thread Michal Hocko
On Fri 22-06-18 16:58:05, Hanjun Guo wrote: > On 2018/6/20 19:51, Punit Agrawal wrote: > > Xie XiuQi writes: > > > >> Hi Lorenzo, Punit, > >> > >> > >> On 2018/6/20 0:32, Lorenzo Pieralisi wrote: > >>> On Tue, Jun 19, 2018 at 04:

Re: dm bufio: Reduce dm_bufio_lock contention

2018-06-22 Thread Michal Hocko
rent_is_kswapd() && > +(sc->gfp_mask & (__GFP_NORETRY | __GFP_FS)) != __GFP_NORETRY > && > current_may_throttle() && pgdat_memcg_congested(pgdat, root)) > wait_iff_congested(BLK_RW_ASYNC, HZ/10); > -- Michal Hocko SUSE Labs

Re: [PATCH v9] Refactor part of the oom report in dump_header

2018-06-22 Thread Michal Hocko
provide any information about memory cgroup the victim belongs to. This information can be interesting for container users because they can find the victim's container much more easily. " > I follow the advices of David Rientjes and Michal Hocko, and refactor > part of the oom

Re: [patch] mm, oom: fix unnecessary killing of additional processes

2018-06-22 Thread Michal Hocko
On Thu 21-06-18 13:50:53, David Rientjes wrote: > On Thu, 21 Jun 2018, Michal Hocko wrote: > > > > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > > > > index 6bcecc325e7e..ac08f5d711be 100644 > > > > --- a/arch/x86/kvm/x86.c > > > &g

Re: [RFC PATCH] memcg, oom: move out_of_memory back to the charge path

2018-06-21 Thread Michal Hocko
On Thu 21-06-18 10:37:51, Johannes Weiner wrote: > On Thu, Jun 21, 2018 at 10:09:27AM +0200, Michal Hocko wrote: > > @@ -496,14 +496,14 @@ void mem_cgroup_print_oom_info(struct mem_cgroup > > *memcg, > > > > static inline void mem_cgroup_oom_enable(void) >

Re: [PATCH] slub: track number of slabs irrespective of CONFIG_SLUB_DEBUG

2018-06-21 Thread Michal Hocko
to reduce the code/data footprint to the > minimum necessary while sacrificing debuggability etc etc. > > Maybe make it impossible to disable CONFIG_SLUB_DEBUG if CGROUPs are in > use? Why don't we simply remove the config option altogether and make it enabled effectively. -- Michal Hocko SUSE Labs

Re: [PATCH] mm: mempool: Remove unused argument in kasan_unpoison_element() and remove_element()

2018-06-21 Thread Michal Hocko
gt; Signed-off-by: Jia-Ju Bai Acked-by: Michal Hocko > --- > mm/mempool.c | 12 ++-- > 1 file changed, 6 insertions(+), 6 deletions(-) > > diff --git a/mm/mempool.c b/mm/mempool.c > index 5c9dce34719b..3076ab3f7bc4 100644 > --- a/mm/mempool.c > +++ b/mm/mempoo

Re: [PATCH] mm,oom: Bring OOM notifier callbacks to outside of OOM killer.

2018-06-21 Thread Michal Hocko
On Thu 21-06-18 20:27:41, Tetsuo Handa wrote: [] > On 2018/06/21 16:31, Michal Hocko wrote: > > On Wed 20-06-18 15:36:45, David Rientjes wrote: > > [...] > >> That makes me think that "oom_notify_list" isn't very intuitive: it can > >> free memory as

Re: [PATCH 0/4] Small cleanup for memoryhotplug

2018-06-21 Thread Michal Hocko
> register_mem_sect_under_node > > drivers/base/memory.c | 2 - > drivers/base/node.c | 52 +- > include/linux/node.h | 21 +-- > mm/memory_hotplug.c | 101 > ++++++ > 4 files changed, 71 insertions(+), 105 deletions(-) > > -- > 2.13.6 > -- Michal Hocko SUSE Labs

Re: [RFC PATCH] memcg, oom: move out_of_memory back to the charge path

2018-06-21 Thread Michal Hocko
This is an updated version with feedback from Johannes integrated. Still not runtime tested but I am posting it to make further review easier. >From ed2796dc3894f93ddf0fc9ec74b83c58abc2b4ff Mon Sep 17 00:00:00 2001 From: Michal Hocko Date: Wed, 20 Jun 2018 10:25:10 +0200 Subject: [PATCH] me

Re: [patch] mm, oom: fix unnecessary killing of additional processes

2018-06-21 Thread Michal Hocko
On Thu 21-06-18 09:45:37, Michal Hocko wrote: > On Wed 20-06-18 13:34:52, David Rientjes wrote: > > On Wed, 20 Jun 2018, Michal Hocko wrote: [...] > > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > > > index 6bcecc325e7e..ac08f5d711be 100644 > > >

Re: [patch] mm, oom: fix unnecessary killing of additional processes

2018-06-21 Thread Michal Hocko
On Wed 20-06-18 13:34:52, David Rientjes wrote: > On Wed, 20 Jun 2018, Michal Hocko wrote: > > > On Tue 19-06-18 10:33:16, Michal Hocko wrote: > > [...] > > > As I've said, if you are not willing to work on a proper solution, I > > > will, but my nack holds

Re: [RFC PATCH] memcg, oom: move out_of_memory back to the charge path

2018-06-21 Thread Michal Hocko
On Wed 20-06-18 15:38:36, Johannes Weiner wrote: > On Wed, Jun 20, 2018 at 05:31:48PM +0200, Michal Hocko wrote: > > * Please note that mem_cgroup_oom_synchronize might fail to find a > > * victim and then we have rely on mem_cgroup_oom_synchronize otherwise > >

Re: [PATCH] mm,oom: Bring OOM notifier callbacks to outside of OOM killer.

2018-06-21 Thread Michal Hocko
undamentally different here? Sure those pages should be reclaimed as the last resort but we already do have priority for slab shrinking so we know that the system is struggling when reaching the lowest priority. Isn't that enough to express the need for current oom notifier implementations? -- Michal Hocko SUSE Labs

Re: [RFC PATCH] memcg, oom: move out_of_memory back to the charge path

2018-06-20 Thread Michal Hocko
On Wed 20-06-18 17:31:48, Michal Hocko wrote: > On Wed 20-06-18 11:18:12, Johannes Weiner wrote: [...] > > 1) Why warn for kernel allocations, but not userspace ones? This > > should have a comment at least. > > I am not sure I understand. We do warn for

Re: [RFC PATCH] memcg, oom: move out_of_memory back to the charge path

2018-06-20 Thread Michal Hocko
On Wed 20-06-18 11:18:12, Johannes Weiner wrote: > On Wed, Jun 20, 2018 at 12:37:36PM +0200, Michal Hocko wrote: [...] > > -static void mem_cgroup_oom(struct mem_cgroup *memcg, gfp_t mask, int order) > > +enum oom_status { > > + OOM_SUCCESS, > > + OO

Re: [PATCH] mm,oom: Bring OOM notifier callbacks to outside of OOM killer.

2018-06-20 Thread Michal Hocko
On Wed 20-06-18 21:21:21, Tetsuo Handa wrote: > On 2018/06/20 20:55, Michal Hocko wrote: > > On Wed 20-06-18 20:20:38, Tetsuo Handa wrote: > >> Sleeping with oom_lock held can cause AB-BA lockup bug because > >> __alloc_pages_may_oom() does n

Re: [patch] mm, oom: fix unnecessary killing of additional processes

2018-06-20 Thread Michal Hocko
On Tue 19-06-18 10:33:16, Michal Hocko wrote: [...] > As I've said, if you are not willing to work on a proper solution, I > will, but my nack holds for this patch until we see no other way around > existing and real world problems. OK, so I gave it a quick try and it doesn't look all

Re: [PATCH] mm,oom: Bring OOM notifier callbacks to outside of OOM killer.

2018-06-20 Thread Michal Hocko
t us from moving them to shrinkers instead? -- Michal Hocko SUSE Labs

Re: [PATCH] mm/madvise: allow MADV_DONTNEED to free memory that is MLOCK_ONFAULT

2018-06-20 Thread Michal Hocko
On Fri 15-06-18 15:36:07, Jason Baron wrote: > > > On 06/13/2018 03:15 AM, Michal Hocko wrote: > > On Wed 13-06-18 08:32:19, Vlastimil Babka wrote: [...] > >> BTW I didn't get why we should allow this for MADV_DONTNEED but not > >> MADV_FREE. Can you expand

[RFC PATCH] memcg, oom: move out_of_memory back to the charge path

2018-06-20 Thread Michal Hocko
From: Michal Hocko 3812c8c8f395 ("mm: memcg: do not trap chargers with full callstack on OOM") has changed the ENOMEM semantic of memcg charges. Rather than invoking the oom killer from the charging context it delays the oom killer to the page fault path (pagefault_out_of_memory). Th

Re: [RFC v2 PATCH 2/2] mm: mmap: zap pages with read mmap_sem for large mapping

2018-06-20 Thread Michal Hocko
ferences to these pages will generate > SIGSEGV.” Yes, this is true but I guess what Yang Shi meant was that an userspace access racing with munmap is not well defined. You never know whether you get your data, #PTF or SEGV because it depends on timing. The user visible change might be that you lose co

Re: [RFC v2 PATCH 2/2] mm: mmap: zap pages with read mmap_sem for large mapping

2018-06-20 Thread Michal Hocko
> I'm supposed this is safe as what Michal said before. I didn't get to read your patches carefully yet but I am wondering why do you need to split in the first place. Why cannot you simply unmap the range (madvise(DONTNEED)) under the read lock and then take the lock for write to finish the rest? -- Michal Hocko SUSE Labs

Re: [PATCH 1/2] arm64: avoid alloc memory on offline node

2018-06-19 Thread Michal Hocko
t copying this antipatern is not really nice. So it is good as a quick fix but it would be definitely much better to have a robust fix. Who knows how many other places might hit this. You certainly do not want to add a hack like this all over... -- Michal Hocko SUSE Labs

Re: [PATCH 1/2] arm64: avoid alloc memory on offline node

2018-06-19 Thread Michal Hocko
. Could you double check that zonelists for node 3 are generated correctly? -- Michal Hocko SUSE Labs

Re: dm bufio: Reduce dm_bufio_lock contention

2018-06-19 Thread Michal Hocko
ping. No real objection to fixing wrong __GFP_NORETRY usage. But __GFP_NORETRY can sleep. Nothing will really change in that regards. It does a reclaim and that _might_ sleep. But seriously, isn't the best way around the throttling issue to use PF_LESS_THROTTLE? -- Michal Hocko SUSE Labs

Re: [patch] mm, oom: fix unnecessary killing of additional processes

2018-06-19 Thread Michal Hocko
top > of mess and then being surprised that the result is a mess. Are we? The current oom_reaper certainly has some shortcomings that are addressable. We have started simple to cover most cases and move on with more complex heuristics based on real life bug reports. But we _do_ have a quite straightforward feedback based algorithm to reclaim oom victims. This is a solid ground for future development. Something we never had before. So I am really wondering what is all the mess about. -- Michal Hocko SUSE Labs

Re: [patch] mm, oom: fix unnecessary killing of additional processes

2018-06-19 Thread Michal Hocko
I do insist to come with a reasonable solution rather than random hacks. Jeez the oom killer was full of these. As I've said, if you are not willing to work on a proper solution, I will, but my nack holds for this patch until we see no other way around existing and real world problems. -- Michal Hocko SUSE Labs

Re: [PATCH v3] x86/e820: put !E820_TYPE_RAM regions into memblock.reserved

2018-06-15 Thread Michal Hocko
On Fri 15-06-18 10:00:00, Pavel Tatashin wrote: [...] > But, I think the 2nd patch with the optimization above should go along this > this fix. Yes, ideally with some numbers. -- Michal Hocko SUSE Labs

Re: [PATCH v3] x86/e820: put !E820_TYPE_RAM regions into memblock.reserved

2018-06-15 Thread Michal Hocko
ilable > ranges by putting them into memblock.reserved. > > Fixes: f7f99100d8d9 ("mm: stop zeroing memory during allocation in vmemmap") > Signed-off-by: Naoya Horiguchi > Tested-by: Oscar Salvador OK, this makes sense to me. It is definitely much better than the

Re: [PATCH v1] mm: zero remaining unavailable struct pages (Re: kernel panic in reading /proc/kpageflags when enabling RAM-simulated PMEM)

2018-06-15 Thread Michal Hocko
On Fri 15-06-18 01:07:22, Naoya Horiguchi wrote: > On Thu, Jun 14, 2018 at 09:00:50AM +0200, Michal Hocko wrote: > > On Thu 14-06-18 05:16:18, Naoya Horiguchi wrote: > > > On Wed, Jun 13, 2018 at 11:07:00AM +0200, Michal Hocko wrote: > > > > On Wed 13-06-18 0

Re: [patch] mm, oom: fix unnecessary killing of additional processes

2018-06-15 Thread Michal Hocko
guarantee forward progress. > > The reaping timeout is intentionally set for a substantial amount of time > since oom livelock is a very rare occurrence and it's better to optimize > for preventing additional (unnecessary) oom killing than a scenario that > is much more unlikely. &g

Re: [PATCH v1] mm: zero remaining unavailable struct pages (Re: kernel panic in reading /proc/kpageflags when enabling RAM-simulated PMEM)

2018-06-14 Thread Michal Hocko
On Thu 14-06-18 05:16:18, Naoya Horiguchi wrote: > On Wed, Jun 13, 2018 at 11:07:00AM +0200, Michal Hocko wrote: > > On Wed 13-06-18 05:41:08, Naoya Horiguchi wrote: > > [...] > > > From: Naoya Horiguchi > > > Date: Wed, 13 Jun 2018 12:43:27 +0900 > >

Re: [PATCH] mm: cma: honor __GFP_ZERO flag in cma_alloc()

2018-06-13 Thread Michal Hocko
for anything other than GFP_KERNEL btw.? If not then, shouldn't we simply drop the gfp argument altogether rather than give users a false hope for differen gfp modes that are not really supported and grow broken code? -- Michal Hocko SUSE Labs

Re: [rfc patch] mm, oom: fix unnecessary killing of additional processes

2018-06-13 Thread Michal Hocko
On Wed 13-06-18 22:20:49, Tetsuo Handa wrote: > On 2018/06/05 17:57, Michal Hocko wrote: > >> For this reason, we see testing harnesses often oom killed immediately > >> after running a unittest that stresses reclaim or compaction by inducing a > >> system-wide oom

Re: [PATCH] mm/madvise: allow MADV_DONTNEED to free memory that is MLOCK_ONFAULT

2018-06-13 Thread Michal Hocko
eal requires to analyze what that would mean for other vmas which are excluded now. -- Michal Hocko SUSE Labs

Re: [PATCH v1] mm: zero remaining unavailable struct pages (Re: kernel panic in reading /proc/kpageflags when enabling RAM-simulated PMEM)

2018-06-13 Thread Michal Hocko
pages in the > gap range are left uninitialized. > > We have a function zero_resv_unavail() which does zeroing the struct > pages outside memblock.memory, but currently it covers only the reserved > unavailable range (i.e. memblock.memory && !memblock.reserved). > This patch extends it to cover all unavailable range, which fixes > the reported issue. Thanks for pin pointing this down Naoya! I am wondering why we cannot simply mark the excluded ranges to be reserved instead. -- Michal Hocko SUSE Labs

Re: [PATCH] mm/madvise: allow MADV_DONTNEED to free memory that is MLOCK_ONFAULT

2018-06-13 Thread Michal Hocko
On Wed 13-06-18 09:51:23, Vlastimil Babka wrote: > On 06/13/2018 09:15 AM, Michal Hocko wrote: > > On Wed 13-06-18 08:32:19, Vlastimil Babka wrote: [...] > >> I think more concerning than guaranteeing no later major fault is > >> possible data loss, e.g. replacing

Re: [PATCH] mm/madvise: allow MADV_DONTNEED to free memory that is MLOCK_ONFAULT

2018-06-13 Thread Michal Hocko
On Wed 13-06-18 08:32:19, Vlastimil Babka wrote: > On 06/12/2018 04:11 PM, Jason Baron wrote: > > > > > > On 06/12/2018 03:46 AM, Michal Hocko wrote: > >> On Mon 11-06-18 12:23:58, Jason Baron wrote: > >>> On 06/11/2018 11:03 AM, Michal Hocko wrote: &g

Re: [PATCH 1/2] arm64: avoid alloc memory on offline node

2018-06-12 Thread Michal Hocko
On Tue 12-06-18 16:08:03, Punit Agrawal wrote: > Michal Hocko writes: [...] > > Well, the standard way to handle memory less NUMA nodes is to simply > > fallback to the closest NUMA node. We even have an API for that > > (numa_mem_id). > > CONFIG_HAVE_MEMORYLESS n

Re: [PATCH] mm/madvise: allow MADV_DONTNEED to free memory that is MLOCK_ONFAULT

2018-06-12 Thread Michal Hocko
On Mon 11-06-18 12:23:58, Jason Baron wrote: > On 06/11/2018 11:03 AM, Michal Hocko wrote: > > So can we start discussing whether we want to allow MADV_DONTNEED on > > mlocked areas and what downsides it might have? Sure it would turn the > > strong mlock guarantee to have t

Re: [PATCH] mm/madvise: allow MADV_DONTNEED to free memory that is MLOCK_ONFAULT

2018-06-11 Thread Michal Hocko
On Mon 11-06-18 10:51:44, Jason Baron wrote: > On 06/11/2018 03:20 AM, Michal Hocko wrote: > > [CCing linux-api - please make sure to CC this mailing list anytime you > > are touching user visible apis] > > > > On Fri 08-06-18 14:56:52, Jason Baron wrote: &

Re: [PATCH 1/2] arm64: avoid alloc memory on offline node

2018-06-11 Thread Michal Hocko
On Mon 11-06-18 08:43:03, Bjorn Helgaas wrote: > On Mon, Jun 11, 2018 at 08:32:10PM +0800, Xie XiuQi wrote: > > Hi Michal, > > > > On 2018/6/11 16:52, Michal Hocko wrote: > > > On Mon 11-06-18 11:23:18, Xie XiuQi wrote: > > >> Hi Michal, > >

Re: [PATCH v1 00/10] mm: online/offline 4MB chunks controlled by device driver

2018-06-11 Thread Michal Hocko
On Mon 11-06-18 13:53:49, David Hildenbrand wrote: > On 24.05.2018 23:07, David Hildenbrand wrote: > > On 24.05.2018 16:22, Michal Hocko wrote: > >> I will go over the rest of the email later I just wanted to make this > >> point clear because I suspect we

Re: [PATCH 1/2] arm64: avoid alloc memory on offline node

2018-06-11 Thread Michal Hocko
On Mon 11-06-18 11:23:18, Xie XiuQi wrote: > Hi Michal, > > On 2018/6/7 20:21, Michal Hocko wrote: > > On Thu 07-06-18 19:55:53, Hanjun Guo wrote: > >> On 2018/6/7 18:55, Michal Hocko wrote: > > [...] > >>> I am not sure I have the full context but p

Re: [PATCH] mm: fix null pointer dereference in mem_cgroup_protected

2018-06-11 Thread Michal Hocko
d). > Fixes: bf8d5d52ffe8 ("memcg: introduce memory.min") I guess. > Reported-by: Shakeel Butt > Signed-off-by: Roman Gushchin > Cc: Johannes Weiner > Cc: Michal Hocko > Cc: Andrew Morton Acked-by: Michal Hocko I really do not see why the whole min limit thing h

Re: [PATCH] mm/madvise: allow MADV_DONTNEED to free memory that is MLOCK_ONFAULT

2018-06-11 Thread Michal Hocko
K_ONFAULT is for userspace to know when pages > are locked in memory and thus to know when page faults will occur. > > Signed-off-by: Jason Baron > Cc: Andrew Morton > Cc: Michal Hocko > Cc: Vlastimil Babka > Cc: Joonsoo Kim > Cc: Mel Gorman > Cc: Kirill A. Shutem

Re: [PATCH v7 2/2] Refactor part of the oom report in dump_header

2018-06-11 Thread Michal Hocko
y), but I need to > > pass a new parameter(constraint) for oom_kill_process. > > Another option is to add the constraint to the oom_control structure. Which would make more sense because oom_control should contain the full OOM context. -- Michal Hocko SUSE Labs

Re: [PATCH 1/2] arm64: avoid alloc memory on offline node

2018-06-07 Thread Michal Hocko
On Thu 07-06-18 19:55:53, Hanjun Guo wrote: > On 2018/6/7 18:55, Michal Hocko wrote: [...] > > I am not sure I have the full context but pci_acpi_scan_root calls > > kzalloc_node(sizeof(*info), GFP_KERNEL, node) > > and that should fall back to whatever node that is

Re: [RFC][PATCH 1/2] memcg: Ensure every task that uses an mm is in the same memory cgroup

2018-06-07 Thread Michal Hocko
On Thu 07-06-18 06:42:49, Eric W. Biederman wrote: > Michal Hocko writes: [...] > > Btw. MMF_ALIEN_MM could be used in the OOM proper as well. > > There are two big issues I see with your suggested alternative. > 1) cgroupv1 the task interface. We still need to deny m

Re: [4.17 regression] Performance drop on kernel-4.17 visible on Stream, Linpack and NAS parallel benchmarks

2018-06-07 Thread Michal Hocko
top charts. > > So it seems that 4.17 is not doing a good job to move the memory to the right > NUMA > node after the process has been moved. > > 8< > > The above is an excerpt from performance testing on 4.16 and 4.17 kernels. > > For now I'm merely making

Re: [PATCH 1/2] arm64: avoid alloc memory on offline node

2018-06-07 Thread Michal Hocko
; > > > Acked-by: Will Deacon > > I agree, this doesn't feel like something we should be avoiding in the > caller of kzalloc_node(). > > I would not expect kzalloc_node() to return memory that's offline, no > matter what node we told it to allocate from. I could imagine it > returning failure, or returning memory from a node that *is* online, > but returning a pointer to offline memory seems broken. > > Are we putting memory that's offline in the free list? I don't know > where to look to figure this out. I am not sure I have the full context but pci_acpi_scan_root calls kzalloc_node(sizeof(*info), GFP_KERNEL, node) and that should fall back to whatever node that is online. Offline node shouldn't keep any pages behind. So there must be something else going on here and the patch is not the right way to handle it. What does faddr2line __alloc_pages_nodemask+0xf0 tells on this kernel? -- Michal Hocko SUSE Labs

Re: [PATCH] mremap: Remove LATENCY_LIMIT from mremap to reduce the number of TLB shootdowns

2018-06-07 Thread Michal Hocko
two patches can stack on top > of each other. Yes, I think the other patch still makes some sense. I do not see why it is not helping much but I hope we will learn that. This is a reasonable step in the meantime. I like the limit removal more than the previous version to tweak it. > Signed-off-

Re: [RFC][PATCH 1/2] memcg: Ensure every task that uses an mm is in the same memory cgroup

2018-06-06 Thread Michal Hocko
make sure to support migration if all processes sharing +* this mm are migrating together. +*/ + if (WARN_ON_ONCE(test_bit(MMF_ALIEN_MM, >flags))) { + mmput(mm); + return -EBUSY; + } + /* We move charges except for creative uses of CLONE_VM */ if (mm->memcg == from) { VM_BUG_ON(mc.from); -- Michal Hocko SUSE Labs

Re: [PATCH 2/3] mm: add find_alloc_contig_pages() interface

2018-06-06 Thread Michal Hocko
type before isolation. This is more a question to Vlastimil, Joonsoo. But my understanding is that it doesn't matter. MIGRATE_MOVABLE will not block other allocations. So we seem to need it only for MIGRATE_CMA. The later should die sooner or later hopefully so this awful kludge should just die with it. -- Michal Hocko SUSE Labs

<    4   5   6   7   8   9   10   11   12   13   >