Re: [PATCH] mm/migrate: initialize pud_entry in migrate_vma()

2019-08-26 Thread Vlastimil Babka
On 7/20/19 1:32 AM, Ralph Campbell wrote: > When CONFIG_MIGRATE_VMA_HELPER is enabled, migrate_vma() calls > migrate_vma_collect() which initializes a struct mm_walk but > didn't initialize mm_walk.pud_entry. (Found by code inspection) > Use a C structure initialization to make sure it is set to NU

[PATCH v2 1/2] mm, sl[ou]b: improve memory accounting

2019-08-26 Thread Vlastimil Babka
. (SLAB doesn't actually use page allocator directly, so no change there). Ideally SLOB and SLUB would be handled in separate patches, but due to the shared kmalloc_order() function and different kfree() implementations, it's easier to patch both at once to prevent inconsistencies. Signe

[PATCH v2 2/2] mm, sl[aou]b: guarantee natural alignment for kmalloc(power-of-two)

2019-08-26 Thread Vlastimil Babka
org/linux-btrfs/c3157c8e8e0e7588312b40c853f65c02fe6c957a.1566399731.git.christophe.le...@c-s.fr/ [2] https://lore.kernel.org/linux-fsdevel/20190225040904.5557-1-ming@redhat.com/ [3] https://lwn.net/Articles/787740/ Signed-off-by: Vlastimil Babka --- Documentation/core-api/memory-allocation.rst | 4 ++ inclu

Re: [PATCH 2/3] xfs: add kmem_alloc_io()

2019-08-22 Thread Vlastimil Babka
On 8/22/19 3:17 PM, Dave Chinner wrote: > On Thu, Aug 22, 2019 at 02:19:04PM +0200, Vlastimil Babka wrote: >> On 8/22/19 2:07 PM, Dave Chinner wrote: >> > On Thu, Aug 22, 2019 at 01:14:30PM +0200, Vlastimil Babka wrote: >> > >> > No, the problem is this

Re: [v2 PATCH -mm] mm: account deferred split THPs into MemAvailable

2019-08-22 Thread Vlastimil Babka
ted by before calling madvise(MADV_DONTNEED): >> MemAvailable: 43531960 kB >> AnonPages: 1096660 kB >> KReclaimable: 26156 kB >> AnonHugePages: 1056768 kB >> >> After calling madvise(MADV_DONTNEED): >> MemAvailable: 44411164 kB >> AnonPa

Re: [PATCH 2/3] xfs: add kmem_alloc_io()

2019-08-22 Thread Vlastimil Babka
On 8/22/19 2:07 PM, Dave Chinner wrote: > On Thu, Aug 22, 2019 at 01:14:30PM +0200, Vlastimil Babka wrote: > > No, the problem is this (using kmalloc as a general term for > allocation, whether it be kmalloc, kmem_cache_alloc, alloc_page, etc) > >some random kernel

Re: [PATCH 2/3] xfs: add kmem_alloc_io()

2019-08-22 Thread Vlastimil Babka
On 8/22/19 12:14 PM, Dave Chinner wrote: > On Thu, Aug 22, 2019 at 11:10:57AM +0200, Peter Zijlstra wrote: >> >> Ah, current_gfp_context() already seems to transfer PF_MEMALLOC_NOFS >> into the GFP flags. >> >> So are we sure it is broken and needs mending? > > Well, that's what we are trying to

[PATCH v2 0/4] debug_pagealloc improvements through page_owner

2019-08-20 Thread Vlastimil Babka
Patch 4. SLUB debug tracking additionaly stores cpu, pid and timestamp. This could be added later, if deemed useful enough to justify the additional page_ext structure size. Vlastimil Babka (4): mm, page_owner: handle THP splits correctly mm, page_owner: record page owner for each subpage

[PATCH v2 3/4] mm, page_owner: keep owner info when freeing the page

2019-08-20 Thread Vlastimil Babka
pages are irrelevant for the memory statistics or leak detection that's the typical use case of the file, anyway. Signed-off-by: Vlastimil Babka --- include/linux/page_ext.h | 1 + mm/page_owner.c | 34 -- 2 files changed, 25 insertions(+), 10 deletions(-

[PATCH v2 4/4] mm, page_owner, debug_pagealloc: save and dump freeing stack trace

2019-08-20 Thread Vlastimil Babka
__x64_sys_clone+0x75/0x80 do_syscall_64+0x6e/0x1e0 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x7f10af854a10 ... Signed-off-by: Vlastimil Babka --- .../admin-guide/kernel-parameters.txt | 2 + mm/Kconfig.debug | 4 +- mm/page_owner.c

[PATCH v2 2/4] mm, page_owner: record page owner for each subpage

2019-08-20 Thread Vlastimil Babka
rder allocations are compound pages with true head and tail pages). When reading the page_owner debugfs file, keep skipping the "tail" pages so that stats gathered by existing scripts don't get inflated. Signed-off-by: Vlastimil Babka --- mm/page_owner.c | 40 +

[PATCH v2 1/4] mm, page_owner: handle THP splits correctly

2019-08-20 Thread Vlastimil Babka
. This patch fixes that by adding the split_page_owner() call into __split_huge_page(). Fixes: a9627bc5e34e ("mm/page_owner: introduce split_page_owner and replace manual handling") Reported-by: Kirill A. Shutemov Cc: sta...@vger.kernel.org Signed-off-by: Vlastimil Babka --- mm/huge_me

Re: [PATCH] btrfs: fix allocation of bitmap pages.

2019-08-20 Thread Vlastimil Babka
On 8/20/19 4:30 AM, Christoph Hellwig wrote: > On Mon, Aug 19, 2019 at 07:46:00PM +0200, David Sterba wrote: >> Another thing that is lost is the slub debugging support for all >> architectures, because get_zeroed_pages lacking the red zones and sanity >> checks. >> >> I find working with raw page

Re: [RFC] mm: Proactive compaction

2019-08-20 Thread Vlastimil Babka
+CC Khalid Aziz who proposed a different approach: https://lore.kernel.org/linux-mm/20190813014012.30232-1-khalid.a...@oracle.com/T/#u On 8/16/19 11:43 PM, Nitin Gupta wrote: > For some applications we need to allocate almost all memory as > hugepages. However, on a running system, higher order al

Re: [PATCH] mm/page_alloc: cleanup __alloc_pages_direct_compact()

2019-08-19 Thread Vlastimil Babka
On 8/17/19 12:51 PM, Pengfei Li wrote: > This patch cleans up the if(page). > > No functional change. > > Signed-off-by: Pengfei Li I don't see much benefit here. The indentation wasn't that bad that it had to be reduced using goto. But the patch is not incorrect so I'm not NACKing. > --- > m

Re: [PATCH 1/3] mm, page_owner: record page owner for each subpage

2019-08-19 Thread Vlastimil Babka
On 8/19/19 1:57 PM, Kirill A. Shutemov wrote: > On Mon, Aug 19, 2019 at 11:55:51AM +, Kirill A. Shutemov wrote: >>> @@ -2533,6 +2534,8 @@ static void __split_huge_page(struct page *page, >>> struct list_head *list, >>> >>> remap_page(head); >>> >>> + split_page_owner(head, HPAGE_PMD_

Re: [PATCH 1/3] mm, page_owner: record page owner for each subpage

2019-08-19 Thread Vlastimil Babka
On 8/16/19 4:04 PM, Kirill A. Shutemov wrote: > On Fri, Aug 16, 2019 at 12:13:59PM +0200, Vlastimil Babka wrote: >> Currently, page owner info is only recorded for the first page of a >> high-order >> allocation, and copied to tail pages in the event of a split page. Wit

[PATCH 2/3] mm, page_owner: keep owner info when freeing the page

2019-08-16 Thread Vlastimil Babka
pages are irrelevant for the memory statistics or leak detection that's the typical use case of the file, anyway. Signed-off-by: Vlastimil Babka --- include/linux/page_ext.h | 1 + mm/page_owner.c | 34 -- 2 files changed, 25 insertions(+), 10 deletions(-

[PATCH 1/3] mm, page_owner: record page owner for each subpage

2019-08-16 Thread Vlastimil Babka
rder allocations are compound pages with true head and tail pages). When reading the page_owner debugfs file, keep skipping the "tail" pages so that stats gathered by existing scripts don't get inflated. Signed-off-by: Vlastimil Babka --- mm/page_owner.c | 40 +

[PATCH 0/3] debug_pagealloc improvements through page_owner

2019-08-16 Thread Vlastimil Babka
d and timestamp. This could be added later, if deemed useful enough to justify the additional page_ext structure size. Vlastimil Babka (3): mm, page_owner: record page owner for each subpage mm, page_owner: keep owner info when freeing the page mm, page_owner, debug_pagealloc: save and dump fr

[PATCH 3/3] mm, page_owner, debug_pagealloc: save and dump freeing stack trace

2019-08-16 Thread Vlastimil Babka
__x64_sys_clone+0x75/0x80 do_syscall_64+0x6e/0x1e0 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x7f10af854a10 ... Signed-off-by: Vlastimil Babka --- .../admin-guide/kernel-parameters.txt | 2 + mm/Kconfig.debug | 4 +- mm/page_owner.c

Re: [RFC PATCH 2/2] mm/gup: introduce vaddr_pin_pages_remote()

2019-08-16 Thread Vlastimil Babka
On 8/15/19 3:35 PM, Jan Kara wrote: >> >> So when the GUP user uses MMU notifiers to stop writing to pages whenever >> they are writeprotected with page_mkclean(), they don't really need page >> pin - their access is then fully equivalent to any other mmap userspace >> access and filesystem knows

Re: [RESEND PATCH 1/2 -mm] mm: account lazy free pages separately

2019-08-14 Thread Vlastimil Babka
On 8/12/19 7:00 PM, Yang Shi wrote: >> I can see that memcg rss size was the primary problem David was looking >> at. But MemAvailable will not help with that, right? Moreover is > > Yes, but David actually would like to have memcg MemAvailable (the > accounter like the global one), which should

Re: [RESEND PATCH 1/2 -mm] mm: account lazy free pages separately

2019-08-14 Thread Vlastimil Babka
On 8/9/19 8:26 PM, Yang Shi wrote: > Here the new counter is introduced for patch 2/2 to account deferred > split THPs into available memory since NR_ANON_THPS may contain > non-deferred split THPs. > > I could use an internal counter for deferred split THPs, but if it is > accounted by mod_nod

Re: [PATCH 3/3] mm/mmap.c: extract __vma_unlink_list as counter part for __vma_link_list

2019-08-14 Thread Vlastimil Babka
On 8/14/19 8:57 AM, Wei Yang wrote: > On Tue, Aug 13, 2019 at 10:16:11PM -0700, Christoph Hellwig wrote: >>Btw, is there any good reason we don't use a list_head for vma linkage? > > Not sure, maybe there is some historical reason? Seems it was single-linked until 2010 commit 297c5eee3724 ("mm: m

Re: [patch] mm, page_alloc: move_freepages should not examine struct page of reserved memory

2019-08-14 Thread Vlastimil Babka
On 8/13/19 7:22 PM, David Rientjes wrote: > On Tue, 13 Aug 2019, Vlastimil Babka wrote: > >> > After commit 907ec5fca3dc ("mm: zero remaining unavailable struct pages"), >> > struct page of reserved memory is zeroed. This causes page->flags to be 0 >>

Re: Let's talk about the elephant in the room - the Linux kernel's inability to gracefully handle low memory pressure

2019-08-13 Thread Vlastimil Babka
On 8/9/19 7:31 PM, Johannes Weiner wrote: >> It made a difference, but not enough, it seems. Before the patch I could >> observe "io:full avg10" around 75% and "memory:full avg10" around 20%, >> after the patch, "memory:full avg10" went to around 45%, while io stayed >> the same (BTW should the ref

Re: [patch] mm, page_alloc: move_freepages should not examine struct page of reserved memory

2019-08-13 Thread Vlastimil Babka
On 8/13/19 5:37 AM, David Rientjes wrote: > After commit 907ec5fca3dc ("mm: zero remaining unavailable struct pages"), > struct page of reserved memory is zeroed. This causes page->flags to be 0 > and fixes issues related to reading /proc/kpageflags, for example, of > reserved memory. > > The VM_

Re: [RFC PATCH v2] mm: slub: print kernel addresses in slub debug messages

2019-08-12 Thread Vlastimil Babka
On 8/9/19 4:46 AM, Matthew Wilcox wrote: > On Fri, Aug 09, 2019 at 09:08:37AM +0800, miles.c...@mediatek.com wrote: >> Possible approaches are: >> 1. stop printing kernel addresses >> 2. print with %pK, >> 3. print with %px. > > No. The point of obscuring kernel addresses is that if the attacker

Re: [PATCH] hugetlbfs: fix hugetlb page migration/fault race causing SIGBUS

2019-08-12 Thread Vlastimil Babka
On 8/12/19 10:45 AM, Michal Hocko wrote: > On Sun 11-08-19 19:46:14, Sasha Levin wrote: >> On Fri, Aug 09, 2019 at 03:17:18PM -0700, Andrew Morton wrote: >>> On Fri, 9 Aug 2019 08:46:33 +0200 Michal Hocko wrote: >>> >>> It should work if we ask stable trees maintainers not to backport >>> such pat

[tip:x86/mm] x86/kconfig: Remove X86_DIRECT_GBPAGES dependency on !DEBUG_PAGEALLOC

2019-08-12 Thread tip-bot for Vlastimil Babka
Commit-ID: 2e1da13fba4cb529c2c8c1d9f657690d1e853d7d Gitweb: https://git.kernel.org/tip/2e1da13fba4cb529c2c8c1d9f657690d1e853d7d Author: Vlastimil Babka AuthorDate: Wed, 7 Aug 2019 15:02:58 +0200 Committer: Thomas Gleixner CommitDate: Mon, 12 Aug 2019 14:52:30 +0200 x86/kconfig: Remove

Re: Let's talk about the elephant in the room - the Linux kernel's inability to gracefully handle low memory pressure

2019-08-09 Thread Vlastimil Babka
On 8/8/19 7:27 PM, Johannes Weiner wrote: > On Thu, Aug 08, 2019 at 04:47:18PM +0200, Vlastimil Babka wrote: >> On 8/7/19 10:51 PM, Johannes Weiner wrote: >>> From 9efda85451062dea4ea287a886e515efefeb1545 Mon Sep 17 00:00:00 2001 >>> From: Johannes Weiner >>>

Re: [PATCH] mm, vmscan: Do not special-case slab reclaim when watermarks are boosted

2019-08-09 Thread Vlastimil Babka
ctable and can lead to abnormal results for normal workloads. This > patch restores the expected behaviour that slab and page cache is > balanced consistently for a workload with a steady allocation ratio of > slab/pagecache pages. It also means that if there are workloads that > favour the preservation of slab over pagecache that it can be tuned via > vm.vfs_cache_pressure where as the vanilla kernel effectively ignores > the parameter when boosting is active. > > Fixes: 1c30844d2dfe ("mm: reclaim small amounts of memory when an external > fragmentation event occurs") > Signed-off-by: Mel Gorman > Reviewed-by: Dave Chinner > Cc: sta...@vger.kernel.org # v5.0+ Acked-by: Vlastimil Babka

Re: [PATCH 1/3] mm/mlock.c: convert put_page() to put_user_page*()

2019-08-09 Thread Vlastimil Babka
On 8/9/19 12:59 AM, John Hubbard wrote: >>> That's true. However, I'm not sure munlocking is where the >>> put_user_page() machinery is intended to be used anyway? These are >>> short-term pins for struct page manipulation, not e.g. dirtying of page >>> contents. Reading commit fc1d8e7cca2d I don't

Re: [PATCH v2] mm/mmap.c: refine find_vma_prev with rb_last

2019-08-09 Thread Vlastimil Babka
This patch refines find_vma_prev with rb_last to make it a little nicer > to read. > > Signed-off-by: Wei Yang Acked-by: Vlastimil Babka Nit below: > --- > v2: leverage rb_last > --- > mm/mmap.c | 9 +++-- > 1 file changed, 3 insertions(+), 6 deletions(-) > > diff --

Re: Let's talk about the elephant in the room - the Linux kernel's inability to gracefully handle low memory pressure

2019-08-08 Thread Vlastimil Babka
On 8/7/19 10:51 PM, Johannes Weiner wrote: > From 9efda85451062dea4ea287a886e515efefeb1545 Mon Sep 17 00:00:00 2001 > From: Johannes Weiner > Date: Mon, 5 Aug 2019 13:15:16 -0400 > Subject: [PATCH] psi: trigger the OOM killer on severe thrashing Thanks a lot, perhaps finally we are going to eat t

Re: [PATCH 1/3] mm/mlock.c: convert put_page() to put_user_page*()

2019-08-08 Thread Vlastimil Babka
On 8/8/19 8:21 AM, Michal Hocko wrote: > On Wed 07-08-19 16:32:08, John Hubbard wrote: >> On 8/7/19 4:01 AM, Michal Hocko wrote: >>> On Mon 05-08-19 15:20:17, john.hubb...@gmail.com wrote: From: John Hubbard For pages that were retained via get_user_pages*(), release those pages >>>

Re: [PATCH] mm/mmap.c: refine data locality of find_vma_prev

2019-08-08 Thread Vlastimil Babka
On 8/8/19 5:26 AM, Wei Yang wrote: > > @@ -2270,12 +2270,9 @@ find_vma_prev(struct mm_struct *mm, unsigned long addr, > if (vma) { > *pprev = vma->vm_prev; > } else { > - struct rb_node *rb_node = mm->mm_rb.rb_node; > - *pprev = NULL; > -

[PATCH] x86/kconfig: remove X86_DIRECT_GBPAGES dependency on !DEBUG_PAGEALLOC

2019-08-07 Thread Vlastimil Babka
but not enabled. Signed-off-by: Vlastimil Babka --- arch/x86/Kconfig | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 222855cc0158..58eae28c3dd6 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -1503,7 +1503,7 @@ config X86_5LEVEL

Re: [PATCH] mm/compaction: remove unnecessary zone parameter in isolate_migratepages()

2019-08-07 Thread Vlastimil Babka
On 8/6/19 5:16 PM, Pengfei Li wrote: > Like commit 40cacbcb3240 ("mm, compaction: remove unnecessary zone > parameter in some instances"), remove unnecessary zone parameter. > > No functional change. > > Signed-off-by: Pengfei Li Acked-by: Vlastimil Babka &

Re: oom-killer

2019-08-06 Thread Vlastimil Babka
On 8/5/19 5:34 PM, Pankaj Suryawanshi wrote: > On Mon, Aug 5, 2019 at 5:35 PM Michal Hocko wrote: >> >> On Mon 05-08-19 13:56:20, Vlastimil Babka wrote: >> > On 8/5/19 1:24 PM, Michal Hocko wrote: >> > >> [ 727.954355] CPU: 0 PID: 56 Comm: kworker/u8:2 T

Re: Let's talk about the elephant in the room - the Linux kernel's inability to gracefully handle low memory pressure

2019-08-06 Thread Vlastimil Babka
On 8/6/19 3:08 AM, Suren Baghdasaryan wrote: >> @@ -1280,3 +1285,50 @@ static int __init psi_proc_init(void) >> return 0; >> } >> module_init(psi_proc_init); >> + >> +#define OOM_PRESSURE_LEVEL 80 >> +#define OOM_PRESSURE_PERIOD(10 * NSEC_PER_SEC) > > 80% of the last 10 seconds s

Re: [PATCH] mm/mmap.c: refine data locality of find_vma_prev

2019-08-06 Thread Vlastimil Babka
On 8/6/19 10:11 AM, Wei Yang wrote: > When addr is out of the range of the whole rb_tree, pprev will points to > the biggest node. find_vma_prev gets is by going through the right most s/biggest/last/ ? or right-most? > node of the tree. > > Since only the last node is the one it is looking for,

Re: [PATCH] mm/mempolicy.c: Remove unnecessary nodemask check in kernel_migrate_pages()

2019-08-06 Thread Vlastimil Babka
to evolution of the code, where initially it was added to prevent calling the syscall with bogus nodes, but now that's achieved by cpuset_mems_allowed(). > Cc: Andrea Arcangeli > Cc: Andrew Morton > Cc: Dan Williams > Cc: Michal Hocko > Cc: Oscar Salvador > Cc:

Re: [PATCH v2 4/4] hugetlbfs: don't retry when pool page allocations start to fail

2019-08-06 Thread Vlastimil Babka
do not use the > aggressive retry algorithm on successive attempts. The allocation > will still succeed if there is memory available, but it will not try > as hard to free up memory. > > Signed-off-by: Mike Kravetz Acked-by: Vlastimil Babka Thanks.

Re: [PATCH V2] fork: Improve error message for corrupted page tables

2019-08-06 Thread Vlastimil Babka
ge print function (from printk(KERN_ALERT, ..) to pr_alert()) so > that it matches the other print statement. > > Cc: Ingo Molnar > Cc: Vlastimil Babka > Cc: Peter Zijlstra > Cc: Andrew Morton > Cc: Anshuman Khandual > Acked-by: Dave Hansen > Suggested-by: Dave Han

Re: [PATCH 1/3] mm, reclaim: make should_continue_reclaim perform dryrun detection

2019-08-05 Thread Vlastimil Babka
On 8/5/19 6:58 PM, Mike Kravetz wrote: >> Signed-off-by: Vlastimil Babka > > Acked-by: Mike Kravetz > > Would you like me to add this to the series, or do you want to send later? Please add, thanks!

Re: [PATCH] fork: Improve error message for corrupted page tables

2019-08-05 Thread Vlastimil Babka
On 8/2/19 8:46 AM, Prakhya, Sai Praneeth wrote: > +static const char * const resident_page_types[NR_MM_COUNTERS] = { > + "MM_FILEPAGES", > + "MM_ANONPAGES", > + "MM_SWAPENTS", > + "MM_SHMEMPAGES", > +}; But please let's not put this in a header file. We're asking

Re: [PATCH 1/3] mm, reclaim: make should_continue_reclaim performdryrun detection

2019-08-05 Thread Vlastimil Babka
On 8/5/19 11:27 AM, Hillf Danton wrote: BTW, can you please do something about your mail client's lack of In-Reply-To/References headers, which breaks threadings? See Documentation/process/email-clients.rst: Email clients should generate and maintain References: or In-Reply-To: headers so that ma

Re: Let's talk about the elephant in the room - the Linux kernel's inability to gracefully handle low memory pressure

2019-08-05 Thread Vlastimil Babka
On 8/4/19 11:23 AM, Artem S. Tashkinov wrote: > Hello, > > There's this bug which has been bugging many people for many years > already and which is reproducible in less than a few minutes under the > latest and greatest kernel, 5.2.6. All the kernel parameters are set to > defaults. > > Steps to

Re: oom-killer

2019-08-05 Thread Vlastimil Babka
On 8/5/19 1:24 PM, Michal Hocko wrote: >> [ 727.954355] CPU: 0 PID: 56 Comm: kworker/u8:2 Tainted: P O >> 4.14.65 #606 > [...] >> [ 728.029390] [] (oom_kill_process) from [] >> (out_of_memory+0x140/0x368) >> [ 728.037569] r10:0001 r9:c12169bc r8:0041 r7:c121e680 r6:c1216588

Re: [PATCH 1/3] mm, reclaim: make should_continue_reclaim perform dryrun detection

2019-08-05 Thread Vlastimil Babka
On 8/5/19 10:42 AM, Vlastimil Babka wrote: > On 8/3/19 12:39 AM, Mike Kravetz wrote: >> From: Hillf Danton >> >> Address the issue of should_continue_reclaim continuing true too often >> for __GFP_RETRY_MAYFAIL attempts when !nr_reclaimed and nr_scanned. >> This

Re: [PATCH 3/3] hugetlbfs: don't retry when pool page allocations start to fail

2019-08-05 Thread Vlastimil Babka
On 8/3/19 12:39 AM, Mike Kravetz wrote: > When allocating hugetlbfs pool pages via /proc/sys/vm/nr_hugepages, > the pages will be interleaved between all nodes of the system. If > nodes are not equal, it is quite possible for one node to fill up > before the others. When this happens, the code st

Re: [PATCH 2/3] mm, compaction: raise compaction priority after it withdrawns

2019-08-05 Thread Vlastimil Babka
On 8/3/19 12:39 AM, Mike Kravetz wrote: > From: Vlastimil Babka > > Mike Kravetz reports that "hugetlb allocations could stall for minutes or > hours > when should_compact_retry() would return true more often then it should. > Specifically, this was in the case w

Re: [PATCH 1/3] mm, reclaim: make should_continue_reclaim perform dryrun detection

2019-08-05 Thread Vlastimil Babka
ty of plenty of inactive pages. IOW with dryrun detected, we are > sure we have reclaimed as many pages as we could. > > Cc: Mike Kravetz > Cc: Mel Gorman > Cc: Michal Hocko > Cc: Vlastimil Babka > Cc: Johannes Weiner > Signed-off-by: Hillf Danton > Tested-by:

[PATCH STABLE 4.9] x86, mm, gup: prevent get_page() race with munmap in paravirt guest

2019-08-02 Thread Vlastimil Babka
52fc2b50de ("x86/mm: Enable RCU based page table freeing (CONFIG_HAVE_RCU_TABLE_FREE=y)"). That commit with follups should also be backported for full safety, although our reproducer didn't hit a problem without that backport. Reproduced-by: Oscar Salvador Signed-off-by: Vlastimil

Re: [RFC PATCH 2/3] mm, compaction: use MIN_COMPACT_COSTLY_PRIORITY everywhere for costly orders

2019-08-02 Thread Vlastimil Babka
On 8/1/19 10:33 PM, Mike Kravetz wrote: > On 8/1/19 6:01 AM, Vlastimil Babka wrote: >> Could you try testing the patch below instead? It should hopefully >> eliminate the stalls. If it makes hugepage allocation give up too early, >> we'll know we have to involve __GFP

Re: [RFC PATCH 2/3] mm, compaction: use MIN_COMPACT_COSTLY_PRIORITY everywhere for costly orders

2019-08-02 Thread Vlastimil Babka
On 8/1/19 10:33 PM, Mike Kravetz wrote: > On 8/1/19 6:01 AM, Vlastimil Babka wrote: >> Could you try testing the patch below instead? It should hopefully >> eliminate the stalls. If it makes hugepage allocation give up too early, >> we'll know we have to involve __GFP_RET

Re: [RFC PATCH 2/3] mm, compaction: use MIN_COMPACT_COSTLY_PRIORITY everywhere for costly orders

2019-08-01 Thread Vlastimil Babka
On 7/31/19 10:30 PM, Mike Kravetz wrote: > On 7/31/19 5:06 AM, Vlastimil Babka wrote: >> On 7/24/19 7:50 PM, Mike Kravetz wrote: >>> For PAGE_ALLOC_COSTLY_ORDER allocations, >>> MIN_COMPACT_COSTLY_PRIORITY is minimum (highest priority). Other >>> plac

Re: [RFC PATCH 1/3] mm, reclaim: make should_continue_reclaim perform dryrun detection

2019-08-01 Thread Vlastimil Babka
On 7/31/19 11:11 PM, Mike Kravetz wrote: > On 7/31/19 4:08 AM, Vlastimil Babka wrote: >> >> I agree this is an improvement overall, but perhaps the patch does too >> many things at once. The reshuffle is one thing and makes sense. The >> change of the last return

Re: [PATCH 4.9 57/83] mm: prevent get_user_pages() from overflowing page refcount

2019-07-31 Thread Vlastimil Babka
On 6/9/19 6:42 PM, Greg Kroah-Hartman wrote: > From: Linus Torvalds > > commit 8fde12ca79aff9b5ba951fce1a2641901b8d8e64 upstream. > > If the page refcount wraps around past zero, it will be freed while > there are still four billion references to it. One of the possible > avenues for an attacke

Re: [RFC PATCH 3/3] hugetlbfs: don't retry when pool page allocations start to fail

2019-07-31 Thread Vlastimil Babka
On 7/25/19 7:15 PM, Mike Kravetz wrote: > On 7/25/19 1:13 AM, Mel Gorman wrote: >> On Wed, Jul 24, 2019 at 10:50:14AM -0700, Mike Kravetz wrote: >>> When allocating hugetlbfs pool pages via /proc/sys/vm/nr_hugepages, >>> the pages will be interleaved between all nodes of the system. If >>> nodes a

Re: [RFC PATCH 2/3] mm, compaction: use MIN_COMPACT_COSTLY_PRIORITY everywhere for costly orders

2019-07-31 Thread Vlastimil Babka
On 7/24/19 7:50 PM, Mike Kravetz wrote: > For PAGE_ALLOC_COSTLY_ORDER allocations, MIN_COMPACT_COSTLY_PRIORITY is > minimum (highest priority). Other places in the compaction code key off > of MIN_COMPACT_PRIORITY. Costly order allocations will never get to > MIN_COMPACT_PRIORITY. Therefore, som

Re: [RFC PATCH 1/3] mm, reclaim: make should_continue_reclaim perform dryrun detection

2019-07-31 Thread Vlastimil Babka
ming pages if we know that there are not > enough inactive lru pages left to satisfy the costly allocation. > > We can give up reclaiming pages too if we see dryrun occur, with the > certainty of plenty of inactive pages. IOW with dryrun detected, we are > sure we have reclaimed as ma

Re: [PATCH] mm: compaction: Avoid 100% CPU usage during compaction when a task is killed

2019-07-25 Thread Vlastimil Babka
t; > I haven't included a Reported-and-tested-by as the reporters real name > is unknown but this was caught and repaired due to their testing and > tracing. If they want a tag added then hopefully they'll say so before > this gets merged. > > Bugzilla: htt

Re: [v4 PATCH 2/2] mm: mempolicy: handle vma with unmovable pages mapped correctly in mbind

2019-07-24 Thread Vlastimil Babka
On 7/23/19 7:35 AM, Yang Shi wrote: > > > On 7/22/19 6:02 PM, Andrew Morton wrote: >> On Mon, 22 Jul 2019 09:25:09 +0200 Vlastimil Babka wrote: >> >>>> since there may be pages off LRU temporarily. We should migrate other >>>> pages if MPOL_MF_M

Re: [v4 PATCH 2/2] mm: mempolicy: handle vma with unmovable pages mapped correctly in mbind

2019-07-22 Thread Vlastimil Babka
age_add() to check if the page is movable or not, if it > is unmovable, just return -EIO. But do not abort pte walk immediately, > since there may be pages off LRU temporarily. We should migrate other > pages if MPOL_MF_MOVE* is specified. Set has_unmovable flag if some > paged could not be not moved, then return -EIO for mbind() eventually. > > With this change the above test would return -EIO as expected. > > Cc: Vlastimil Babka > Cc: Michal Hocko > Cc: Mel Gorman > Signed-off-by: Yang Shi Reviewed-by: Vlastimil Babka Thanks!

Re: [v3 PATCH 2/2] mm: mempolicy: handle vma with unmovable pages mapped correctly in mbind

2019-07-19 Thread Vlastimil Babka
On 7/18/19 7:17 PM, Yang Shi wrote: > When running syzkaller internally, we ran into the below bug on 4.9.x > kernel: > > kernel BUG at mm/huge_memory.c:2124! > invalid opcode: [#1] SMP KASAN > Dumping ftrace buffer: >(ftrace buffer empty) > Modules linked in: > CPU: 0 PID: 1518 Comm: syz

Re: [v3 PATCH 1/2] mm: mempolicy: make the behavior consistent when MPOL_MF_MOVE* and MPOL_MF_STRICT were specified

2019-07-19 Thread Vlastimil Babka
whatever steps it deems necessary - attempt rollback, > determine which exact page(s) are violating the policy, etc. > > Make queue_pages_range() return 1 to indicate there are unmovable pages > or vma is not migratable. > > The #2 is not handled correctly in the current kernel, the fol

Re: incoming

2019-07-19 Thread Vlastimil Babka
On 7/19/19 12:56 AM, Andrew Morton wrote: > > The rest of MM and a kernel-wide procfs cleanup. > > > > Summary of the more significant patches: Thanks for that! Perhaps now it would be nice if this went also to linux-mm and lkml, as mm-commits is sort of hidden. Vlastimil

Re: [v3 PATCH 2/2] mm: thp: fix false negative of shmem vma's THP eligibility

2019-07-18 Thread Vlastimil Babka
On 7/18/19 11:44 PM, Andrew Morton wrote: > On Wed, 19 Jun 2019 09:28:42 -0700 Yang Shi > wrote: > >>> Sorry for replying rather late, and not in the v2 thread, but unlike >>> Hugh I'm not convinced that we should include vma size/alignment in the >>> test for reporting THPeligible, which was su

Re: [v2 PATCH 2/2] mm: mempolicy: handle vma with unmovable pages mapped correctly in mbind

2019-07-17 Thread Vlastimil Babka
On 7/17/19 8:23 PM, Yang Shi wrote: > > > On 7/16/19 10:28 AM, Yang Shi wrote: >> >> >> On 7/16/19 5:07 AM, Vlastimil Babka wrote: >>> On 6/22/19 2:20 AM, Yang Shi wrote: >>>> @@ -969,10 +975,21 @@ static long do_get_mempolicy(int *policy,

Re: incoming

2019-07-17 Thread Vlastimil Babka
On 7/17/19 6:13 PM, Linus Torvalds wrote: > On Wed, Jul 17, 2019 at 1:47 AM Vlastimil Babka wrote: >> >> So I've tried now to provide an example what I had in mind, below. > > I'll take it as a trial. I added one-line notes about coda and the > PTRACE_GET_S

Re: [v2 PATCH 1/2] mm: mempolicy: make the behavior consistent when MPOL_MF_MOVE* and MPOL_MF_STRICT were specified

2019-07-17 Thread Vlastimil Babka
On 7/16/19 7:18 PM, Yang Shi wrote: >> I think after your patch, you miss putback_movable_pages() in cases >> where some were queued, and later the walk returned -EIO. The previous >> code doesn't miss it, but it's also not obvious due to the multiple if >> (!err) checks. I would rewrite it some th

Re: incoming

2019-07-17 Thread Vlastimil Babka
On 7/17/19 1:25 AM, Andrew Morton wrote: > > Most of the rest of MM and just about all of the rest of everything > else. Hi, as I've mentioned at LSF/MM [1], I think it would be nice if mm pull requests had summaries similar to other subsystems. I see they are now more structured (thanks!), but

Re: [v2 PATCH 2/2] mm: mempolicy: handle vma with unmovable pages mapped correctly in mbind

2019-07-16 Thread Vlastimil Babka
On 6/22/19 2:20 AM, Yang Shi wrote: > @@ -969,10 +975,21 @@ static long do_get_mempolicy(int *policy, nodemask_t > *nmask, > /* > * page migration, thp tail pages can be passed. > */ > -static void migrate_page_add(struct page *page, struct list_head *pagelist, > +static int migrate_page_add(

Re: [v2 PATCH 1/2] mm: mempolicy: make the behavior consistent when MPOL_MF_MOVE* and MPOL_MF_STRICT were specified

2019-07-16 Thread Vlastimil Babka
On 7/16/19 10:12 AM, Vlastimil Babka wrote: >> --- a/mm/mempolicy.c >> +++ b/mm/mempolicy.c >> @@ -429,11 +429,14 @@ static inline bool queue_pages_required(struct page >> *page, >> } >> >> /* >> - * queue_pages_pmd() has three possible

Re: [v2 PATCH 1/2] mm: mempolicy: make the behavior consistent when MPOL_MF_MOVE* and MPOL_MF_STRICT were specified

2019-07-16 Thread Vlastimil Babka
whatever steps it deems necessary - attempt rollback, > determine which exact page(s) are violating the policy, etc. > > Make queue_pages_range() return 1 to indicate there are unmovable pages > or vma is not migratable. > > The #2 is not handled correctly in the current kernel, the fol

Re: [PATCH] mm/mempolicy: Fix an incorrect rebind node in mpol_rebind_nodemask

2019-06-27 Thread Vlastimil Babka
hanges with expected wrt actual results would be nice, but I think the above should be fine by itself) Reviewed-by: Vlastimil Babka > --- > mm/mempolicy.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/mm/mempolicy.c b/mm/mempolicy.c > index e3ab1d9..a60

Re: [PATCH] mm/mempolicy: Fix an incorrect rebind node in mpol_rebind_nodemask

2019-06-27 Thread Vlastimil Babka
On 6/27/19 5:57 AM, Andrew Morton wrote: > On Mon, 27 May 2019 21:58:17 +0800 zhong jiang wrote: > >> On 2019/5/27 20:23, Vlastimil Babka wrote: >>> On 5/25/19 8:28 PM, Andrew Morton wrote: >>>> (Cc Vlastimil) >>> Oh dear, 2 years and I forgot all the

Re: [PATCH] mm: fix setting the high and low watermarks

2019-06-23 Thread Vlastimil Babka
On 6/21/19 4:07 PM, Bharath Vedartham wrote: > Do you think this could cause a race condition between > __setup_per_zone_wmarks and pgdat_watermark_boosted which checks whether > the watermark_boost of each zone is non-zero? pgdat_watermark_boosted is > not called with a zone lock. > Here is a prob

Re: [PATCH] mm: fix setting the high and low watermarks

2019-06-21 Thread Vlastimil Babka
ks-seem-higher-than-predicted-by-documentation-sysctl-vm/525687 > > Signed-off-by: Alan Jenkins > Fixes: 1c30844d2dfe ("mm: reclaim small amounts of memory when an external > fragmentation event occurs") > Cc: sta...@vger.kernel.org Nice catch, tha

Re: [PATCH] mm: mempolicy: handle vma with unmovable pages mapped correctly in mbind

2019-06-21 Thread Vlastimil Babka
On 6/20/19 6:08 PM, Yang Shi wrote: > > > On 6/20/19 12:18 AM, Vlastimil Babka wrote: >> On 6/19/19 8:19 PM, Yang Shi wrote: >>>>>> This is getting even more muddy TBH. Is there any reason that we >>>>>> have to >>>>>> handle

Re: [PATCH RFC] proc/meminfo: add NetBuffers counter for socket buffers

2019-06-20 Thread Vlastimil Babka
On 5/15/19 1:55 PM, Konstantin Khlebnikov wrote: > Socket buffers always were dark-matter that lives by its own rules. Is the information even exported somewhere e.g. in sysfs or via netlink yet? > This patch adds line NetBuffers that exposes most common kinds of them. Did you encounter a situat

Re: [PATCH] mm: mempolicy: handle vma with unmovable pages mapped correctly in mbind

2019-06-20 Thread Vlastimil Babka
On 6/19/19 8:19 PM, Yang Shi wrote: This is getting even more muddy TBH. Is there any reason that we have to handle this problem during the isolation phase rather the migration? >>> I think it was already said that if pages can't be isolated, then >>> migration phase won't process t

Re: [v3 PATCH 2/2] mm: thp: fix false negative of shmem vma's THP eligibility

2019-06-19 Thread Vlastimil Babka
On 6/13/19 6:44 AM, Yang Shi wrote: > The commit 7635d9cbe832 ("mm, thp, proc: report THP eligibility for each > vma") introduced THPeligible bit for processes' smaps. But, when checking > the eligibility for shmem vma, __transparent_hugepage_enabled() is > called to override the result from shmem_

Re: [PATCH] mm: mempolicy: handle vma with unmovable pages mapped correctly in mbind

2019-06-19 Thread Vlastimil Babka
On 6/19/19 7:21 AM, Michal Hocko wrote: > On Tue 18-06-19 14:13:16, Yang Shi wrote: > [...] >> >> I used to have !__PageMovable(page), but it was removed since the >> aforementioned reason. I could add it back. >> >> For the temporary off LRU page, I did a quick search, it looks the most >> paths h

Re: [PATCH] mm: mempolicy: handle vma with unmovable pages mapped correctly in mbind

2019-06-19 Thread Vlastimil Babka
On 6/18/19 7:06 PM, Yang Shi wrote: > The BUG_ON was removed by commit > d44d363f65780f2ac2ec672164555af54896d40d ("mm: don't assume anonymous > pages have SwapBacked flag") since 4.12. Perhaps that commit should be sent to stable@ ? Although with VM_BUG_ON() this is less critical than plain BUG

Re: [PATCH 1/1] mm/page_owner: store page_owner's gfp_mask in stackdepot itself

2019-06-17 Thread Vlastimil Babka
On 6/7/19 7:53 AM, Sai Charan Sane wrote: > Memory overhead of 4MB is reduced by storing gfp_mask in stackdepot along > with stacktrace. Stackdepot memory usage increased by ~100kb for 4GB of RAM. > > Page owner logs from dmesg: > Before patch: > allocated 20971520 bytes of pag

Re: kernel BUG at mm/swap_state.c:170!

2019-06-17 Thread Vlastimil Babka
On 5/29/19 7:32 PM, Mikhail Gavrilov wrote: > On Wed, 29 May 2019 at 09:05, Mikhail Gavrilov > wrote: >> >> Hi folks. >> I am observed kernel panic after update to git tag 5.2-rc2. >> This crash happens at memory pressing when swap being used. >> >> Unfortunately in journalctl saved only this: >>

Re: kernel BUG at mm/swap_state.c:170!

2019-06-17 Thread Vlastimil Babka
On 6/16/19 12:12 PM, Mikhail Gavrilov wrote: > Hi, > I finished today bisecting kernel. > And first bad commit for me was cd736d8b67fb22a85a68c1ee8020eb0d660615ec That's commit "tcp: fix retrans timestamp on passive Fast Open" which is almost certainly not the culprit. > Can you look into this?

Re: [PATCH v2 1/3] fs/fuse, splice_write: Don't access pipe->buffers without pipe_lock()

2019-06-12 Thread Vlastimil Babka
On 7/17/18 6:00 PM, Andrey Ryabinin wrote: > fuse_dev_splice_write() reads pipe->buffers to determine the size of > 'bufs' array before taking the pipe_lock(). This is not safe as > another thread might change the 'pipe->buffers' between the allocation > and taking the pipe_lock(). So we end up wit

Re: question: should_compact_retry limit

2019-06-05 Thread Vlastimil Babka
On 6/5/19 6:05 PM, Mike Kravetz wrote: > On 6/5/19 12:58 AM, Vlastimil Babka wrote: >> On 6/5/19 1:30 AM, Mike Kravetz wrote: >> Hmm I guess we didn't expect compaction_withdrawn() to be so >> consistently returned. Do you know what value of compact_result is there >&g

Re: question: should_compact_retry limit

2019-06-05 Thread Vlastimil Babka
On 6/5/19 1:30 AM, Mike Kravetz wrote: > While looking at some really long hugetlb page allocation times, I noticed > instances where should_compact_retry() was returning true more often that > I expected. In one allocation attempt, it returned true 765668 times in a > row. To me, this was unexpe

[PATCH 0/3] debug_pagealloc improvements

2019-06-03 Thread Vlastimil Babka
configured in when building a distro kernel without extra overhead, and debugging page use after free or double free can be enabled simply by rebooting with debug_pagealloc=on. Vlastimil Babka (3): mm, debug_pagelloc: use static keys to enable debugging mm, page_alloc: more extensive free page

[PATCH 1/3] mm, debug_pagelloc: use static keys to enable debugging

2019-06-03 Thread Vlastimil Babka
rhead when not boot-enabled (including page allocator fast paths) using static keys. This patch introduces one for debug_pagealloc core functionality, and another for the optional guard page functionality (enabled by booting with debug_guardpage_minorder=X). Signed-off-by: Vlastimil Babka Cc: J

[PATCH 2/3] mm, page_alloc: more extensive free page checking with debug_pagealloc

2019-06-03 Thread Vlastimil Babka
being moved between pcplists and free lists *in addition* to when allocated from or freed to the pcplists. When debug_pagealloc is not enabled on boot, the overhead in fast paths should be virtually none thanks to the use of static key. Signed-off-by: Vlastimil Babka Cc: Mel Gorman --- m

[PATCH 3/3] mm, debug_pagealloc: use a page type instead of page_ext flag

2019-06-03 Thread Vlastimil Babka
when debug_pagealloc is enabled and there are no other features requiring the page_ext array. Signed-off-by: Vlastimil Babka Cc: Joonsoo Kim Cc: Matthew Wilcox --- .../admin-guide/kernel-parameters.txt | 10 ++--- include/linux/mm.h| 10 + include/linux

Re: [PATCH v10 1/3] mm: Shuffle initial free memory to improve memory-side-cache utilization

2019-05-31 Thread Vlastimil Babka
On 2/1/19 6:15 AM, Dan Williams wrote: > --- a/init/Kconfig > +++ b/init/Kconfig > @@ -1714,6 +1714,29 @@ config SLAB_FREELIST_HARDENED > sacrifies to harden the kernel slab allocator against common > freelist exploit methods. > > +config SHUFFLE_PAGE_ALLOCATOR > + bool "Page

Re: [PATCH v2] mm: mlockall error for flag MCL_ONFAULT

2019-05-27 Thread Vlastimil Babka
ckall() incorrectly. > > Fixes: b0f205c2a308 ("mm: mlock: add mlock flags to enable VM_LOCKONFAULT > usage") > Signed-off-by: Stefan Potyra > Reviewed-by: Daniel Jordan > Acked-by: Michal Hocko Acked-by: Vlastimil Babka Thanks, shame we didn't catch it dur

<    3   4   5   6   7   8   9   10   11   12   >