Re: [PATCH] KVM: SVM: Mark SEV launch secret pages as dirty.

2020-08-07 Thread David Rientjes
On Thu, 6 Aug 2020, Cfir Cohen wrote: > The LAUNCH_SECRET command performs encryption of the > launch secret memory contents. Mark pinned pages as > dirty, before unpinning them. > This matches the logic in sev_launch_update(). > > Signed-off-by: Cfir Cohen Acked-by: David Rientjes

Re: [PATCH v3] mm/slab.c: add node spinlock protect in __cache_free_alien

2020-07-30 Thread David Rientjes
On Thu, 30 Jul 2020, qiang.zh...@windriver.com wrote: > From: Zhang Qiang > > for example: > node0 > cpu0cpu1 > slab_dead_cpu >>mutex_lock(_mutex) > >cpuup_canceledslab_dead_cpu

Re: [PATCH] mm: slab: Avoid the use of one-element array and use struct_size() helper

2020-07-29 Thread David Rientjes
On Wed, 29 Jul 2020, Qianli Zhao wrote: > From: Qianli Zhao > > There is a regular need in the kernel to provide a way to declare having a > dynamically sized set of trailing elements in a structure. Kernel code should > always use “flexible array members”[1] for these cases. The older style of

Re: 回复: [PATCH] mm/slab.c: add node spinlock protect in __cache_free_alien

2020-07-29 Thread David Rientjes
On Wed, 29 Jul 2020, Zhang, Qiang wrote: > > From: Zhang Qiang > > > > We should add node spinlock protect "n->alien" which may be > > assigned to NULL in cpuup_canceled func. cause address access > > exception. > > > > >Hi, do you have an example NULL pointer dereference where you have hit >

Re: [PATCH] mm/slab.c: add node spinlock protect in __cache_free_alien

2020-07-28 Thread David Rientjes
On Tue, 28 Jul 2020, qiang.zh...@windriver.com wrote: > From: Zhang Qiang > > We should add node spinlock protect "n->alien" which may be > assigned to NULL in cpuup_canceled func. cause address access > exception. > Hi, do you have an example NULL pointer dereference where you have hit

Re: [PATCH 2/4] dma-pool: Get rid of dma_in_atomic_pool()

2020-07-09 Thread David Rientjes
On Thu, 9 Jul 2020, Nicolas Saenz Julienne wrote: > The function is only used once and can be simplified to a one-liner. > > Signed-off-by: Nicolas Saenz Julienne I'll leave this one to Christoph to decide on. One thing I really liked about hacking around in kernel/dma is the coding style,

Re: [PATCH] dma-pool: use single atomic pool for both DMA zones

2020-07-09 Thread David Rientjes
t; Hmm, this is not what I expected from the previous thread. I thought > > > we'd just use one dma pool based on runtime available of the zones.. > > > > I may be misunderstanding you, but isn't that going back to how things used > > to > > be before pulling i

Re: [PATCH] dma-pool: Do not allocate pool memory from CMA

2020-07-09 Thread David Rientjes
ional coherent pools to map to gfp > mask") > Reported-by: Jeremy Linton > Signed-off-by: Nicolas Saenz Julienne Acked-by: David Rientjes Thanks Nicolas!

Re: [BUG] XHCI getting ZONE_DMA32 memory > than its bus_dma_limit

2020-07-05 Thread David Rientjes
On Fri, 3 Jul 2020, Robin Murphy wrote: > > Just for the record the offending commit is: c84dc6e68a1d2 ("dma-pool: add > > additional coherent pools to map to gfp mask"). > > > > On Thu, 2020-07-02 at 12:49 -0500, Jeremy Linton wrote: > > > Hi, > > > > > > Using 5.8rc3: > > > > > > The rpi4

Re: [PATCH 3/3] mm/vmscan: replace implicit RECLAIM_ZONE checks with explicit checks

2020-07-01 Thread David Rientjes
On Wed, 1 Jul 2020, Dave Hansen wrote: > On 7/1/20 1:04 PM, Ben Widawsky wrote: > >> +static inline bool node_reclaim_enabled(void) > >> +{ > >> + /* Is any node_reclaim_mode bit set? */ > >> + return node_reclaim_mode & (RECLAIM_ZONE|RECLAIM_WRITE|RECLAIM_UNMAP); > >> +} > >> + > >> extern

Re: [PATCH 2/3] mm/vmscan: move RECLAIM* bits to uapi header

2020-07-01 Thread David Rientjes
> > Signed-off-by: Dave Hansen > Cc: Ben Widawsky > Cc: Alex Shi > Cc: Daniel Wagner > Cc: "Tobin C. Harding" > Cc: Christoph Lameter > Cc: Andrew Morton > Cc: Huang Ying > Cc: Dan Williams > Cc: Qian Cai > Cc: Daniel Wagner Acked-by: David Rientjes

Re: [PATCH 1/3] mm/vmscan: restore zone_reclaim_mode ABI

2020-07-01 Thread David Rientjes
Cc: "Tobin C. Harding" > Cc: Christoph Lameter > Cc: Andrew Morton > Cc: Huang Ying > Cc: Dan Williams > Cc: Qian Cai > Cc: Daniel Wagner > Cc: sta...@vger.kernel.org Acked-by: David Rientjes

Re: [PATCH 3/3] mm/vmscan: replace implicit RECLAIM_ZONE checks with explicit checks

2020-07-01 Thread David Rientjes
On Wed, 1 Jul 2020, Dave Hansen wrote: > diff -puN include/linux/swap.h~mm-vmscan-node_reclaim_mode_helper > include/linux/swap.h > --- a/include/linux/swap.h~mm-vmscan-node_reclaim_mode_helper 2020-07-01 > 08:22:13.650955330 -0700 > +++ b/include/linux/swap.h2020-07-01 08:22:13.659955330

Re: [RFC][PATCH 3/8] mm/vmscan: Attempt to migrate page in lieu of discard

2020-07-01 Thread David Rientjes
On Wed, 1 Jul 2020, Dave Hansen wrote: > Even if they don't allocate directly from PMEM, is it OK for such an app > to get its cold data migrated to PMEM? That's a much more subtle > question and I suspect the kernel isn't going to have a single answer > for it. I suspect we'll need a

Re: [RFC][PATCH 3/8] mm/vmscan: Attempt to migrate page in lieu of discard

2020-07-01 Thread David Rientjes
On Wed, 1 Jul 2020, Yang Shi wrote: > > We can do this if we consider pmem not to be a separate memory tier from > > the system perspective, however, but rather the socket perspective. In > > other words, a node can only demote to a series of exclusive pmem ranges > > and promote to the same

Re: [RFC][PATCH 3/8] mm/vmscan: Attempt to migrate page in lieu of discard

2020-07-01 Thread David Rientjes
On Wed, 1 Jul 2020, Dave Hansen wrote: > > Could this cause us to break a user's mbind() or allow a user to > > circumvent their cpuset.mems? > > In its current form, yes. > > My current rationale for this is that while it's not as deferential as > it can be to the user/kernel ABI contract,

Re: [PATCH v3] mm, slab: Check GFP_SLAB_BUG_MASK before alloc_pages in kmalloc_order

2020-07-01 Thread David Rientjes
On Wed, 1 Jul 2020, Long Li wrote: > diff --git a/mm/slab.c b/mm/slab.c > index ac7a223d9ac3..2850fe3c5fb8 100644 > --- a/mm/slab.c > +++ b/mm/slab.c > @@ -2573,13 +2573,9 @@ static struct page *cache_grow_begin(struct kmem_cache > *cachep, >* Be lazy and only check for valid flags here,

Re: [RFC][PATCH 3/8] mm/vmscan: Attempt to migrate page in lieu of discard

2020-06-30 Thread David Rientjes
On Tue, 30 Jun 2020, Yang Shi wrote: > > > From: Dave Hansen > > > > > > If a memory node has a preferred migration path to demote cold pages, > > > attempt to move those inactive pages to that migration node before > > > reclaiming. This will better utilize available memory, provide a faster >

Re: [RFC][PATCH 3/8] mm/vmscan: Attempt to migrate page in lieu of discard

2020-06-30 Thread David Rientjes
t; > #Signed-off-by: Keith Busch > Signed-off-by: Dave Hansen > Cc: Keith Busch > Cc: Yang Shi > Cc: David Rientjes > Cc: Huang Ying > Cc: Dan Williams > --- > > b/include/linux/migrate.h|6 > b/includ

Re: [PATCH v4 01/26] mm: Do page fault accounting in handle_mm_fault

2020-06-30 Thread David Rientjes
On Tue, 30 Jun 2020, Peter Xu wrote: > @@ -4408,6 +4440,34 @@ vm_fault_t handle_mm_fault(struct vm_area_struct *vma, > unsigned long address, > mem_cgroup_oom_synchronize(false); > } > > + if (ret & (VM_FAULT_RETRY | VM_FAULT_ERROR)) > + return ret;

Re: ERROR: "min_low_pfn" undefined!

2020-06-29 Thread David Rientjes
On Tue, 30 Jun 2020, kernel test robot wrote: > Hi Alexander, > > FYI, the error/warning still remains. > > tree: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git > master > head: 7c30b859a947535f2213277e827d7ac7dcff9c84 > commit: f220df66f67684246ae1bf4a4e479efc7c2f325a

Re:Re: [PATCH] mm: remove the redundancy code

2020-06-29 Thread David Rientjes
On Tue, 30 Jun 2020, 苏辉 wrote: > I am sorry that i did not consider the memory hotplug case, > and i think we should add a new param to distinguish two different cases. > No need, we can simply continue setting zone->zone_start_pfn unless there is a bug to be fixed (and, if so, please send a

Re: [PATCH] mm: remove the redundancy code

2020-06-29 Thread David Rientjes
On Tue, 30 Jun 2020, Su Hui wrote: > remove the redundancy code, the zone_start_pfn > is assigned from zone->zone_start_pfn > Signed-off-by: Su Hui I don't think this is redundant, it's used by memory hotplug when onlining new memory. > --- > mm/page_alloc.c | 2 -- > 1 file changed, 2

Re: [patch] dma-pool: warn when coherent pool is depleted

2020-06-27 Thread David Rientjes
> > kernel command line. > > > > Provide some guidance on the failure and a recommended minimum size for > > the pools (double the size). > > > > Signed-off-by: David Rientjes > > Tested-by: Guenter Roeck > > Also confirmed that coherent_pool=256k works

Re: [mm, slub] c91e241f56: WARNING:at_mm/slub.c:#kmem_cache_open

2020-06-27 Thread David Rientjes
On Sat, 27 Jun 2020, kernel test robot wrote: > Greeting, > > FYI, we noticed the following commit (built with gcc-9): > > commit: c91e241f569e7f9b0e2946841ef884b22a09f624 ("mm, slub: introduce > kmem_cache_debug_flags()-fix") > https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git

Re: [PATCH v8 1/4] mm/madvise: pass task and mm to do_madvise

2020-06-24 Thread David Rientjes
hannes Weiner > Cc: Shakeel Butt > Cc: John Dias > Cc: Joel Fernandes > Cc: Alexander Duyck > Cc: SeongJae Park > Cc: Christian Brauner > Cc: Kirill Tkhai > Cc: Oleksandr Natalenko > Cc: SeongJae Park > Cc: Christian Brauner > Cc: Acked-by: David Rientjes

Re: [PATCH v8 3/4] mm/madvise: introduce process_madvise() syscall: an external memory hinting API

2020-06-24 Thread David Rientjes
te use case for MADV_HUGEPAGE, which is overloaded. Today, MADV_HUGEPAGE controls enablement depending on system config and controls defrag behavior based on system config. It also cannot be opted out of without setting MADV_NOHUGEPAGE :) I was thinking of a flag that users could use to trigger an immediate collapse in process context regardless of the system config. So I'm a big advocate of this flags parameter and consider it an absolute must for the API. Acked-by: David Rientjes

Re: [PATCH v8 2/4] pid: move pidfd_get_pid() to pid.c

2020-06-24 Thread David Rientjes
: Johannes Weiner > Cc: John Dias > Cc: Kirill Tkhai > Cc: Michal Hocko > Cc: Oleksandr Natalenko > Cc: Sandeep Patil > Cc: SeongJae Park > Cc: SeongJae Park > Cc: Shakeel Butt > Cc: Sonny Rao > Cc: Tim Murray > Cc: Christian Brauner > Cc: Acked-by: David Rientjes

Re: [PATCH v8 4/4] mm/madvise: check fatal signal pending of target process

2020-06-24 Thread David Rientjes
: Kirill Tkhai > Cc: Michal Hocko > Cc: Oleksandr Natalenko > Cc: Sandeep Patil > Cc: SeongJae Park > Cc: SeongJae Park > Cc: Shakeel Butt > Cc: Sonny Rao > Cc: Tim Murray > Cc: Christian Brauner > Cc: Acked-by: David Rientjes

[patch] dma-pool: warn when coherent pool is depleted

2020-06-21 Thread David Rientjes
for the pools (double the size). Signed-off-by: David Rientjes --- kernel/dma/pool.c | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/kernel/dma/pool.c b/kernel/dma/pool.c --- a/kernel/dma/pool.c +++ b/kernel/dma/pool.c @@ -239,12 +239,16 @@ void *dma_alloc_from_pool(struct

Re: [PATCH v2] dma-pool: Fix too large DMA pools on medium systems

2020-06-21 Thread David Rientjes
On Sun, 21 Jun 2020, Guenter Roeck wrote: > >> This patch results in a boot failure in some of my powerpc boot tests, > >> specifically those testing boots from mptsas1068 devices. Error message: > >> > >> mptsas :00:02.0: enabling device ( -> 0002) > >> mptbase: ioc0: Initiating bringup

Re: kernel BUG at mm/huge_memory.c:2613!

2020-06-21 Thread David Rientjes
_pcm_hw_params+0x3fd/0x490 [snd_pcm] > > > [ 40.287593] snd_pcm_common_ioctl+0x1c5/0x1110 [snd_pcm] > > > [ 40.287601] ? snd_pcm_info_user+0x64/0x80 [snd_pcm] > > > [ 40.287608] snd_pcm_ioctl+0x23/0x30 [snd_pcm] > > > [ 40.287613] ksys_ioctl+0x8

Re: kernel BUG at mm/huge_memory.c:2613!

2020-06-19 Thread David Rientjes
4d/0x90 > [ 40.287627] entry_SYSCALL_64_after_hwframe+0x44/0xa9 Hi Roman, If you have CONFIG_AMD_MEM_ENCRYPT set, this should be resolved by commit dbed452a078d56bc7f1abecc3edd6a75e8e4484e Author: David Rientjes Date: Thu Jun 11 00:25:57 2020 -0700 dma-pool: decouple DMA_REMAP from DMA_COHERENT_POOL Or you might want to wait for 5.8-rc2 instead which includes this fix.

Re: [PATCH] dma-mapping: DMA_COHERENT_POOL should select GENERIC_ALLOCATOR

2020-06-18 Thread David Rientjes
On Thu, 18 Jun 2020, Christoph Hellwig wrote: > The dma coherent pool code needs genalloc. Move the select over > from DMA_REMAP, which doesn't actually need it. > > Fixes: dbed452a078d ("dma-pool: decouple DMA_REMAP from DMA_COHERENT_POOL") > Reported-by: kernel tes

Re: Linux 5.8-rc1 BUG unable to handle page fault (snd_pcm)

2020-06-15 Thread David Rientjes
On Mon, 15 Jun 2020, Shuah Khan wrote: > I am seeing the following problem on my system. I haven't started debug > yet. Is this a known issue? > > [9.791309] BUG: unable to handle page fault for address: b1e78165d000 > [9.791328] #PF: supervisor write access in kernel mode > [

[patch for-5.8 3/4] dma-direct: check return value when encrypting or decrypting memory

2020-06-11 Thread David Rientjes
, there is no alternative other than to leak the memory. Fixes: c10f07aa27da ("dma/direct: Handle force decryption for DMA coherent buffers in common code") Cc: sta...@vger.kernel.org # 4.17+ Signed-off-by: David Rientjes --- kernel/dma/direct.c | 19 ++- 1 file changed, 14

[patch for-5.8 2/4] dma-direct: re-encrypt memory if dma_direct_alloc_pages() fails

2020-06-11 Thread David Rientjes
If arch_dma_set_uncached() fails after memory has been decrypted, it needs to be re-encrypted before freeing. Fixes: fa7e2247c572 ("dma-direct: make uncached_kernel_address more general") Cc: sta...@vger.kernel.org # 5.7 Signed-off-by: David Rientjes --- kernel/dma/direct.c | 6

[patch for-5.8 1/4] dma-direct: always align allocation size in dma_direct_alloc_pages()

2020-06-11 Thread David Rientjes
cation size for dma_alloc_need_uncached() when CONFIG_DMA_DIRECT_REMAP is disabled but CONFIG_ARCH_HAS_DMA_SET_UNCACHED is enabled. Cc: sta...@vger.kernel.org Signed-off-by: David Rientjes --- kernel/dma/direct.c | 17 ++--- 1 file changed, 10 insertions(+), 7 deletions(-) diff

[patch for-5.8 4/4] dma-direct: add missing set_memory_decrypted() for coherent mapping

2020-06-11 Thread David Rientjes
When a coherent mapping is created in dma_direct_alloc_pages(), it needs to be decrypted if the device requires unencrypted DMA before returning. Fixes: 3acac065508f ("dma-mapping: merge the generic remapping helpers into dma-direct") Cc: sta...@vger.kernel.org # 5.5+ Signed-off

[patch for-5.8 0/4] dma-direct: dma_direct_alloc_pages() fixes for AMD SEV

2020-06-11 Thread David Rientjes
While debugging recently reported issues concerning DMA allocation practices when CONFIG_AMD_MEM_ENCRYPT is enabled, some curiosities arose when looking at dma_direct_alloc_pages() behavior. Fix these up. These are likely all stable material, so proposing for 5.8. --- kernel/dma/direct.c | 42

[patch for-5.8] dma-pool: decouple DMA_REMAP from DMA_COHERENT_POOL

2020-06-11 Thread David Rientjes
: 82fef0ad811f ("x86/mm: unencrypted non-blocking DMA allocations use coherent pools") Suggested-by: Christoph Hellwig Signed-off-by: David Rientjes --- kernel/dma/Kconfig | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/kernel/dma/Kconfig b/kernel/dma/Kconfig --

Re: next-0519 on thinkpad x60: sound related? window manager crash

2020-06-09 Thread David Rientjes
On Tue, 9 Jun 2020, Christoph Hellwig wrote: > > Working theory is that CONFIG_DMA_NONCOHERENT_MMAP getting set is causing > > the error_code in the page fault path. Debugging with Alex off-thread we > > found that dma_{alloc,free}_from_pool() are not getting called from the > > new code in

Re: next-0519 on thinkpad x60: sound related? window manager crash

2020-06-08 Thread David Rientjes
On Mon, 8 Jun 2020, Alex Xu (Hello71) wrote: > Excerpts from Christoph Hellwig's message of June 8, 2020 2:19 am: > > Can you do a listing using gdb where this happens? > > > > gdb vmlinux > > > > l *(snd_pcm_hw_params+0x3f3) > > > > ? > > > > (gdb) l *(snd_pcm_hw_params+0x3f3) >

Re: [PATCH v2] dma-pool: Fix too large DMA pools on medium systems

2020-06-08 Thread David Rientjes
by: Geert Uytterhoeven This works as well and is much more readable. Thanks Geert! Acked-by: David Rientjes

Re: 82fef0ad811f "x86/mm: unencrypted non-blocking DMA allocations use coherent pools" was Re: next-0519 on thinkpad x60: sound related? window manager crash

2020-06-07 Thread David Rientjes
On Sun, 7 Jun 2020, Alex Xu (Hello71) wrote: > > On Sun, 7 Jun 2020, Pavel Machek wrote: > > > >> > I have a similar issue, caused between aaa2faab4ed8 and b170290c2836. > >> > > >> > [ 20.263098] BUG: unable to handle page fault for address: > >> > b2b582cc2000 > >> > [ 20.263104]

Re: 82fef0ad811f "x86/mm: unencrypted non-blocking DMA allocations use coherent pools" was Re: next-0519 on thinkpad x60: sound related? window manager crash

2020-06-07 Thread David Rientjes
On Sun, 7 Jun 2020, Pavel Machek wrote: > > I have a similar issue, caused between aaa2faab4ed8 and b170290c2836. > > > > [ 20.263098] BUG: unable to handle page fault for address: > > b2b582cc2000 > > [ 20.263104] #PF: supervisor write access in kernel mode > > [ 20.263105] #PF:

Re: [PATCH] mm: thp: Add new kernel parameters transparent_hugepage_defrag/khugepaged_defrag

2020-06-03 Thread David Rientjes
On Wed, 3 Jun 2020, Vlastimil Babka wrote: > > There is no way to set up the defrag options in boot time. And it's > > useful to set it up by default instead of making it work by a > > systemd/upstart service or put the command to set up defrag inside > > /etc/rc.local. > > > > Signed-off-by:

Re: [PATCH] mm, compaction: Indicate when compaction is manually triggered by sysctl

2020-05-10 Thread David Rientjes
On Fri, 8 May 2020, Guilherme Piccoli wrote: > On Fri, May 8, 2020 at 3:31 PM David Rientjes wrote: > > It doesn't make sense because it's only being done here for the entire > > system, there are also per-node sysfs triggers so you could do something > > like iterate ove

Re: [PATCH] mm, compaction: Indicate when compaction is manually triggered by sysctl

2020-05-08 Thread David Rientjes
On Thu, 7 May 2020, Guilherme G. Piccoli wrote: > Well...you can think that the problem we are trying to solve was more > like...admin forgot if they triggered or not the compaction hehe > So, counting on the user to keep track of it is what I'd like to > avoid. And thinking about drop_caches

Re: [PATCH] slab: Replace zero-length array with flexible-array

2020-05-08 Thread David Rientjes
le. > > [1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html > [2] https://github.com/KSPP/linux/issues/21 > [3] commit 76497732932f ("cxgb3/l2t: Fix undefined behaviour") > > Signed-off-by: Gustavo A. R. Silva Acked-by: David Rientjes

Re: [PATCH] slub: limit count of partial slabs scanned to gather statistics

2020-05-07 Thread David Rientjes
On Thu, 7 May 2020, Konstantin Khlebnikov wrote: > > > > To get exact count of free and used objects slub have to scan list of > > > > partial slabs. This may take at long time. Scanning holds spinlock and > > > > blocks allocations which move partial slabs to per-cpu lists and back. > > > > > >

Re: [PATCH v2 0/5] Statsfs: a new ram-based file sytem for Linux kernel statistics

2020-05-05 Thread David Rientjes
On Tue, 5 May 2020, Paolo Bonzini wrote: > >>> Since this is becoming a generic API (good!!), maybe we can discuss > >>> possible ways to optimize gathering of stats in mass? > >> Sure, the idea of a binary format was considered from the beginning in > >> [1], and it can be done either together

Re: [PATCH v2 0/5] Statsfs: a new ram-based file sytem for Linux kernel statistics

2020-05-04 Thread David Rientjes
On Mon, 4 May 2020, Emanuele Giuseppe Esposito wrote: > There is currently no common way for Linux kernel subsystems to expose > statistics to userspace shared throughout the Linux kernel; subsystems > have to take care of gathering and displaying statistics by themselves, > for example in the

Re: [PATCH v5.6-rt] mm: slub: Always flush the delayed empty slubs in flush_all()

2020-05-04 Thread David Rientjes
_slab() > reference to released kmem_cache > > Fixes: f0b231101c94 ("mm/SLUB: delay giving back empty slubs to IRQ enabled > regions") > Signed-off-by: Kevin Hao Acked-by: David Rientjes

Re: [PATCH] slub: limit count of partial slabs scanned to gather statistics

2020-05-04 Thread David Rientjes
On Mon, 4 May 2020, Konstantin Khlebnikov wrote: > To get exact count of free and used objects slub have to scan list of > partial slabs. This may take at long time. Scanning holds spinlock and > blocks allocations which move partial slabs to per-cpu lists and back. > > Example found in the

Re: [patch] mm, oom: stop reclaiming if GFP_ATOMIC will start failing soon

2020-04-28 Thread David Rientjes
On Tue, 28 Apr 2020, Vlastimil Babka wrote: > > I took a look at doing a quick-fix for the > > direct-reclaimers-get-their-stuff-stolen issue about a million years > > ago. I don't recall where it ended up. It's pretty trivial for the > > direct reclaimer to free pages into

Re: [PATCH] mm/vmstat: Reduce zone lock hold time when reading /proc/pagetypeinfo

2019-10-22 Thread David Rientjes
On Tue, 22 Oct 2019, Waiman Long wrote: > >>> and used nr_free to compute the missing count. Since MIGRATE_MOVABLE > >>> is usually the largest one on large memory systems, this is the one > >>> to be skipped. Since the printing order is migration-type => order, we > >>> will have to store the

Re: [PATCH] mm: update comments in slub.c

2019-10-13 Thread David Rientjes
On Mon, 7 Oct 2019, Yu Zhao wrote: > Slub doesn't use PG_active and PG_error anymore. > > Signed-off-by: Yu Zhao Acked-by: David Rientjes

Re: [rfc] mm, hugetlb: allow hugepage allocations to excessively reclaim

2019-10-04 Thread David Rientjes
On Fri, 4 Oct 2019, Michal Hocko wrote: > Requesting the userspace to drop _all_ page cache in order allocate a > number of hugetlb pages or any other affected __GFP_RETRY_MAYFAIL > requests is simply not reasonable IMHO. It can be used as a fallback when writing to nr_hugepages and the amount

Re: [PATCH] mm/slub: fix a deadlock in show_slab_objects()

2019-10-03 Thread David Rientjes
On Thu, 3 Oct 2019, Qian Cai wrote: > > > diff --git a/mm/slub.c b/mm/slub.c > > > index 42c1b3af3c98..922cdcf5758a 100644 > > > --- a/mm/slub.c > > > +++ b/mm/slub.c > > > @@ -4838,7 +4838,15 @@ static ssize_t show_slab_objects(struct kmem_cache > > > *s, > > > } > > > } > > > > >

Re: [PATCH] mm/slub: fix a deadlock in show_slab_objects()

2019-10-03 Thread David Rientjes
On Thu, 3 Oct 2019, Qian Cai wrote: > diff --git a/mm/slub.c b/mm/slub.c > index 42c1b3af3c98..922cdcf5758a 100644 > --- a/mm/slub.c > +++ b/mm/slub.c > @@ -4838,7 +4838,15 @@ static ssize_t show_slab_objects(struct kmem_cache *s, > } > } > > - get_online_mems(); > +/* >

Re: [rfc] mm, hugetlb: allow hugepage allocations to excessively reclaim

2019-10-03 Thread David Rientjes
On Thu, 3 Oct 2019, Vlastimil Babka wrote: > I think the key differences between Mike's tests and Michal's is this part > from Mike's mail linked above: > > "I 'tested' by simply creating some background activity and then seeing > how many hugetlb pages could be allocated. Of course, many tries

Re: [PATCH] mm: vmscan: remove unused scan_control parameter from pageout()

2019-10-03 Thread David Rientjes
On Fri, 4 Oct 2019, Yang Shi wrote: > Since lumpy reclaim was removed in v3.5 scan_control is not used by > may_write_to_{queue|inode} and pageout() anymore, remove the unused > parameter. > > Cc: Mel Gorman > Cc: Johannes Weiner > Cc: Michal Hocko > Signed-off-by: Ya

[rfc] mm, hugetlb: allow hugepage allocations to excessively reclaim

2019-10-02 Thread David Rientjes
by commit b39d0ee2632d but hugetlb allocations are admittedly beyond the scope of what the patch is intended to address (thp allocations). Cc: Mike Kravetz Signed-off-by: David Rientjes --- Mike, you eluded that you may want to opt hugetlbfs out of this for the time being in https://marc.info/?l=li

Re: [patch for-5.3 0/4] revert immediate fallback to remote hugepages

2019-10-02 Thread David Rientjes
On Wed, 2 Oct 2019, Michal Hocko wrote: > > > If > > > hugetlb wants to stress this to the fullest extent possible, it already > > > appropriately uses __GFP_RETRY_MAYFAIL. > > > > Which doesn't work anymore right now, and should again after this patch. > > I didn't get to fully digest the

Re: [patch for-5.3 0/4] revert immediate fallback to remote hugepages

2019-10-01 Thread David Rientjes
On Tue, 1 Oct 2019, Vlastimil Babka wrote: > diff --git a/mm/mempolicy.c b/mm/mempolicy.c > index 4ae967bcf954..2c48146f3ee2 100644 > --- a/mm/mempolicy.c > +++ b/mm/mempolicy.c > @@ -2129,18 +2129,20 @@ alloc_pages_vma(gfp_t gfp, int order, struct > vm_area_struct *vma, > nmask =

Re: [PATCH v5 0/7] hugetlb_cgroup: Add hugetlb_cgroup reservation limits

2019-09-26 Thread David Rientjes
On Tue, 24 Sep 2019, Mina Almasry wrote: > > I personally prefer the one counter approach only for the reason that it > > exposes less information about hugetlb reservations. I was not around > > for the introduction of hugetlb reservations, but I have fixed several > > issues having to do with

Re: [PATCH] mm, vmpressure: Fix a signedness bug in vmpressure_register_event()

2019-09-26 Thread David Rientjes
ore > and it's sort of confusing. > > Fixes: 3cadfa2b9497 ("mm/vmpressure.c: convert to use match_string() helper") > Signed-off-by: Dan Carpenter Acked-by: David Rientjes

Re: [patch for-5.3 0/4] revert immediate fallback to remote hugepages

2019-09-26 Thread David Rientjes
On Wed, 25 Sep 2019, Michal Hocko wrote: > I am especially interested about this part. The more I think about this > the more I am convinced that the underlying problem really is in the pre > mature fallback in the fast path. I appreciate you taking the time to continue to look at this but I'm

Re: [PATCH] mm: slub: print_hex_dump() with DUMP_PREFIX_OFFSET

2019-09-21 Thread David Rientjes
On Fri, 20 Sep 2019, Miles Chen wrote: > Since commit ad67b74d2469d9b8 ("printk: hash addresses printed with %p"), > The use DUMP_PREFIX_OFFSET instead of DUMP_PREFIX_ADDRESS with > print_hex_dump() can generate more useful messages. > > In the following example, it's easier get the offset of

Re: [PATCH] mm/slub: fix -Wunused-function compiler warnings

2019-09-17 Thread David Rientjes
On Tue, 17 Sep 2019, Qian Cai wrote: > tid_to_cpu() and tid_to_event() are only used in note_cmpxchg_failure() > when SLUB_DEBUG_CMPXCHG=y, so when SLUB_DEBUG_CMPXCHG=n by default, > Clang will complain that those unused functions. > > Signed-off-by: Qian Cai Acked-by: David Rientjes

Re: [RFC] mm: Proactive compaction

2019-09-17 Thread David Rientjes
On Tue, 17 Sep 2019, John Hubbard wrote: > > We've had good success with periodically compacting memory on a regular > > cadence on systems with hugepages enabled. The cadence itself is defined > > by the admin but it causes khugepaged[*] to periodically wakeup and invoke > > compaction in an

Re: [RFC PATCH] mm/slub: remove left-over debugging code

2019-09-17 Thread David Rientjes
On Tue, 17 Sep 2019, Qian Cai wrote: > > The cmpxchg failures could likely be more generalized beyond SLUB since > > there will be other dependencies in the kernel than just this allocator. > > OK, SLUB_RESILIENCY_TEST is fine to keep around and maybe be turned into a > Kconfig option to make

Re: [RFC] mm: Proactive compaction

2019-09-16 Thread David Rientjes
On Fri, 16 Aug 2019, Nitin Gupta wrote: > For some applications we need to allocate almost all memory as > hugepages. However, on a running system, higher order allocations can > fail if the memory is fragmented. Linux kernel currently does > on-demand compaction as we request more hugepages but

Re: WARNING in implement

2019-09-16 Thread David Rientjes
On Mon, 16 Sep 2019, syzbot wrote: > Hello, > > syzbot found the following crash on: > > HEAD commit:f0df5c1b usb-fuzzer: main usb gadget fuzzer driver > git tree: https://github.com/google/kasan.git usb-fuzzer > console output: https://syzkaller.appspot.com/x/log.txt?x=170b213e60

Re: WARNING in __alloc_pages_nodemask

2019-09-16 Thread David Rientjes
On Mon, 16 Sep 2019, syzbot wrote: > Hello, > > syzbot found the following crash on: > > HEAD commit:f0df5c1b usb-fuzzer: main usb gadget fuzzer driver > git tree: https://github.com/google/kasan.git usb-fuzzer > console output: https://syzkaller.appspot.com/x/log.txt?x=14b1537160

Re: [RFC PATCH] mm/slub: remove left-over debugging code

2019-09-16 Thread David Rientjes
On Mon, 16 Sep 2019, Qian Cai wrote: > SLUB_RESILIENCY_TEST and SLUB_DEBUG_CMPXCHG look like some left-over > debugging code during the internal development that probably nobody uses > it anymore. Remove them to make the world greener. Adding Pengfei Li who has been working on a patchset for

Re: [RESEND v4 7/7] mm, slab_common: Modify kmalloc_caches[type][idx] to kmalloc_caches[idx][type]

2019-09-15 Thread David Rientjes
e_await_sw_fence 417 405 -12 > ida_alloc_range 955 934 -21 > Total: Before=14874316, After=14873867, chg -0.00% > > Signed-off-by: Pengfei Li This also seems more intuitive. Acked-by: David Rientjes

Re: [RESEND v4 4/7] mm, slab: Return ZERO_SIZE_ALLOC for zero sized kmalloc requests

2019-09-15 Thread David Rientjes
On Mon, 16 Sep 2019, Pengfei Li wrote: > This is a preparation patch, just replace 0 with ZERO_SIZE_ALLOC > as the return value of zero sized requests. > > Signed-off-by: Pengfei Li Acked-by: David Rientjes

Re: [RESEND v4 5/7] mm, slab_common: Make kmalloc_caches[] start at size KMALLOC_MIN_SIZE

2019-09-15 Thread David Rientjes
On Mon, 16 Sep 2019, Pengfei Li wrote: > Currently, kmalloc_cache[] is not sorted by size, kmalloc_cache[0] > is kmalloc-96, kmalloc_cache[1] is kmalloc-192 (when ARCH_DMA_MINALIGN > is not defined). > > As suggested by Vlastimil Babka, > > "Since you're doing these cleanups, have you

Re: [RESEND v4 3/7] mm, slab_common: Use enum kmalloc_cache_type to iterate over kmalloc caches

2019-09-15 Thread David Rientjes
On Mon, 16 Sep 2019, Pengfei Li wrote: > The type of local variable *type* of new_kmalloc_cache() should > be enum kmalloc_cache_type instead of int, so correct it. > > Signed-off-by: Pengfei Li > Acked-by: Vlastimil Babka > Acked-by: Roman Gushchin Acked-by: David Rientjes

Re: [RESEND v4 2/7] mm, slab: Remove unused kmalloc_size()

2019-09-15 Thread David Rientjes
On Mon, 16 Sep 2019, Pengfei Li wrote: > The size of kmalloc can be obtained from kmalloc_info[], > so remove kmalloc_size() that will not be used anymore. > > Signed-off-by: Pengfei Li > Acked-by: Vlastimil Babka > Acked-by: Roman Gushchin Acked-by: David Rientjes

Re: [RESEND v4 1/7] mm, slab: Make kmalloc_info[] contain all types of names

2019-09-15 Thread David Rientjes
tter off as INIT_KMALLOC_INFO? Nothing major though, so: Acked-by: David Rientjes

Re: [RESEND v4 6/7] mm, slab_common: Initialize the same size of kmalloc_caches[]

2019-09-15 Thread David Rientjes
On Mon, 16 Sep 2019, Pengfei Li wrote: > diff --git a/mm/slab_common.c b/mm/slab_common.c > index 2aed30deb071..e7903bd28b1f 100644 > --- a/mm/slab_common.c > +++ b/mm/slab_common.c > @@ -1165,12 +1165,9 @@ void __init setup_kmalloc_cache_index_table(void) >

Re: [patch for-5.3 0/4] revert immediate fallback to remote hugepages

2019-09-08 Thread David Rientjes
On Sun, 8 Sep 2019, Vlastimil Babka wrote: > > On Sat, 7 Sep 2019, Linus Torvalds wrote: > > > >>> Andrea acknowledges the swap storm that he reported would be fixed with > >>> the last two patches in this series > >> > >> The problem is that even you aren't arguing that those patches should >

Re: [patch for-5.3 0/4] revert immediate fallback to remote hugepages

2019-09-07 Thread David Rientjes
On Sat, 7 Sep 2019, Linus Torvalds wrote: > > Andrea acknowledges the swap storm that he reported would be fixed with > > the last two patches in this series > > The problem is that even you aren't arguing that those patches should > go into 5.3. > For three reasons: (a) we lack a test result

Re: [patch for-5.3 0/4] revert immediate fallback to remote hugepages

2019-09-07 Thread David Rientjes
2019, David Rientjes wrote: > On Wed, 4 Sep 2019, Linus Torvalds wrote: > > > > This series reverts those reverts and attempts to propose a more sane > > > default allocation strategy specifically for hugepages. Andrea > > > acknowledges this is likely to fix t

Re: [rfc 3/4] mm, page_alloc: avoid expensive reclaim when compaction may not succeed

2019-09-06 Thread David Rientjes
ge allocator that works for everybody. > >> It is also not helpful to thrash a zone by doing excessive reclaim if > >> compaction may not be able to access that memory. If order-0 watermarks > >> fail and the allocation order is sufficiently large, it is l

Re: [rfc 3/4] mm, page_alloc: avoid expensive reclaim when compaction may not succeed

2019-09-06 Thread David Rientjes
On Thu, 5 Sep 2019, Mike Kravetz wrote: > I don't have a specific test for this. It is somewhat common for people > to want to allocate "as many hugetlb pages as possible". Therefore, they > will try to allocate more pages than reasonable for their environment and > take what they can get. I

Re: [patch for-5.3 0/4] revert immediate fallback to remote hugepages

2019-09-05 Thread David Rientjes
On Wed, 4 Sep 2019, Andrea Arcangeli wrote: > > This is an admittedly hacky solution that shouldn't cause anybody to > > regress based on NUMA and the semantics of MADV_HUGEPAGE for the past > > 4 1/2 years for users whose workload does fit within a socket. > > How can you live with the below

Re: [patch for-5.3 0/4] revert immediate fallback to remote hugepages

2019-09-05 Thread David Rientjes
On Wed, 4 Sep 2019, Linus Torvalds wrote: > > This series reverts those reverts and attempts to propose a more sane > > default allocation strategy specifically for hugepages. Andrea > > acknowledges this is likely to fix the swap storms that he originally > > reported that resulted in the

[bug] __blk_mq_run_hw_queue suspicious rcu usage

2019-09-04 Thread David Rientjes
Hi Christoph, Jens, and Ming, While booting a 5.2 SEV-enabled guest we have encountered the following WARNING that is followed up by a BUG because we are in atomic context while trying to call set_memory_decrypted: WARNING: suspicious RCU usage 5.2.0 #1 Not tainted

Re: [RFC PATCH] mm, oom: disable dump_tasks by default

2019-09-04 Thread David Rientjes
On Wed, 4 Sep 2019, Michal Hocko wrote: > > > It's primary purpose is > > > to help analyse oom victim selection decision. > > > > I disagree, for I use the process list for understanding what / how many > > processes are consuming what kind of memory (without crashing the system) > > for

[rfc 4/4] mm, page_alloc: allow hugepage fallback to remote nodes when madvised

2019-09-04 Thread David Rientjes
o allocate hugepages or the vma is advised to explicitly want to try hard for hugepages that remote allocation is better when local allocation and memory compaction have both failed. Signed-off-by: David Rientjes --- mm/mempolicy.c | 11 +++ 1 file changed, 11 insertions(+) diff

[rfc 3/4] mm, page_alloc: avoid expensive reclaim when compaction may not succeed

2019-09-04 Thread David Rientjes
excessive reclaim if compaction may not be able to access that memory. If order-0 watermarks fail and the allocation order is sufficiently large, it is likely better to fail the allocation rather than thrashing the zone. Signed-off-by: David Rientjes --- mm/page_alloc.c | 22

[patch for-5.3 0/4] revert immediate fallback to remote hugepages

2019-09-04 Thread David Rientjes
Two commits: commit a8282608c88e08b1782141026eab61204c1e533f Author: Andrea Arcangeli Date: Tue Aug 13 15:37:53 2019 -0700 Revert "mm, thp: restore node-local hugepage allocations" commit 92717d429b38e4f9f934eed7e605cc42858f1839 Author: Andrea Arcangeli Date: Tue Aug 13 15:37:50 2019

[patch for-5.3 1/4] Revert "Revert "mm, thp: restore node-local hugepage allocations""

2019-09-04 Thread David Rientjes
m_mode is not a solution to this problem since it does not only impact hugepage allocations but rather changes the memory allocation strategy for *all* page allocations. Signed-off-by: David Rientjes --- include/linux/mempolicy.h | 2 -- mm/huge_memory.c | 42 +++--

[patch for-5.3 2/4] Revert "Revert "Revert "mm, thp: consolidate THP gfp handling into alloc_hugepage_direct_gfpmask""

2019-09-04 Thread David Rientjes
ault page allocation strategy, so this patch reverts a cleanup done on a strategy that is now reverted and thus is the least risky option for 5.3. Signed-off-by: David Rientjes --- include/linux/gfp.h | 12 mm/huge_memory.c| 27 +-- mm/m

Re: [PATCH] mm, oom: consider present pages for the node size

2019-08-29 Thread David Rientjes
ble problems > because the oom calculates a ratio against totalpages and used memory > cannot exceed present pages but it is confusing and wrong from code > point of view. > > Noticed-by: David Hildenbrand > Signed-off-by: Michal Hocko Acked-by: David Rientjes

Re: [PATCH] mm/oom: Add oom_score_adj value to oom Killed process message

2019-08-21 Thread David Rientjes
On Wed, 21 Aug 2019, Michal Hocko wrote: > > vm.oom_dump_tasks is pretty useful, however, so it's curious why you > > haven't left it enabled :/ > > Because it generates a lot of output potentially. Think of a workload > with too many tasks which is not uncommon. Probably better to always

<    1   2   3   4   5   6   7   8   9   10   >