Re: [PATCH v1 0/9] mm/memory: optimize unmap/zap with PTE-mapped THP

2024-01-31 Thread Michal Hocko
some configurations. -- Michal Hocko SUSE Labs

Re: [PATCH v1 0/9] mm/memory: optimize unmap/zap with PTE-mapped THP

2024-01-31 Thread Michal Hocko
means it should be considered low latency guarantee feature. A lot of has changed since the limit was introduced and the current latency numbers will surely be different than back then. As long as soft lockups do not trigger again this should be acceptable IMHO. -- Michal Hocko SUSE Labs

Re: [PATCH v7 7/7] mm/memory_hotplug: Enable runtime update of memmap_on_memory parameter

2023-08-07 Thread Michal Hocko
tively harmful. Just have a look at ridicously small memory blocks on ppc. I do understand that it makes some sense to be aligned to the memory model (so sparsmem section aligned). In an ideal world, memory hotplug v2 interface (if we ever go that path) should be physical memory range based. -- Michal

Re: [PATCH v3 1/2] nmi_backtrace: Allow excluding an arbitrary CPU

2023-08-07 Thread Michal Hocko
On Fri 04-08-23 09:06:07, Doug Anderson wrote: > Hi, > > On Fri, Aug 4, 2023 at 8:02 AM Michal Hocko wrote: > > > > > > It would have been slightly safer to modify > > > > arch_trigger_cpumask_backtrace > > > > by switching arguments so that

Re: [PATCH v3 1/2] nmi_backtrace: Allow excluding an arbitrary CPU

2023-08-04 Thread Michal Hocko
On Fri 04-08-23 06:56:51, Doug Anderson wrote: > Hi, > > On Fri, Aug 4, 2023 at 12:50 AM Michal Hocko wrote: > > > > On Thu 03-08-23 16:07:57, Douglas Anderson wrote: > > > The APIs that allow backtracing across CPUs have always had a way to > > > exclude

Re: [PATCH v3 1/2] nmi_backtrace: Allow excluding an arbitrary CPU

2023-08-04 Thread Michal Hocko
sk) { return false; > Signed-off-by: Douglas Anderson Anyway Acked-by: Michal Hocko > --- > > Changes in v3: > - ("nmi_backtrace: Allow excluding an arbitrary CPU") new for v3. > > arch/arm/include/asm/irq.h | 2 +- > arch/arm/kernel/sm

Re: [PATCH v7 7/7] mm/memory_hotplug: Enable runtime update of memmap_on_memory parameter

2023-08-03 Thread Michal Hocko
On Wed 02-08-23 18:59:04, Michal Hocko wrote: > On Wed 02-08-23 17:54:04, David Hildenbrand wrote: > > On 02.08.23 17:50, Michal Hocko wrote: > > > On Wed 02-08-23 10:15:04, Aneesh Kumar K V wrote: > > > > On 8/1/23 4:20 PM, Michal Hocko wrote: > > > > &g

Re: [PATCH v7 7/7] mm/memory_hotplug: Enable runtime update of memmap_on_memory parameter

2023-08-02 Thread Michal Hocko
On Wed 02-08-23 17:54:04, David Hildenbrand wrote: > On 02.08.23 17:50, Michal Hocko wrote: > > On Wed 02-08-23 10:15:04, Aneesh Kumar K V wrote: > > > On 8/1/23 4:20 PM, Michal Hocko wrote: > > > > On Tue 01-08-23 14:58:29, Aneesh Kumar K V wrote: > > > &g

Re: [PATCH v7 7/7] mm/memory_hotplug: Enable runtime update of memmap_on_memory parameter

2023-08-02 Thread Michal Hocko
On Wed 02-08-23 10:15:04, Aneesh Kumar K V wrote: > On 8/1/23 4:20 PM, Michal Hocko wrote: > > On Tue 01-08-23 14:58:29, Aneesh Kumar K V wrote: > >> On 8/1/23 2:28 PM, Michal Hocko wrote: > >>> On Tue 01-08-23 10:11:16, Aneesh Kumar K.V wrote: > >>>&g

Re: [PATCH v7 7/7] mm/memory_hotplug: Enable runtime update of memmap_on_memory parameter

2023-08-01 Thread Michal Hocko
On Tue 01-08-23 14:58:29, Aneesh Kumar K V wrote: > On 8/1/23 2:28 PM, Michal Hocko wrote: > > On Tue 01-08-23 10:11:16, Aneesh Kumar K.V wrote: > >> Allow updating memmap_on_memory mode after the kernel boot. Memory > >> hotplug done after the mode update will u

Re: [PATCH v7 0/7] Add support for memmap on memory feature on ppc64

2023-08-01 Thread Michal Hocko
justification and use case description IMHO. That being said for patches 1 - 4 and 6 feel free to add Acked-by: Michal Hocko On Tue 01-08-23 10:11:09, Aneesh Kumar K.V wrote: > This patch series update memmap on memory feature to fall back to > memmap allocation outside the memory block if the ali

Re: [PATCH v7 4/7] mm/memory_hotplug: Support memmap_on_memory when memmap is not aligned to pageblocks

2023-08-01 Thread Michal Hocko
once("Memory hotplug will waste %ld pages in each > memory block\n", > + memmap_pages - PFN_UP(memory_block_memmap_size())); -- Michal Hocko SUSE Labs

Re: [PATCH v7 7/7] mm/memory_hotplug: Enable runtime update of memmap_on_memory parameter

2023-08-01 Thread Michal Hocko
rds about the usecase. Why we could live with this static and now need dynamic? -- Michal Hocko SUSE Labs

Re: [PATCH v6 6/7] mm/memory_hotplug: Embed vmem_altmap details in memory block

2023-07-27 Thread Michal Hocko
On Thu 27-07-23 15:02:12, Aneesh Kumar K V wrote: > On 7/27/23 2:55 PM, Michal Hocko wrote: > > On Thu 27-07-23 13:32:31, Aneesh Kumar K.V wrote: > >> With memmap on memory, some architecture needs more details w.r.t altmap > >> such as base_pfn, end_pfn, etc to un

Re: [PATCH v6 4/7] mm/memory_hotplug: Support memmap_on_memory when memmap is not aligned to pageblocks

2023-07-27 Thread Michal Hocko
On Thu 27-07-23 14:57:17, Aneesh Kumar K V wrote: > On 7/27/23 2:53 PM, Michal Hocko wrote: > > On Thu 27-07-23 13:32:29, Aneesh Kumar K.V wrote: > > [...] > >> + if (mode == MEMMAP_ON_MEMORY_FORCE) { > >> + unsigned long memmap_pages = > &g

Re: [PATCH v6 7/7] mm/memory_hotplug: Enable runtime update of memmap_on_memory parameter

2023-07-27 Thread Michal Hocko
, 19 insertions(+), 16 deletions(-) -- Michal Hocko SUSE Labs

Re: [PATCH v6 6/7] mm/memory_hotplug: Embed vmem_altmap details in memory block

2023-07-27 Thread Michal Hocko
106,7 @@ static void memory_block_release(struct device *dev) > { > struct memory_block *mem = to_memory_block(dev); > > + WARN_ON(mem->altmap); What is this supposed to catch? A comment would be handy so that we know what to look at should it ever trigger. > kfree(mem); > } -- Michal Hocko SUSE Labs

Re: [PATCH v6 4/7] mm/memory_hotplug: Support memmap_on_memory when memmap is not aligned to pageblocks

2023-07-27 Thread Michal Hocko
e depends on the block size and that can vary. I think it would make more sense to print the block size and the vmemmap reservation and for the force case also any wasted amount on top (if any). -- Michal Hocko SUSE Labs

Re: [next-20230705] kernel BUG mm/memcontrol.c:3715! (ltp/madvise06)

2023-07-07 Thread Michal Hocko
nt_cgmem("memory.current"); > > 94 print_cgmem("memory.swap.current"); > > 95 print_cgmem("memory.kmem.usage_in_bytes”); <<== this line. > > 96 } > > > > If I comment line 95 from the testcase, it completes successfully. > > The handling for _KMEM was removed from mem_cgroup_read_u64() > incorrectly. > It is used by the still existing kmem.*usage*_in_bytes in addition to > the now removed kmem.*limit*_in_bytes. > (And kmem.max_usage_in_bytes, kmem.failcnt) > > The testcase seems to be fine, it actually did its job. Correct. The updated patch has been already posted http://lkml.kernel.org/r/zke5wxdbvpi5c...@dhcp22.suse.cz Thanks for the report! -- Michal Hocko SUSE Labs

Re: [PATCH 1/1] mm: introduce vm_flags_reset_once to replace WRITE_ONCE vm_flags updates

2023-01-31 Thread Michal Hocko
the git blame would be more visible when the conversion is from WRITE_ONCE. One way or the other Acked-by: Michal Hocko > --- > Notes: > - The patch applies cleanly over mm-unstable > - The SHA in Fixes: line is from mm-unstable, so is... unstable > > include/linux/mm.h | 7 +++ >

Re: [PATCH v3 6/7] mm: introduce mod_vm_flags_nolock and use it in untrack_pfn

2023-01-26 Thread Michal Hocko
ch situation, when VMA is > not part of VMA tree and locking it is not required. > Pass a hint to untrack_pfn to conditionally use mod_vm_flags_nolock for > flags modification and to avoid assertion. > > Signed-off-by: Suren Baghdasaryan Acked-by: Michal Hocko Thanks! > --- >

Re: [PATCH v3 2/7] mm: introduce vma->vm_flags wrapper functions

2023-01-26 Thread Michal Hocko
istent on that much even in the core kernel - e.g. init_rwsem vs. mutex_init) Acked-by: Michal Hocko > --- > include/linux/mm.h | 37 + > include/linux/mm_types.h | 10 +- > 2 files changed, 46 insertions(+), 1 deletion(-) > > diff

Re: [PATCH v3 2/7] mm: introduce vma->vm_flags wrapper functions

2023-01-26 Thread Michal Hocko
t it would introduce a huge additional > churn (800+ hits) with no obvious benefits, I think. Does that clarify > the intent of this trick? I think that makes sense at this stage. The conversion patch is quite large already. Maybe the final renaming could be done on top of everything and patch generated by coccinele. The const union is a neat trick but a potential lockdep assert is a nice plus as well. I wouldn't see it as a top priority though. -- Michal Hocko SUSE Labs

Re: [PATCH v2 4/6] mm: replace vma->vm_flags indirect modification in ksm_madvise

2023-01-25 Thread Michal Hocko
On Wed 25-01-23 08:57:48, Suren Baghdasaryan wrote: > On Wed, Jan 25, 2023 at 1:38 AM 'Michal Hocko' via kernel-team > wrote: > > > > On Wed 25-01-23 00:38:49, Suren Baghdasaryan wrote: > > > Replace indirect modifications to vma->vm_flags with calls to modif

Re: [PATCH v2 6/6] mm: export dump_mm()

2023-01-25 Thread Michal Hocko
; > Signed-off-by: Suren Baghdasaryan Acked-by: Michal Hocko > --- > mm/debug.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/mm/debug.c b/mm/debug.c > index 9d3d893dc7f4..96d594e16292 100644 > --- a/mm/debug.c > +++ b/mm/debug.c > @@ -215,6 +215

Re: [PATCH v2 5/6] mm: introduce mod_vm_flags_nolock and use it in untrack_pfn

2023-01-25 Thread Michal Hocko
tistics and freeing VMAs */ > mas_set(_detach, start); > remove_mt(mm, _detach); > @@ -2704,7 +2708,7 @@ unsigned long mmap_region(struct file *file, unsigned > long addr, > > /* Undo any partial mapping done by a device driver. */ > unmap_region(mm, >mm_mt, vma, prev, next, vma->vm_start, > - vma->vm_end); > + vma->vm_end, true); > } > if (file && (vm_flags & VM_SHARED)) > mapping_unmap_writable(file->f_mapping); > @@ -3031,7 +3035,7 @@ void exit_mmap(struct mm_struct *mm) > tlb_gather_mmu_fullmm(, mm); > /* update_hiwater_rss(mm) here? but nobody should be looking */ > /* Use ULONG_MAX here to ensure all VMAs in the mm are unmapped */ > - unmap_vmas(, >mm_mt, vma, 0, ULONG_MAX); > + unmap_vmas(, >mm_mt, vma, 0, ULONG_MAX, false); > mmap_read_unlock(mm); > > /* > -- > 2.39.1 -- Michal Hocko SUSE Labs

Re: [PATCH v2 4/6] mm: replace vma->vm_flags indirect modification in ksm_madvise

2023-01-25 Thread Michal Hocko
cation attempts. Those BUG_ONs scream to much IMHO. KSM is an MM internal code so I gueess we should be willing to trust it. > Signed-off-by: Suren Baghdasaryan Acked-by: Michal Hocko -- Michal Hocko SUSE Labs

Re: [PATCH v2 2/6] mm: replace VM_LOCKED_CLEAR_MASK with VM_LOCKED_MASK

2023-01-25 Thread Michal Hocko
On Wed 25-01-23 00:38:47, Suren Baghdasaryan wrote: > To simplify the usage of VM_LOCKED_CLEAR_MASK in clear_vm_flags(), > replace it with VM_LOCKED_MASK bitmask and convert all users. > > Signed-off-by: Suren Baghdasaryan Acked-by: Michal Hocko > --- > include/linux/mm.h

Re: [PATCH v2 1/6] mm: introduce vma->vm_flags modifier functions

2023-01-25 Thread Michal Hocko
; operations. Introduce modifier functions for vm_flags to be used whenever > flags are updated. This way we can better check and control correct > locking behavior during these updates. > > Signed-off-by: Suren Baghdasaryan Acked-by: Michal Hocko > --- >

Re: [PATCH 39/41] kernel/fork: throttle call_rcu() calls in vm_area_free

2023-01-23 Thread Michal Hocko
On Mon 23-01-23 19:30:43, Matthew Wilcox wrote: > On Mon, Jan 23, 2023 at 08:18:37PM +0100, Michal Hocko wrote: > > On Mon 23-01-23 18:23:08, Matthew Wilcox wrote: > > > On Mon, Jan 23, 2023 at 09:46:20AM -0800, Suren Baghdasaryan wrote: > > [...] > > > >

Re: [PATCH 39/41] kernel/fork: throttle call_rcu() calls in vm_area_free

2023-01-23 Thread Michal Hocko
not think we want to have something like that in the future either but that is really hard to envision. I am claiming that it is subtle and potentially error prone to have two different ways of mass vma freeing wrt. locking. Also, don't we have a very similar situation during last munmaps? -- Michal Hocko SUSE Labs

Re: [PATCH 39/41] kernel/fork: throttle call_rcu() calls in vm_area_free

2023-01-23 Thread Michal Hocko
On Mon 23-01-23 09:07:34, Suren Baghdasaryan wrote: > On Mon, Jan 23, 2023 at 8:55 AM Michal Hocko wrote: > > > > On Mon 23-01-23 08:22:53, Suren Baghdasaryan wrote: > > > On Mon, Jan 23, 2023 at 1:56 AM Michal Hocko wrote: > > > > > > > > On

Re: [PATCH 39/41] kernel/fork: throttle call_rcu() calls in vm_area_free

2023-01-23 Thread Michal Hocko
On Mon 23-01-23 08:22:53, Suren Baghdasaryan wrote: > On Mon, Jan 23, 2023 at 1:56 AM Michal Hocko wrote: > > > > On Fri 20-01-23 09:50:01, Suren Baghdasaryan wrote: > > > On Fri, Jan 20, 2023 at 9:32 AM Matthew Wilcox > > > wrote: > > [...] > >

Re: [PATCH 39/41] kernel/fork: throttle call_rcu() calls in vm_area_free

2023-01-23 Thread Michal Hocko
he vma life time assurance? Jann has already shown how rwsem is not safe wrt to unlock and free without RCU. -- Michal Hocko SUSE Labs

Re: [PATCH 39/41] kernel/fork: throttle call_rcu() calls in vm_area_free

2023-01-23 Thread Michal Hocko
On Fri 20-01-23 08:20:43, Suren Baghdasaryan wrote: > On Fri, Jan 20, 2023 at 12:52 AM Michal Hocko wrote: > > > > On Thu 19-01-23 10:52:03, Suren Baghdasaryan wrote: > > > On Thu, Jan 19, 2023 at 4:59 AM Michal Hocko wrote: > > > > > > > > On

Re: [PATCH 39/41] kernel/fork: throttle call_rcu() calls in vm_area_free

2023-01-20 Thread Michal Hocko
On Thu 19-01-23 11:17:07, Paul E. McKenney wrote: > On Thu, Jan 19, 2023 at 01:52:14PM +0100, Michal Hocko wrote: > > On Wed 18-01-23 11:01:08, Suren Baghdasaryan wrote: > > > On Wed, Jan 18, 2023 at 10:34 AM Paul E. McKenney > > > wrote: > > [...] > >

Re: [PATCH 39/41] kernel/fork: throttle call_rcu() calls in vm_area_free

2023-01-20 Thread Michal Hocko
On Thu 19-01-23 10:52:03, Suren Baghdasaryan wrote: > On Thu, Jan 19, 2023 at 4:59 AM Michal Hocko wrote: > > > > On Mon 09-01-23 12:53:34, Suren Baghdasaryan wrote: > > > call_rcu() can take a long time when callback offloading is enabled. > > > Its use in the v

Re: [PATCH 39/41] kernel/fork: throttle call_rcu() calls in vm_area_free

2023-01-19 Thread Michal Hocko
ust link all the vmas into linked list and use a single call_rcu instead, no? This would both simplify the implementation, remove the scaling issue as well and we do not have to argue whether VM_AREA_FREE_LIST_MAX should be epsilon or epsilon + 1. -- Michal Hocko SUSE Labs

Re: [PATCH 39/41] kernel/fork: throttle call_rcu() calls in vm_area_free

2023-01-19 Thread Michal Hocko
ink I've seen such a case. > Thanks for clarifications, Paul! Thanks for the explanation Paul. I have to say this has caught me as a surprise. There are just not enough details about the benchmark to understand what is going on but I find it rather surprising that call_rcu can induce a higher overhead than the actual kmem_cache_free which is the callback. My naive understanding has been that call_rcu is really fast way to defer the execution to the RCU safe context to do the final cleanup. -- Michal Hocko SUSE Labs

Re: [PATCH 17/41] mm/mmap: move VMA locking before anon_vma_lock_write call

2023-01-19 Thread Michal Hocko
On Wed 18-01-23 13:48:13, Suren Baghdasaryan wrote: > On Wed, Jan 18, 2023 at 1:33 PM Michal Hocko wrote: [...] > > So it will become: > > Move VMA flag modification (which now implies VMA locking) before > > vma_adjust_trans_huge() to ensure the modifications are done a

Re: [PATCH 17/41] mm/mmap: move VMA locking before anon_vma_lock_write call

2023-01-18 Thread Michal Hocko
On Wed 18-01-23 10:09:29, Suren Baghdasaryan wrote: > On Wed, Jan 18, 2023 at 1:23 AM Michal Hocko wrote: > > > > On Tue 17-01-23 18:01:01, Suren Baghdasaryan wrote: > > > On Tue, Jan 17, 2023 at 7:16 AM Michal Hocko wrote: > > > > > > > > On

Re: [PATCH 12/41] mm: add per-VMA lock and helper functions to control it

2023-01-18 Thread Michal Hocko
On Wed 18-01-23 09:36:44, Suren Baghdasaryan wrote: > On Wed, Jan 18, 2023 at 7:11 AM 'Michal Hocko' via kernel-team > wrote: > > > > On Wed 18-01-23 14:23:32, Jann Horn wrote: > > > On Wed, Jan 18, 2023 at 1:28 PM Michal Hocko wrote: > > > > On

Re: [PATCH 12/41] mm: add per-VMA lock and helper functions to control it

2023-01-18 Thread Michal Hocko
On Wed 18-01-23 14:23:32, Jann Horn wrote: > On Wed, Jan 18, 2023 at 1:28 PM Michal Hocko wrote: > > On Tue 17-01-23 19:02:55, Jann Horn wrote: > > > +locking maintainers > > > > > > On Mon, Jan 9, 2023 at 9:54 PM Suren Baghdasaryan > > > wr

Re: [PATCH 12/41] mm: add per-VMA lock and helper functions to control it

2023-01-18 Thread Michal Hocko
to sync up with ongoing readers. vma manipulation functions like __adjust_vma make my head spin. Would it make more sense to have a rcu style synchronization point in vm_area_free directly before call_rcu? This would add an overhead of uncontended down_write of course. -- Michal Hocko SUSE Labs

Re: [PATCH 39/41] kernel/fork: throttle call_rcu() calls in vm_area_free

2023-01-18 Thread Michal Hocko
On Tue 17-01-23 17:19:46, Suren Baghdasaryan wrote: > On Tue, Jan 17, 2023 at 7:57 AM Michal Hocko wrote: > > > > On Mon 09-01-23 12:53:34, Suren Baghdasaryan wrote: > > > call_rcu() can take a long time when callback offloading is enabled. > > > Its use in the v

Re: [PATCH 26/41] kernel/fork: assert no VMA readers during its destruction

2023-01-18 Thread Michal Hocko
On Tue 17-01-23 17:53:00, Suren Baghdasaryan wrote: > On Tue, Jan 17, 2023 at 7:42 AM 'Michal Hocko' via kernel-team > wrote: > > > > On Mon 09-01-23 12:53:21, Suren Baghdasaryan wrote: > > > Assert there are no holders of VMA lock for reading when it is

Re: [PATCH 18/41] mm/khugepaged: write-lock VMA while collapsing a huge page

2023-01-18 Thread Michal Hocko
On Tue 17-01-23 21:28:06, Jann Horn wrote: > On Tue, Jan 17, 2023 at 4:25 PM Michal Hocko wrote: > > On Mon 09-01-23 12:53:13, Suren Baghdasaryan wrote: > > > Protect VMA from concurrent page fault handler while collapsing a huge > > > page. Page fault handler ne

Re: [PATCH 17/41] mm/mmap: move VMA locking before anon_vma_lock_write call

2023-01-18 Thread Michal Hocko
On Tue 17-01-23 18:01:01, Suren Baghdasaryan wrote: > On Tue, Jan 17, 2023 at 7:16 AM Michal Hocko wrote: > > > > On Mon 09-01-23 12:53:12, Suren Baghdasaryan wrote: > > > Move VMA flag modification (which now implies VMA locking) before > > > anon_vma_lock

Re: [PATCH 12/41] mm: add per-VMA lock and helper functions to control it

2023-01-18 Thread Michal Hocko
On Tue 17-01-23 21:54:58, Matthew Wilcox wrote: > On Tue, Jan 17, 2023 at 01:21:47PM -0800, Suren Baghdasaryan wrote: > > On Tue, Jan 17, 2023 at 7:12 AM Michal Hocko wrote: > > > > > > On Tue 17-01-23 16:04:26, Michal Hocko wrote: > > > > On Mon 09-0

Re: [PATCH 41/41] mm: replace rw_semaphore with atomic_t in vma_lock

2023-01-17 Thread Michal Hocko
n actual real life numbers. This whole thing is quite big enough that we do not have to go through "is this new synchronization primitive correct and behaving reasonably" exercise. -- Michal Hocko SUSE Labs

Re: [PATCH 39/41] kernel/fork: throttle call_rcu() calls in vm_area_free

2023-01-17 Thread Michal Hocko
, place VMAs into > a list and free them in groups using one call_rcu() call per group. Please add some data to justify this additional complexity. -- Michal Hocko SUSE Labs

Re: [PATCH 28/41] mm: introduce lock_vma_under_rcu to be used from arch-specific code

2023-01-17 Thread Michal Hocko
etry; > + } > + > + rcu_read_unlock(); > + return vma; > +inval: > + rcu_read_unlock(); > + count_vm_vma_lock_event(VMA_LOCK_ABORT); > + return NULL; > +} > +#endif /* CONFIG_PER_VMA_LOCK */ > + > #ifndef __PAGETABLE_P4D_FOLDED > /* > * Allocate p4d page table. > -- > 2.39.0 -- Michal Hocko SUSE Labs

Re: [PATCH 26/41] kernel/fork: assert no VMA readers during its destruction

2023-01-17 Thread Michal Hocko
READ_ONCE(vma->vm_mm->mm_lock_seq), > + vma); Do we really need to check for vm_lock_seq? rwsem_is_locked should tell us something is wrong on its own, no? This could be somebody racing with the vma destruction and using the write lock. Unlikely but I do not see why to narrow debugging scope. -- Michal Hocko SUSE Labs

Re: [PATCH 18/41] mm/khugepaged: write-lock VMA while collapsing a huge page

2023-01-17 Thread Michal Hocko
andling and THP collapsing need to be mutually exclusive currently so in order to keep that assumption you have mark the vma write locked? Also it is not really clear to me how that handles other vmas which can share the same thp? -- Michal Hocko SUSE Labs

Re: [PATCH 17/41] mm/mmap: move VMA locking before anon_vma_lock_write call

2023-01-17 Thread Michal Hocko
On Mon 09-01-23 12:53:12, Suren Baghdasaryan wrote: > Move VMA flag modification (which now implies VMA locking) before > anon_vma_lock_write to match the locking order of page fault handler. Does this changelog assumes per vma locking in the #PF? -- Michal Hocko SUSE Labs

Re: [PATCH 13/41] mm: introduce vma->vm_flags modifier functions

2023-01-17 Thread Michal Hocko
On Tue 17-01-23 16:09:03, Michal Hocko wrote: > On Mon 09-01-23 12:53:08, Suren Baghdasaryan wrote: > > To keep vma locking correctness when vm_flags are modified, add modifier > > functions to be used whenever flags are updated. > > > > Signed-off-by: Suren Baghdas

Re: [PATCH 12/41] mm: add per-VMA lock and helper functions to control it

2023-01-17 Thread Michal Hocko
On Tue 17-01-23 16:04:26, Michal Hocko wrote: > On Mon 09-01-23 12:53:07, Suren Baghdasaryan wrote: > > Introduce a per-VMA rw_semaphore to be used during page fault handling > > instead of mmap_lock. Because there are cases when multiple VMAs need > > to be exclusively l

Re: [PATCH 13/41] mm: introduce vma->vm_flags modifier functions

2023-01-17 Thread Michal Hocko
unsigned long set, unsigned long clear) > +{ > + vma_write_lock(vma); > + vma->vm_flags |= set; > + vma->vm_flags &= ~clear; > +} > + This is rather unusual pattern. There is no note about locking involved in the naming and also why is the locking part of this interface in the first place? I can see reason for access functions to actually check for lock asserts. -- Michal Hocko SUSE Labs

Re: [PATCH 12/41] mm: add per-VMA lock and helper functions to control it

2023-01-17 Thread Michal Hocko
truct mm_struct *mm, > struct task_struct *p, > seqcount_init(>write_protect_seq); > mmap_init_lock(mm); > INIT_LIST_HEAD(>mmlist); > +#ifdef CONFIG_PER_VMA_LOCK > + WRITE_ONCE(mm->mm_lock_seq, 0); > +#endif The mm shouldn't be visible so why WRITE_ONCE? -- Michal Hocko SUSE Labs

Re: [PATCH 12/41] mm: add per-VMA lock and helper functions to control it

2023-01-17 Thread Michal Hocko
rig, new); > } > return new; > @@ -1145,6 +1146,9 @@ static struct mm_struct *mm_init(struct mm_struct *mm, > struct task_struct *p, > seqcount_init(>write_protect_seq); > mmap_init_lock(mm); > INIT_LIST_HEAD(>mmlist); > +#ifdef CONFIG_PER_VMA_LOCK > + WRITE_ONCE(mm->mm_lock_seq, 0); > +#endif > mm_pgtables_bytes_init(mm); > mm->map_count = 0; > mm->locked_vm = 0; > diff --git a/mm/init-mm.c b/mm/init-mm.c > index c9327abb771c..33269314e060 100644 > --- a/mm/init-mm.c > +++ b/mm/init-mm.c > @@ -37,6 +37,9 @@ struct mm_struct init_mm = { > .page_table_lock = __SPIN_LOCK_UNLOCKED(init_mm.page_table_lock), > .arg_lock = __SPIN_LOCK_UNLOCKED(init_mm.arg_lock), > .mmlist = LIST_HEAD_INIT(init_mm.mmlist), > +#ifdef CONFIG_PER_VMA_LOCK > + .mm_lock_seq= 0, > +#endif > .user_ns= _user_ns, > .cpu_bitmap = CPU_BITS_NONE, > #ifdef CONFIG_IOMMU_SVA > -- > 2.39.0 -- Michal Hocko SUSE Labs

Re: [PATCH 09/41] mm: rcu safe VMA freeing

2023-01-17 Thread Michal Hocko
); > +#endif Is it safe to have vma with already freed vma_name? I suspect this is safe because of mmap_lock but is there any reason to split the freeing process and have this potential UAF lurking? > } > > static void account_kernel_stack(struct task_struct *tsk, int account) > -- > 2.39.0 -- Michal Hocko SUSE Labs

Re: [PATCH 08/41] mm: introduce CONFIG_PER_VMA_LOCK

2023-01-11 Thread Michal Hocko
On Wed 11-01-23 09:49:08, Suren Baghdasaryan wrote: > On Wed, Jan 11, 2023 at 9:37 AM Michal Hocko wrote: > > > > On Wed 11-01-23 09:04:41, Suren Baghdasaryan wrote: > > > On Wed, Jan 11, 2023 at 8:44 AM Michal Hocko wrote: > > > > > > > > On

Re: [PATCH 08/41] mm: introduce CONFIG_PER_VMA_LOCK

2023-01-11 Thread Michal Hocko
On Wed 11-01-23 09:04:41, Suren Baghdasaryan wrote: > On Wed, Jan 11, 2023 at 8:44 AM Michal Hocko wrote: > > > > On Wed 11-01-23 08:28:49, Suren Baghdasaryan wrote: > > [...] > > > Anyhow. Sounds like the overhead of the current design is small enough > > >

Re: [PATCH 08/41] mm: introduce CONFIG_PER_VMA_LOCK

2023-01-11 Thread Michal Hocko
his stage. -- Michal Hocko SUSE Labs

Re: [PATCH 08/41] mm: introduce CONFIG_PER_VMA_LOCK

2023-01-11 Thread Michal Hocko
Sure there might be a lot of those but your patch increases it by rwsem (without the last patch) which is something like 40B on top of 136B vma so we are talking about 400B in total which even with wild mapcount limits shouldn't really be prohibitive. With a default map count limit we are talking

Re: [PATCH] mm: remove zap_page_range and create zap_vma_pages

2023-01-03 Thread Michal Hocko
ngle(). > - Remove zap_page_range. > > [1] > https://lore.kernel.org/linux-mm/20221114235507.294320-2-mike.krav...@oracle.com/ > Suggested-by: Peter Xu > Signed-off-by: Mike Kravetz This looks even better than the previous version. Acked-by: Michal Hocko minor nit [...] >

Re: [RFC PATCH] mm: remove zap_page_range and change callers to use zap_vma_page_range

2022-12-19 Thread Michal Hocko
e_single rather than adding a new wrapper but nothing really critical. > Also, change madvise_dontneed_single_vma to use this new routine. > > [1] > https://lore.kernel.org/linux-mm/20221114235507.294320-2-mike.krav...@oracle.com/ > Suggested-by: Peter Xu > Signed-off-by: Mike Kr

Re: [RFC PATCH 0/7] Memory hotplug/hotremove at subsection size

2021-05-07 Thread Michal Hocko
Is there any reason to stick with that memory model for an advance feature you are working on? -- Michal Hocko SUSE Labs

Re: [PATCH v1 4/4] powernv/memtrace: don't abuse memory hot(un)plug infrastructure for memory allocations

2020-11-03 Thread Michal Hocko
sed > along PG_reserved in hibernation code will always return "true" > on powerpc, resulting in the pages getting touched. It's too > generic - e.g., indicates boot allocations. > > Note 3: For now, we keep using memory_block_size_bytes() as minimum >

Re: [PATCH v5 3/3] mm/page_alloc: Keep memoryless cpuless node 0 offline

2020-08-18 Thread Michal Hocko
://bugzilla.kernel.org/show_bug.cgi?id=202187 > >> > >> So... do we merge this patch or not? Seems that the overall view is > >> "risky but nobody is likely to do anything better any time soon"? > > > > Can we decide on this one way or the other? > > Hmm, not sure who's the person to decide. I tend to prefer doing the > node renaming, handling this in ppc code; Agreed. That would be a safer option. -- Michal Hocko SUSE Labs

Re: [PATCH v5 3/3] mm/page_alloc: Keep memoryless cpuless node 0 offline

2020-07-03 Thread Michal Hocko
On Fri 03-07-20 13:32:21, David Hildenbrand wrote: > On 03.07.20 12:59, Michal Hocko wrote: > > On Fri 03-07-20 11:24:17, Michal Hocko wrote: > >> [Cc Andi] > >> > >> On Fri 03-07-20 11:10:01, Michal Suchanek wrote: > >>> On Wed, Jul 01, 2020 at 02:

Re: [PATCH v5 3/3] mm/page_alloc: Keep memoryless cpuless node 0 offline

2020-07-03 Thread Michal Hocko
On Fri 03-07-20 11:24:17, Michal Hocko wrote: > [Cc Andi] > > On Fri 03-07-20 11:10:01, Michal Suchanek wrote: > > On Wed, Jul 01, 2020 at 02:21:10PM +0200, Michal Hocko wrote: > > > On Wed 01-07-20 13:30:57, David Hildenbrand wrote: > [.

Re: [PATCH v5 3/3] mm/page_alloc: Keep memoryless cpuless node 0 offline

2020-07-03 Thread Michal Hocko
[Cc Andi] On Fri 03-07-20 11:10:01, Michal Suchanek wrote: > On Wed, Jul 01, 2020 at 02:21:10PM +0200, Michal Hocko wrote: > > On Wed 01-07-20 13:30:57, David Hildenbrand wrote: [...] > > > Yep, looks like it. > > > > > > [0.009726] SRAT: PXM 1 ->

Re: [PATCH v5 3/3] mm/page_alloc: Keep memoryless cpuless node 0 offline

2020-07-02 Thread Michal Hocko
On Thu 02-07-20 12:14:08, Srikar Dronamraju wrote: > * Michal Hocko [2020-07-01 14:21:10]: > > > > >>>>>> > > > >>>>>> 2. Also existence of dummy node also leads to inconsistent > > > >>>>>> in

Re: [PATCH v5 3/3] mm/page_alloc: Keep memoryless cpuless node 0 offline

2020-07-01 Thread Michal Hocko
; used by the kernel and can be used arbitrarily? > > > > I thought Michal Hocko already gave a clear picture on why mapping is a bad > idea. https://lore.kernel.org/lkml/20200316085425.gb11...@dhcp22.suse.cz/t/#u > Are you suggesting that we add that as part of the changel

Re: [PATCH v5 3/3] mm/page_alloc: Keep memoryless cpuless node 0 offline

2020-07-01 Thread Michal Hocko
On Wed 01-07-20 13:30:57, David Hildenbrand wrote: > On 01.07.20 13:06, David Hildenbrand wrote: > > On 01.07.20 13:01, Srikar Dronamraju wrote: > >> * David Hildenbrand [2020-07-01 12:15:54]: > >> > >>> On 01.07.20 12:04, Srikar Dronamraju wrote: >

Re: [PATCH v5 3/3] mm/page_alloc: Keep memoryless cpuless node 0 offline

2020-07-01 Thread Michal Hocko
RC we have discussed testing in the previous version and David has provided a way to emulate these configurations on x86. Did you manage to use those instruction for additional testing on other than ppc architectures? > Cc: linuxppc-dev@lists.ozlabs.org > Cc: linux...@kvack.org > Cc

Re: [PATCH v4 0/3] mm, treewide: Rename kzfree() to kfree_sensitive()

2020-06-17 Thread Michal Hocko
On Wed 17-06-20 05:23:21, Matthew Wilcox wrote: > On Wed, Jun 17, 2020 at 01:31:57PM +0200, Michal Hocko wrote: > > On Wed 17-06-20 04:08:20, Matthew Wilcox wrote: > > > If you call vfree() under > > > a spinlock, you're in trouble. in_atomic() only knows

Re: [PATCH v4 0/3] mm, treewide: Rename kzfree() to kfree_sensitive()

2020-06-17 Thread Michal Hocko
On Wed 17-06-20 04:08:20, Matthew Wilcox wrote: > On Wed, Jun 17, 2020 at 09:12:12AM +0200, Michal Hocko wrote: > > On Tue 16-06-20 17:37:11, Matthew Wilcox wrote: > > > Not just performance critical, but correctness critical. Since kvfree() > > > may allocate from the

Re: [PATCH v4 0/3] mm, treewide: Rename kzfree() to kfree_sensitive()

2020-06-17 Thread Michal Hocko
e from the vmalloc allocator, I really think that kvfree() > should assert that it's !in_atomic(). Otherwise we can get into trouble > if we end up calling vfree() and have to take the mutex. FWIW __vfree already checks for atomic context and put the work into a deferred context. So this should be safe. It should be used as a last resort, though. -- Michal Hocko SUSE Labs

Re: [PATCH v4 1/3] mm/slab: Use memzero_explicit() in kzfree()

2020-06-16 Thread Michal Hocko
;slab: introduce kzfree()") > Cc: sta...@vger.kernel.org > Signed-off-by: Waiman Long Acked-by: Michal Hocko Although I am not really sure this is a stable material. Is there any known instance where the memset was optimized out from kzfree? > --- > mm/slab_common.c | 2

Re: [PATCH v2 3/3] mm/page_alloc: Keep memoryless cpuless node 0 offline

2020-05-04 Thread Michal Hocko
On Thu 30-04-20 12:48:20, Srikar Dronamraju wrote: > * Michal Hocko [2020-04-29 14:22:11]: > > > On Wed 29-04-20 07:11:45, Srikar Dronamraju wrote: > > > > > > > > > > By marking, N_ONLINE as NODE_MASK_NONE, lets stop assuming that No

Re: [PATCH v2 3/3] mm/page_alloc: Keep memoryless cpuless node 0 offline

2020-04-29 Thread Michal Hocko
MA Multi node but with no CPUs and memory from node 0. Have you tested on something else than ppc? Each arch does the NUMA setup separately and this is a big mess. E.g. x86 marks even memory less nodes (see init_memory_less_node) as online. Honestly I have hard time to evaluate the effect of this

Re: [PATCH 1/2] mm, treewide: Rename kzfree() to kfree_sensitive()

2020-04-14 Thread Michal Hocko
; > followed by some editing of the kfree_sensitive() kerneldoc and the > use of memzero_explicit() instead of memset(). > > Suggested-by: Joe Perches > Signed-off-by: Waiman Long Makes sense. I haven't checked all the conversions and will rely on the script doing the right thing. The core MM part is correct. Acked-by: Michal Hocko -- Michal Hocko SUSE Labs

Re: [PATCH RFC] mm: remove CONFIG_HAVE_MEMBLOCK_NODE_MAP (was: Re: [PATCH v3 0/5] mm: Enable CONFIG_NODES_SPAN_OTHER_NODES by default for NUMA)

2020-04-09 Thread Michal Hocko
On Thu 09-04-20 22:41:19, Baoquan He wrote: > On 04/02/20 at 10:01am, Michal Hocko wrote: > > On Wed 01-04-20 10:51:55, Mike Rapoport wrote: > > > Hi, > > > > > > On Wed, Apr 01, 2020 at 01:42:27PM +0800, Baoquan He wrote: > > [...] >

Re: [PATCH v1 1/2] powerpc/pseries/hotplug-memory: stop checking is_mem_section_removable()

2020-04-09 Thread Michal Hocko
On Thu 09-04-20 10:12:20, David Hildenbrand wrote: > On 09.04.20 09:59, Michal Hocko wrote: > > On Thu 09-04-20 17:26:01, Michael Ellerman wrote: > >> David Hildenbrand writes: > >> > >>> In commit 53cdc1cb29e8 ("drivers/base/memory.c: indicate all

Re: [PATCH v1 1/2] powerpc/pseries/hotplug-memory: stop checking is_mem_section_removable()

2020-04-09 Thread Michal Hocko
ose messages in the kernel log? >From the below you can clearly tell that there are kernel allocations which prevent hot remove from happening. If the overall size of the debugging output is a concern then we can think of a way to reduce it. E.g. once you have a couple of pages reported then all others from the same block are likely not interesting much. -- Michal Hocko SUSE Labs

Re: [PATCH v1 2/2] mm/memory_hotplug: remove is_mem_section_removable()

2020-04-07 Thread Michal Hocko
On Tue 07-04-20 15:54:16, David Hildenbrand wrote: > Fortunately, all users of is_mem_section_removable() are gone. Get rid of > it, including some now unnecessary functions. > > Cc: Michael Ellerman > Cc: Benjamin Herrenschmidt > Cc: Michal Hocko > Cc: Andrew Morton

Re: [PATCH v1 1/2] powerpc/pseries/hotplug-memory: stop checking is_mem_section_removable()

2020-04-07 Thread Michal Hocko
he initial > hotremove support of lmbs. I am not familiar with this code but it makes sense to make it sync with the global behavior. > Cc: Nathan Fontenot > Cc: Michael Ellerman > Cc: Benjamin Herrenschmidt > Cc: Paul Mackerras > Cc: Michal Hocko > Cc: Andrew Morton &

Re: [PATCH RFC] mm: remove CONFIG_HAVE_MEMBLOCK_NODE_MAP (was: Re: [PATCH v3 0/5] mm: Enable CONFIG_NODES_SPAN_OTHER_NODES by default for NUMA)

2020-04-02 Thread Michal Hocko
pensate the addition of nid to the memblock structures. Well, we can make memblock_region->nid defined only for CONFIG_NUMA. memblock_get_region_node would then unconditionally return 0 on UMA. Essentially the same way we do NUMA for other MM code. I only see few direct usage of region->nid. -- Michal Hocko SUSE Labs

Re: [PATCH v3 0/5] mm: Enable CONFIG_NODES_SPAN_OTHER_NODES by default for NUMA

2020-03-31 Thread Michal Hocko
On Tue 31-03-20 22:03:32, Baoquan He wrote: > Hi Michal, > > On 03/31/20 at 10:55am, Michal Hocko wrote: > > On Tue 31-03-20 11:14:23, Mike Rapoport wrote: > > > Maybe I mis-read the code, but I don't see how this could happen. In the > > > HAVE_MEMBLOCK_NOD

Re: [PATCH v3 0/5] mm: Enable CONFIG_NODES_SPAN_OTHER_NODES by default for NUMA

2020-03-31 Thread Michal Hocko
On Tue 31-03-20 11:14:23, Mike Rapoport wrote: > On Mon, Mar 30, 2020 at 08:23:01PM +0200, Michal Hocko wrote: > > On Mon 30-03-20 20:51:00, Mike Rapoport wrote: > > > On Mon, Mar 30, 2020 at 09:42:46AM +0200, Michal Hocko wrote: > > > > On Sat 28-03-20 11:31:17, Hoan

Re: [PATCH v3 0/5] mm: Enable CONFIG_NODES_SPAN_OTHER_NODES by default for NUMA

2020-03-30 Thread Michal Hocko
On Mon 30-03-20 20:51:00, Mike Rapoport wrote: > On Mon, Mar 30, 2020 at 09:42:46AM +0200, Michal Hocko wrote: > > On Sat 28-03-20 11:31:17, Hoan Tran wrote: > > > In NUMA layout which nodes have memory ranges that span across other > > > nodes, > > > the mm

Re: [PATCH v3 0/5] mm: Enable CONFIG_NODES_SPAN_OTHER_NODES by default for NUMA

2020-03-30 Thread Michal Hocko
On Mon 30-03-20 12:21:27, Mike Rapoport wrote: > On Mon, Mar 30, 2020 at 09:42:46AM +0200, Michal Hocko wrote: > > On Sat 28-03-20 11:31:17, Hoan Tran wrote: > > > In NUMA layout which nodes have memory ranges that span across other > > > nodes, > > > the mm

Re: [PATCH v3 0/5] mm: Enable CONFIG_NODES_SPAN_OTHER_NODES by default for NUMA

2020-03-30 Thread Michal Hocko
n from some systems. The > change looks interesting though. Does this make it more clear? physical address range and its node associaion [0 0 0 0 1 1 1 1 0 0 0 0 1 1 1 1 0 0 0 0 1 1 1 1] -- Michal Hocko SUSE Labs

Re: [PATCH v3 0/5] mm: Enable CONFIG_NODES_SPAN_OTHER_NODES by default for NUMA

2020-03-30 Thread Michal Hocko
fig | 9 - > arch/x86/Kconfig | 9 - > mm/page_alloc.c | 2 +- > 5 files changed, 1 insertion(+), 36 deletions(-) > > -- > 1.8.3.1 > -- Michal Hocko SUSE Labs

Re: [PATCH v2] mm/sparse: Fix kernel crash with pfn_section_valid check

2020-03-26 Thread Michal Hocko
enables this. I believe that the necessity of clearing the section before the tear down is worth a comment into the code. Because this is just way to easy to miss or not be aware at all while looking into the code without git balme. > Fixes: d41e2f3bd546 ("mm/hotplug: fix hot remov

Re: [PATCH] mm/sparse: Fix kernel crash with pfn_section_valid check

2020-03-26 Thread Michal Hocko
On Thu 26-03-20 11:16:33, Michal Hocko wrote: > On Thu 26-03-20 15:26:22, Aneesh Kumar K.V wrote: > > On 3/26/20 3:10 PM, Michal Hocko wrote: > > > On Wed 25-03-20 08:49:14, Aneesh Kumar K.V wrote: > > > > Fixes the below crash > > > > > >

Re: [PATCH] mm/sparse: Fix kernel crash with pfn_section_valid check

2020-03-26 Thread Michal Hocko
On Thu 26-03-20 15:26:22, Aneesh Kumar K.V wrote: > On 3/26/20 3:10 PM, Michal Hocko wrote: > > On Wed 25-03-20 08:49:14, Aneesh Kumar K.V wrote: > > > Fixes the below crash > > > > > > BUG: Kernel NULL pointer dereference on read at 0x000

Re: [PATCH] mm/sparse: Fix kernel crash with pfn_section_valid check

2020-03-26 Thread Michal Hocko
pfn, > unsigned long nr_pages, > ms->usage = NULL; > } > memmap = sparse_decode_mem_map(ms->section_mem_map, section_nr); > + /* Mark the section invalid */ > + ms->section_mem_map &= ~SECTION_H

  1   2   3   4   5   >