r:
Freeing unused kernel image memory: 2480K
Signed-off-by: Oscar Salvador
---
mm/page_alloc.c | 28 ++--
1 file changed, 14 insertions(+), 14 deletions(-)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index fee5e9bad0dd..94e16eba162c 100644
--- a/mm/page_alloc.c
+++ b/mm/pa
t; >
/sys/devices/system/node/node2/memory$i/state;done
And we run kmemleak_scan:
# echo "scan" > /sys/kernel/debug/kmemleak
before the patch:
kmemleak: time spend: 41596 us
after the patch:
kmemleak: time spend: 34899 us
Signed-off-by: Oscar Salvador
---
mm/kmemleak.c
I just realized I forgot to add that this was suggested by Michal.
Sorry, I was a but rushy.
On Thu, 2018-12-06 at 14:19 +0100, Oscar Salvador wrote:
> kmemleak_scan() goes through all online nodes and tries
> to scan all used pages.
> We can do better and use pfn_to_online_page(), so i
page = pfn_to_page(pfn);
> > /* only scan if page is in use */
> > if (page_count(page) == 0)
> > continue;
> > --
> > 2.13.7
>
>
--
Oscar Salvador
SUSE L3
I do not really know the tricks behind Hyper-V/Xen, could you expand on that?
So far I only tested this with qemu simulating large machines, but I plan
to try the balloning thing on Xen.
At this moment I am working on a second version of this patchset
to address Dave's feedback.
Oscar Salvador
SUSE L3
From: Oscar Salvador
This patch, as the previous one, gets rid of the wrong if statements.
While at it, I realized that the comments are sometimes very confusing,
to say the least, and wrong.
For example:
---
zone_last = ZONE_MOVABLE;
/*
* check whether node_states[N_HIGH_MEMORY
From: Oscar Salvador
Currently, when !CONFIG_HIGHMEM, status_change_nid_high is being set
to status_change_nid_normal, but on such systems N_HIGH_MEMORY falls
back to N_NORMAL_MEMORY.
That means that if status_change_nid_normal is not -1,
we will perform two calls to node_set_state for the same
From: Oscar Salvador
node_states_clear has the following if statements:
if ((N_MEMORY != N_NORMAL_MEMORY) &&
(arg->status_change_nid_high >= 0))
...
if ((N_MEMORY != N_HIGH_MEMORY) &&
(arg->status_change_nid >= 0))
...
N_MEMORY c
From: Oscar Salvador
This patchset refactors/clean ups node_states_check_changes_online/offline
functions together with node_states_set/clear_node.
The main reason behind this patchset is that currently, these
functions are suboptimal and confusing.
For example, they contain wrong statements
From: Oscar Salvador
In node_states_check_changes_online, we check if the node will
have to be set for any of the N_*_MEMORY states after the pages
have been onlined.
Later on, we perform the activation in node_states_set_node.
Currently, in node_states_set_node we set the node to N_MEMORY
From: Oscar Salvador
While looking at node_states_check_changes_online, I stumbled
upon some confusing things.
Right after entering the function, we find this:
if (N_MEMORY == N_NORMAL_MEMORY)
zone_last = ZONE_MOVABLE;
This is wrong.
N_MEMORY cannot really be equal to N_NORMAL_MEMORY
a similar "kvm_nopvspin"
argument to disable paravirtual spinlocks for KVM. This can be useful
for testing as well as allowing administrators to choose unfair lock
for their KVM guests if they want to.
Signed-off-by: Waiman Long
Signed-off-by: Oscar Salvador
---
Documentation/admin-guide/k
On 09/05/2017 08:28 AM, Juergen Gross wrote:
On 05/09/17 00:21, Davidlohr Bueso wrote:
On Mon, 04 Sep 2017, Peter Zijlstra wrote:
For testing its trivial to hack your kernel and I don't feel this is
something an Admin can make reasonable decisions about.
So why? In general less knobs is
e[1]. There really is no way to eliminate
> the race without holding a reference to the page (or hugetlb_lock). That
> check in page_huge_active just shortens the race window.
Yeah, you are right, the race already exists.
Anyway, do_migrate_range should take care of making sure what it
'Active' made more sense to me, but your point is valid.
Sorry for the confusion.
About that alloc_contig_range topic, I would like to take a look unless
someone is already on it or about to be.
Thanks Mike for the time ;-)
--
Oscar Salvador
SUSE L3
EMHP_MERGE_RESOURCE should be renamed if we agree on that.
[1]
https://patchwork.kernel.org/project/linux-mm/cover/20201217130758.11565-1-osalva...@suse.de/
--
Oscar Salvador
SUSE L3
gt; scan_movable_pages already deals with these races, so removing the check
> is acceptable. Add comment to racy code.
>
> Signed-off-by: Mike Kravetz
Reviewed-by: Oscar Salvador
> -/*
> - * Test to determine whether the hugepage is "active/in-use" (i.e. being
>
ruct mhp_params *params,
> goto err_kasan;
> }
>
> - if (!memhp_range_allowed(range->start, range_len(range), true))
> {
> - error = -ERANGE;
> - mem_hotplug_done();
> -
, but that's a much
larger change.
We tried to remove that flag in the past but for different reasons.
I will have another look to see what can be done.
Thanks
--
Oscar Salvador
SUSE L3
ser come in the future, we can always revisit.
Maybe just add a little comment in vmemmap_pte_range(), explaining while we
are "+= PAGE_SIZE" for address and I would like to see a comment in
vmemmap_remap_free why the BUG_ON and more important what it is checking.
--
Oscar Salvador
SUSE L3
low number of flags, we coud get away with:
hugetlb_{set,test,clear}_page_flag(page, flag)
and call it from the code.
But some of the flags need to be set/tested outside hugetlb code, so
it indeed looks nicer and more consistent to follow page-flags.h convention.
Sorry for the noise.
--
Oscar Salvador
SUSE L3
s in the past.
Oh, I see. I jumped late into that patchset so I missed some early messages.
Thanks for explaining this again.
--
Oscar Salvador
SUSE L3
On Thu, Dec 17, 2020 at 02:07:55PM +0100, Oscar Salvador wrote:
> Physical memory hotadd has to allocate a memmap (struct page array) for
> the newly added memory section. Currently, alloc_pages_node() is used
> for those allocations.
>
> This has some disadvantages:
> a)
more sense to move the BUILD_BUG_ON from above to
hugetlb_init?
Other than that, looks good to me, and I think it is a great improvment
towards readability and maintability.
--
Oscar Salvador
SUSE L3
On Tue, Jan 19, 2021 at 05:30:49PM -0800, Mike Kravetz wrote:
> Use new hugetlb specific HPageFreed flag to replace the
> PageHugeFreed interfaces.
>
> Signed-off-by: Mike Kravetz
Reviewed-by: Oscar Salvador
> ---
> include/linux/hugetlb.h | 3 +++
> mm/huge
Huge() check in PageHugeTemporary.
AFAICS, the paths checking it already know they are handling with a
hugetlb page, but still it is better to mention it in the changelog
in case someone wonders.
Other than that looks good to me:
Reviewed-by: Oscar Salvador
> ---
> include/linux/hugetlb.h | 6
On Wed, Jan 20, 2021 at 10:59:05AM +0100, Oscar Salvador wrote:
> On Tue, Jan 19, 2021 at 05:30:46PM -0800, Mike Kravetz wrote:
> > Use the new hugetlb page specific flag HPageMigratable to replace the
> > page_huge_active interfaces. By it's name, page_huge_active implied
>
pread over the
allocation paths, would it make more sense to place it in
alloc_huge_page before returning the page?
Then we could opencode SetHPageMigratableIfSupported right there.
I might be missing something and this might not be possible, but if it
is, it would look cleaner and more logical to me.
--
Oscar Salvador
SUSE L3
;
> Yeah, as we used to have in v1. Maybe other reviewers (@Oscar?) have a
> different opinion.
No, I think that placing the check in pagemap_range() out of the if-else
makes much more sense.
Actually, unless my memory fails me that is what I suggested in v2.
I plan to have a look at the series later this week as I am fairly busy
atm.
Thanks
--
Oscar Salvador
SUSE L3
; > include/linux/mm.h | 5 +
> > mm/Makefile | 2 +
> > mm/bootmem_info.c | 124 +++
> > mm/hugetlb.c| 218 +--
> > mm/hugetlb_vmemmap.c| 278
> >
> > mm/hugetlb_vmemmap.h| 45
> > mm/memory_hotplug.c | 116 --
> > mm/sparse-vmemmap.c | 273
> > +++
> > mm/sparse.c | 1 +
> > 17 files changed, 1082 insertions(+), 172 deletions(-)
> > create mode 100644 include/linux/bootmem_info.h
> > create mode 100644 mm/bootmem_info.c
> > create mode 100644 mm/hugetlb_vmemmap.c
> > create mode 100644 mm/hugetlb_vmemmap.h
> >
> > --
> > 2.11.0
> >
>
--
Oscar Salvador
SUSE L3
generic memory management and it is not a
> part of neither node 0 nor ZONE_DMA.
So, since it never was added to memblock.memory structs, it was not
initialized by init_unavailable_mem, right?
--
Oscar Salvador
SUSE L3
so we are checking
a wrong page[1]? Am I making sense?
--
Oscar Salvador
SUSE L3
..", but anyway:
Reviewed-by: Oscar Salvador
--
Oscar Salvador
SUSE L3
eTemporary, as I did in previous patch,
because all callers of it make sure they operate on a hugetlb page.
--
Oscar Salvador
SUSE L3
ble memory hotplug to handle
> hugepage")
> Signed-off-by: Muchun Song
> Reviewed-by: Mike Kravetz
> Cc: sta...@vger.kernel.org
LGTM,
Reviewed-by: Oscar Salvador
--
Oscar Salvador
SUSE L3
; it is already freed to the buddy allocator.
>
> Fixes: c8721bbbdd36 ("mm: memory-hotplug: enable memory hotplug to handle
> hugepage")
> Signed-off-by: Muchun Song
> Reviewed-by: Mike Kravetz
> Acked-by: Michal Hocko
> Cc: sta...@vger.kernel.org
Reviewed-by: Oscar Salvador
--
Oscar Salvador
SUSE L3
etz
> Acked-by: Michal Hocko
> Cc: sta...@vger.kernel.org
Reviewed-by: Oscar Salvador
--
Oscar Salvador
SUSE L3
euse,
> + .vmemmap_pages = _pages,
> + };
> +
> + /*
> + * In order to make remapping routine most efficient for the huge pages,
> + * the routine of vmemmap page table walking has the following rules
> + * (see more details from the vmemmap_pte_range()):
> + *
> + * - The @reuse address is part of the range that we are walking.
> + * - The @reuse address is the first in the complete range.
> + *
> + * So we need to make sure that @start and @reuse meet the above rules.
You say that "reuse" and "start" need to meet some rules, but in the
paragraph above you only seem to point "reuse" rules?
--
Oscar Salvador
SUSE L3
On Mon, Jan 25, 2021 at 11:39:55AM +0100, Oscar Salvador wrote:
> > Interresting, so we automatically support differeing sizeof(struct
> > page). I guess it will be problematic in case of sizeof(struct page) !=
> > 64, because then, we might not have multiples of 2MB for the memm
(1835008),
> only using parts of it.
>
> Ripping out a memory block, along with the PMD in the vmemmap would
> remove parts of the vmemmap of another memory block.
Bleh, yeah, I was confused, you are right.
> You might want to take a look at:
Thanks a lot for the hints, I will hav
c...@bytedance.com/
--
Oscar Salvador
SUSE L3
3bc35193d9 ("mm/hotplug: invalid PFNs from pfn_to_online_page()")
> Cc: Qian Cai
> Cc: Michal Hocko
> Cc: Oscar Salvador
> Reported-by: David Hildenbrand
> Signed-off-by: Dan Williams
Reviewed-by: Oscar Salvador
> ---
> mm/memory_hotplug.c | 24 +
}
Since PageHuge only returns true for hugetlb pages, I think the following is
more simple?
if (PageHuge(page))
is_hugetlb = true;
else if (PageTransHuge(page))
is_thp = true
Besides that, it looks good to me:
Reviewed-by: Oscar Salvador
--
Oscar Salvador
SUSE L3
saving some potential CPU cycles on normal pages.
Ah, I remember now.
I missed that, sorry.
--
Oscar Salvador
SUSE L3
On Wed, Sep 16, 2020 at 09:53:58AM -0400, Aristeu Rozanski wrote:
> Hi Oscar,
Thanks Aristeu,
>
> On Wed, Sep 16, 2020 at 09:27:02AM +0200, Oscar Salvador wrote:
> > Could you please re-run the tests with the below patch applied, and
> > attached then the logs here?
>
flags)
if (ret > 0)
ret = soft_offline_in_use_page(page);
else if (ret == 0)
- ret = soft_offline_free_page(page);
+ if (soft_offline_free_page(page) && try_again) {
+ try_again = false;
+ goto retry;
+ }
return ret;
--
Oscar Salvador
SUSE L3
unsigned long start, unsigned long end)
> {
> - struct zone *zone;
> unsigned long size;
>
> if (!capable(CAP_SYS_ADMIN))
> --
> 2.25.1
>
>
--
Oscar Salvador
SUSE L3
.
This patch tries to handle such race window by trying to handle the new
type of page again if the page was allocated under us.
After this patch, Aristeu said the test cases work properly.
Signed-off-by: Oscar Salvador
Reported-by: Aristeu Rozanski
---
mm/memory-failure.c | 7 ++-
1 file
it is unpoisoned, so we fixed the situation.
So unless I am missing something, I strongly think that we should report
MF_RECOVERED.
[1] https://lore.kernel.org/linux-mm/20190826104144.GA7849@linux/T/#u
[2] https://patchwork.kernel.org/patch/11694847/
Signed-off-by: Oscar Salvador
---
mm/memory-failure.c
/cover/11704083/
[2] https://patchwork.kernel.org/comment/23619775/
Thanks
Oscar Salvador (7):
mm,hwpoison: take free pages off the buddy freelists
mm,hwpoison: Do not set hugepage_or_freepage unconditionally
mm,hwpoison: Try to narrow window race for free pages
mm,hwpoison: refactor
this fix, everything works.
[1] https://patchwork.kernel.org/comment/23617301/
[2] https://patchwork.kernel.org/comment/23619535/
Signed-off-by: Oscar Salvador
Reported-by: Aristeu Rozanski
---
mm/memory-failure.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mm/memory
was
in-use and we migrated it 3) was a clean pagecache.
Because of that, a page cannot longer be poisoned and be in a pcplist.
Signed-off-by: Oscar Salvador
---
mm/madvise.c | 4
1 file changed, 4 deletions(-)
diff --git a/mm/madvise.c b/mm/madvise.c
index 4a48f7215195..302f3a84d17c 100644
right before coming here, so we should be on the safe
side.
Signed-off-by: Oscar Salvador
---
mm/memory-failure.c | 12
1 file changed, 12 deletions(-)
diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index 7fba4ba201d5..f68cb5e3b320 100644
--- a/mm/memory-failure.c
+++ b/mm
and retry
the check again. It might be that pcplists have been spilled into the
buddy allocator and so we can handle it.
Signed-off-by: Oscar Salvador
---
mm/memory-failure.c | 24 ++--
1 file changed, 22 insertions(+), 2 deletions(-)
diff --git a/mm/memory-failure.c b/mm/memory
Make a proper if-else condition for {hard,soft}-offline.
[akpm: remove zone variable and refactor comment]
Signed-off-by: Oscar Salvador
---
mm/madvise.c | 32 ++--
1 file changed, 14 insertions(+), 18 deletions(-)
diff --git a/mm/madvise.c b/mm/madvise.c
index
chset diverged so much from mine, but is
not right.
I will go over my patchset and yours and compare/fix things.
--
Oscar Salvador
SUSE L3
On Thu, Sep 17, 2020 at 03:09:52PM +0200, Oscar Salvador wrote:
> static bool page_handle_poison(struct page *page, bool hugepage_or_freepage,
> bool release)
> {
> if (release) {
> put_page(page);
> drain_all_pa
ppreciate a comment in pageset_set_high_and_batch to be
restored and updated, otherwise:
Reviewed-by: Oscar Salvador
Thanks
--
Oscar Salvador
SUSE L3
On Thu, Sep 10, 2020 at 10:31:20AM +0200, Oscar Salvador wrote:
> On Mon, Sep 07, 2020 at 06:36:24PM +0200, Vlastimil Babka wrote:
> > Signed-off-by: Vlastimil Babka
>
> > for_each_possible_cpu(cpu)
> > - setup_pageset(_cpu(boot_pageset, cpu), 0);
> >
to all per-cpu pagesets of the zone.
>
> This also allows removing zone_pageset_init() and __zone_pcp_update()
> wrappers.
>
> No functional change.
>
> Signed-off-by: Vlastimil Babka
I like this, it simplifies the things.
Reviewed-by: Oscar Salvador
--
Oscar Salvador
SUSE L3
nit(). Non-boot pagesets then subsequently update them to specific
> values.
>
> Signed-off-by: Vlastimil Babka
Reviewed-by: Oscar Salvador
Just one question below:
> -static void setup_pageset(struct per_cpu_pageset *p)
> -{
> - pageset_init(p);
Baoquan He
> Cc: Pankaj Gupta
> Cc: Oscar Salvador
> Signed-off-by: David Hildenbrand
Reviewed-by: Oscar Salvador
--
Oscar Salvador
SUSE L3
true that we can hot-{remove,add} sub-section granularity,
while
we can only online /offline on section granularity?
>
> Acked-by: Michal Hocko
> Cc: Andrew Morton
> Cc: Michal Hocko
> Cc: Wei Yang
> Cc: Baoquan He
> Cc: Pankaj Gupta
> Cc: Oscar Salvador
> Signe
directly.
>
> offlined_pages always corresponds to nr_pages, so we can simplify that.
>
> Acked-by: Michal Hocko
> Cc: Andrew Morton
> Cc: Michal Hocko
> Cc: Wei Yang
> Cc: Baoquan He
> Cc: Pankaj Gupta
> Cc: Oscar Salvador
> Signed-off-by: David Hildenbrand
Reviewed-by:
> checks.
>
> Update the documentation.
>
> Acked-by: Michal Hocko
> Cc: Andrew Morton
> Cc: Michal Hocko
> Cc: Wei Yang
> Cc: Baoquan He
> Cc: Pankaj Gupta
> Cc: Oscar Salvador
> Signed-off-by: David Hildenbrand
pretty nice
Reviewed-by: Oscar Salvador
--
Oscar Salvador
SUSE L3
> always span full pageblocks.
>
> We can directly calculate the number of isolated pageblocks from nr_pages.
>
> Acked-by: Michal Hocko
> Cc: Andrew Morton
> Cc: Michal Hocko
> Cc: Wei Yang
> Cc: Baoquan He
> Cc: Pankaj Gupta
> Cc: Oscar Salvador
>
On Wed, Aug 19, 2020 at 07:59:53PM +0200, David Hildenbrand wrote:
> Callers no longer need the number of isolated pageblocks. Let's
> simplify.
>
> Acked-by: Michal Hocko
> Cc: Andrew Morton
> Cc: Michal Hocko
> Cc: Wei Yang
> Cc: Baoquan He
> Cc: Pankaj
ing the callback not exposing all
> pages to the buddy.
>
> Acked-by: Michal Hocko
> Cc: Andrew Morton
> Cc: Michal Hocko
> Cc: Wei Yang
> Cc: Baoquan He
> Cc: Pankaj Gupta
> Cc: Oscar Salvador
> Signed-off-by: David Hildenbrand
Reviewed-by: Oscar Salvador
--
Oscar Salvador
SUSE L3
's drop the stale comment and make the pageblock check easier to read.
>
> Acked-by: Michal Hocko
> Cc: Andrew Morton
> Cc: Michal Hocko
> Cc: Wei Yang
> Cc: Baoquan He
> Cc: Pankaj Gupta
> Cc: Oscar Salvador
> Cc: Mel Gorman
> Signed-off-by: David Hildenbrand
Reviewed-by: Oscar Salvador
--
Oscar Salvador
SUSE L3
ndrew Morton
> Cc: Michal Hocko
> Cc: Wei Yang
> Cc: Baoquan He
> Cc: Pankaj Gupta
> Cc: Oscar Salvador
> Cc: Tony Luck
> Cc: Fenghua Yu
> Cc: Logan Gunthorpe
> Cc: Dan Williams
> Cc: Mike Rapoport
> Cc: "Matthew Wilcox (Oracle)"
> Cc: Michel Les
> https://lkml.kernel.org/r/1597150703-19003-1-git-send-email-chara...@codeaurora.org
>
> Acked-by: Michal Hocko
> Cc: Andrew Morton
> Cc: Michal Hocko
> Cc: Wei Yang
> Cc: Baoquan He
> Cc: Pankaj Gupta
> Cc: Oscar Salvador
> Cc: Charan Teja Reddy
> Signed-off-by: David Hildenbrand
Reviewed-by: Oscar Salvador
--
Oscar Salvador
SUSE L3
e it more reliably.
>
> Reported-by: Qian Cai
> Signed-off-by: Naoya Horiguchi
Reviewed-by: Oscar Salvador
--
Oscar Salvador
SUSE L3
be better than passing
> around the context.
I am not sure if I would duplicate the code there.
We could just pass the pointer of the function we want to call to
link_mem_sections? either register_mem_sect_under_node_hotplug or
register_mem_sect_under_node_early?
Would not that be clean and clear enough?
--
Oscar Salvador
SUSE L3
On Mon, Sep 14, 2020 at 05:22:16PM +0800, Qi Liu wrote:
> Variable zone is unused in function madvise_inject_error, let's remove it.
>
> Signed-off-by: Qi Liu
Andrew already fixed that up in my patch.
Thanks anyway
--
Oscar Salvador
SUSE L3
and not that important, so if anything,
consider patch#1 for inclusion.
[1] https://patchwork.kernel.org/cover/11704083/
Thanks
Oscar Salvador (5):
mm,hwpoison: take free pages off the buddy freelists
mm,hwpoison: refactor madvise_inject_error
mm,hwpoison: drain pcplists before bailing out
was
in-use and we migrated it 3) was a clean pagecache.
Because of that, a page cannot longer be poisoned and be in a pcplist.
Link: https://lkml.kernel.org/r/20200908075626.11976-5-osalva...@suse.de
Signed-off-by: Oscar Salvador
Cc: Michal Hocko
Cc: Naoya Horiguchi
Cc: Oscar Salvador
Cc: Qian
and retry
the check again. It might be that pcplists have been spilled into the
buddy allocator and so we can handle it.
Link: https://lkml.kernel.org/r/20200908075626.11976-4-osalva...@suse.de
Signed-off-by: Oscar Salvador
Cc: Michal Hocko
Cc: Naoya Horiguchi
Cc: Oscar Salvador
Cc: Qian Cai
Cc
right before coming here, so we should be on the safe
side.
Link: https://lkml.kernel.org/r/20200908075626.11976-6-osalva...@suse.de
Signed-off-by: Oscar Salvador
Cc: Michal Hocko
Cc: Naoya Horiguchi
Cc: Oscar Salvador
Cc: Qian Cai
Cc: Tony Luck
Signed-off-by: Andrew Morton
Signed-off
Make a proper if-else condition for {hard,soft}-offline.
[akpm: remove zone variable and refactor comment]
Link: https://lkml.kernel.org/r/20200908075626.11976-3-osalva...@suse.de
Signed-off-by: Oscar Salvador
Cc: Michal Hocko
Cc: Naoya Horiguchi
Cc: Qian Cai
Cc: Tony Luck
Signed-off
ux/T/#u
[2] https://patchwork.kernel.org/patch/11694847/
Link: https://lkml.kernel.org/r/20200908075626.11976-1-osalva...@suse.de
Link: https://lkml.kernel.org/r/20200908075626.11976-2-osalva...@suse.de
Signed-off-by: Oscar Salvador
Cc: Naoya Horiguchi
Cc: Michal Hocko
Cc: Tony Luck
Cc: Qian Cai
On Thu, Sep 10, 2020 at 11:23:07AM +0200, Oscar Salvador wrote:
> On Mon, Sep 07, 2020 at 06:36:26PM +0200, Vlastimil Babka wrote:
> > We initialize boot-time pagesets with setup_pageset(), which sets high and
> > batch values that effectively disable pcplists.
> >
> >
update) {
> + return;
> + }
I am probably missimg something obvious, so sorry, but why do we need
force_update here?
AFAICS, we only want to call pageset_update() in case zone->pageset_high/batch
and the new computed high/batch differs, so if everything is equal, why do we
want
to call it anyways?
--
Oscar Salvador
SUSE L3
i_bus_attach+0x60/0x1c0
kernel: [0.760811] acpi_bus_scan+0x33/0x70
kernel: [0.760811] acpi_scan_init+0xea/0x21b
kernel: [0.760811] acpi_init+0x2f1/0x33c
kernel: [0.760811] do_one_initcall+0x46/0x1f4
--
Oscar Salvador
SUSE L3
efault_online_type=[online,online_*], memory will get onlined right
after hot-adding stage:
/* online pages if requested */
if (memhp_default_online_type != MMOP_OFFLINE)
walk_memory_blocks(start, size, NULL, online_memory_block);
If not, systemd-udev will do the magic
e is no functional change introduced by this patch
>
> Suggested-by: David Hildenbrand
> Signed-off-by: Laurent Dufour
Reviewed-by: Oscar Salvador
--
Oscar Salvador
SUSE L3
On Tue, Sep 15, 2020 at 11:41:42AM +0200, Laurent Dufour wrote:
> [1] According to Oscar Salvador, using this qemu command line, ACPI memory
> hotplug operations are raised at SYSTEM_SCHEDULING state:
I would like to stress that this is not the only way we can end up
hotplugging memor while
> Acked-by: Michal Hocko
> Acked-by: David Hildenbrand
> Cc: Greg Kroah-Hartman
Reviewed-by: Oscar Salvador
--
Oscar Salvador
SUSE L3
nc__);
+ dump_page(page, "soft_offline_free_page");
rc = -EBUSY;
+ }
return rc;
}
Thanks
--
Oscar Salvador
SUSE L3
_succeeded);
As I said, instead of alloc_demote_page, use a new_demote_page and make
alloc_migration_target handle the allocations and prep thp pages.
--
Oscar Salvador
SUSE L3
support ... shouldn't be too hard :)
Yeah, I guess so, but first I would like to have everything else settled.
So, gentle ping :-)
--
Oscar Salvador
SUSE L3
with vmemmap
> optimizations for hugetlb pages.
>
> https://lkml.kernel.org/r/20201026145114.59424-1-songmuc...@bytedance.com
I was about to have a look at that series eitherway, but good you mentioned.
--
Oscar Salvador
SUSE L3
And you are using __GFP_HIGH, which will allow us to use more memory (by
cutting down the watermark), but it might lead to putting the system
on its knees wrt. memory.
And yes, I know that once we allocate the 4088 pages, 1GB gets freed, but
still.
I would like to hear Michal's thoughts on this one, but I wonder if it makes
sense to not let 1GB-HugeTLB pages be freed.
--
Oscar Salvador
SUSE L3
I would put this in an else-if above:
if (free_vmemmap_pages_per_hpage(h)) {
set_page_private(head + 4, page - head);
return;
} else if (page != head) {
SetPageHWPoison(page);
ClearPageHWPoison(head);
}
or will we lose the optimization in case free_vmemmap_pages_per_hpage gets
compiled out?
--
Oscar Salvador
SUSE L3
ing to do with an EIO error.
Let us return -EBUSY instead, as we do in case we failed to isolate
the page.
While are it, let us remove the "ret" print as its value does not change.
Signed-off-by: Oscar Salvador
---
mm/memory-failure.c | 6 +++---
1 file changed, 3 insertions(+), 3 delet
You can simplify to
>
> return arch_support_memmap_on_memory() &&
> IS_ENABLED(CONFIG_SPARSEMEM_VMEMMAP) &&
>size == memory_block_size_bytes();
Yeah, thanks ;-)
--
Oscar Salvador
SUSE L3
On Wed, Dec 02, 2020 at 10:37:23AM +0100, David Hildenbrand wrote:
> Please split that patch into two parts, one for each subsystem.
I did not feel the need but if it eases the review, why not :-)
--
Oscar Salvador
SUSE L3
On Wed, Dec 09, 2020 at 10:40:13AM +0100, David Hildenbrand wrote:
> Sorry if I was unclear, s390x will simply not set
> ARCH_MHP_MEMMAP_ON_MEMORY_ENABLE.
Bleh, that makes sense now.
I'm in a monday..
Thanks David
--
Oscar Salvador
SUSE L3
t; obj-$(CONFIG_MEMTEST) += memtest.o
> obj-$(CONFIG_MIGRATION) += migrate.o
> obj-$(CONFIG_TRANSPARENT_HUGEPAGE) += huge_memory.o khugepaged.o
>
>
> The you can just use module_param/MODULE_PARM_DESC and set the parameter via
>
> "memory_hotplug.memmap_on_memory"
I have to confess that I was not aware of this trick, but looks cleaner
overall.
Thanks
--
Oscar Salvador
SUSE L3
s;
> > + altmap = _altmap;
> > + }
> > +
>
> If someone would remove_memory() in a different granularity than
> add_memory(), this would no longer work. How can we catch that
> efficiently? Or at least document that this is not supported with
> memmap_on_memory).
Well, we can check whether the size spans more than a single memory block.
And if it does, we can print a warning with pr_warn and refuse to remove
the memory.
We could document that "memory_hotplug.memmap_on_memory" is meant to operate
with ranges that span a single memory block.
And I guess that we could place that documentation under [1].
[1] Documentation/admin-guide/kernel-parameters.txt
Thanks for the review David
--
Oscar Salvador
SUSE L3
On Wed, Dec 09, 2020 at 10:59:04AM +0100, David Hildenbrand wrote:
> On 09.12.20 10:28, Oscar Salvador wrote:
> Do we expect callers to retry immediately? -EAGAIN might make also
> sense. But -EBUSY is an obvious improvement. Do we have callers relying
> on this behavior?
Not re
1 - 100 of 1233 matches
Mail list logo