Re: [PATCH] mm/alloc: fallback to first node if the wanted node offline

2018-12-07 Thread Michal Hocko
) here. If that is the case then we are probably calling free_area_init_node too early. I do not see it yet though. -- Michal Hocko SUSE Labs

Re: [PATCH] mm/alloc: fallback to first node if the wanted node offline

2018-12-07 Thread Michal Hocko
On Fri 07-12-18 22:27:13, Pingfan Liu wrote: > On Fri, Dec 7, 2018 at 10:22 PM Michal Hocko wrote: > > > > On Fri 07-12-18 21:20:17, Pingfan Liu wrote: > > [...] > > > Hi Michal, > > > > > > As I mentioned in my previous email, I have manually a

Re: [PATCH] mm/alloc: fallback to first node if the wanted node offline

2018-12-07 Thread Michal Hocko
nsole by any chance? -- Michal Hocko SUSE Labs

Re: [PATCH] mm/alloc: fallback to first node if the wanted node offline

2018-12-07 Thread Michal Hocko
On Fri 07-12-18 17:40:09, Pingfan Liu wrote: > On Fri, Dec 7, 2018 at 3:53 PM Michal Hocko wrote: > > > > On Fri 07-12-18 10:56:51, Pingfan Liu wrote: > > [...] > > > In a short word, the fix method should consider about the two factors: > > > semantic of

Re: [RFC PATCH 0/3] THP eligibility reporting via proc

2018-12-07 Thread Michal Hocko
On Tue 20-11-18 11:35:12, Michal Hocko wrote: > Hi, > this series of three patches aims at making THP eligibility reporting > much more robust and long term sustainable. The trigger for the change > is a regression report [1] and the long follow up discussion. In short > the speci

Re: [patch for-4.20] Revert "mm, thp: consolidate THP gfp handling into alloc_hugepage_direct_gfpmask"

2018-12-07 Thread Michal Hocko
= __alloc_pages_node(hpage_node, > + gfp | __GFP_THISNODE, order); > + goto out; > + } > + } > + > nmask = policy_nodemask(gfp, pol); > preferred_nid = policy_node(gfp, pol, node); > page = __alloc_pages_nodemask(gfp, order, preferred_nid, nmask); > diff --git a/mm/shmem.c b/mm/shmem.c > --- a/mm/shmem.c > +++ b/mm/shmem.c > @@ -1439,7 +1439,7 @@ static struct page *shmem_alloc_hugepage(gfp_t gfp, > > shmem_pseudo_vma_init(, info, hindex); > page = alloc_pages_vma(gfp | __GFP_COMP | __GFP_NORETRY | __GFP_NOWARN, > - HPAGE_PMD_ORDER, , 0, numa_node_id()); > + HPAGE_PMD_ORDER, , 0, numa_node_id(), true); > shmem_pseudo_vma_destroy(); > if (page) > prep_transhuge_page(page); -- Michal Hocko SUSE Labs

Re: [PATCH] mm/alloc: fallback to first node if the wanted node offline

2018-12-06 Thread Michal Hocko
should be fixed firs with the simplest way and all the cleanup should be done on top. Do I get it right that the diff worked for you and I can prepare a full patch? -- Michal Hocko SUSE Labs

Re: MADV_HUGEPAGE vs. NUMA semantic (was: Re: [LKP] [mm] ac5b2c1891: vm-scalability.throughput -61.3% regression)

2018-12-06 Thread Michal Hocko
d just grown too large with back and forth that didn't lead to anywhere. > On Thu, Dec 6, 2018 at 1:14 AM Michal Hocko wrote: > > > > MADV_HUGEPAGE changes the picture because the caller expressed a need > > for THP and is willing to go extra mile to get it. > > Actua

Re: MADV_HUGEPAGE vs. NUMA semantic (was: Re: [LKP] [mm] ac5b2c1891: vm-scalability.throughput -61.3% regression)

2018-12-06 Thread Michal Hocko
On Thu 06-12-18 15:49:04, David Rientjes wrote: > On Thu, 6 Dec 2018, Michal Hocko wrote: > > > MADV_HUGEPAGE changes the picture because the caller expressed a need > > for THP and is willing to go extra mile to get it. That involves > > allocation latency and as of now

Re: [PATCH] mm/alloc: fallback to first node if the wanted node offline

2018-12-06 Thread Michal Hocko
this patch? > -2nd. there are other archs, do they obey the rules? I am afraid that each arch does its own initialization. -- Michal Hocko SUSE Labs

Re: [RFC PATCH] hwpoison, memory_hotplug: allow hwpoisoned pages to be offlined

2018-12-06 Thread Michal Hocko
On Thu 06-12-18 09:15:53, Naoya Horiguchi wrote: > On Thu, Dec 06, 2018 at 09:32:06AM +0100, Michal Hocko wrote: > > On Thu 06-12-18 05:21:38, Naoya Horiguchi wrote: > > > On Wed, Dec 05, 2018 at 05:57:16PM +0100, Michal Hocko wrote: > > > > On Wed 05-12-1

[PATCH] hwpoison, memory_hotplug: allow hwpoisoned pages to be offlined

2018-12-06 Thread Michal Hocko
From: Michal Hocko We have received a bug report that an injected MCE about faulty memory prevents memory offline to succeed on 4.4 base kernel. The underlying reason was that the HWPoison page has an elevated reference count and the migration keeps failing. There are two problems

MADV_HUGEPAGE vs. NUMA semantic (was: Re: [LKP] [mm] ac5b2c1891: vm-scalability.throughput -61.3% regression)

2018-12-06 Thread Michal Hocko
he global node_reclaim and make it usable again. Does this sound at least remotely sane? -- Michal Hocko SUSE Labs

Re: [RFC PATCH] hwpoison, memory_hotplug: allow hwpoisoned pages to be offlined

2018-12-06 Thread Michal Hocko
On Thu 06-12-18 05:21:38, Naoya Horiguchi wrote: > On Wed, Dec 05, 2018 at 05:57:16PM +0100, Michal Hocko wrote: > > On Wed 05-12-18 13:29:18, Michal Hocko wrote: > > [...] > > > After some more thinking I am not really sure the above reasoning is > > > still true

Re: [PATCH] mm/alloc: fallback to first node if the wanted node offline

2018-12-06 Thread Michal Hocko
* areas are initialized. -*/ -} - /* * Setup early cpu_to_node. * @@ -763,9 +763,6 @@ void __init init_cpu_to_node(void) if (node == NUMA_NO_NODE) continue; - if (!node_online(node)) - init_memory_less_node(node); - numa_set_node(cpu, node); } } -- Michal Hocko SUSE Labs

Re: [PATCH] mm/alloc: fallback to first node if the wanted node offline

2018-12-05 Thread Michal Hocko
m not sure how much sanitization can we do. We need to fallback anyway so we should better make sure that all possible nodes are initialized regardless of nr_cpus. I will look into that. -- Michal Hocko SUSE Labs

Re: [patch 0/2 for-4.20] mm, thp: fix remote access and allocation regressions

2018-12-05 Thread Michal Hocko
On Wed 05-12-18 11:49:26, David Rientjes wrote: > On Wed, 5 Dec 2018, Michal Hocko wrote: > > > > The revert is certainly needed to prevent the regression, yes, but I > > > anticipate that Andrea will report back that patch 2 at least improves > > > the > &

Re: [patch 1/2 for-4.20] mm, thp: restore node-local hugepage allocations

2018-12-05 Thread Michal Hocko
On Wed 05-12-18 11:24:53, David Rientjes wrote: > On Wed, 5 Dec 2018, Michal Hocko wrote: > > > > > At minimum do not remove the cleanup part which consolidates the gfp > > > > hadnling to a single place. There is no real reason to have the > >

Re: [RFC PATCH] hwpoison, memory_hotplug: allow hwpoisoned pages to be offlined

2018-12-05 Thread Michal Hocko
On Wed 05-12-18 13:29:18, Michal Hocko wrote: [...] > After some more thinking I am not really sure the above reasoning is > still true with the current upstream kernel. Maybe I just managed to > confuse myself so please hold off on this patch for now. Testing by > Oscar has show

Re: [RFC PATCH] hwpoison, memory_hotplug: allow hwpoisoned pages to be offlined

2018-12-05 Thread Michal Hocko
On Mon 03-12-18 11:03:09, Michal Hocko wrote: > From: Michal Hocko > > We have received a bug report that an injected MCE about faulty memory > prevents memory offline to succeed. The underlying reason is that the > HWPoison page has an elevated reference count and the migration

Re: [LKP] [mm] ac5b2c1891: vm-scalability.throughput -61.3% regression

2018-12-05 Thread Michal Hocko
On Wed 05-12-18 10:43:43, Mel Gorman wrote: > On Wed, Dec 05, 2018 at 10:08:56AM +0100, Michal Hocko wrote: > > On Tue 04-12-18 16:47:23, David Rientjes wrote: > > > On Tue, 4 Dec 2018, Mel Gorman wrote: > > > > > > > What should also be kept in

Re: [PATCH v4 2/3] mm: Add support for kmem caches in DMA32 zone

2018-12-05 Thread Michal Hocko
be merged. We don't want slab caches with GFP_DMA32 and > ~GFP_DMA32 to be merged, so it should be in there. > (https://elixir.bootlin.com/linux/v4.19.6/source/mm/slab_common.c#L342). Ohh, my bad, I have misread the change. Sure we definitely not want to allow merging here. My bad. -- Michal Hocko SUSE Labs

Re: [LKP] [mm] ac5b2c1891: vm-scalability.throughput -61.3% regression

2018-12-05 Thread Michal Hocko
On Tue 04-12-18 16:07:27, David Rientjes wrote: > On Tue, 4 Dec 2018, Michal Hocko wrote: > > > The thing I am really up to here is that reintroduction of > > __GFP_THISNODE, which you are pushing for, will conflate madvise mode > > resp. defrag=always with a numa

Re: [PATCH v4 2/3] mm: Add support for kmem caches in DMA32 zone

2018-12-05 Thread Michal Hocko
Who is this going to merge with? -- Michal Hocko SUSE Labs

Re: [PATCH] mm/alloc: fallback to first node if the wanted node offline

2018-12-05 Thread Michal Hocko
On Wed 05-12-18 17:29:31, Pingfan Liu wrote: > On Wed, Dec 5, 2018 at 5:21 PM Michal Hocko wrote: > > > > On Wed 05-12-18 13:38:17, Pingfan Liu wrote: > > > On Tue, Dec 4, 2018 at 4:56 PM Michal Hocko wrote: > > > > > > > > On Tue 04-12-18 16:2

Re: [PATCH] mm/alloc: fallback to first node if the wanted node offline

2018-12-05 Thread Michal Hocko
On Wed 05-12-18 13:38:17, Pingfan Liu wrote: > On Tue, Dec 4, 2018 at 4:56 PM Michal Hocko wrote: > > > > On Tue 04-12-18 16:20:32, Pingfan Liu wrote: > > > On Tue, Dec 4, 2018 at 3:22 PM Michal Hocko wrote: > > > > > > > > On Tue 04-12-18 11:05

Re: [LKP] [mm] ac5b2c1891: vm-scalability.throughput -61.3% regression

2018-12-05 Thread Michal Hocko
an artificial worst case? The utilization issue Mel pointed out before and here again is a real concern IMHO. We we definitely need a better picture to make an educated decision. -- Michal Hocko SUSE Labs

Re: [patch 0/2 for-4.20] mm, thp: fix remote access and allocation regressions

2018-12-05 Thread Michal Hocko
vement. Especially when the reported regression hasn't been demonstrated on a real or repeatable workload but rather a very vague presumably worst case behavior where the access penalty is absolutely prevailing. [1] http://lkml.kernel.org/r/20181204104558.gv23...@techsingularity.net -- Michal Hocko SUSE Labs

Re: [patch 0/2 for-4.20] mm, thp: fix remote access and allocation regressions

2018-12-04 Thread Michal Hocko
On Tue 04-12-18 14:25:54, David Rientjes wrote: > On Tue, 4 Dec 2018, Michal Hocko wrote: > > > > This fixes a 13.9% of remote memory access regression and 40% remote > > > memory allocation regression on Haswell when the local node is fragmented > > > for hugepag

Re: [patch 1/2 for-4.20] mm, thp: restore node-local hugepage allocations

2018-12-04 Thread Michal Hocko
On Tue 04-12-18 13:56:30, David Rientjes wrote: > On Tue, 4 Dec 2018, Michal Hocko wrote: > > > > This is a full revert of ac5b2c18911f ("mm: thp: relax __GFP_THISNODE for > > > MADV_HUGEPAGE mappings") and a partial revert of 89c83fb539f9 ("mm,

Re: [PATCH] Revert "exec: make de_thread() freezable (was: Re: Linux 4.20-rc4)

2018-12-04 Thread Michal Hocko
On Tue 04-12-18 09:31:11, Linus Torvalds wrote: > On Tue, Dec 4, 2018 at 1:58 AM Michal Hocko wrote: > > > > AFAIU both suspend and hibernation require the system to enter quiescent > > state with no task potentially interfering with suspended devices. And > > in th

Re: [PATCH] Revert "exec: make de_thread() freezable (was: Re: Linux 4.20-rc4)

2018-12-04 Thread Michal Hocko
On Tue 04-12-18 10:33:10, Ingo Molnar wrote: > > * Michal Hocko wrote: > > > I dunno. I do not use hibernation. I am a heavy user of the suspend > > though. I s2ram all the time. And I have certainly experienced cases > > where suspend has failed and I onlyi found

Re: [RFC PATCH] hwpoison, memory_hotplug: allow hwpoisoned pages to be offlined

2018-12-04 Thread Michal Hocko
On Tue 04-12-18 09:11:05, Naoya Horiguchi wrote: > On Tue, Dec 04, 2018 at 09:48:26AM +0100, Michal Hocko wrote: > > On Tue 04-12-18 07:21:16, Naoya Horiguchi wrote: > > > On Mon, Dec 03, 2018 at 11:03:09AM +0100, Michal Hocko wrote: > > > > From: Michal Hocko >

Re: [PATCH] Revert "exec: make de_thread() freezable (was: Re: Linux 4.20-rc4)

2018-12-04 Thread Michal Hocko
On Tue 04-12-18 10:02:28, Ingo Molnar wrote: > > * Michal Hocko wrote: > > > > Do we actually have reports of this happening for people outside > > > Android? > > > > Not that I am aware of. > > I'd say outside of Android 99% of the use of hiber

Re: [PATCH] mm/alloc: fallback to first node if the wanted node offline

2018-12-04 Thread Michal Hocko
On Tue 04-12-18 16:20:32, Pingfan Liu wrote: > On Tue, Dec 4, 2018 at 3:22 PM Michal Hocko wrote: > > > > On Tue 04-12-18 11:05:57, Pingfan Liu wrote: > > > During my test on some AMD machine, with kexec -l nr_cpus=x option, the > > > kernel failed to bootup, be

Re: [RFC PATCH] hwpoison, memory_hotplug: allow hwpoisoned pages to be offlined

2018-12-04 Thread Michal Hocko
On Tue 04-12-18 07:21:16, Naoya Horiguchi wrote: > On Mon, Dec 03, 2018 at 11:03:09AM +0100, Michal Hocko wrote: > > From: Michal Hocko > > > > We have received a bug report that an injected MCE about faulty memory > > prevents memory offline to suc

Re: [LKP] [mm] ac5b2c1891: vm-scalability.throughput -61.3% regression

2018-12-04 Thread Michal Hocko
On Mon 03-12-18 13:53:21, David Rientjes wrote: > On Mon, 3 Dec 2018, Michal Hocko wrote: > > > > I think extending functionality so thp can be allocated remotely if truly > > > desired is worthwhile > > > > This is a complete NUMA policy antipatern that w

Re: [patch 0/2 for-4.20] mm, thp: fix remote access and allocation regressions

2018-12-03 Thread Michal Hocko
ads Mel and Andrea have pointed out during the previous review discussion? In other words what is the impact on the THP success rate and allocation latencies for other usecases? -- Michal Hocko SUSE Labs

Re: [patch 1/2 for-4.20] mm, thp: restore node-local hugepage allocations

2018-12-03 Thread Michal Hocko
GE_PMD_ORDER, vma, address, > + numa_node_id()); > if (!thp) > return NULL; > prep_transhuge_page(thp); > @@ -1662,7 +1663,7 @@ struct mempolicy *__get_vma_policy(struct > vm_area_struct *vma, > * freeing by another task. It is the caller's responsibility to free the > * extra reference for shared policies. > */ > -struct mempolicy *get_vma_policy(struct vm_area_struct *vma, > +static struct mempolicy *get_vma_policy(struct vm_area_struct *vma, > unsigned long addr) > { > struct mempolicy *pol = __get_vma_policy(vma, addr); -- Michal Hocko SUSE Labs

Re: [PATCH 2/3] mm/vmscan: Enable kswapd to reclaim low-protected memory

2018-12-03 Thread Michal Hocko
On Tue 04-12-18 10:40:29, Xunlei Pang wrote: > On 2018/12/4 AM 1:22, Michal Hocko wrote: > > On Mon 03-12-18 23:20:31, Xunlei Pang wrote: > >> On 2018/12/3 下午7:56, Michal Hocko wrote: > >>> On Mon 03-12-18 16:01:18, Xunlei Pang wrote: > >>>&

Re: [PATCH] mm/alloc: fallback to first node if the wanted node offline

2018-12-03 Thread Michal Hocko
e is associated with? Your patch is not correct btw, because we want to fallback into the node in the distance order rather into the first online node. -- Michal Hocko SUSE Labs

Re: [LKP] [mm] ac5b2c1891: vm-scalability.throughput -61.3% regression

2018-12-03 Thread Michal Hocko
On Mon 03-12-18 12:39:34, David Rientjes wrote: > On Mon, 3 Dec 2018, Michal Hocko wrote: > > > I have merely said that a better THP locality needs more work and during > > the review discussion I have even volunteered to work on that. There > > are other reclaim relate

Re: [LKP] [mm] ac5b2c1891: vm-scalability.throughput -61.3% regression

2018-12-03 Thread Michal Hocko
On Mon 03-12-18 10:45:35, Linus Torvalds wrote: > On Mon, Dec 3, 2018 at 10:30 AM Michal Hocko wrote: > > > > I do not get it. 5265047ac301 which this patch effectively reverts has > > regressed kvm workloads. People started to notice only later because > > they

Re: [LKP] [mm] ac5b2c1891: vm-scalability.throughput -61.3% regression

2018-12-03 Thread Michal Hocko
On Mon 03-12-18 10:19:55, Linus Torvalds wrote: > On Mon, Dec 3, 2018 at 10:15 AM Michal Hocko wrote: > > > > The thing is that there is no universal win here. There are two > > different types of workloads and we cannot satisfy both. > > Ok, if that's the case, then

Re: [LKP] [mm] ac5b2c1891: vm-scalability.throughput -61.3% regression

2018-12-03 Thread Michal Hocko
class of workloads. As we cannot satisfy both I believe we should make the API clear and in favor of a more relaxed workloads. Those with special requirements should have a proper API to reflect that (this is our general NUMA policy pattern already). -- Michal Hocko SUSE Labs

Re: [PATCH 2/3] mm/vmscan: Enable kswapd to reclaim low-protected memory

2018-12-03 Thread Michal Hocko
On Mon 03-12-18 23:20:31, Xunlei Pang wrote: > On 2018/12/3 下午7:56, Michal Hocko wrote: > > On Mon 03-12-18 16:01:18, Xunlei Pang wrote: > >> There may be cgroup memory overcommitment, it will become > >> even common in the future. > >> > >> Let's

Re: [PATCH] Revert "exec: make de_thread() freezable (was: Re: Linux 4.20-rc4)

2018-12-03 Thread Michal Hocko
On Mon 03-12-18 09:06:18, Linus Torvalds wrote: > On Mon, Dec 3, 2018 at 6:17 AM Michal Hocko wrote: > > > > This argument just doesn't make any sense. Rare bugs are maybe even more > > annoying because you do not expect them to happen. > > Absolutely. >

Re: [PATCH] Revert "exec: make de_thread() freezable (was: Re: Linux 4.20-rc4)

2018-12-03 Thread Michal Hocko
On Mon 03-12-18 15:14:59, Pavel Machek wrote: > On Mon 2018-12-03 14:53:51, Michal Hocko wrote: > > On Mon 03-12-18 14:10:06, Pavel Machek wrote: > > > On Mon 2018-12-03 13:38:57, Michal Hocko wrote: > > > > On Mon 03-12-18 13:31:49, Oleg Nesterov wrote: > >

Re: [PATCH] Revert "exec: make de_thread() freezable (was: Re: Linux 4.20-rc4)

2018-12-03 Thread Michal Hocko
On Mon 03-12-18 14:10:06, Pavel Machek wrote: > On Mon 2018-12-03 13:38:57, Michal Hocko wrote: > > On Mon 03-12-18 13:31:49, Oleg Nesterov wrote: > > > On 12/03, Michal Hocko wrote: > > > > > > > > Now, I wouldn't mind to revert this because the code is r

Re: [PATCH] Revert "exec: make de_thread() freezable (was: Re: Linux 4.20-rc4)

2018-12-03 Thread Michal Hocko
On Mon 03-12-18 13:31:49, Oleg Nesterov wrote: > On 12/03, Michal Hocko wrote: > > > > Now, I wouldn't mind to revert this because the code is really old and > > we haven't seen many bug reports about failing suspend yet. But what is > > the actual plan to make this work

Re: [PATCH 3/3] mm/memcg: Avoid reclaiming below hard protection

2018-12-03 Thread Michal Hocko
d here again. Describe the setup and the behavior please? -- Michal Hocko SUSE Labs

Re: [PATCH 2/3] mm/vmscan: Enable kswapd to reclaim low-protected memory

2018-12-03 Thread Michal Hocko
c.memcg_low_skipped) { > + sc.priority = DEF_PRIORITY; > + sc.memcg_low_reclaim = 1; > + sc.memcg_low_skipped = 0; > + goto retry; > + } > + > if (!sc.nr_reclaimed) > pgdat->kswapd_failures++; > > -- > 2.13.5 (Apple Git-94) > -- Michal Hocko SUSE Labs

Re: [PATCH 1/3] mm/memcg: Fix min/low usage in propagate_protected_usage()

2018-12-03 Thread Michal Hocko
mic_long_xchg(>low_usage, protected); > delta = protected - old_protected; > if (delta) > -- > 2.13.5 (Apple Git-94) > -- Michal Hocko SUSE Labs

[RFC PATCH] hwpoison, memory_hotplug: allow hwpoisoned pages to be offlined

2018-12-03 Thread Michal Hocko
From: Michal Hocko We have received a bug report that an injected MCE about faulty memory prevents memory offline to succeed. The underlying reason is that the HWPoison page has an elevated reference count and the migration keeps failing. There are two problems with that. First of all

Re: [PATCH v2] memblock: Anonotate memblock_is_reserved() with __init_memblock.

2018-12-03 Thread Michal Hocko
reserved is wrong. > > Use __init_memblock instead of __init. Yes, it really doesn't make much sense to stand this out of all other helpers. > Signed-off-by: liyueyi Acked-by: Michal Hocko > --- > > Changes v2: correct typo in 'warning'. > > mm/memblock.c | 2 +- &g

Re: [PATCH] Revert "exec: make de_thread() freezable (was: Re: Linux 4.20-rc4)

2018-12-03 Thread Michal Hocko
n to make this work properly? Use freezable_schedule_unsafe instead? Freezer code has some fundamental design issues which are quite hard to get over. -- Michal Hocko SUSE Labs

Re: [PATCH v2] mm: page_mapped: don't assume compound page is huge or THP

2018-11-30 Thread Michal Hocko
On Fri 30-11-18 15:36:51, Kirill A. Shutemov wrote: > On Fri, Nov 30, 2018 at 01:18:51PM +0100, Michal Hocko wrote: > > On Fri 30-11-18 13:06:57, Jan Stancek wrote: > > > LTP proc01 testcase has been observed to rarely trigger crashes > > > on arm64: >

Re: [PATCH v2] mm: page_mapped: don't assume compound page is huge or THP

2018-11-30 Thread Michal Hocko
we know that we can add a Fixes tag and also mark the patch for stable because that sounds like a stable material. > Debugged-by: Laszlo Ersek > Suggested-by: "Kirill A. Shutemov" > Signed-off-by: Jan Stancek The patch looks sensible to me Acked-by: Michal Hocko Thanks! &g

Re: [PATCH] mm: remove pte_lock_deinit()

2018-11-29 Thread Michal Hocko
e structure has changed since Hugh introduced the pte lock split > Signed-off-by: Yu Zhao Acked-by: Michal Hocko > --- > include/linux/mm.h | 11 ++- > 1 file changed, 2 insertions(+), 9 deletions(-) > > diff --git a/include/linux/mm.h b/include/linux/mm.h > ind

Re: [LKP] [mm] ac5b2c1891: vm-scalability.throughput -61.3% regression

2018-11-27 Thread Michal Hocko
long term solution should introduce a MPOL_NODE_RECLAIM kind of policy. It would effectively reclaim local nodes (within NODE_RECLAIM distance) before falling to other nodes. Apart from that we need a less disruptive reclaim driven by compaction and Mel is already working on that AFAIK. -- Michal Hocko SUSE Labs

Re: [LKP] [mm] ac5b2c1891: vm-scalability.throughput -61.3% regression

2018-11-27 Thread Michal Hocko
On Tue 27-11-18 19:17:27, Michal Hocko wrote: > On Tue 27-11-18 09:08:50, Linus Torvalds wrote: > > On Mon, Nov 26, 2018 at 10:24 PM kernel test robot > > wrote: > > > > > > FYI, we noticed a -61.3% regression of vm-scalability.throughput due > > &g

Re: [LKP] [mm] ac5b2c1891: vm-scalability.throughput -61.3% regression

2018-11-27 Thread Michal Hocko
is that we need a numa policy to tell whether a expensive localility is preferred over remote allocation. Also we definitely need a better pro-active defragmentation to allow larger pages on a local node. This is a work in progress and this patch is a stop gap fix. -- Michal Hocko SUSE Labs

Re: [PATCHi v2] mm: put_and_wait_on_page_locked() while page is migrated

2018-11-27 Thread Michal Hocko
strange and it seems unnecessary? Maybe we need a > better explanation? > > A process has no refcount on a page struct and is waiting for it to become > unlocked? Why? Should it not simply ignore that page and continue? It > cannot possibly do anything with the page since it does not hold a > refcount. So do you suggest busy waiting on the page under migration? -- Michal Hocko SUSE Labs

Re: [PATCH] mm: warn only once if page table misaccounting is detected

2018-11-27 Thread Michal Hocko
On Tue 27-11-18 15:36:38, Heiko Carstens wrote: > On Tue, Nov 27, 2018 at 02:19:16PM +0100, Michal Hocko wrote: > > On Tue 27-11-18 09:36:03, Heiko Carstens wrote: > > > Use pr_alert_once() instead of pr_alert() if page table misaccounting > > > has been detected. > &

Re: [RFC PATCH 3/3] mm, proc: report PR_SET_THP_DISABLE in proc

2018-11-27 Thread Michal Hocko
On Tue 27-11-18 07:50:08, William Kucharski wrote: > > > > On Nov 27, 2018, at 6:17 AM, Michal Hocko wrote: > > > > This is only about the process wide flag to disable THP. I do not see > > how this can be alighnement related. I suspect you wanted

Re: [PATCH] mm: warn only once if page table misaccounting is detected

2018-11-27 Thread Michal Hocko
bles_bytes on freeing mm: > %ld\n", > + mm_pgtables_bytes(mm)); > > #if defined(CONFIG_TRANSPARENT_HUGEPAGE) && !USE_SPLIT_PMD_PTLOCKS > VM_BUG_ON_MM(mm->pmd_huge_pte, mm); > -- > 2.16.4 -- Michal Hocko SUSE Labs

Re: [RFC PATCH 3/3] mm, proc: report PR_SET_THP_DISABLE in proc

2018-11-27 Thread Michal Hocko
and how the alignment comes into the game? The only thing I can think of is to not report VMAs smaller than the THP as eligible. Is this what you are looking for? -- Michal Hocko SUSE Labs

Re: [PATCHi v2] mm: put_and_wait_on_page_locked() while page is migrated

2018-11-26 Thread Michal Hocko
ke this. It makes to semantic much more clear. Thanks! -- Michal Hocko SUSE Labs

Re: [RFC PATCH 1/5] mm: print more information about mapping in __dump_page

2018-11-25 Thread Michal Hocko
On Fri 23-11-18 16:04:04, Andrew Morton wrote: > On Wed, 7 Nov 2018 11:18:26 +0100 Michal Hocko wrote: > > > From: Michal Hocko > > > > __dump_page prints the mapping pointer but that is quite unhelpful > > for many reports because the pointer itself only hel

Re: [RFC PATCH 2/3] mm, thp, proc: report THP eligibility for each vma

2018-11-23 Thread Michal Hocko
On Fri 23-11-18 16:07:06, Vlastimil Babka wrote: > On 11/20/18 11:35 AM, Michal Hocko wrote: > > From: Michal Hocko > > > > Userspace falls short when trying to find out whether a specific memory > > range is eligible for THP. There are usecases that would lik

Re: [RFC PATCH 0/4] mm, memory_hotplug: allocate memmap from hotadded memory

2018-11-23 Thread Michal Hocko
On Fri 23-11-18 13:51:57, David Hildenbrand wrote: > On 23.11.18 13:42, Michal Hocko wrote: > > On Fri 23-11-18 12:55:41, Oscar Salvador wrote: [...] > >> It is not memory that the system can use. > > > > same as bootmem ;) > > Fair enough, jus

Re: [RFC PATCH 2/4] mm, memory_hotplug: provide a more generic restrictions for memory hotplug

2018-11-23 Thread Michal Hocko
[Cc Alexander - email thread starts http://lkml.kernel.org/r/20181116101222.16581-1-osalva...@suse.com] On Fri 16-11-18 11:12:20, Oscar Salvador wrote: > From: Michal Hocko > > arch_add_memory, __add_pages take a want_memblock which controls whether > the newly added memor

Re: [RFC PATCH 0/4] mm, memory_hotplug: allocate memmap from hotadded memory

2018-11-23 Thread Michal Hocko
memory_resource(), and there > unset MHP_MEMMAP_FROM_RANGE in case that flag is enabled. I believe we will need to make this opt-in. There are some usecases which hotplug an expensive (per size) memory via hotplug and it would be too wasteful to use it for struct pages. I haven't bothered to

Re: [PATCH v15 2/2] Add oom victim's memcg to the oom context information

2018-11-22 Thread Michal Hocko
=,task_memcg=,task=,pid=,uid= > > Signed-off-by: yuzhoujian I thought I have acked this one already. Acked-by: Michal Hocko -- Michal Hocko SUSE Labs

Re: [PATCH v15 1/2] Reorganize the oom report in dump_header

2018-11-22 Thread Michal Hocko
chosen victim). > oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0-1,task=panic,pid=10737,uid=0 > > An admin can easily get the full oom context at a single line which > makes parsing much easier. > > Signed-off-by: yuzhoujian Looks good, finally Acked-by: Michal Hocko -- Michal Hocko SUSE Labs

Re: [RFC PATCH 3/3] mm, fault_around: do not take a reference to a locked page

2018-11-22 Thread Michal Hocko
On Wed 21-11-18 18:27:11, Hugh Dickins wrote: > On Wed, 21 Nov 2018, Michal Hocko wrote: > > On Tue 20-11-18 17:47:21, Hugh Dickins wrote: > > > On Tue, 20 Nov 2018, Michal Hocko wrote: > > > > > > > From: Michal Hocko > > > > > > > &g

Re: [RFC PATCH 1/3] mm, proc: be more verbose about unstable VMA flags in /proc//smaps

2018-11-21 Thread Michal Hocko
On Wed 21-11-18 18:54:28, Mike Rapoport wrote: > On Tue, Nov 20, 2018 at 11:35:13AM +0100, Michal Hocko wrote: [...] > > diff --git a/Documentation/filesystems/proc.txt > > b/Documentation/filesystems/proc.txt > > index 12a5e6e693b6..b1fda309f067 100644 > > ---

Re: Memory hotplug softlock issue

2018-11-21 Thread Michal Hocko
ly for page migration. > > Signed-off-by: Hugh Dickins The patch looks good to me - quite ugly but it doesn't make the existing code much worse. With the problem described Vlastimil fixed, feel free to add Acked-by: Michal Hocko And thanks for a prompt patch. This is something I've been ch

Re: [PATCH 4.4 131/160] mm: thp: relax __GFP_THISNODE for MADV_HUGEPAGE mappings

2018-11-21 Thread Michal Hocko
On Tue 20-11-18 15:53:10, David Rientjes wrote: > On Tue, 20 Nov 2018, Michal Hocko wrote: > > > On Mon 19-11-18 14:16:24, David Rientjes wrote: > > > On Mon, 19 Nov 2018, Greg Kroah-Hartman wrote: > > > > > > > 4.4-stable review patch. If anyone has a

Re: [RFC PATCH 3/3] mm, fault_around: do not take a reference to a locked page

2018-11-20 Thread Michal Hocko
On Tue 20-11-18 17:47:21, Hugh Dickins wrote: > On Tue, 20 Nov 2018, Michal Hocko wrote: > > > From: Michal Hocko > > > > filemap_map_pages takes a speculative reference to each page in the > > range before it tries to lock that page. While this is correct

Re: [RFC PATCH 3/3] mm, fault_around: do not take a reference to a locked page

2018-11-20 Thread Michal Hocko
On Tue 20-11-18 21:51:39, William Kucharski wrote: > > > > On Nov 20, 2018, at 7:12 AM, Michal Hocko wrote: > > > > + /* > > +* Check the locked pages before taking a reference to not > > +* go in the way of migration. >

Re: [RFC PATCH 1/3] mm, proc: be more verbose about unstable VMA flags in /proc//smaps

2018-11-20 Thread Michal Hocko
On Tue 20-11-18 10:32:07, Dan Williams wrote: > On Tue, Nov 20, 2018 at 2:35 AM Michal Hocko wrote: > > > > From: Michal Hocko > > > > Even though vma flags exported via /proc//smaps are explicitly > > documented to be not guaranteed for future compatibility the

Re: [RFC PATCH 1/3] mm, proc: be more verbose about unstable VMA flags in /proc//smaps

2018-11-20 Thread Michal Hocko
ing on a semantic of a specific VMA > > > flag. The primary reason why that happened is a lack of a proper > > > internface. While this has been worked on and it will be fixed properly, > > > it seems that our wording could see some refinement and be more vocal &

Re: [RFC PATCH 2/3] mm, memory_hotplug: deobfuscate migration part of offlining

2018-11-20 Thread Michal Hocko
On Tue 20-11-18 16:13:35, Oscar Salvador wrote: > > > Signed-off-by: Michal Hocko > [...] > > + do { > > + for (pfn = start_pfn; pfn;) > > + { > > + /* start memory hot removal */ > > Should we change thAT c

Re: [RFC PATCH 1/3] mm, memory_hotplug: try to migrate full section worth of pages

2018-11-20 Thread Michal Hocko
On Tue 20-11-18 15:51:32, Oscar Salvador wrote: > On Tue, 2018-11-20 at 14:43 +0100, Michal Hocko wrote: > > From: Michal Hocko > > > > do_migrate_range has been limiting the number of pages to migrate to > > 256 > > for some reason which is not documented. >

Re: [RFC PATCH 2/3] mm, memory_hotplug: deobfuscate migration part of offlining

2018-11-20 Thread Michal Hocko
rt memory hot removal */ - ret = -EINTR; if (signal_pending(current)) { + ret = -EINTR; reason = "signal backoff"; goto failed_removal_isolated; } -- Michal Hocko SUSE Labs

Re: [RFC PATCH 3/3] mm, fault_around: do not take a reference to a locked page

2018-11-20 Thread Michal Hocko
On Tue 20-11-18 17:17:00, Kirill A. Shutemov wrote: > On Tue, Nov 20, 2018 at 03:12:07PM +0100, Michal Hocko wrote: > > On Tue 20-11-18 17:07:15, Kirill A. Shutemov wrote: > > > On Tue, Nov 20, 2018 at 02:43:23PM +0100, Michal Hocko wrote: > >

Re: [RFC PATCH 1/3] mm, memory_hotplug: try to migrate full section worth of pages

2018-11-20 Thread Michal Hocko
e part is deeper down in the migration core. We wait for page lock or writeback and that can take a long. None of that is killable wait which is a larger surgery but something that we should consider should there be any need to address this. > Reviewed-by: David Hildenbrand Thanks! -- Michal Hocko SUSE Labs

Re: [RFC PATCH 3/3] mm, fault_around: do not take a reference to a locked page

2018-11-20 Thread Michal Hocko
On Tue 20-11-18 17:07:15, Kirill A. Shutemov wrote: > On Tue, Nov 20, 2018 at 02:43:23PM +0100, Michal Hocko wrote: > > From: Michal Hocko > > > > filemap_map_pages takes a speculative reference to each page in the > > range before it tries to lock that page. While

Re: Memory hotplug softlock issue

2018-11-20 Thread Michal Hocko
> > > > Hmm... > > > > > + * even if CONFIG_MEMORY_HOTREMOVE is not enabled, > > > + * there is a risk of waiting forever on a page reused > > > + * for something that keeps it locked indefinitely. > > > + * But best check for -EINTR above before breaking. > > > + */ > > > + break; > > > + } > > > } > > > > > > finish_wait(q, wait); > > > > ... the code continues by: > > > > if (thrashing) { > > if (!PageSwapBacked(page)) > > > > So maybe we should not set 'thrashing' true when lock < 0? > > > > Thanks! > > Vlastimil -- Michal Hocko SUSE Labs

[RFC PATCH 1/3] mm, memory_hotplug: try to migrate full section worth of pages

2018-11-20 Thread Michal Hocko
From: Michal Hocko do_migrate_range has been limiting the number of pages to migrate to 256 for some reason which is not documented. Even if the limit made some sense back then when it was introduced it doesn't really serve a good purpose these days. If the range contains huge pages then we

[RFC PATCH 0/3] few memory offlining enhancements

2018-11-20 Thread Michal Hocko
I have been chasing memory offlining not making progress recently. On the way I have noticed few weird decisions in the code. The migration itself is restricted without a reasonable justification and the retry loop around the migration is quite messy. This is addressed by patch 1 and patch 2.

[RFC PATCH 2/3] mm, memory_hotplug: deobfuscate migration part of offlining

2018-11-20 Thread Michal Hocko
From: Michal Hocko Memory migration might fail during offlining and we keep retrying in that case. This is currently obfuscate by goto retry loop. The code is hard to follow and as a result it is even suboptimal becase each retry round scans the full range from start_pfn even though we have

[RFC PATCH 3/3] mm, fault_around: do not take a reference to a locked page

2018-11-20 Thread Michal Hocko
From: Michal Hocko filemap_map_pages takes a speculative reference to each page in the range before it tries to lock that page. While this is correct it also can influence page migration which will bail out when seeing an elevated reference count. The faultaround code would bail on seeing

Re: [RFC PATCH 3/3] mm, proc: report PR_SET_THP_DISABLE in proc

2018-11-20 Thread Michal Hocko
Damn, David somehow didn't make it to the CC list. Sorry about that. On Tue 20-11-18 11:35:15, Michal Hocko wrote: > From: Michal Hocko > > David Rientjes has reported that 1860033237d4 ("mm: make > PR_SET_THP_DISABLE immediately active") has changed the way how &g

Re: [RFC PATCH 2/3] mm, thp, proc: report THP eligibility for each vma

2018-11-20 Thread Michal Hocko
Damn, David somehow didn't make it to the CC list. Sorry about that. On Tue 20-11-18 11:35:14, Michal Hocko wrote: > From: Michal Hocko > > Userspace falls short when trying to find out whether a specific memory > range is eligible for THP. There are usecases that would

Re: [RFC PATCH 1/3] mm, proc: be more verbose about unstable VMA flags in /proc//smaps

2018-11-20 Thread Michal Hocko
rst place. But, well, this ship has already sailed... > But this is a good clarification regardless. So feel free to > add: > > Acked-by: Jan Kara Thanks! -- Michal Hocko SUSE Labs

[RFC PATCH 3/3] mm, proc: report PR_SET_THP_DISABLE in proc

2018-11-20 Thread Michal Hocko
From: Michal Hocko David Rientjes has reported that 1860033237d4 ("mm: make PR_SET_THP_DISABLE immediately active") has changed the way how we report THPable VMAs to the userspace. Their monitoring tool is triggering false alarms on PR_SET_THP_DISABLE tasks because it considers an in

[RFC PATCH 1/3] mm, proc: be more verbose about unstable VMA flags in /proc//smaps

2018-11-20 Thread Michal Hocko
From: Michal Hocko Even though vma flags exported via /proc//smaps are explicitly documented to be not guaranteed for future compatibility the warning doesn't go far enough because it doesn't mention semantic changes to those flags. And they are important as well because these flags are a deep

[RFC PATCH 2/3] mm, thp, proc: report THP eligibility for each vma

2018-11-20 Thread Michal Hocko
From: Michal Hocko Userspace falls short when trying to find out whether a specific memory range is eligible for THP. There are usecases that would like to know that http://lkml.kernel.org/r/alpine.deb.2.21.1809251248450.50...@chino.kir.corp.google.com : This is used to identify heap mappings

  1   2   3   4   5   6   7   8   9   10   >