Re: [PATCH] mm/alloc: fallback to first node if the wanted node offline

2018-12-07 Thread Michal Hocko
On Fri 07-12-18 22:27:13, Pingfan Liu wrote: [...] > diff --git a/arch/x86/mm/numa.c b/arch/x86/mm/numa.c > index 1308f54..4dc497d 100644 > --- a/arch/x86/mm/numa.c > +++ b/arch/x86/mm/numa.c > @@ -754,18 +754,23 @@ void __init init_cpu_to_node(void) > { > int cpu; > u16

Re: [PATCH] mm/alloc: fallback to first node if the wanted node offline

2018-12-07 Thread Michal Hocko
On Fri 07-12-18 22:27:13, Pingfan Liu wrote: > On Fri, Dec 7, 2018 at 10:22 PM Michal Hocko wrote: > > > > On Fri 07-12-18 21:20:17, Pingfan Liu wrote: > > [...] > > > Hi Michal, > > > > > > As I mentioned in my previous email, I have manually apply the patch, > > > and the patch can not work for

Re: [PATCH] mm/alloc: fallback to first node if the wanted node offline

2018-12-07 Thread Pingfan Liu
On Fri, Dec 7, 2018 at 10:22 PM Michal Hocko wrote: > > On Fri 07-12-18 21:20:17, Pingfan Liu wrote: > [...] > > Hi Michal, > > > > As I mentioned in my previous email, I have manually apply the patch, > > and the patch can not work for normal bootup. > > I am sorry, I have misread your previous

Re: [PATCH] mm/alloc: fallback to first node if the wanted node offline

2018-12-07 Thread Michal Hocko
On Fri 07-12-18 21:20:17, Pingfan Liu wrote: [...] > Hi Michal, > > As I mentioned in my previous email, I have manually apply the patch, > and the patch can not work for normal bootup. I am sorry, I have misread your previous response. Is there anything interesting on the serial console by any

Re: [PATCH] mm/alloc: fallback to first node if the wanted node offline

2018-12-07 Thread Pingfan Liu
On Fri, Dec 7, 2018 at 7:30 PM Michal Hocko wrote: > [...] > On Fri 07-12-18 17:40:09, Pingfan Liu wrote: > > On Fri, Dec 7, 2018 at 3:53 PM Michal Hocko wrote: > > > > > > On Fri 07-12-18 10:56:51, Pingfan Liu wrote: > > > [...] > > > > In a short word, the fix method should consider about the

Re: [PATCH] mm/alloc: fallback to first node if the wanted node offline

2018-12-07 Thread Michal Hocko
On Fri 07-12-18 17:40:09, Pingfan Liu wrote: > On Fri, Dec 7, 2018 at 3:53 PM Michal Hocko wrote: > > > > On Fri 07-12-18 10:56:51, Pingfan Liu wrote: > > [...] > > > In a short word, the fix method should consider about the two factors: > > > semantic of online-node and the effect on all archs >

Re: [PATCH] mm/alloc: fallback to first node if the wanted node offline

2018-12-07 Thread Pingfan Liu
On Fri, Dec 7, 2018 at 3:53 PM Michal Hocko wrote: > > On Fri 07-12-18 10:56:51, Pingfan Liu wrote: > [...] > > In a short word, the fix method should consider about the two factors: > > semantic of online-node and the effect on all archs > > I am pretty sure there is a lot of room for

Re: [PATCH] mm/alloc: fallback to first node if the wanted node offline

2018-12-06 Thread Michal Hocko
On Fri 07-12-18 10:56:51, Pingfan Liu wrote: [...] > In a short word, the fix method should consider about the two factors: > semantic of online-node and the effect on all archs I am pretty sure there is a lot of room for unification in this area. Nevertheless I strongly believe the bug should be

Re: [PATCH] mm/alloc: fallback to first node if the wanted node offline

2018-12-06 Thread Pingfan Liu
On Thu, Dec 6, 2018 at 8:11 PM Michal Hocko wrote: > > On Thu 06-12-18 18:44:03, Pingfan Liu wrote: > > On Thu, Dec 6, 2018 at 6:03 PM Pingfan Liu wrote: > [...] > > > Which commit is this patch applied on? I can not apply it on latest linux > > > tree. > > > > > I applied it by manual, will

Re: [PATCH] mm/alloc: fallback to first node if the wanted node offline

2018-12-06 Thread Michal Hocko
On Thu 06-12-18 18:44:03, Pingfan Liu wrote: > On Thu, Dec 6, 2018 at 6:03 PM Pingfan Liu wrote: [...] > > Which commit is this patch applied on? I can not apply it on latest linux > > tree. > > > I applied it by manual, will see the test result. I think it should > work since you instance all

Re: [PATCH] mm/alloc: fallback to first node if the wanted node offline

2018-12-06 Thread Pingfan Liu
On Thu, Dec 6, 2018 at 6:03 PM Pingfan Liu wrote: > > [...] > > THanks for pointing this out. It made my life easier. So It think the > > bug is that we call init_memory_less_node from this path. I suspect > > numa_register_memblks is the right place to do this. So I admit I > > am not 100% sure

Re: [PATCH] mm/alloc: fallback to first node if the wanted node offline

2018-12-06 Thread Pingfan Liu
[...] > THanks for pointing this out. It made my life easier. So It think the > bug is that we call init_memory_less_node from this path. I suspect > numa_register_memblks is the right place to do this. So I admit I > am not 100% sure but could you give this a try please? > Sure. > diff --git

Re: [PATCH] mm/alloc: fallback to first node if the wanted node offline

2018-12-06 Thread Michal Hocko
On Thu 06-12-18 11:07:33, Pingfan Liu wrote: > On Wed, Dec 5, 2018 at 5:40 PM Vlastimil Babka wrote: > > > > On 12/5/18 10:29 AM, Pingfan Liu wrote: > > >> [0.007418] Early memory node ranges > > >> [0.007419] node 1: [mem 0x1000-0x0008efff] > > >> [0.007420]

Re: [PATCH] mm/alloc: fallback to first node if the wanted node offline

2018-12-05 Thread Michal Hocko
On Thu 06-12-18 11:34:30, Pingfan Liu wrote: [...] > > I suspect we are looking at two issues here. The first one, and a more > > important one is that there is a NUMA affinity configured for the device > > to a non-existing node. The second one is that nr_cpus affects > > initialization of

Re: [PATCH] mm/alloc: fallback to first node if the wanted node offline

2018-12-05 Thread Pingfan Liu
On Wed, Dec 5, 2018 at 5:43 PM Michal Hocko wrote: > > On Wed 05-12-18 17:29:31, Pingfan Liu wrote: > > On Wed, Dec 5, 2018 at 5:21 PM Michal Hocko wrote: > > > > > > On Wed 05-12-18 13:38:17, Pingfan Liu wrote: > > > > On Tue, Dec 4, 2018 at 4:56 PM Michal Hocko wrote: > > > > > > > > > > On

Re: [PATCH] mm/alloc: fallback to first node if the wanted node offline

2018-12-05 Thread Pingfan Liu
On Wed, Dec 5, 2018 at 5:40 PM Vlastimil Babka wrote: > > On 12/5/18 10:29 AM, Pingfan Liu wrote: > >> [0.007418] Early memory node ranges > >> [0.007419] node 1: [mem 0x1000-0x0008efff] > >> [0.007420] node 1: [mem 0x0009-0x0009] >

Re: [PATCH] mm/alloc: fallback to first node if the wanted node offline

2018-12-05 Thread David Rientjes
On Wed, 5 Dec 2018, Pingfan Liu wrote: > > > And rather than using first_online_node, would next_online_node() work? > > > > > What is the gain? Is it for memory pressure on node0? > > > Maybe I got your point now. Do you try to give a cheap assumption on > nearest neigh of this node? > It's

Re: [PATCH] mm/alloc: fallback to first node if the wanted node offline

2018-12-05 Thread Michal Hocko
On Wed 05-12-18 17:29:31, Pingfan Liu wrote: > On Wed, Dec 5, 2018 at 5:21 PM Michal Hocko wrote: > > > > On Wed 05-12-18 13:38:17, Pingfan Liu wrote: > > > On Tue, Dec 4, 2018 at 4:56 PM Michal Hocko wrote: > > > > > > > > On Tue 04-12-18 16:20:32, Pingfan Liu wrote: > > > > > On Tue, Dec 4,

Re: [PATCH] mm/alloc: fallback to first node if the wanted node offline

2018-12-05 Thread Vlastimil Babka
On 12/5/18 10:29 AM, Pingfan Liu wrote: >> [0.007418] Early memory node ranges >> [0.007419] node 1: [mem 0x1000-0x0008efff] >> [0.007420] node 1: [mem 0x0009-0x0009] >> [0.007422] node 1: [mem

Re: [PATCH] mm/alloc: fallback to first node if the wanted node offline

2018-12-05 Thread Pingfan Liu
On Wed, Dec 5, 2018 at 5:21 PM Michal Hocko wrote: > > On Wed 05-12-18 13:38:17, Pingfan Liu wrote: > > On Tue, Dec 4, 2018 at 4:56 PM Michal Hocko wrote: > > > > > > On Tue 04-12-18 16:20:32, Pingfan Liu wrote: > > > > On Tue, Dec 4, 2018 at 3:22 PM Michal Hocko wrote: > > > > > > > > > > On

Re: [PATCH] mm/alloc: fallback to first node if the wanted node offline

2018-12-05 Thread Michal Hocko
On Wed 05-12-18 13:38:17, Pingfan Liu wrote: > On Tue, Dec 4, 2018 at 4:56 PM Michal Hocko wrote: > > > > On Tue 04-12-18 16:20:32, Pingfan Liu wrote: > > > On Tue, Dec 4, 2018 at 3:22 PM Michal Hocko wrote: > > > > > > > > On Tue 04-12-18 11:05:57, Pingfan Liu wrote: > > > > > During my test on

Re: [PATCH] mm/alloc: fallback to first node if the wanted node offline

2018-12-04 Thread Pingfan Liu
On Tue, Dec 4, 2018 at 5:09 PM Wei Yang wrote: > > On Tue, Dec 04, 2018 at 04:52:52PM +0800, Pingfan Liu wrote: > >On Tue, Dec 4, 2018 at 4:34 PM Wei Yang wrote: > >> > >> On Tue, Dec 04, 2018 at 03:20:13PM +0800, Pingfan Liu wrote: > >> >On Tue, Dec 4, 2018 at 2:54 PM Wei Yang wrote: > >> >> >

Re: [PATCH] mm/alloc: fallback to first node if the wanted node offline

2018-12-04 Thread Pingfan Liu
On Tue, Dec 4, 2018 at 3:16 PM Pingfan Liu wrote: > > On Tue, Dec 4, 2018 at 11:53 AM David Rientjes wrote: > > > > On Tue, 4 Dec 2018, Pingfan Liu wrote: > > > > > diff --git a/include/linux/gfp.h b/include/linux/gfp.h > > > index 76f8db0..8324953 100644 > > > --- a/include/linux/gfp.h > > >

Re: [PATCH] mm/alloc: fallback to first node if the wanted node offline

2018-12-04 Thread Pingfan Liu
On Tue, Dec 4, 2018 at 4:56 PM Michal Hocko wrote: > > On Tue 04-12-18 16:20:32, Pingfan Liu wrote: > > On Tue, Dec 4, 2018 at 3:22 PM Michal Hocko wrote: > > > > > > On Tue 04-12-18 11:05:57, Pingfan Liu wrote: > > > > During my test on some AMD machine, with kexec -l nr_cpus=x option, the > >

Re: [PATCH] mm/alloc: fallback to first node if the wanted node offline

2018-12-04 Thread Vlastimil Babka
On 12/4/18 9:56 AM, Michal Hocko wrote: >> The device's node num is 2. And in my case, I used nr_cpus param. Due >> to init_cpu_to_node() initialize all the possible node. It is hard >> for me to figure out without this param, how zonelists is accessed >> before page allocator works. > I believe

Re: [PATCH] mm/alloc: fallback to first node if the wanted node offline

2018-12-04 Thread Wei Yang
On Tue, Dec 04, 2018 at 04:52:52PM +0800, Pingfan Liu wrote: >On Tue, Dec 4, 2018 at 4:34 PM Wei Yang wrote: >> >> On Tue, Dec 04, 2018 at 03:20:13PM +0800, Pingfan Liu wrote: >> >On Tue, Dec 4, 2018 at 2:54 PM Wei Yang wrote: >> >> >> >> On Tue, Dec 04, 2018 at 11:05:57AM +0800, Pingfan Liu

Re: [PATCH] mm/alloc: fallback to first node if the wanted node offline

2018-12-04 Thread Pingfan Liu
On Tue, Dec 4, 2018 at 4:40 PM Wei Yang wrote: > > On Tue, Dec 04, 2018 at 04:20:32PM +0800, Pingfan Liu wrote: > >On Tue, Dec 4, 2018 at 3:22 PM Michal Hocko wrote: > >> > >> On Tue 04-12-18 11:05:57, Pingfan Liu wrote: > >> > During my test on some AMD machine, with kexec -l nr_cpus=x option,

Re: [PATCH] mm/alloc: fallback to first node if the wanted node offline

2018-12-04 Thread Michal Hocko
On Tue 04-12-18 16:20:32, Pingfan Liu wrote: > On Tue, Dec 4, 2018 at 3:22 PM Michal Hocko wrote: > > > > On Tue 04-12-18 11:05:57, Pingfan Liu wrote: > > > During my test on some AMD machine, with kexec -l nr_cpus=x option, the > > > kernel failed to bootup, because some node's data struct can

Re: [PATCH] mm/alloc: fallback to first node if the wanted node offline

2018-12-04 Thread Pingfan Liu
On Tue, Dec 4, 2018 at 4:34 PM Wei Yang wrote: > > On Tue, Dec 04, 2018 at 03:20:13PM +0800, Pingfan Liu wrote: > >On Tue, Dec 4, 2018 at 2:54 PM Wei Yang wrote: > >> > >> On Tue, Dec 04, 2018 at 11:05:57AM +0800, Pingfan Liu wrote: > >> >During my test on some AMD machine, with kexec -l

Re: [PATCH] mm/alloc: fallback to first node if the wanted node offline

2018-12-04 Thread Wei Yang
On Tue, Dec 04, 2018 at 04:20:32PM +0800, Pingfan Liu wrote: >On Tue, Dec 4, 2018 at 3:22 PM Michal Hocko wrote: >> >> On Tue 04-12-18 11:05:57, Pingfan Liu wrote: >> > During my test on some AMD machine, with kexec -l nr_cpus=x option, the >> > kernel failed to bootup, because some node's data

Re: [PATCH] mm/alloc: fallback to first node if the wanted node offline

2018-12-04 Thread Wei Yang
On Tue, Dec 04, 2018 at 03:20:13PM +0800, Pingfan Liu wrote: >On Tue, Dec 4, 2018 at 2:54 PM Wei Yang wrote: >> >> On Tue, Dec 04, 2018 at 11:05:57AM +0800, Pingfan Liu wrote: >> >During my test on some AMD machine, with kexec -l nr_cpus=x option, the >> >kernel failed to bootup, because some

Re: [PATCH] mm/alloc: fallback to first node if the wanted node offline

2018-12-04 Thread Pingfan Liu
On Tue, Dec 4, 2018 at 3:22 PM Michal Hocko wrote: > > On Tue 04-12-18 11:05:57, Pingfan Liu wrote: > > During my test on some AMD machine, with kexec -l nr_cpus=x option, the > > kernel failed to bootup, because some node's data struct can not be > > allocated, > > e.g, on x86, initialized by

Re: [PATCH] mm/alloc: fallback to first node if the wanted node offline

2018-12-03 Thread Michal Hocko
On Tue 04-12-18 11:05:57, Pingfan Liu wrote: > During my test on some AMD machine, with kexec -l nr_cpus=x option, the > kernel failed to bootup, because some node's data struct can not be allocated, > e.g, on x86, initialized by init_cpu_to_node()->init_memory_less_node(). But > device->numa_node

Re: [PATCH] mm/alloc: fallback to first node if the wanted node offline

2018-12-03 Thread Pingfan Liu
On Tue, Dec 4, 2018 at 2:54 PM Wei Yang wrote: > > On Tue, Dec 04, 2018 at 11:05:57AM +0800, Pingfan Liu wrote: > >During my test on some AMD machine, with kexec -l nr_cpus=x option, the > >kernel failed to bootup, because some node's data struct can not be > >allocated, > >e.g, on x86,

Re: [PATCH] mm/alloc: fallback to first node if the wanted node offline

2018-12-03 Thread Pingfan Liu
On Tue, Dec 4, 2018 at 11:53 AM David Rientjes wrote: > > On Tue, 4 Dec 2018, Pingfan Liu wrote: > > > diff --git a/include/linux/gfp.h b/include/linux/gfp.h > > index 76f8db0..8324953 100644 > > --- a/include/linux/gfp.h > > +++ b/include/linux/gfp.h > > @@ -453,6 +453,8 @@ static inline int

Re: [PATCH] mm/alloc: fallback to first node if the wanted node offline

2018-12-03 Thread Wei Yang
On Tue, Dec 04, 2018 at 11:05:57AM +0800, Pingfan Liu wrote: >During my test on some AMD machine, with kexec -l nr_cpus=x option, the >kernel failed to bootup, because some node's data struct can not be allocated, >e.g, on x86, initialized by init_cpu_to_node()->init_memory_less_node(). But

Re: [PATCH] mm/alloc: fallback to first node if the wanted node offline

2018-12-03 Thread David Rientjes
On Tue, 4 Dec 2018, Pingfan Liu wrote: > diff --git a/include/linux/gfp.h b/include/linux/gfp.h > index 76f8db0..8324953 100644 > --- a/include/linux/gfp.h > +++ b/include/linux/gfp.h > @@ -453,6 +453,8 @@ static inline int gfp_zonelist(gfp_t flags) > */ > static inline struct zonelist

[PATCH] mm/alloc: fallback to first node if the wanted node offline

2018-12-03 Thread Pingfan Liu
During my test on some AMD machine, with kexec -l nr_cpus=x option, the kernel failed to bootup, because some node's data struct can not be allocated, e.g, on x86, initialized by init_cpu_to_node()->init_memory_less_node(). But device->numa_node info is used as preferred_nid param for