Re: [PATCH v3 2/2] mm/page_alloc: fix memmap_init_zone pageblock alignment

2018-03-13 Thread Daniel Vacek
On Tue, Mar 13, 2018 at 7:34 AM, Naresh Kamboju
 wrote:
> On 12 March 2018 at 22:21, Daniel Vacek  wrote:
>> On Mon, Mar 12, 2018 at 3:49 PM, Naresh Kamboju
>>  wrote:
>>> On 12 March 2018 at 17:56, Sudeep Holla  wrote:
 Hi,

 I couldn't find the exact mail corresponding to the patch merged in 
 v4.16-rc5
 but commit 864b75f9d6b01 "mm/page_alloc: fix memmap_init_zone
 pageblock alignment"
 cause boot hang on my ARM64 platform.
>>>
>>> I have also noticed this problem on hi6220 Hikey - arm64.
>>>
>>> LKFT: linux-next: Hikey boot failed linux-next-20180308
>>> https://bugs.linaro.org/show_bug.cgi?id=3676
>>>
>>> - Naresh
>>>

 Log:
 [0.00] NUMA: No NUMA configuration found
 [0.00] NUMA: Faking a node at [mem
 0x-0x0009]
 [0.00] NUMA: NODE_DATA [mem 0x9fffcb480-0x9fffccf7f]
 [0.00] Zone ranges:
 [0.00]   DMA32[mem 0x8000-0x]
 [0.00]   Normal   [mem 0x0001-0x0009]
 [0.00] Movable zone start for each node
 [0.00] Early memory node ranges
 [0.00]   node   0: [mem 0x8000-0xf8f9afff]
 [0.00]   node   0: [mem 0xf8f9b000-0xf908]
 [0.00]   node   0: [mem 0xf909-0xf914]
 [0.00]   node   0: [mem 0xf915-0xf920]
 [0.00]   node   0: [mem 0xf921-0xf922]
 [0.00]   node   0: [mem 0xf923-0xf95b]
 [0.00]   node   0: [mem 0xf95c-0xfe58]
 [0.00]   node   0: [mem 0xfe59-0xfe5c]
 [0.00]   node   0: [mem 0xfe5d-0xfe5d]
 [0.00]   node   0: [mem 0xfe5e-0xfe62]
 [0.00]   node   0: [mem 0xfe63-0xfeff]
 [0.00]   node   0: [mem 0x00088000-0x0009]
 [0.00]  Initmem setup node 0 [mem 
 0x8000-0x0009]

 On Sat, Mar 3, 2018 at 1:08 AM, Daniel Vacek  wrote:
> On Sat, Mar 3, 2018 at 1:40 AM, Andrew Morton  
> wrote:
>>
>> This makes me wonder whether a -stable backport is really needed...
>
> For some machines it definitely is. Won't hurt either, IMHO.
>
> --nX
>>
>> Hmm, does it step back perhaps?
>>
>> Can you check if below cures the boot hang?
>>
>> --nX
>>
>> 
>> neelx@metal:~/nX/src/linux$ git diff
>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>> index 3d974cb2a1a1..415571120bbd 100644
>> --- a/mm/page_alloc.c
>> +++ b/mm/page_alloc.c
>> @@ -5365,8 +5365,10 @@ void __meminit memmap_init_zone(unsigned long
>> size, int nid, unsigned long zone,
>>  * the valid region but still depends on correct page
>>  * metadata.
>>  */
>> -   pfn = (memblock_next_valid_pfn(pfn, end_pfn) &
>> +   unsigned long next_pfn;
>> +   next_pfn = (memblock_next_valid_pfn(pfn, end_pfn) &
>> ~(pageblock_nr_pages-1)) - 1;
>> +   pfn = max(next_pfn, pfn);
>>  #endif
>> continue;
>> }
>
> After applying this patch on linux-next the boot hang problem resolved.
> Now the hi6220-hikey is booting successfully.
> Thank you.

Thank you and Sudeep for testing. I've just sent Andrew a formal patch.

>
> - Naresh
>
>> 


Re: [PATCH v3 2/2] mm/page_alloc: fix memmap_init_zone pageblock alignment

2018-03-13 Thread Daniel Vacek
On Tue, Mar 13, 2018 at 7:34 AM, Naresh Kamboju
 wrote:
> On 12 March 2018 at 22:21, Daniel Vacek  wrote:
>> On Mon, Mar 12, 2018 at 3:49 PM, Naresh Kamboju
>>  wrote:
>>> On 12 March 2018 at 17:56, Sudeep Holla  wrote:
 Hi,

 I couldn't find the exact mail corresponding to the patch merged in 
 v4.16-rc5
 but commit 864b75f9d6b01 "mm/page_alloc: fix memmap_init_zone
 pageblock alignment"
 cause boot hang on my ARM64 platform.
>>>
>>> I have also noticed this problem on hi6220 Hikey - arm64.
>>>
>>> LKFT: linux-next: Hikey boot failed linux-next-20180308
>>> https://bugs.linaro.org/show_bug.cgi?id=3676
>>>
>>> - Naresh
>>>

 Log:
 [0.00] NUMA: No NUMA configuration found
 [0.00] NUMA: Faking a node at [mem
 0x-0x0009]
 [0.00] NUMA: NODE_DATA [mem 0x9fffcb480-0x9fffccf7f]
 [0.00] Zone ranges:
 [0.00]   DMA32[mem 0x8000-0x]
 [0.00]   Normal   [mem 0x0001-0x0009]
 [0.00] Movable zone start for each node
 [0.00] Early memory node ranges
 [0.00]   node   0: [mem 0x8000-0xf8f9afff]
 [0.00]   node   0: [mem 0xf8f9b000-0xf908]
 [0.00]   node   0: [mem 0xf909-0xf914]
 [0.00]   node   0: [mem 0xf915-0xf920]
 [0.00]   node   0: [mem 0xf921-0xf922]
 [0.00]   node   0: [mem 0xf923-0xf95b]
 [0.00]   node   0: [mem 0xf95c-0xfe58]
 [0.00]   node   0: [mem 0xfe59-0xfe5c]
 [0.00]   node   0: [mem 0xfe5d-0xfe5d]
 [0.00]   node   0: [mem 0xfe5e-0xfe62]
 [0.00]   node   0: [mem 0xfe63-0xfeff]
 [0.00]   node   0: [mem 0x00088000-0x0009]
 [0.00]  Initmem setup node 0 [mem 
 0x8000-0x0009]

 On Sat, Mar 3, 2018 at 1:08 AM, Daniel Vacek  wrote:
> On Sat, Mar 3, 2018 at 1:40 AM, Andrew Morton  
> wrote:
>>
>> This makes me wonder whether a -stable backport is really needed...
>
> For some machines it definitely is. Won't hurt either, IMHO.
>
> --nX
>>
>> Hmm, does it step back perhaps?
>>
>> Can you check if below cures the boot hang?
>>
>> --nX
>>
>> 
>> neelx@metal:~/nX/src/linux$ git diff
>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>> index 3d974cb2a1a1..415571120bbd 100644
>> --- a/mm/page_alloc.c
>> +++ b/mm/page_alloc.c
>> @@ -5365,8 +5365,10 @@ void __meminit memmap_init_zone(unsigned long
>> size, int nid, unsigned long zone,
>>  * the valid region but still depends on correct page
>>  * metadata.
>>  */
>> -   pfn = (memblock_next_valid_pfn(pfn, end_pfn) &
>> +   unsigned long next_pfn;
>> +   next_pfn = (memblock_next_valid_pfn(pfn, end_pfn) &
>> ~(pageblock_nr_pages-1)) - 1;
>> +   pfn = max(next_pfn, pfn);
>>  #endif
>> continue;
>> }
>
> After applying this patch on linux-next the boot hang problem resolved.
> Now the hi6220-hikey is booting successfully.
> Thank you.

Thank you and Sudeep for testing. I've just sent Andrew a formal patch.

>
> - Naresh
>
>> 


Re: [PATCH v3 2/2] mm/page_alloc: fix memmap_init_zone pageblock alignment

2018-03-13 Thread Naresh Kamboju
On 12 March 2018 at 22:21, Daniel Vacek  wrote:
> On Mon, Mar 12, 2018 at 3:49 PM, Naresh Kamboju
>  wrote:
>> On 12 March 2018 at 17:56, Sudeep Holla  wrote:
>>> Hi,
>>>
>>> I couldn't find the exact mail corresponding to the patch merged in 
>>> v4.16-rc5
>>> but commit 864b75f9d6b01 "mm/page_alloc: fix memmap_init_zone
>>> pageblock alignment"
>>> cause boot hang on my ARM64 platform.
>>
>> I have also noticed this problem on hi6220 Hikey - arm64.
>>
>> LKFT: linux-next: Hikey boot failed linux-next-20180308
>> https://bugs.linaro.org/show_bug.cgi?id=3676
>>
>> - Naresh
>>
>>>
>>> Log:
>>> [0.00] NUMA: No NUMA configuration found
>>> [0.00] NUMA: Faking a node at [mem
>>> 0x-0x0009]
>>> [0.00] NUMA: NODE_DATA [mem 0x9fffcb480-0x9fffccf7f]
>>> [0.00] Zone ranges:
>>> [0.00]   DMA32[mem 0x8000-0x]
>>> [0.00]   Normal   [mem 0x0001-0x0009]
>>> [0.00] Movable zone start for each node
>>> [0.00] Early memory node ranges
>>> [0.00]   node   0: [mem 0x8000-0xf8f9afff]
>>> [0.00]   node   0: [mem 0xf8f9b000-0xf908]
>>> [0.00]   node   0: [mem 0xf909-0xf914]
>>> [0.00]   node   0: [mem 0xf915-0xf920]
>>> [0.00]   node   0: [mem 0xf921-0xf922]
>>> [0.00]   node   0: [mem 0xf923-0xf95b]
>>> [0.00]   node   0: [mem 0xf95c-0xfe58]
>>> [0.00]   node   0: [mem 0xfe59-0xfe5c]
>>> [0.00]   node   0: [mem 0xfe5d-0xfe5d]
>>> [0.00]   node   0: [mem 0xfe5e-0xfe62]
>>> [0.00]   node   0: [mem 0xfe63-0xfeff]
>>> [0.00]   node   0: [mem 0x00088000-0x0009]
>>> [0.00]  Initmem setup node 0 [mem 
>>> 0x8000-0x0009]
>>>
>>> On Sat, Mar 3, 2018 at 1:08 AM, Daniel Vacek  wrote:
 On Sat, Mar 3, 2018 at 1:40 AM, Andrew Morton  
 wrote:
>
> This makes me wonder whether a -stable backport is really needed...

 For some machines it definitely is. Won't hurt either, IMHO.

 --nX
>
> Hmm, does it step back perhaps?
>
> Can you check if below cures the boot hang?
>
> --nX
>
> 
> neelx@metal:~/nX/src/linux$ git diff
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 3d974cb2a1a1..415571120bbd 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -5365,8 +5365,10 @@ void __meminit memmap_init_zone(unsigned long
> size, int nid, unsigned long zone,
>  * the valid region but still depends on correct page
>  * metadata.
>  */
> -   pfn = (memblock_next_valid_pfn(pfn, end_pfn) &
> +   unsigned long next_pfn;
> +   next_pfn = (memblock_next_valid_pfn(pfn, end_pfn) &
> ~(pageblock_nr_pages-1)) - 1;
> +   pfn = max(next_pfn, pfn);
>  #endif
> continue;
> }

After applying this patch on linux-next the boot hang problem resolved.
Now the hi6220-hikey is booting successfully.
Thank you.

- Naresh

> 


Re: [PATCH v3 2/2] mm/page_alloc: fix memmap_init_zone pageblock alignment

2018-03-13 Thread Naresh Kamboju
On 12 March 2018 at 22:21, Daniel Vacek  wrote:
> On Mon, Mar 12, 2018 at 3:49 PM, Naresh Kamboju
>  wrote:
>> On 12 March 2018 at 17:56, Sudeep Holla  wrote:
>>> Hi,
>>>
>>> I couldn't find the exact mail corresponding to the patch merged in 
>>> v4.16-rc5
>>> but commit 864b75f9d6b01 "mm/page_alloc: fix memmap_init_zone
>>> pageblock alignment"
>>> cause boot hang on my ARM64 platform.
>>
>> I have also noticed this problem on hi6220 Hikey - arm64.
>>
>> LKFT: linux-next: Hikey boot failed linux-next-20180308
>> https://bugs.linaro.org/show_bug.cgi?id=3676
>>
>> - Naresh
>>
>>>
>>> Log:
>>> [0.00] NUMA: No NUMA configuration found
>>> [0.00] NUMA: Faking a node at [mem
>>> 0x-0x0009]
>>> [0.00] NUMA: NODE_DATA [mem 0x9fffcb480-0x9fffccf7f]
>>> [0.00] Zone ranges:
>>> [0.00]   DMA32[mem 0x8000-0x]
>>> [0.00]   Normal   [mem 0x0001-0x0009]
>>> [0.00] Movable zone start for each node
>>> [0.00] Early memory node ranges
>>> [0.00]   node   0: [mem 0x8000-0xf8f9afff]
>>> [0.00]   node   0: [mem 0xf8f9b000-0xf908]
>>> [0.00]   node   0: [mem 0xf909-0xf914]
>>> [0.00]   node   0: [mem 0xf915-0xf920]
>>> [0.00]   node   0: [mem 0xf921-0xf922]
>>> [0.00]   node   0: [mem 0xf923-0xf95b]
>>> [0.00]   node   0: [mem 0xf95c-0xfe58]
>>> [0.00]   node   0: [mem 0xfe59-0xfe5c]
>>> [0.00]   node   0: [mem 0xfe5d-0xfe5d]
>>> [0.00]   node   0: [mem 0xfe5e-0xfe62]
>>> [0.00]   node   0: [mem 0xfe63-0xfeff]
>>> [0.00]   node   0: [mem 0x00088000-0x0009]
>>> [0.00]  Initmem setup node 0 [mem 
>>> 0x8000-0x0009]
>>>
>>> On Sat, Mar 3, 2018 at 1:08 AM, Daniel Vacek  wrote:
 On Sat, Mar 3, 2018 at 1:40 AM, Andrew Morton  
 wrote:
>
> This makes me wonder whether a -stable backport is really needed...

 For some machines it definitely is. Won't hurt either, IMHO.

 --nX
>
> Hmm, does it step back perhaps?
>
> Can you check if below cures the boot hang?
>
> --nX
>
> 
> neelx@metal:~/nX/src/linux$ git diff
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 3d974cb2a1a1..415571120bbd 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -5365,8 +5365,10 @@ void __meminit memmap_init_zone(unsigned long
> size, int nid, unsigned long zone,
>  * the valid region but still depends on correct page
>  * metadata.
>  */
> -   pfn = (memblock_next_valid_pfn(pfn, end_pfn) &
> +   unsigned long next_pfn;
> +   next_pfn = (memblock_next_valid_pfn(pfn, end_pfn) &
> ~(pageblock_nr_pages-1)) - 1;
> +   pfn = max(next_pfn, pfn);
>  #endif
> continue;
> }

After applying this patch on linux-next the boot hang problem resolved.
Now the hi6220-hikey is booting successfully.
Thank you.

- Naresh

> 


Re: [PATCH v3 2/2] mm/page_alloc: fix memmap_init_zone pageblock alignment

2018-03-12 Thread Sudeep Holla


On 12/03/18 16:51, Daniel Vacek wrote:
[...]

> 
> Hmm, does it step back perhaps?
> 
> Can you check if below cures the boot hang?
> 

Yes it does fix the boot hang.

> --nX
> 
> 
> neelx@metal:~/nX/src/linux$ git diff
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 3d974cb2a1a1..415571120bbd 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -5365,8 +5365,10 @@ void __meminit memmap_init_zone(unsigned long
> size, int nid, unsigned long zone,
>  * the valid region but still depends on correct page
>  * metadata.
>  */
> -   pfn = (memblock_next_valid_pfn(pfn, end_pfn) &
> +   unsigned long next_pfn;
> +   next_pfn = (memblock_next_valid_pfn(pfn, end_pfn) &
> ~(pageblock_nr_pages-1)) - 1;
> +   pfn = max(next_pfn, pfn);
>  #endif
> continue;
> }
> 
> 

-- 
Regards,
Sudeep


Re: [PATCH v3 2/2] mm/page_alloc: fix memmap_init_zone pageblock alignment

2018-03-12 Thread Sudeep Holla


On 12/03/18 16:51, Daniel Vacek wrote:
[...]

> 
> Hmm, does it step back perhaps?
> 
> Can you check if below cures the boot hang?
> 

Yes it does fix the boot hang.

> --nX
> 
> 
> neelx@metal:~/nX/src/linux$ git diff
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 3d974cb2a1a1..415571120bbd 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -5365,8 +5365,10 @@ void __meminit memmap_init_zone(unsigned long
> size, int nid, unsigned long zone,
>  * the valid region but still depends on correct page
>  * metadata.
>  */
> -   pfn = (memblock_next_valid_pfn(pfn, end_pfn) &
> +   unsigned long next_pfn;
> +   next_pfn = (memblock_next_valid_pfn(pfn, end_pfn) &
> ~(pageblock_nr_pages-1)) - 1;
> +   pfn = max(next_pfn, pfn);
>  #endif
> continue;
> }
> 
> 

-- 
Regards,
Sudeep


Re: [PATCH v3 2/2] mm/page_alloc: fix memmap_init_zone pageblock alignment

2018-03-12 Thread Daniel Vacek
On Mon, Mar 12, 2018 at 3:49 PM, Naresh Kamboju
 wrote:
> On 12 March 2018 at 17:56, Sudeep Holla  wrote:
>> Hi,
>>
>> I couldn't find the exact mail corresponding to the patch merged in v4.16-rc5
>> but commit 864b75f9d6b01 "mm/page_alloc: fix memmap_init_zone
>> pageblock alignment"
>> cause boot hang on my ARM64 platform.
>
> I have also noticed this problem on hi6220 Hikey - arm64.
>
> LKFT: linux-next: Hikey boot failed linux-next-20180308
> https://bugs.linaro.org/show_bug.cgi?id=3676
>
> - Naresh
>
>>
>> Log:
>> [0.00] NUMA: No NUMA configuration found
>> [0.00] NUMA: Faking a node at [mem
>> 0x-0x0009]
>> [0.00] NUMA: NODE_DATA [mem 0x9fffcb480-0x9fffccf7f]
>> [0.00] Zone ranges:
>> [0.00]   DMA32[mem 0x8000-0x]
>> [0.00]   Normal   [mem 0x0001-0x0009]
>> [0.00] Movable zone start for each node
>> [0.00] Early memory node ranges
>> [0.00]   node   0: [mem 0x8000-0xf8f9afff]
>> [0.00]   node   0: [mem 0xf8f9b000-0xf908]
>> [0.00]   node   0: [mem 0xf909-0xf914]
>> [0.00]   node   0: [mem 0xf915-0xf920]
>> [0.00]   node   0: [mem 0xf921-0xf922]
>> [0.00]   node   0: [mem 0xf923-0xf95b]
>> [0.00]   node   0: [mem 0xf95c-0xfe58]
>> [0.00]   node   0: [mem 0xfe59-0xfe5c]
>> [0.00]   node   0: [mem 0xfe5d-0xfe5d]
>> [0.00]   node   0: [mem 0xfe5e-0xfe62]
>> [0.00]   node   0: [mem 0xfe63-0xfeff]
>> [0.00]   node   0: [mem 0x00088000-0x0009]
>> [0.00]  Initmem setup node 0 [mem 
>> 0x8000-0x0009]
>>
>> On Sat, Mar 3, 2018 at 1:08 AM, Daniel Vacek  wrote:
>>> On Sat, Mar 3, 2018 at 1:40 AM, Andrew Morton  
>>> wrote:

 This makes me wonder whether a -stable backport is really needed...
>>>
>>> For some machines it definitely is. Won't hurt either, IMHO.
>>>
>>> --nX

Hmm, does it step back perhaps?

Can you check if below cures the boot hang?

--nX


neelx@metal:~/nX/src/linux$ git diff
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 3d974cb2a1a1..415571120bbd 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -5365,8 +5365,10 @@ void __meminit memmap_init_zone(unsigned long
size, int nid, unsigned long zone,
 * the valid region but still depends on correct page
 * metadata.
 */
-   pfn = (memblock_next_valid_pfn(pfn, end_pfn) &
+   unsigned long next_pfn;
+   next_pfn = (memblock_next_valid_pfn(pfn, end_pfn) &
~(pageblock_nr_pages-1)) - 1;
+   pfn = max(next_pfn, pfn);
 #endif
continue;
}



Re: [PATCH v3 2/2] mm/page_alloc: fix memmap_init_zone pageblock alignment

2018-03-12 Thread Daniel Vacek
On Mon, Mar 12, 2018 at 3:49 PM, Naresh Kamboju
 wrote:
> On 12 March 2018 at 17:56, Sudeep Holla  wrote:
>> Hi,
>>
>> I couldn't find the exact mail corresponding to the patch merged in v4.16-rc5
>> but commit 864b75f9d6b01 "mm/page_alloc: fix memmap_init_zone
>> pageblock alignment"
>> cause boot hang on my ARM64 platform.
>
> I have also noticed this problem on hi6220 Hikey - arm64.
>
> LKFT: linux-next: Hikey boot failed linux-next-20180308
> https://bugs.linaro.org/show_bug.cgi?id=3676
>
> - Naresh
>
>>
>> Log:
>> [0.00] NUMA: No NUMA configuration found
>> [0.00] NUMA: Faking a node at [mem
>> 0x-0x0009]
>> [0.00] NUMA: NODE_DATA [mem 0x9fffcb480-0x9fffccf7f]
>> [0.00] Zone ranges:
>> [0.00]   DMA32[mem 0x8000-0x]
>> [0.00]   Normal   [mem 0x0001-0x0009]
>> [0.00] Movable zone start for each node
>> [0.00] Early memory node ranges
>> [0.00]   node   0: [mem 0x8000-0xf8f9afff]
>> [0.00]   node   0: [mem 0xf8f9b000-0xf908]
>> [0.00]   node   0: [mem 0xf909-0xf914]
>> [0.00]   node   0: [mem 0xf915-0xf920]
>> [0.00]   node   0: [mem 0xf921-0xf922]
>> [0.00]   node   0: [mem 0xf923-0xf95b]
>> [0.00]   node   0: [mem 0xf95c-0xfe58]
>> [0.00]   node   0: [mem 0xfe59-0xfe5c]
>> [0.00]   node   0: [mem 0xfe5d-0xfe5d]
>> [0.00]   node   0: [mem 0xfe5e-0xfe62]
>> [0.00]   node   0: [mem 0xfe63-0xfeff]
>> [0.00]   node   0: [mem 0x00088000-0x0009]
>> [0.00]  Initmem setup node 0 [mem 
>> 0x8000-0x0009]
>>
>> On Sat, Mar 3, 2018 at 1:08 AM, Daniel Vacek  wrote:
>>> On Sat, Mar 3, 2018 at 1:40 AM, Andrew Morton  
>>> wrote:

 This makes me wonder whether a -stable backport is really needed...
>>>
>>> For some machines it definitely is. Won't hurt either, IMHO.
>>>
>>> --nX

Hmm, does it step back perhaps?

Can you check if below cures the boot hang?

--nX


neelx@metal:~/nX/src/linux$ git diff
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 3d974cb2a1a1..415571120bbd 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -5365,8 +5365,10 @@ void __meminit memmap_init_zone(unsigned long
size, int nid, unsigned long zone,
 * the valid region but still depends on correct page
 * metadata.
 */
-   pfn = (memblock_next_valid_pfn(pfn, end_pfn) &
+   unsigned long next_pfn;
+   next_pfn = (memblock_next_valid_pfn(pfn, end_pfn) &
~(pageblock_nr_pages-1)) - 1;
+   pfn = max(next_pfn, pfn);
 #endif
continue;
}



Re: [PATCH v3 2/2] mm/page_alloc: fix memmap_init_zone pageblock alignment

2018-03-12 Thread Naresh Kamboju
On 12 March 2018 at 17:56, Sudeep Holla  wrote:
> Hi,
>
> I couldn't find the exact mail corresponding to the patch merged in v4.16-rc5
> but commit 864b75f9d6b01 "mm/page_alloc: fix memmap_init_zone
> pageblock alignment"
> cause boot hang on my ARM64 platform.

I have also noticed this problem on hi6220 Hikey - arm64.

LKFT: linux-next: Hikey boot failed linux-next-20180308
https://bugs.linaro.org/show_bug.cgi?id=3676

- Naresh

>
> Log:
> [0.00] NUMA: No NUMA configuration found
> [0.00] NUMA: Faking a node at [mem
> 0x-0x0009]
> [0.00] NUMA: NODE_DATA [mem 0x9fffcb480-0x9fffccf7f]
> [0.00] Zone ranges:
> [0.00]   DMA32[mem 0x8000-0x]
> [0.00]   Normal   [mem 0x0001-0x0009]
> [0.00] Movable zone start for each node
> [0.00] Early memory node ranges
> [0.00]   node   0: [mem 0x8000-0xf8f9afff]
> [0.00]   node   0: [mem 0xf8f9b000-0xf908]
> [0.00]   node   0: [mem 0xf909-0xf914]
> [0.00]   node   0: [mem 0xf915-0xf920]
> [0.00]   node   0: [mem 0xf921-0xf922]
> [0.00]   node   0: [mem 0xf923-0xf95b]
> [0.00]   node   0: [mem 0xf95c-0xfe58]
> [0.00]   node   0: [mem 0xfe59-0xfe5c]
> [0.00]   node   0: [mem 0xfe5d-0xfe5d]
> [0.00]   node   0: [mem 0xfe5e-0xfe62]
> [0.00]   node   0: [mem 0xfe63-0xfeff]
> [0.00]   node   0: [mem 0x00088000-0x0009]
> [0.00]  Initmem setup node 0 [mem 
> 0x8000-0x0009]
>
> On Sat, Mar 3, 2018 at 1:08 AM, Daniel Vacek  wrote:
>> On Sat, Mar 3, 2018 at 1:40 AM, Andrew Morton  
>> wrote:
>>> On Sat,  3 Mar 2018 01:12:26 +0100 Daniel Vacek  wrote:
>>>
 Commit b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns
 where possible") introduced a bug where move_freepages() triggers a
 VM_BUG_ON() on uninitialized page structure due to pageblock alignment.
>>>
>>> b92df1de5d28 was merged a year ago.  Can you suggest why this hasn't
>>> been reported before now?
>>
>> Yeah. I was surprised myself I couldn't find a fix to backport to
>> RHEL. But actually customers started to report this as soon as 7.4
>> (where b92df1de5d28 was merged in RHEL) was released. I remember
>> reports from September/October-ish times. It's not easily reproduced
>> and happens on a handful of machines only. I guess that's why. But
>> that does not make it less serious, I think.
>>
>> Though there actually is a report here:
>> https://bugzilla.kernel.org/show_bug.cgi?id=196443
>>
>> And there are reports for Fedora from July:
>> https://bugzilla.redhat.com/show_bug.cgi?id=1473242
>> and CentOS: https://bugs.centos.org/view.php?id=13964
>> and we internally track several dozens reports for RHEL bug
>> https://bugzilla.redhat.com/show_bug.cgi?id=1525121
>>
>> Enough? ;-)
>>
>>> This makes me wonder whether a -stable backport is really needed...
>>
>> For some machines it definitely is. Won't hurt either, IMHO.
>>
>> --nX


Re: [PATCH v3 2/2] mm/page_alloc: fix memmap_init_zone pageblock alignment

2018-03-12 Thread Naresh Kamboju
On 12 March 2018 at 17:56, Sudeep Holla  wrote:
> Hi,
>
> I couldn't find the exact mail corresponding to the patch merged in v4.16-rc5
> but commit 864b75f9d6b01 "mm/page_alloc: fix memmap_init_zone
> pageblock alignment"
> cause boot hang on my ARM64 platform.

I have also noticed this problem on hi6220 Hikey - arm64.

LKFT: linux-next: Hikey boot failed linux-next-20180308
https://bugs.linaro.org/show_bug.cgi?id=3676

- Naresh

>
> Log:
> [0.00] NUMA: No NUMA configuration found
> [0.00] NUMA: Faking a node at [mem
> 0x-0x0009]
> [0.00] NUMA: NODE_DATA [mem 0x9fffcb480-0x9fffccf7f]
> [0.00] Zone ranges:
> [0.00]   DMA32[mem 0x8000-0x]
> [0.00]   Normal   [mem 0x0001-0x0009]
> [0.00] Movable zone start for each node
> [0.00] Early memory node ranges
> [0.00]   node   0: [mem 0x8000-0xf8f9afff]
> [0.00]   node   0: [mem 0xf8f9b000-0xf908]
> [0.00]   node   0: [mem 0xf909-0xf914]
> [0.00]   node   0: [mem 0xf915-0xf920]
> [0.00]   node   0: [mem 0xf921-0xf922]
> [0.00]   node   0: [mem 0xf923-0xf95b]
> [0.00]   node   0: [mem 0xf95c-0xfe58]
> [0.00]   node   0: [mem 0xfe59-0xfe5c]
> [0.00]   node   0: [mem 0xfe5d-0xfe5d]
> [0.00]   node   0: [mem 0xfe5e-0xfe62]
> [0.00]   node   0: [mem 0xfe63-0xfeff]
> [0.00]   node   0: [mem 0x00088000-0x0009]
> [0.00]  Initmem setup node 0 [mem 
> 0x8000-0x0009]
>
> On Sat, Mar 3, 2018 at 1:08 AM, Daniel Vacek  wrote:
>> On Sat, Mar 3, 2018 at 1:40 AM, Andrew Morton  
>> wrote:
>>> On Sat,  3 Mar 2018 01:12:26 +0100 Daniel Vacek  wrote:
>>>
 Commit b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns
 where possible") introduced a bug where move_freepages() triggers a
 VM_BUG_ON() on uninitialized page structure due to pageblock alignment.
>>>
>>> b92df1de5d28 was merged a year ago.  Can you suggest why this hasn't
>>> been reported before now?
>>
>> Yeah. I was surprised myself I couldn't find a fix to backport to
>> RHEL. But actually customers started to report this as soon as 7.4
>> (where b92df1de5d28 was merged in RHEL) was released. I remember
>> reports from September/October-ish times. It's not easily reproduced
>> and happens on a handful of machines only. I guess that's why. But
>> that does not make it less serious, I think.
>>
>> Though there actually is a report here:
>> https://bugzilla.kernel.org/show_bug.cgi?id=196443
>>
>> And there are reports for Fedora from July:
>> https://bugzilla.redhat.com/show_bug.cgi?id=1473242
>> and CentOS: https://bugs.centos.org/view.php?id=13964
>> and we internally track several dozens reports for RHEL bug
>> https://bugzilla.redhat.com/show_bug.cgi?id=1525121
>>
>> Enough? ;-)
>>
>>> This makes me wonder whether a -stable backport is really needed...
>>
>> For some machines it definitely is. Won't hurt either, IMHO.
>>
>> --nX


Re: [PATCH v3 2/2] mm/page_alloc: fix memmap_init_zone pageblock alignment

2018-03-12 Thread Sudeep Holla
Hi,

I couldn't find the exact mail corresponding to the patch merged in v4.16-rc5
but commit 864b75f9d6b01 "mm/page_alloc: fix memmap_init_zone
pageblock alignment"
cause boot hang on my ARM64 platform.

Log:
[0.00] NUMA: No NUMA configuration found
[0.00] NUMA: Faking a node at [mem
0x-0x0009]
[0.00] NUMA: NODE_DATA [mem 0x9fffcb480-0x9fffccf7f]
[0.00] Zone ranges:
[0.00]   DMA32[mem 0x8000-0x]
[0.00]   Normal   [mem 0x0001-0x0009]
[0.00] Movable zone start for each node
[0.00] Early memory node ranges
[0.00]   node   0: [mem 0x8000-0xf8f9afff]
[0.00]   node   0: [mem 0xf8f9b000-0xf908]
[0.00]   node   0: [mem 0xf909-0xf914]
[0.00]   node   0: [mem 0xf915-0xf920]
[0.00]   node   0: [mem 0xf921-0xf922]
[0.00]   node   0: [mem 0xf923-0xf95b]
[0.00]   node   0: [mem 0xf95c-0xfe58]
[0.00]   node   0: [mem 0xfe59-0xfe5c]
[0.00]   node   0: [mem 0xfe5d-0xfe5d]
[0.00]   node   0: [mem 0xfe5e-0xfe62]
[0.00]   node   0: [mem 0xfe63-0xfeff]
[0.00]   node   0: [mem 0x00088000-0x0009]
[0.00]  Initmem setup node 0 [mem 0x8000-0x0009]

On Sat, Mar 3, 2018 at 1:08 AM, Daniel Vacek  wrote:
> On Sat, Mar 3, 2018 at 1:40 AM, Andrew Morton  
> wrote:
>> On Sat,  3 Mar 2018 01:12:26 +0100 Daniel Vacek  wrote:
>>
>>> Commit b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns
>>> where possible") introduced a bug where move_freepages() triggers a
>>> VM_BUG_ON() on uninitialized page structure due to pageblock alignment.
>>
>> b92df1de5d28 was merged a year ago.  Can you suggest why this hasn't
>> been reported before now?
>
> Yeah. I was surprised myself I couldn't find a fix to backport to
> RHEL. But actually customers started to report this as soon as 7.4
> (where b92df1de5d28 was merged in RHEL) was released. I remember
> reports from September/October-ish times. It's not easily reproduced
> and happens on a handful of machines only. I guess that's why. But
> that does not make it less serious, I think.
>
> Though there actually is a report here:
> https://bugzilla.kernel.org/show_bug.cgi?id=196443
>
> And there are reports for Fedora from July:
> https://bugzilla.redhat.com/show_bug.cgi?id=1473242
> and CentOS: https://bugs.centos.org/view.php?id=13964
> and we internally track several dozens reports for RHEL bug
> https://bugzilla.redhat.com/show_bug.cgi?id=1525121
>
> Enough? ;-)
>
>> This makes me wonder whether a -stable backport is really needed...
>
> For some machines it definitely is. Won't hurt either, IMHO.
>
> --nX


Re: [PATCH v3 2/2] mm/page_alloc: fix memmap_init_zone pageblock alignment

2018-03-12 Thread Sudeep Holla
Hi,

I couldn't find the exact mail corresponding to the patch merged in v4.16-rc5
but commit 864b75f9d6b01 "mm/page_alloc: fix memmap_init_zone
pageblock alignment"
cause boot hang on my ARM64 platform.

Log:
[0.00] NUMA: No NUMA configuration found
[0.00] NUMA: Faking a node at [mem
0x-0x0009]
[0.00] NUMA: NODE_DATA [mem 0x9fffcb480-0x9fffccf7f]
[0.00] Zone ranges:
[0.00]   DMA32[mem 0x8000-0x]
[0.00]   Normal   [mem 0x0001-0x0009]
[0.00] Movable zone start for each node
[0.00] Early memory node ranges
[0.00]   node   0: [mem 0x8000-0xf8f9afff]
[0.00]   node   0: [mem 0xf8f9b000-0xf908]
[0.00]   node   0: [mem 0xf909-0xf914]
[0.00]   node   0: [mem 0xf915-0xf920]
[0.00]   node   0: [mem 0xf921-0xf922]
[0.00]   node   0: [mem 0xf923-0xf95b]
[0.00]   node   0: [mem 0xf95c-0xfe58]
[0.00]   node   0: [mem 0xfe59-0xfe5c]
[0.00]   node   0: [mem 0xfe5d-0xfe5d]
[0.00]   node   0: [mem 0xfe5e-0xfe62]
[0.00]   node   0: [mem 0xfe63-0xfeff]
[0.00]   node   0: [mem 0x00088000-0x0009]
[0.00]  Initmem setup node 0 [mem 0x8000-0x0009]

On Sat, Mar 3, 2018 at 1:08 AM, Daniel Vacek  wrote:
> On Sat, Mar 3, 2018 at 1:40 AM, Andrew Morton  
> wrote:
>> On Sat,  3 Mar 2018 01:12:26 +0100 Daniel Vacek  wrote:
>>
>>> Commit b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns
>>> where possible") introduced a bug where move_freepages() triggers a
>>> VM_BUG_ON() on uninitialized page structure due to pageblock alignment.
>>
>> b92df1de5d28 was merged a year ago.  Can you suggest why this hasn't
>> been reported before now?
>
> Yeah. I was surprised myself I couldn't find a fix to backport to
> RHEL. But actually customers started to report this as soon as 7.4
> (where b92df1de5d28 was merged in RHEL) was released. I remember
> reports from September/October-ish times. It's not easily reproduced
> and happens on a handful of machines only. I guess that's why. But
> that does not make it less serious, I think.
>
> Though there actually is a report here:
> https://bugzilla.kernel.org/show_bug.cgi?id=196443
>
> And there are reports for Fedora from July:
> https://bugzilla.redhat.com/show_bug.cgi?id=1473242
> and CentOS: https://bugs.centos.org/view.php?id=13964
> and we internally track several dozens reports for RHEL bug
> https://bugzilla.redhat.com/show_bug.cgi?id=1525121
>
> Enough? ;-)
>
>> This makes me wonder whether a -stable backport is really needed...
>
> For some machines it definitely is. Won't hurt either, IMHO.
>
> --nX


Re: [PATCH v3 2/2] mm/page_alloc: fix memmap_init_zone pageblock alignment

2018-03-02 Thread Daniel Vacek
On Sat, Mar 3, 2018 at 1:40 AM, Andrew Morton  wrote:
> On Sat,  3 Mar 2018 01:12:26 +0100 Daniel Vacek  wrote:
>
>> Commit b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns
>> where possible") introduced a bug where move_freepages() triggers a
>> VM_BUG_ON() on uninitialized page structure due to pageblock alignment.
>
> b92df1de5d28 was merged a year ago.  Can you suggest why this hasn't
> been reported before now?

Yeah. I was surprised myself I couldn't find a fix to backport to
RHEL. But actually customers started to report this as soon as 7.4
(where b92df1de5d28 was merged in RHEL) was released. I remember
reports from September/October-ish times. It's not easily reproduced
and happens on a handful of machines only. I guess that's why. But
that does not make it less serious, I think.

Though there actually is a report here:
https://bugzilla.kernel.org/show_bug.cgi?id=196443

And there are reports for Fedora from July:
https://bugzilla.redhat.com/show_bug.cgi?id=1473242
and CentOS: https://bugs.centos.org/view.php?id=13964
and we internally track several dozens reports for RHEL bug
https://bugzilla.redhat.com/show_bug.cgi?id=1525121

Enough? ;-)

> This makes me wonder whether a -stable backport is really needed...

For some machines it definitely is. Won't hurt either, IMHO.

--nX


Re: [PATCH v3 2/2] mm/page_alloc: fix memmap_init_zone pageblock alignment

2018-03-02 Thread Daniel Vacek
On Sat, Mar 3, 2018 at 1:40 AM, Andrew Morton  wrote:
> On Sat,  3 Mar 2018 01:12:26 +0100 Daniel Vacek  wrote:
>
>> Commit b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns
>> where possible") introduced a bug where move_freepages() triggers a
>> VM_BUG_ON() on uninitialized page structure due to pageblock alignment.
>
> b92df1de5d28 was merged a year ago.  Can you suggest why this hasn't
> been reported before now?

Yeah. I was surprised myself I couldn't find a fix to backport to
RHEL. But actually customers started to report this as soon as 7.4
(where b92df1de5d28 was merged in RHEL) was released. I remember
reports from September/October-ish times. It's not easily reproduced
and happens on a handful of machines only. I guess that's why. But
that does not make it less serious, I think.

Though there actually is a report here:
https://bugzilla.kernel.org/show_bug.cgi?id=196443

And there are reports for Fedora from July:
https://bugzilla.redhat.com/show_bug.cgi?id=1473242
and CentOS: https://bugs.centos.org/view.php?id=13964
and we internally track several dozens reports for RHEL bug
https://bugzilla.redhat.com/show_bug.cgi?id=1525121

Enough? ;-)

> This makes me wonder whether a -stable backport is really needed...

For some machines it definitely is. Won't hurt either, IMHO.

--nX


Re: [PATCH v3 2/2] mm/page_alloc: fix memmap_init_zone pageblock alignment

2018-03-02 Thread Andrew Morton
On Sat,  3 Mar 2018 01:12:26 +0100 Daniel Vacek  wrote:

> Commit b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns
> where possible") introduced a bug where move_freepages() triggers a
> VM_BUG_ON() on uninitialized page structure due to pageblock alignment.

b92df1de5d28 was merged a year ago.  Can you suggest why this hasn't
been reported before now?

This makes me wonder whether a -stable backport is really needed...


Re: [PATCH v3 2/2] mm/page_alloc: fix memmap_init_zone pageblock alignment

2018-03-02 Thread Andrew Morton
On Sat,  3 Mar 2018 01:12:26 +0100 Daniel Vacek  wrote:

> Commit b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns
> where possible") introduced a bug where move_freepages() triggers a
> VM_BUG_ON() on uninitialized page structure due to pageblock alignment.

b92df1de5d28 was merged a year ago.  Can you suggest why this hasn't
been reported before now?

This makes me wonder whether a -stable backport is really needed...