Re: [PATCH v3 2/2] mm/page_alloc: fix memmap_init_zone pageblock alignment
On Tue, Mar 13, 2018 at 7:34 AM, Naresh Kambojuwrote: > On 12 March 2018 at 22:21, Daniel Vacek wrote: >> On Mon, Mar 12, 2018 at 3:49 PM, Naresh Kamboju >> wrote: >>> On 12 March 2018 at 17:56, Sudeep Holla wrote: Hi, I couldn't find the exact mail corresponding to the patch merged in v4.16-rc5 but commit 864b75f9d6b01 "mm/page_alloc: fix memmap_init_zone pageblock alignment" cause boot hang on my ARM64 platform. >>> >>> I have also noticed this problem on hi6220 Hikey - arm64. >>> >>> LKFT: linux-next: Hikey boot failed linux-next-20180308 >>> https://bugs.linaro.org/show_bug.cgi?id=3676 >>> >>> - Naresh >>> Log: [0.00] NUMA: No NUMA configuration found [0.00] NUMA: Faking a node at [mem 0x-0x0009] [0.00] NUMA: NODE_DATA [mem 0x9fffcb480-0x9fffccf7f] [0.00] Zone ranges: [0.00] DMA32[mem 0x8000-0x] [0.00] Normal [mem 0x0001-0x0009] [0.00] Movable zone start for each node [0.00] Early memory node ranges [0.00] node 0: [mem 0x8000-0xf8f9afff] [0.00] node 0: [mem 0xf8f9b000-0xf908] [0.00] node 0: [mem 0xf909-0xf914] [0.00] node 0: [mem 0xf915-0xf920] [0.00] node 0: [mem 0xf921-0xf922] [0.00] node 0: [mem 0xf923-0xf95b] [0.00] node 0: [mem 0xf95c-0xfe58] [0.00] node 0: [mem 0xfe59-0xfe5c] [0.00] node 0: [mem 0xfe5d-0xfe5d] [0.00] node 0: [mem 0xfe5e-0xfe62] [0.00] node 0: [mem 0xfe63-0xfeff] [0.00] node 0: [mem 0x00088000-0x0009] [0.00] Initmem setup node 0 [mem 0x8000-0x0009] On Sat, Mar 3, 2018 at 1:08 AM, Daniel Vacek wrote: > On Sat, Mar 3, 2018 at 1:40 AM, Andrew Morton > wrote: >> >> This makes me wonder whether a -stable backport is really needed... > > For some machines it definitely is. Won't hurt either, IMHO. > > --nX >> >> Hmm, does it step back perhaps? >> >> Can you check if below cures the boot hang? >> >> --nX >> >> >> neelx@metal:~/nX/src/linux$ git diff >> diff --git a/mm/page_alloc.c b/mm/page_alloc.c >> index 3d974cb2a1a1..415571120bbd 100644 >> --- a/mm/page_alloc.c >> +++ b/mm/page_alloc.c >> @@ -5365,8 +5365,10 @@ void __meminit memmap_init_zone(unsigned long >> size, int nid, unsigned long zone, >> * the valid region but still depends on correct page >> * metadata. >> */ >> - pfn = (memblock_next_valid_pfn(pfn, end_pfn) & >> + unsigned long next_pfn; >> + next_pfn = (memblock_next_valid_pfn(pfn, end_pfn) & >> ~(pageblock_nr_pages-1)) - 1; >> + pfn = max(next_pfn, pfn); >> #endif >> continue; >> } > > After applying this patch on linux-next the boot hang problem resolved. > Now the hi6220-hikey is booting successfully. > Thank you. Thank you and Sudeep for testing. I've just sent Andrew a formal patch. > > - Naresh > >>
Re: [PATCH v3 2/2] mm/page_alloc: fix memmap_init_zone pageblock alignment
On Tue, Mar 13, 2018 at 7:34 AM, Naresh Kamboju wrote: > On 12 March 2018 at 22:21, Daniel Vacek wrote: >> On Mon, Mar 12, 2018 at 3:49 PM, Naresh Kamboju >> wrote: >>> On 12 March 2018 at 17:56, Sudeep Holla wrote: Hi, I couldn't find the exact mail corresponding to the patch merged in v4.16-rc5 but commit 864b75f9d6b01 "mm/page_alloc: fix memmap_init_zone pageblock alignment" cause boot hang on my ARM64 platform. >>> >>> I have also noticed this problem on hi6220 Hikey - arm64. >>> >>> LKFT: linux-next: Hikey boot failed linux-next-20180308 >>> https://bugs.linaro.org/show_bug.cgi?id=3676 >>> >>> - Naresh >>> Log: [0.00] NUMA: No NUMA configuration found [0.00] NUMA: Faking a node at [mem 0x-0x0009] [0.00] NUMA: NODE_DATA [mem 0x9fffcb480-0x9fffccf7f] [0.00] Zone ranges: [0.00] DMA32[mem 0x8000-0x] [0.00] Normal [mem 0x0001-0x0009] [0.00] Movable zone start for each node [0.00] Early memory node ranges [0.00] node 0: [mem 0x8000-0xf8f9afff] [0.00] node 0: [mem 0xf8f9b000-0xf908] [0.00] node 0: [mem 0xf909-0xf914] [0.00] node 0: [mem 0xf915-0xf920] [0.00] node 0: [mem 0xf921-0xf922] [0.00] node 0: [mem 0xf923-0xf95b] [0.00] node 0: [mem 0xf95c-0xfe58] [0.00] node 0: [mem 0xfe59-0xfe5c] [0.00] node 0: [mem 0xfe5d-0xfe5d] [0.00] node 0: [mem 0xfe5e-0xfe62] [0.00] node 0: [mem 0xfe63-0xfeff] [0.00] node 0: [mem 0x00088000-0x0009] [0.00] Initmem setup node 0 [mem 0x8000-0x0009] On Sat, Mar 3, 2018 at 1:08 AM, Daniel Vacek wrote: > On Sat, Mar 3, 2018 at 1:40 AM, Andrew Morton > wrote: >> >> This makes me wonder whether a -stable backport is really needed... > > For some machines it definitely is. Won't hurt either, IMHO. > > --nX >> >> Hmm, does it step back perhaps? >> >> Can you check if below cures the boot hang? >> >> --nX >> >> >> neelx@metal:~/nX/src/linux$ git diff >> diff --git a/mm/page_alloc.c b/mm/page_alloc.c >> index 3d974cb2a1a1..415571120bbd 100644 >> --- a/mm/page_alloc.c >> +++ b/mm/page_alloc.c >> @@ -5365,8 +5365,10 @@ void __meminit memmap_init_zone(unsigned long >> size, int nid, unsigned long zone, >> * the valid region but still depends on correct page >> * metadata. >> */ >> - pfn = (memblock_next_valid_pfn(pfn, end_pfn) & >> + unsigned long next_pfn; >> + next_pfn = (memblock_next_valid_pfn(pfn, end_pfn) & >> ~(pageblock_nr_pages-1)) - 1; >> + pfn = max(next_pfn, pfn); >> #endif >> continue; >> } > > After applying this patch on linux-next the boot hang problem resolved. > Now the hi6220-hikey is booting successfully. > Thank you. Thank you and Sudeep for testing. I've just sent Andrew a formal patch. > > - Naresh > >>
Re: [PATCH v3 2/2] mm/page_alloc: fix memmap_init_zone pageblock alignment
On 12 March 2018 at 22:21, Daniel Vacekwrote: > On Mon, Mar 12, 2018 at 3:49 PM, Naresh Kamboju > wrote: >> On 12 March 2018 at 17:56, Sudeep Holla wrote: >>> Hi, >>> >>> I couldn't find the exact mail corresponding to the patch merged in >>> v4.16-rc5 >>> but commit 864b75f9d6b01 "mm/page_alloc: fix memmap_init_zone >>> pageblock alignment" >>> cause boot hang on my ARM64 platform. >> >> I have also noticed this problem on hi6220 Hikey - arm64. >> >> LKFT: linux-next: Hikey boot failed linux-next-20180308 >> https://bugs.linaro.org/show_bug.cgi?id=3676 >> >> - Naresh >> >>> >>> Log: >>> [0.00] NUMA: No NUMA configuration found >>> [0.00] NUMA: Faking a node at [mem >>> 0x-0x0009] >>> [0.00] NUMA: NODE_DATA [mem 0x9fffcb480-0x9fffccf7f] >>> [0.00] Zone ranges: >>> [0.00] DMA32[mem 0x8000-0x] >>> [0.00] Normal [mem 0x0001-0x0009] >>> [0.00] Movable zone start for each node >>> [0.00] Early memory node ranges >>> [0.00] node 0: [mem 0x8000-0xf8f9afff] >>> [0.00] node 0: [mem 0xf8f9b000-0xf908] >>> [0.00] node 0: [mem 0xf909-0xf914] >>> [0.00] node 0: [mem 0xf915-0xf920] >>> [0.00] node 0: [mem 0xf921-0xf922] >>> [0.00] node 0: [mem 0xf923-0xf95b] >>> [0.00] node 0: [mem 0xf95c-0xfe58] >>> [0.00] node 0: [mem 0xfe59-0xfe5c] >>> [0.00] node 0: [mem 0xfe5d-0xfe5d] >>> [0.00] node 0: [mem 0xfe5e-0xfe62] >>> [0.00] node 0: [mem 0xfe63-0xfeff] >>> [0.00] node 0: [mem 0x00088000-0x0009] >>> [0.00] Initmem setup node 0 [mem >>> 0x8000-0x0009] >>> >>> On Sat, Mar 3, 2018 at 1:08 AM, Daniel Vacek wrote: On Sat, Mar 3, 2018 at 1:40 AM, Andrew Morton wrote: > > This makes me wonder whether a -stable backport is really needed... For some machines it definitely is. Won't hurt either, IMHO. --nX > > Hmm, does it step back perhaps? > > Can you check if below cures the boot hang? > > --nX > > > neelx@metal:~/nX/src/linux$ git diff > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > index 3d974cb2a1a1..415571120bbd 100644 > --- a/mm/page_alloc.c > +++ b/mm/page_alloc.c > @@ -5365,8 +5365,10 @@ void __meminit memmap_init_zone(unsigned long > size, int nid, unsigned long zone, > * the valid region but still depends on correct page > * metadata. > */ > - pfn = (memblock_next_valid_pfn(pfn, end_pfn) & > + unsigned long next_pfn; > + next_pfn = (memblock_next_valid_pfn(pfn, end_pfn) & > ~(pageblock_nr_pages-1)) - 1; > + pfn = max(next_pfn, pfn); > #endif > continue; > } After applying this patch on linux-next the boot hang problem resolved. Now the hi6220-hikey is booting successfully. Thank you. - Naresh >
Re: [PATCH v3 2/2] mm/page_alloc: fix memmap_init_zone pageblock alignment
On 12 March 2018 at 22:21, Daniel Vacek wrote: > On Mon, Mar 12, 2018 at 3:49 PM, Naresh Kamboju > wrote: >> On 12 March 2018 at 17:56, Sudeep Holla wrote: >>> Hi, >>> >>> I couldn't find the exact mail corresponding to the patch merged in >>> v4.16-rc5 >>> but commit 864b75f9d6b01 "mm/page_alloc: fix memmap_init_zone >>> pageblock alignment" >>> cause boot hang on my ARM64 platform. >> >> I have also noticed this problem on hi6220 Hikey - arm64. >> >> LKFT: linux-next: Hikey boot failed linux-next-20180308 >> https://bugs.linaro.org/show_bug.cgi?id=3676 >> >> - Naresh >> >>> >>> Log: >>> [0.00] NUMA: No NUMA configuration found >>> [0.00] NUMA: Faking a node at [mem >>> 0x-0x0009] >>> [0.00] NUMA: NODE_DATA [mem 0x9fffcb480-0x9fffccf7f] >>> [0.00] Zone ranges: >>> [0.00] DMA32[mem 0x8000-0x] >>> [0.00] Normal [mem 0x0001-0x0009] >>> [0.00] Movable zone start for each node >>> [0.00] Early memory node ranges >>> [0.00] node 0: [mem 0x8000-0xf8f9afff] >>> [0.00] node 0: [mem 0xf8f9b000-0xf908] >>> [0.00] node 0: [mem 0xf909-0xf914] >>> [0.00] node 0: [mem 0xf915-0xf920] >>> [0.00] node 0: [mem 0xf921-0xf922] >>> [0.00] node 0: [mem 0xf923-0xf95b] >>> [0.00] node 0: [mem 0xf95c-0xfe58] >>> [0.00] node 0: [mem 0xfe59-0xfe5c] >>> [0.00] node 0: [mem 0xfe5d-0xfe5d] >>> [0.00] node 0: [mem 0xfe5e-0xfe62] >>> [0.00] node 0: [mem 0xfe63-0xfeff] >>> [0.00] node 0: [mem 0x00088000-0x0009] >>> [0.00] Initmem setup node 0 [mem >>> 0x8000-0x0009] >>> >>> On Sat, Mar 3, 2018 at 1:08 AM, Daniel Vacek wrote: On Sat, Mar 3, 2018 at 1:40 AM, Andrew Morton wrote: > > This makes me wonder whether a -stable backport is really needed... For some machines it definitely is. Won't hurt either, IMHO. --nX > > Hmm, does it step back perhaps? > > Can you check if below cures the boot hang? > > --nX > > > neelx@metal:~/nX/src/linux$ git diff > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > index 3d974cb2a1a1..415571120bbd 100644 > --- a/mm/page_alloc.c > +++ b/mm/page_alloc.c > @@ -5365,8 +5365,10 @@ void __meminit memmap_init_zone(unsigned long > size, int nid, unsigned long zone, > * the valid region but still depends on correct page > * metadata. > */ > - pfn = (memblock_next_valid_pfn(pfn, end_pfn) & > + unsigned long next_pfn; > + next_pfn = (memblock_next_valid_pfn(pfn, end_pfn) & > ~(pageblock_nr_pages-1)) - 1; > + pfn = max(next_pfn, pfn); > #endif > continue; > } After applying this patch on linux-next the boot hang problem resolved. Now the hi6220-hikey is booting successfully. Thank you. - Naresh >
Re: [PATCH v3 2/2] mm/page_alloc: fix memmap_init_zone pageblock alignment
On 12/03/18 16:51, Daniel Vacek wrote: [...] > > Hmm, does it step back perhaps? > > Can you check if below cures the boot hang? > Yes it does fix the boot hang. > --nX > > > neelx@metal:~/nX/src/linux$ git diff > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > index 3d974cb2a1a1..415571120bbd 100644 > --- a/mm/page_alloc.c > +++ b/mm/page_alloc.c > @@ -5365,8 +5365,10 @@ void __meminit memmap_init_zone(unsigned long > size, int nid, unsigned long zone, > * the valid region but still depends on correct page > * metadata. > */ > - pfn = (memblock_next_valid_pfn(pfn, end_pfn) & > + unsigned long next_pfn; > + next_pfn = (memblock_next_valid_pfn(pfn, end_pfn) & > ~(pageblock_nr_pages-1)) - 1; > + pfn = max(next_pfn, pfn); > #endif > continue; > } > > -- Regards, Sudeep
Re: [PATCH v3 2/2] mm/page_alloc: fix memmap_init_zone pageblock alignment
On 12/03/18 16:51, Daniel Vacek wrote: [...] > > Hmm, does it step back perhaps? > > Can you check if below cures the boot hang? > Yes it does fix the boot hang. > --nX > > > neelx@metal:~/nX/src/linux$ git diff > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > index 3d974cb2a1a1..415571120bbd 100644 > --- a/mm/page_alloc.c > +++ b/mm/page_alloc.c > @@ -5365,8 +5365,10 @@ void __meminit memmap_init_zone(unsigned long > size, int nid, unsigned long zone, > * the valid region but still depends on correct page > * metadata. > */ > - pfn = (memblock_next_valid_pfn(pfn, end_pfn) & > + unsigned long next_pfn; > + next_pfn = (memblock_next_valid_pfn(pfn, end_pfn) & > ~(pageblock_nr_pages-1)) - 1; > + pfn = max(next_pfn, pfn); > #endif > continue; > } > > -- Regards, Sudeep
Re: [PATCH v3 2/2] mm/page_alloc: fix memmap_init_zone pageblock alignment
On Mon, Mar 12, 2018 at 3:49 PM, Naresh Kambojuwrote: > On 12 March 2018 at 17:56, Sudeep Holla wrote: >> Hi, >> >> I couldn't find the exact mail corresponding to the patch merged in v4.16-rc5 >> but commit 864b75f9d6b01 "mm/page_alloc: fix memmap_init_zone >> pageblock alignment" >> cause boot hang on my ARM64 platform. > > I have also noticed this problem on hi6220 Hikey - arm64. > > LKFT: linux-next: Hikey boot failed linux-next-20180308 > https://bugs.linaro.org/show_bug.cgi?id=3676 > > - Naresh > >> >> Log: >> [0.00] NUMA: No NUMA configuration found >> [0.00] NUMA: Faking a node at [mem >> 0x-0x0009] >> [0.00] NUMA: NODE_DATA [mem 0x9fffcb480-0x9fffccf7f] >> [0.00] Zone ranges: >> [0.00] DMA32[mem 0x8000-0x] >> [0.00] Normal [mem 0x0001-0x0009] >> [0.00] Movable zone start for each node >> [0.00] Early memory node ranges >> [0.00] node 0: [mem 0x8000-0xf8f9afff] >> [0.00] node 0: [mem 0xf8f9b000-0xf908] >> [0.00] node 0: [mem 0xf909-0xf914] >> [0.00] node 0: [mem 0xf915-0xf920] >> [0.00] node 0: [mem 0xf921-0xf922] >> [0.00] node 0: [mem 0xf923-0xf95b] >> [0.00] node 0: [mem 0xf95c-0xfe58] >> [0.00] node 0: [mem 0xfe59-0xfe5c] >> [0.00] node 0: [mem 0xfe5d-0xfe5d] >> [0.00] node 0: [mem 0xfe5e-0xfe62] >> [0.00] node 0: [mem 0xfe63-0xfeff] >> [0.00] node 0: [mem 0x00088000-0x0009] >> [0.00] Initmem setup node 0 [mem >> 0x8000-0x0009] >> >> On Sat, Mar 3, 2018 at 1:08 AM, Daniel Vacek wrote: >>> On Sat, Mar 3, 2018 at 1:40 AM, Andrew Morton >>> wrote: This makes me wonder whether a -stable backport is really needed... >>> >>> For some machines it definitely is. Won't hurt either, IMHO. >>> >>> --nX Hmm, does it step back perhaps? Can you check if below cures the boot hang? --nX neelx@metal:~/nX/src/linux$ git diff diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 3d974cb2a1a1..415571120bbd 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -5365,8 +5365,10 @@ void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone, * the valid region but still depends on correct page * metadata. */ - pfn = (memblock_next_valid_pfn(pfn, end_pfn) & + unsigned long next_pfn; + next_pfn = (memblock_next_valid_pfn(pfn, end_pfn) & ~(pageblock_nr_pages-1)) - 1; + pfn = max(next_pfn, pfn); #endif continue; }
Re: [PATCH v3 2/2] mm/page_alloc: fix memmap_init_zone pageblock alignment
On Mon, Mar 12, 2018 at 3:49 PM, Naresh Kamboju wrote: > On 12 March 2018 at 17:56, Sudeep Holla wrote: >> Hi, >> >> I couldn't find the exact mail corresponding to the patch merged in v4.16-rc5 >> but commit 864b75f9d6b01 "mm/page_alloc: fix memmap_init_zone >> pageblock alignment" >> cause boot hang on my ARM64 platform. > > I have also noticed this problem on hi6220 Hikey - arm64. > > LKFT: linux-next: Hikey boot failed linux-next-20180308 > https://bugs.linaro.org/show_bug.cgi?id=3676 > > - Naresh > >> >> Log: >> [0.00] NUMA: No NUMA configuration found >> [0.00] NUMA: Faking a node at [mem >> 0x-0x0009] >> [0.00] NUMA: NODE_DATA [mem 0x9fffcb480-0x9fffccf7f] >> [0.00] Zone ranges: >> [0.00] DMA32[mem 0x8000-0x] >> [0.00] Normal [mem 0x0001-0x0009] >> [0.00] Movable zone start for each node >> [0.00] Early memory node ranges >> [0.00] node 0: [mem 0x8000-0xf8f9afff] >> [0.00] node 0: [mem 0xf8f9b000-0xf908] >> [0.00] node 0: [mem 0xf909-0xf914] >> [0.00] node 0: [mem 0xf915-0xf920] >> [0.00] node 0: [mem 0xf921-0xf922] >> [0.00] node 0: [mem 0xf923-0xf95b] >> [0.00] node 0: [mem 0xf95c-0xfe58] >> [0.00] node 0: [mem 0xfe59-0xfe5c] >> [0.00] node 0: [mem 0xfe5d-0xfe5d] >> [0.00] node 0: [mem 0xfe5e-0xfe62] >> [0.00] node 0: [mem 0xfe63-0xfeff] >> [0.00] node 0: [mem 0x00088000-0x0009] >> [0.00] Initmem setup node 0 [mem >> 0x8000-0x0009] >> >> On Sat, Mar 3, 2018 at 1:08 AM, Daniel Vacek wrote: >>> On Sat, Mar 3, 2018 at 1:40 AM, Andrew Morton >>> wrote: This makes me wonder whether a -stable backport is really needed... >>> >>> For some machines it definitely is. Won't hurt either, IMHO. >>> >>> --nX Hmm, does it step back perhaps? Can you check if below cures the boot hang? --nX neelx@metal:~/nX/src/linux$ git diff diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 3d974cb2a1a1..415571120bbd 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -5365,8 +5365,10 @@ void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone, * the valid region but still depends on correct page * metadata. */ - pfn = (memblock_next_valid_pfn(pfn, end_pfn) & + unsigned long next_pfn; + next_pfn = (memblock_next_valid_pfn(pfn, end_pfn) & ~(pageblock_nr_pages-1)) - 1; + pfn = max(next_pfn, pfn); #endif continue; }
Re: [PATCH v3 2/2] mm/page_alloc: fix memmap_init_zone pageblock alignment
On 12 March 2018 at 17:56, Sudeep Hollawrote: > Hi, > > I couldn't find the exact mail corresponding to the patch merged in v4.16-rc5 > but commit 864b75f9d6b01 "mm/page_alloc: fix memmap_init_zone > pageblock alignment" > cause boot hang on my ARM64 platform. I have also noticed this problem on hi6220 Hikey - arm64. LKFT: linux-next: Hikey boot failed linux-next-20180308 https://bugs.linaro.org/show_bug.cgi?id=3676 - Naresh > > Log: > [0.00] NUMA: No NUMA configuration found > [0.00] NUMA: Faking a node at [mem > 0x-0x0009] > [0.00] NUMA: NODE_DATA [mem 0x9fffcb480-0x9fffccf7f] > [0.00] Zone ranges: > [0.00] DMA32[mem 0x8000-0x] > [0.00] Normal [mem 0x0001-0x0009] > [0.00] Movable zone start for each node > [0.00] Early memory node ranges > [0.00] node 0: [mem 0x8000-0xf8f9afff] > [0.00] node 0: [mem 0xf8f9b000-0xf908] > [0.00] node 0: [mem 0xf909-0xf914] > [0.00] node 0: [mem 0xf915-0xf920] > [0.00] node 0: [mem 0xf921-0xf922] > [0.00] node 0: [mem 0xf923-0xf95b] > [0.00] node 0: [mem 0xf95c-0xfe58] > [0.00] node 0: [mem 0xfe59-0xfe5c] > [0.00] node 0: [mem 0xfe5d-0xfe5d] > [0.00] node 0: [mem 0xfe5e-0xfe62] > [0.00] node 0: [mem 0xfe63-0xfeff] > [0.00] node 0: [mem 0x00088000-0x0009] > [0.00] Initmem setup node 0 [mem > 0x8000-0x0009] > > On Sat, Mar 3, 2018 at 1:08 AM, Daniel Vacek wrote: >> On Sat, Mar 3, 2018 at 1:40 AM, Andrew Morton >> wrote: >>> On Sat, 3 Mar 2018 01:12:26 +0100 Daniel Vacek wrote: >>> Commit b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns where possible") introduced a bug where move_freepages() triggers a VM_BUG_ON() on uninitialized page structure due to pageblock alignment. >>> >>> b92df1de5d28 was merged a year ago. Can you suggest why this hasn't >>> been reported before now? >> >> Yeah. I was surprised myself I couldn't find a fix to backport to >> RHEL. But actually customers started to report this as soon as 7.4 >> (where b92df1de5d28 was merged in RHEL) was released. I remember >> reports from September/October-ish times. It's not easily reproduced >> and happens on a handful of machines only. I guess that's why. But >> that does not make it less serious, I think. >> >> Though there actually is a report here: >> https://bugzilla.kernel.org/show_bug.cgi?id=196443 >> >> And there are reports for Fedora from July: >> https://bugzilla.redhat.com/show_bug.cgi?id=1473242 >> and CentOS: https://bugs.centos.org/view.php?id=13964 >> and we internally track several dozens reports for RHEL bug >> https://bugzilla.redhat.com/show_bug.cgi?id=1525121 >> >> Enough? ;-) >> >>> This makes me wonder whether a -stable backport is really needed... >> >> For some machines it definitely is. Won't hurt either, IMHO. >> >> --nX
Re: [PATCH v3 2/2] mm/page_alloc: fix memmap_init_zone pageblock alignment
On 12 March 2018 at 17:56, Sudeep Holla wrote: > Hi, > > I couldn't find the exact mail corresponding to the patch merged in v4.16-rc5 > but commit 864b75f9d6b01 "mm/page_alloc: fix memmap_init_zone > pageblock alignment" > cause boot hang on my ARM64 platform. I have also noticed this problem on hi6220 Hikey - arm64. LKFT: linux-next: Hikey boot failed linux-next-20180308 https://bugs.linaro.org/show_bug.cgi?id=3676 - Naresh > > Log: > [0.00] NUMA: No NUMA configuration found > [0.00] NUMA: Faking a node at [mem > 0x-0x0009] > [0.00] NUMA: NODE_DATA [mem 0x9fffcb480-0x9fffccf7f] > [0.00] Zone ranges: > [0.00] DMA32[mem 0x8000-0x] > [0.00] Normal [mem 0x0001-0x0009] > [0.00] Movable zone start for each node > [0.00] Early memory node ranges > [0.00] node 0: [mem 0x8000-0xf8f9afff] > [0.00] node 0: [mem 0xf8f9b000-0xf908] > [0.00] node 0: [mem 0xf909-0xf914] > [0.00] node 0: [mem 0xf915-0xf920] > [0.00] node 0: [mem 0xf921-0xf922] > [0.00] node 0: [mem 0xf923-0xf95b] > [0.00] node 0: [mem 0xf95c-0xfe58] > [0.00] node 0: [mem 0xfe59-0xfe5c] > [0.00] node 0: [mem 0xfe5d-0xfe5d] > [0.00] node 0: [mem 0xfe5e-0xfe62] > [0.00] node 0: [mem 0xfe63-0xfeff] > [0.00] node 0: [mem 0x00088000-0x0009] > [0.00] Initmem setup node 0 [mem > 0x8000-0x0009] > > On Sat, Mar 3, 2018 at 1:08 AM, Daniel Vacek wrote: >> On Sat, Mar 3, 2018 at 1:40 AM, Andrew Morton >> wrote: >>> On Sat, 3 Mar 2018 01:12:26 +0100 Daniel Vacek wrote: >>> Commit b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns where possible") introduced a bug where move_freepages() triggers a VM_BUG_ON() on uninitialized page structure due to pageblock alignment. >>> >>> b92df1de5d28 was merged a year ago. Can you suggest why this hasn't >>> been reported before now? >> >> Yeah. I was surprised myself I couldn't find a fix to backport to >> RHEL. But actually customers started to report this as soon as 7.4 >> (where b92df1de5d28 was merged in RHEL) was released. I remember >> reports from September/October-ish times. It's not easily reproduced >> and happens on a handful of machines only. I guess that's why. But >> that does not make it less serious, I think. >> >> Though there actually is a report here: >> https://bugzilla.kernel.org/show_bug.cgi?id=196443 >> >> And there are reports for Fedora from July: >> https://bugzilla.redhat.com/show_bug.cgi?id=1473242 >> and CentOS: https://bugs.centos.org/view.php?id=13964 >> and we internally track several dozens reports for RHEL bug >> https://bugzilla.redhat.com/show_bug.cgi?id=1525121 >> >> Enough? ;-) >> >>> This makes me wonder whether a -stable backport is really needed... >> >> For some machines it definitely is. Won't hurt either, IMHO. >> >> --nX
Re: [PATCH v3 2/2] mm/page_alloc: fix memmap_init_zone pageblock alignment
Hi, I couldn't find the exact mail corresponding to the patch merged in v4.16-rc5 but commit 864b75f9d6b01 "mm/page_alloc: fix memmap_init_zone pageblock alignment" cause boot hang on my ARM64 platform. Log: [0.00] NUMA: No NUMA configuration found [0.00] NUMA: Faking a node at [mem 0x-0x0009] [0.00] NUMA: NODE_DATA [mem 0x9fffcb480-0x9fffccf7f] [0.00] Zone ranges: [0.00] DMA32[mem 0x8000-0x] [0.00] Normal [mem 0x0001-0x0009] [0.00] Movable zone start for each node [0.00] Early memory node ranges [0.00] node 0: [mem 0x8000-0xf8f9afff] [0.00] node 0: [mem 0xf8f9b000-0xf908] [0.00] node 0: [mem 0xf909-0xf914] [0.00] node 0: [mem 0xf915-0xf920] [0.00] node 0: [mem 0xf921-0xf922] [0.00] node 0: [mem 0xf923-0xf95b] [0.00] node 0: [mem 0xf95c-0xfe58] [0.00] node 0: [mem 0xfe59-0xfe5c] [0.00] node 0: [mem 0xfe5d-0xfe5d] [0.00] node 0: [mem 0xfe5e-0xfe62] [0.00] node 0: [mem 0xfe63-0xfeff] [0.00] node 0: [mem 0x00088000-0x0009] [0.00] Initmem setup node 0 [mem 0x8000-0x0009] On Sat, Mar 3, 2018 at 1:08 AM, Daniel Vacekwrote: > On Sat, Mar 3, 2018 at 1:40 AM, Andrew Morton > wrote: >> On Sat, 3 Mar 2018 01:12:26 +0100 Daniel Vacek wrote: >> >>> Commit b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns >>> where possible") introduced a bug where move_freepages() triggers a >>> VM_BUG_ON() on uninitialized page structure due to pageblock alignment. >> >> b92df1de5d28 was merged a year ago. Can you suggest why this hasn't >> been reported before now? > > Yeah. I was surprised myself I couldn't find a fix to backport to > RHEL. But actually customers started to report this as soon as 7.4 > (where b92df1de5d28 was merged in RHEL) was released. I remember > reports from September/October-ish times. It's not easily reproduced > and happens on a handful of machines only. I guess that's why. But > that does not make it less serious, I think. > > Though there actually is a report here: > https://bugzilla.kernel.org/show_bug.cgi?id=196443 > > And there are reports for Fedora from July: > https://bugzilla.redhat.com/show_bug.cgi?id=1473242 > and CentOS: https://bugs.centos.org/view.php?id=13964 > and we internally track several dozens reports for RHEL bug > https://bugzilla.redhat.com/show_bug.cgi?id=1525121 > > Enough? ;-) > >> This makes me wonder whether a -stable backport is really needed... > > For some machines it definitely is. Won't hurt either, IMHO. > > --nX
Re: [PATCH v3 2/2] mm/page_alloc: fix memmap_init_zone pageblock alignment
Hi, I couldn't find the exact mail corresponding to the patch merged in v4.16-rc5 but commit 864b75f9d6b01 "mm/page_alloc: fix memmap_init_zone pageblock alignment" cause boot hang on my ARM64 platform. Log: [0.00] NUMA: No NUMA configuration found [0.00] NUMA: Faking a node at [mem 0x-0x0009] [0.00] NUMA: NODE_DATA [mem 0x9fffcb480-0x9fffccf7f] [0.00] Zone ranges: [0.00] DMA32[mem 0x8000-0x] [0.00] Normal [mem 0x0001-0x0009] [0.00] Movable zone start for each node [0.00] Early memory node ranges [0.00] node 0: [mem 0x8000-0xf8f9afff] [0.00] node 0: [mem 0xf8f9b000-0xf908] [0.00] node 0: [mem 0xf909-0xf914] [0.00] node 0: [mem 0xf915-0xf920] [0.00] node 0: [mem 0xf921-0xf922] [0.00] node 0: [mem 0xf923-0xf95b] [0.00] node 0: [mem 0xf95c-0xfe58] [0.00] node 0: [mem 0xfe59-0xfe5c] [0.00] node 0: [mem 0xfe5d-0xfe5d] [0.00] node 0: [mem 0xfe5e-0xfe62] [0.00] node 0: [mem 0xfe63-0xfeff] [0.00] node 0: [mem 0x00088000-0x0009] [0.00] Initmem setup node 0 [mem 0x8000-0x0009] On Sat, Mar 3, 2018 at 1:08 AM, Daniel Vacek wrote: > On Sat, Mar 3, 2018 at 1:40 AM, Andrew Morton > wrote: >> On Sat, 3 Mar 2018 01:12:26 +0100 Daniel Vacek wrote: >> >>> Commit b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns >>> where possible") introduced a bug where move_freepages() triggers a >>> VM_BUG_ON() on uninitialized page structure due to pageblock alignment. >> >> b92df1de5d28 was merged a year ago. Can you suggest why this hasn't >> been reported before now? > > Yeah. I was surprised myself I couldn't find a fix to backport to > RHEL. But actually customers started to report this as soon as 7.4 > (where b92df1de5d28 was merged in RHEL) was released. I remember > reports from September/October-ish times. It's not easily reproduced > and happens on a handful of machines only. I guess that's why. But > that does not make it less serious, I think. > > Though there actually is a report here: > https://bugzilla.kernel.org/show_bug.cgi?id=196443 > > And there are reports for Fedora from July: > https://bugzilla.redhat.com/show_bug.cgi?id=1473242 > and CentOS: https://bugs.centos.org/view.php?id=13964 > and we internally track several dozens reports for RHEL bug > https://bugzilla.redhat.com/show_bug.cgi?id=1525121 > > Enough? ;-) > >> This makes me wonder whether a -stable backport is really needed... > > For some machines it definitely is. Won't hurt either, IMHO. > > --nX
Re: [PATCH v3 2/2] mm/page_alloc: fix memmap_init_zone pageblock alignment
On Sat, Mar 3, 2018 at 1:40 AM, Andrew Mortonwrote: > On Sat, 3 Mar 2018 01:12:26 +0100 Daniel Vacek wrote: > >> Commit b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns >> where possible") introduced a bug where move_freepages() triggers a >> VM_BUG_ON() on uninitialized page structure due to pageblock alignment. > > b92df1de5d28 was merged a year ago. Can you suggest why this hasn't > been reported before now? Yeah. I was surprised myself I couldn't find a fix to backport to RHEL. But actually customers started to report this as soon as 7.4 (where b92df1de5d28 was merged in RHEL) was released. I remember reports from September/October-ish times. It's not easily reproduced and happens on a handful of machines only. I guess that's why. But that does not make it less serious, I think. Though there actually is a report here: https://bugzilla.kernel.org/show_bug.cgi?id=196443 And there are reports for Fedora from July: https://bugzilla.redhat.com/show_bug.cgi?id=1473242 and CentOS: https://bugs.centos.org/view.php?id=13964 and we internally track several dozens reports for RHEL bug https://bugzilla.redhat.com/show_bug.cgi?id=1525121 Enough? ;-) > This makes me wonder whether a -stable backport is really needed... For some machines it definitely is. Won't hurt either, IMHO. --nX
Re: [PATCH v3 2/2] mm/page_alloc: fix memmap_init_zone pageblock alignment
On Sat, Mar 3, 2018 at 1:40 AM, Andrew Morton wrote: > On Sat, 3 Mar 2018 01:12:26 +0100 Daniel Vacek wrote: > >> Commit b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns >> where possible") introduced a bug where move_freepages() triggers a >> VM_BUG_ON() on uninitialized page structure due to pageblock alignment. > > b92df1de5d28 was merged a year ago. Can you suggest why this hasn't > been reported before now? Yeah. I was surprised myself I couldn't find a fix to backport to RHEL. But actually customers started to report this as soon as 7.4 (where b92df1de5d28 was merged in RHEL) was released. I remember reports from September/October-ish times. It's not easily reproduced and happens on a handful of machines only. I guess that's why. But that does not make it less serious, I think. Though there actually is a report here: https://bugzilla.kernel.org/show_bug.cgi?id=196443 And there are reports for Fedora from July: https://bugzilla.redhat.com/show_bug.cgi?id=1473242 and CentOS: https://bugs.centos.org/view.php?id=13964 and we internally track several dozens reports for RHEL bug https://bugzilla.redhat.com/show_bug.cgi?id=1525121 Enough? ;-) > This makes me wonder whether a -stable backport is really needed... For some machines it definitely is. Won't hurt either, IMHO. --nX
Re: [PATCH v3 2/2] mm/page_alloc: fix memmap_init_zone pageblock alignment
On Sat, 3 Mar 2018 01:12:26 +0100 Daniel Vacekwrote: > Commit b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns > where possible") introduced a bug where move_freepages() triggers a > VM_BUG_ON() on uninitialized page structure due to pageblock alignment. b92df1de5d28 was merged a year ago. Can you suggest why this hasn't been reported before now? This makes me wonder whether a -stable backport is really needed...
Re: [PATCH v3 2/2] mm/page_alloc: fix memmap_init_zone pageblock alignment
On Sat, 3 Mar 2018 01:12:26 +0100 Daniel Vacek wrote: > Commit b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns > where possible") introduced a bug where move_freepages() triggers a > VM_BUG_ON() on uninitialized page structure due to pageblock alignment. b92df1de5d28 was merged a year ago. Can you suggest why this hasn't been reported before now? This makes me wonder whether a -stable backport is really needed...