Re: [5.9.0-rc5-20200914] Kernel crash while running LTP(mlock201)

2020-09-15 Thread Sachin Sant


> On 15-Sep-2020, at 6:39 PM, Matthew Wilcox  wrote:
> 
> On Tue, Sep 15, 2020 at 09:24:38PM +1000, Michael Ellerman wrote:
>> Sachin Sant  writes:
>>> While running LTP tests (specifically mlock201) against next-20200914 tree
>>> on a POWER9 LPAR results in following crash.
>> 
>> Looks the same as:
>> 
>> https://lore.kernel.org/linux-mm/20200914085545.GB28738@shao2-debian/
> 
> https://lore.kernel.org/linux-mm/20200914112738.gm6...@casper.infradead.org/

Thanks. The patch fixes the problem for me.

Tested-by: Sachin Sant 

thanks
-Sachin



Re: [5.9.0-rc5-20200914] Kernel crash while running LTP(mlock201)

2020-09-15 Thread Matthew Wilcox
On Tue, Sep 15, 2020 at 09:24:38PM +1000, Michael Ellerman wrote:
> Sachin Sant  writes:
> > While running LTP tests (specifically mlock201) against next-20200914 tree
> > on a POWER9 LPAR results in following crash.
> 
> Looks the same as:
> 
> https://lore.kernel.org/linux-mm/20200914085545.GB28738@shao2-debian/

https://lore.kernel.org/linux-mm/20200914112738.gm6...@casper.infradead.org/


Re: [5.9.0-rc5-20200914] Kernel crash while running LTP(mlock201)

2020-09-15 Thread Michael Ellerman
Sachin Sant  writes:
> While running LTP tests (specifically mlock201) against next-20200914 tree
> on a POWER9 LPAR results in following crash.

Looks the same as:

https://lore.kernel.org/linux-mm/20200914085545.GB28738@shao2-debian/

cheers

> BUG: Kernel NULL pointer dereference on read at 0x
> Faulting instruction address: 0xc0454248
> Oops: Kernel access of bad area, sig: 11 [#1]
> LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
> Modules linked in: af_packet(E) nft_ct(E) nf_conntrack(E) nf_defrag_ipv6(E) 
> nf_defrag_ipv4(E) libcrc32c(E) ip6_tables(E) nft_compat(E) ip_set(E) 
> rfkill(E) nf_tables(E) nfnetlink(E) vmx_crypto(E) uio_pdrv_genirq(E) 
> gf128mul(E) uio(E) rtc_generic(E) crct10dif_vpmsum(E) sch_fq_codel(E) 
> ip_tables(E) x_tables(E) ext4(E) crc16(E) mbcache(E) jbd2(E) sd_mod(E) 
> t10_pi(E) sg(E) ibmvscsi(E) scsi_transport_srp(E) scsi_mod(E) ibmveth(E) 
> crc32c_vpmsum(E) dm_mirror(E) dm_region_hash(E) dm_log(E) dm_mod(E) autofs4(E)
> CPU: 11 PID: 26435 Comm: mlock201 Tainted: GE 
> 5.9.0-rc5-next-20200914-281.gf529200-default #1
> NIP:  c0454248 LR: c0445a74 CTR: c0413150
> REGS: c008e645b770 TRAP: 0300   Tainted: GE  
> (5.9.0-rc5-next-20200914-281.gf529200-default)
> MSR:  80009033   CR: 28002482  XER: 2004
> CFAR: c000fbb0 DAR:  DSISR: 4000 IRQMASK: 0 
> GPR00: c0445a74 c008e645ba00 c17c4500  
> GPR04: 0001 c008ea109e98 c008f0c4  
> GPR08:    0003 
> GPR12: c0413150 c0001ec70200  c1502038 
> GPR16: 7fff9c61 7fff9c61 7fff9c61 c0cb02f8 
> GPR20: 7fff9c5c 7fff9c62 c008e645bcd8 c008f0c4 
> GPR24: c00c023c0d00 fe7f  c008f0c4 
> GPR28: c008ea109e98 0001 c008ea9288a8  
> NIP [c0454248] PageHuge+0x8/0x60
> LR [c0445a74] find_get_incore_page+0x114/0x160
> Call Trace:
> [c008e645ba00] [c0445994] find_get_incore_page+0x34/0x160 
> (unreliable)
> [c008e645ba40] [c0412e54] mincore_page+0x24/0x160
> [c008e645ba70] [c0413020] __mincore_unmapped_range+0x90/0x160
> [c008e645bac0] [c0413680] mincore_pte_range+0x530/0x5d0
> [c008e645bb40] [c0422a38] walk_pgd_range+0x4e8/0xae0
> [c008e645bc30] [c04230c4] __walk_page_range+0x94/0x250
> [c008e645bcb0] [c04233d8] walk_page_range+0x158/0x1e0
> [c008e645bd40] [c041386c] sys_mincore+0x14c/0x370
> [c008e645bdc0] [c0033eb8] system_call_exception+0xf8/0x200
> [c008e645be20] [c000d140] system_call_common+0xf0/0x27c
> Instruction dump:
> e8410018 38210020 e8010010 7c0803a6 4e800020 6000 3d41 7d435378 
> 4e800020 6000 7c0802a6 6000  75290001 40820010 e9230008 
> ---[ end trace 357eb14a3b22eab2 ]—
>
>
> The function find_get_incore_page() was introduced with 
> 3fcbe4eb49a0406e6202e8c8c3560f30965a8e79 
>
> mm: factor find_get_incore_page out of mincore_page
>
>
> Thanks
> -Sachin


[5.9.0-rc5-20200914] Kernel crash while running LTP(mlock201)

2020-09-15 Thread Sachin Sant
While running LTP tests (specifically mlock201) against next-20200914 tree
on a POWER9 LPAR results in following crash.

BUG: Kernel NULL pointer dereference on read at 0x
Faulting instruction address: 0xc0454248
Oops: Kernel access of bad area, sig: 11 [#1]
LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
Modules linked in: af_packet(E) nft_ct(E) nf_conntrack(E) nf_defrag_ipv6(E) 
nf_defrag_ipv4(E) libcrc32c(E) ip6_tables(E) nft_compat(E) ip_set(E) rfkill(E) 
nf_tables(E) nfnetlink(E) vmx_crypto(E) uio_pdrv_genirq(E) gf128mul(E) uio(E) 
rtc_generic(E) crct10dif_vpmsum(E) sch_fq_codel(E) ip_tables(E) x_tables(E) 
ext4(E) crc16(E) mbcache(E) jbd2(E) sd_mod(E) t10_pi(E) sg(E) ibmvscsi(E) 
scsi_transport_srp(E) scsi_mod(E) ibmveth(E) crc32c_vpmsum(E) dm_mirror(E) 
dm_region_hash(E) dm_log(E) dm_mod(E) autofs4(E)
CPU: 11 PID: 26435 Comm: mlock201 Tainted: GE 
5.9.0-rc5-next-20200914-281.gf529200-default #1
NIP:  c0454248 LR: c0445a74 CTR: c0413150
REGS: c008e645b770 TRAP: 0300   Tainted: GE  
(5.9.0-rc5-next-20200914-281.gf529200-default)
MSR:  80009033   CR: 28002482  XER: 2004
CFAR: c000fbb0 DAR:  DSISR: 4000 IRQMASK: 0 
GPR00: c0445a74 c008e645ba00 c17c4500  
GPR04: 0001 c008ea109e98 c008f0c4  
GPR08:    0003 
GPR12: c0413150 c0001ec70200  c1502038 
GPR16: 7fff9c61 7fff9c61 7fff9c61 c0cb02f8 
GPR20: 7fff9c5c 7fff9c62 c008e645bcd8 c008f0c4 
GPR24: c00c023c0d00 fe7f  c008f0c4 
GPR28: c008ea109e98 0001 c008ea9288a8  
NIP [c0454248] PageHuge+0x8/0x60
LR [c0445a74] find_get_incore_page+0x114/0x160
Call Trace:
[c008e645ba00] [c0445994] find_get_incore_page+0x34/0x160 
(unreliable)
[c008e645ba40] [c0412e54] mincore_page+0x24/0x160
[c008e645ba70] [c0413020] __mincore_unmapped_range+0x90/0x160
[c008e645bac0] [c0413680] mincore_pte_range+0x530/0x5d0
[c008e645bb40] [c0422a38] walk_pgd_range+0x4e8/0xae0
[c008e645bc30] [c04230c4] __walk_page_range+0x94/0x250
[c008e645bcb0] [c04233d8] walk_page_range+0x158/0x1e0
[c008e645bd40] [c041386c] sys_mincore+0x14c/0x370
[c008e645bdc0] [c0033eb8] system_call_exception+0xf8/0x200
[c008e645be20] [c000d140] system_call_common+0xf0/0x27c
Instruction dump:
e8410018 38210020 e8010010 7c0803a6 4e800020 6000 3d41 7d435378 
4e800020 6000 7c0802a6 6000  75290001 40820010 e9230008 
---[ end trace 357eb14a3b22eab2 ]—


The function find_get_incore_page() was introduced with 
3fcbe4eb49a0406e6202e8c8c3560f30965a8e79 

mm: factor find_get_incore_page out of mincore_page


Thanks
-Sachin