Thank you for your help. I've created an issue in the comminity JIRA for this: 
LU-13168.

Kind Regards,
Christopher.

On Mon, Jan 20, 2020 at 05:22:58PM +0000, Peter Jones wrote:
> Christopher
> 
> Apologies for the confusing message about requesting an account for JIRA - 
> I'll see if we can remove that message but I think that it might be 
> system-generated. We've had to disable self-registration because of repeated 
> hacking attempts via that mechanism. The message on the left "For questions 
> or login request, send email to Jira administrators" works - the link there 
> sends an email to [email protected] and several requests come through per 
> week via that channel - but I can see why the message on the right would draw 
> your eye...
> 
> Peter
> 
> On 2020-01-20, 8:15 AM, "lustre-discuss on behalf of Christopher Mountford" 
> <[email protected] on behalf of [email protected]> 
> wrote:
> 
>     We've seen 3 lustre client panics in the last few hours when using the 
> b2_12 branch (we're using it on client nodes as it patches a data on MDT bug 
> in 2.12.3. Still using 2.12.3 on MDS/OSS). This looks similar similar to 
> LU-12581, which we had seen on our system before but was fixed in 2.12.3. 
> Could this have been re-introduced in the b2_12 branch?
>     
>     I've included the dmesg from one of the panics below. Unfortunately we 
> have not yet found a way to reproduce the problem. Has anyone seen anything 
> similar to this?
>     
>     Is this mailing list a suitable place to ask for help on this sort of 
> bug? I've been looking at the Whamcloud Community Jira, but the link to 
> request an account returns "Your Jira administrator has not yet configured 
> this contact form."
>     
>     dmesg from failed client:
>     
>     [542909.741793] 
> =============================================================================
>     [542909.741800] BUG kmalloc-8 (Tainted: G           OE  ------------  ): 
> Freechain corrupt
>     [542909.741802] 
> -----------------------------------------------------------------------------
>     
>     [542909.741805] Disabling lock debugging due to kernel taint
>     [542909.741809] INFO: Slab 0xffffe0933440b3c0 objects=102 used=75 
> fp=0xffff9bb6902cf558 flags=0x6fffff00000081
>     [542909.741812] INFO: Object 0xffff9bb6902cfad0 @offset=2768 
> fp=0x7fff9bb6902cfdf0
>     
>     [542909.741816] Redzone ffff9bb6902cfac8: bb 3b 3b 3b 3b bb bb bb         
>                  .;;;;...
>     [542909.741818] Object ffff9bb6902cfad0: 6b 6b 6b 6b 6b 6b 6b a5          
>                 kkkkkkk.
>     [542909.741821] Redzone ffff9bb6902cfad8: bb bb bb 3b bb bb bb bb         
>                  ...;....
>     [542909.741823] Padding ffff9bb6902cfae8: 5a 5a 5a 5a 5a 5a 5a 5a         
>                  ZZZZZZZZ
>     [542909.741828] CPU: 25 PID: 50461 Comm: pool Kdump: loaded Tainted: G    
> B      OE  ------------   3.10.0-1062.9.1.el7.x86_64 #1
>     [542909.741830] Hardware name: HP ProLiant BL460c Gen9, BIOS I36 
> 10/21/2019
>     [542909.741832] Call Trace:
>     [542909.741846]  [<ffffffffa277ac23>] dump_stack+0x19/0x1b
>     [542909.741852]  [<ffffffffa2221561>] print_trailer+0x161/0x280
>     [542909.741856]  [<ffffffffa2221ebf>] on_freelist+0xff/0x270
>     [542909.741860]  [<ffffffffa27774cc>] free_debug_processing+0x18d/0x270
>     [542909.741867]  [<ffffffffa21ddcb5>] ? kvfree+0x35/0x40
>     [542909.741870]  [<ffffffffa2223bee>] __slab_free+0x1ce/0x290
>     [542909.741878]  [<ffffffffa2272e58>] ? generic_setxattr+0x68/0x80
>     [542909.741883]  [<ffffffffa2273635>] ? __vfs_setxattr_noperm+0x65/0x1b0
>     [542909.741889]  [<ffffffffa232b7ae>] ? evm_inode_setxattr+0xe/0x10
>     [542909.741892]  [<ffffffffa21ddcb5>] ? kvfree+0x35/0x40
>     [542909.741895]  [<ffffffffa2223db6>] kfree+0x106/0x140
>     [542909.741899]  [<ffffffffa21ddcb5>] kvfree+0x35/0x40
>     [542909.741902]  [<ffffffffa227399b>] setxattr+0x15b/0x1e0
>     [542909.741909]  [<ffffffffa225c3ed>] ? putname+0x3d/0x60
>     [542909.741914]  [<ffffffffa225d602>] ? user_path_at_empty+0x72/0xc0
>     [542909.741920]  [<ffffffffa224d828>] ? __sb_start_write+0x58/0x120
>     [542909.741926]  [<ffffffffa22802f1>] ? do_utimes+0xf1/0x180
>     [542909.741930]  [<ffffffffa2273c87>] SyS_setxattr+0xb7/0x100
>     [542909.741937]  [<ffffffffa278dede>] system_call_fastpath+0x25/0x2a
>     [542909.741940] 
> =============================================================================
>     [542909.741942] BUG kmalloc-8 (Tainted: G    B      OE  ------------  ): 
> Wrong object count. Counter is 75 but counted were 95
>     [542909.741944] 
> -----------------------------------------------------------------------------
>     
>     [542909.741947] INFO: Slab 0xffffe0933440b3c0 objects=102 used=75 
> fp=0xffff9bb6902cf558 flags=0x6fffff00000081
>     [542909.741951] CPU: 25 PID: 50461 Comm: pool Kdump: loaded Tainted: G    
> B      OE  ------------   3.10.0-1062.9.1.el7.x86_64 #1
>     [542909.741953] Hardware name: HP ProLiant BL460c Gen9, BIOS I36 
> 10/21/2019
>     [542909.741954] Call Trace:
>     [542909.741958]  [<ffffffffa277ac23>] dump_stack+0x19/0x1b
>     [542909.741961]  [<ffffffffa2221b54>] slab_err+0xb4/0xe0
>     [542909.741969]  [<ffffffffa2030a1e>] ? show_stack+0x4e/0x60
>     [542909.741972]  [<ffffffffa2221561>] ? print_trailer+0x161/0x280
>     [542909.741975]  [<ffffffffa2221f85>] on_freelist+0x1c5/0x270
>     [542909.742227]  [<ffffffffa27774cc>] free_debug_processing+0x18d/0x270
>     [542909.742479]  [<ffffffffa21ddcb5>] ? kvfree+0x35/0x40
>     [542909.742483]  [<ffffffffa2223bee>] __slab_free+0x1ce/0x290
>     [542909.742488]  [<ffffffffa2272e58>] ? generic_setxattr+0x68/0x80
>     [542909.742491]  [<ffffffffa2273635>] ? __vfs_setxattr_noperm+0x65/0x1b0
>     [542909.742495]  [<ffffffffa232b7ae>] ? evm_inode_setxattr+0xe/0x10
>     [542909.742498]  [<ffffffffa21ddcb5>] ? kvfree+0x35/0x40
>     [542909.742501]  [<ffffffffa2223db6>] kfree+0x106/0x140
>     [542909.742504]  [<ffffffffa21ddcb5>] kvfree+0x35/0x40
>     [542909.742508]  [<ffffffffa227399b>] setxattr+0x15b/0x1e0
>     [542909.742511]  [<ffffffffa225c3ed>] ? putname+0x3d/0x60
>     [542909.742515]  [<ffffffffa225d602>] ? user_path_at_empty+0x72/0xc0
>     [542909.742519]  [<ffffffffa224d828>] ? __sb_start_write+0x58/0x120
>     [542909.742523]  [<ffffffffa22802f1>] ? do_utimes+0xf1/0x180
>     [542909.742527]  [<ffffffffa2273c87>] SyS_setxattr+0xb7/0x100
>     [542909.742530]  [<ffffffffa278dede>] system_call_fastpath+0x25/0x2a
>     [542909.742533] FIX kmalloc-8: Object count adjusted.
>     [542909.742536] 
> =============================================================================
>     [542909.742538] BUG kmalloc-8 (Tainted: G    B      OE  ------------  ): 
> Redzone overwritten
>     [542909.742539] 
> -----------------------------------------------------------------------------
>     
>     [542909.742543] INFO: 0xffff9bb6902cf858-0xffff9bb6902cf85f. First byte 
> 0x4c instead of 0xcc
>     [542909.742545] INFO: Slab 0xffffe0933440b3c0 objects=102 used=95 
> fp=0xffff9bb6902cf558 flags=0x6fffff00000081
>     [542909.742547] INFO: Object 0xffff9bb6902cf850 @offset=2128 
> fp=0x7f7f1b36102c7c10
>     
>     [542909.742550] Redzone ffff9bb6902cf848: cc cc cc cc cc cc cc cc         
>                  ........
>     [542909.742552] Object ffff9bb6902cf850: d0 0b d6 0b 88 01 00 25          
>                 .......%
>     [542909.742555] Redzone ffff9bb6902cf858: 4c 4c 4c 4c 4c 4c 4c 4c         
>                  LLLLLLLL
>     [542909.742557] Padding ffff9bb6902cf868: 5a 5a 5a 5a 5a 5a 5a 5a         
>                  ZZZZZZZZ
>     [542909.742560] CPU: 25 PID: 50461 Comm: pool Kdump: loaded Tainted: G    
> B      OE  ------------   3.10.0-1062.9.1.el7.x86_64 #1
>     [542909.742562] Hardware name: HP ProLiant BL460c Gen9, BIOS I36 
> 10/21/2019
>     [542909.742563] Call Trace:
>     [542909.742567]  [<ffffffffa277ac23>] dump_stack+0x19/0x1b
>     [542909.742570]  [<ffffffffa2221561>] print_trailer+0x161/0x280
>     [542909.742573]  [<ffffffffa22217ef>] check_bytes_and_report+0xcf/0x110
>     [542909.742576]  [<ffffffffa222237d>] check_object+0x1dd/0x2a0
>     [542909.742580]  [<ffffffffa27773cc>] free_debug_processing+0x8d/0x270
>     [542909.742583]  [<ffffffffa21ddcb5>] ? kvfree+0x35/0x40
>     [542909.742586]  [<ffffffffa2223bee>] __slab_free+0x1ce/0x290
>     [542909.742590]  [<ffffffffa2272e58>] ? generic_setxattr+0x68/0x80
>     [542909.742593]  [<ffffffffa2273635>] ? __vfs_setxattr_noperm+0x65/0x1b0
>     [542909.742596]  [<ffffffffa232b7ae>] ? evm_inode_setxattr+0xe/0x10
>     [542909.742599]  [<ffffffffa21ddcb5>] ? kvfree+0x35/0x40
>     [542909.742602]  [<ffffffffa2223db6>] kfree+0x106/0x140
>     [542909.742606]  [<ffffffffa21ddcb5>] kvfree+0x35/0x40
>     [542909.742609]  [<ffffffffa227399b>] setxattr+0x15b/0x1e0
>     [542909.742613]  [<ffffffffa225c3ed>] ? putname+0x3d/0x60
>     [542909.742617]  [<ffffffffa225d602>] ? user_path_at_empty+0x72/0xc0
>     [542909.742621]  [<ffffffffa224d828>] ? __sb_start_write+0x58/0x120
>     [542909.742624]  [<ffffffffa22802f1>] ? do_utimes+0xf1/0x180
>     [542909.742628]  [<ffffffffa2273c87>] SyS_setxattr+0xb7/0x100
>     [542909.742631]  [<ffffffffa278dede>] system_call_fastpath+0x25/0x2a
>     [542909.742635] FIX kmalloc-8: Restoring 
> 0xffff9bb6902cf858-0xffff9bb6902cf85f=0xcc
>     
>     [542909.742648] FIX kmalloc-8: Object at 0xffff9bb6902cf850 not freed
>     [542909.763926] general protection fault: 0000 [#1] SMP 
>     [542909.792826] Modules linked in: tcp_diag inet_diag fuse nfsd mgc(OE) 
> lustre(OE) lmv(OE) mdc(OE) fid(OE) osc(OE) lov(OE) fld(OE) ko2iblnd(OE) 
> ptlrpc(OE) obdclass(OE) cts lnet(OE) rpcsec_gss_krb5 nfsv4 dns_resolver 
> libcfs(OE) rdma_ucm(OE) ib_ucm(OE) rdma_cm(OE) iw_cm(OE) ib_ipoib(OE) 
> ib_cm(OE) ib_umad(OE) mlx5_fpga_tools(OE) mlx5_ib(OE) mlx5_core(OE) mlxfw(OE) 
> mlx4_en(OE) ipt_REJECT nf_reject_ipv4 nf_log_ipv4 nf_log_common xt_LOG 
> nf_conntrack_ipv4 nf_defrag_ipv4 xt_multiport xt_recent xt_conntrack 
> nf_conntrack iptable_filter mlx4_ib(OE) dm_mirror dm_region_hash dm_log 
> dm_mod ib_uverbs(OE) ib_core(OE) sb_edac intel_powerclamp coretemp intel_rapl 
> iosf_mbi kvm_intel mgag200 mlx4_core(OE) iTCO_wdt iTCO_vendor_support ttm kvm 
> drm_kms_helper irqbypass syscopyarea sysfillrect crc32_pclmul sysimgblt 
> crc32c_intel
>     [542910.218156]  fb_sys_fops mlx_compat(OE) ghash_clmulni_intel drm 
> aesni_intel lrw gf128mul glue_helper ses ablk_helper devlink enclosure cryptd 
> drm_panel_orientation_quirks hpwdt i2c_i801 pcspkr pcc_cpufreq wmi ioatdma 
> ipmi_si acpi_power_meter ipmi_devintf ipmi_msghandler lpc_ich knem(OE) 
> binfmt_misc auth_rpcgss ip_tables smartpqi bridge stp llc xfs isci libsas 
> qla3xxx e1000e igb i2c_algo_bit megaraid_sas aacraid aic79xx ata_piix mpt2sas 
> raid_class mptspi scsi_transport_spi mptsas mptscsih mptbase arcmsr ahci 
> libahci sata_nv sata_svw bnx2x libcrc32c bnx2 ext4 mbcache jbd2 sata_sil 
> libata tg3 e1000 nfsv3 nfs_acl nfs lockd grace sunrpc fscache tun sd_mod 
> crc_t10dif crct10dif_generic sg ixgbe crct10dif_pclmul crct10dif_common hpsa 
> dca mdio hpilo ptp scsi_transport_sas pps_core [last unloaded: 
> ipmi_msghandler]
>     [542910.624054] 
>     [542910.625230] CPU: 27 PID: 25861 Comm: gdbus Kdump: loaded Tainted: G   
>  B      OE  ------------   3.10.0-1062.9.1.el7.x86_64 #1
>     [542910.685731] Hardware name: HP ProLiant BL460c Gen9, BIOS I36 
> 10/21/2019
>     [542910.724144] task: ffff9ba5b5bc1070 ti: ffff9ba6067c0000 task.ti: 
> ffff9ba6067c0000
>     [542910.768155] RIP: 0010:[<ffffffffa21f711b>]  [<ffffffffa21f711b>] 
> find_vma+0x3b/0x60
>     [542910.810986] RSP: 0000:ffff9ba6067c3ea8  EFLAGS: 00010202
>     [542910.840760] RAX: ffff9bb72066f1b8 RBX: 0000000000000004 RCX: 
> ffff9ba6067c3fd8
>     [542910.880983] RDX: 7fff9bb7c2fec608 RSI: 0000000000682888 RDI: 
> ffff9ba002a34b00
>     [542910.919946] RBP: ffff9ba6067c3ea8 R08: 0000000000000001 R09: 
> 0000000000000000
>     [542910.958846] R10: 000000000000001c R11: 00002aaaae480b40 R12: 
> 00000000000000a8
>     [542910.998593] R13: 0000000000682888 R14: ffff9ba6067c3f58 R15: 
> ffff9ba002a34b00
>     [542911.038992] FS:  00002aaabc395700(0000) GS:ffff9bb97f140000(0000) 
> knlGS:0000000000000000
>     [542911.095715] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>     [542911.155694] CR2: 0000000000682888 CR3: 0000003214b00000 CR4: 
> 00000000003607e0
>     [542911.202949] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 
> 0000000000000000
>     [542911.265589] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 
> 0000000000000400
>     [542911.315387] Call Trace:
>     [542911.355844]  [<ffffffffa278857d>] __do_page_fault+0x13d/0x500
>     [542911.413348]  [<ffffffffa2788975>] do_page_fault+0x35/0x90
>     [542911.455443]  [<ffffffffa2784778>] page_fault+0x28/0x30
>     [542911.495307] Code: 74 06 48 39 70 08 77 40 48 8b 57 08 31 c0 48 85 d2 
> 75 18 eb 2e 0f 1f 00 48 3b 72 e0 48 8d 42 e0 73 1d 48 8b 52 10 48 85 d2 74 0f 
> <48> 3b 72 e8 72 e7 48 8b 52 08 48 85 d2 75 f1 48 85 c0 74 04 48 
>     [542911.665436] RIP  [<ffffffffa21f711b>] find_vma+0x3b/0x60
>     [542911.695917]  RSP <ffff9ba6067c3ea8>
>     
>     -- 
>     -- 
>     # Dr. Christopher Mountford
>     # System specialist - Research Computing/HPC
>     # 
>     # IT services,
>     #     University of Leicester, University Road, 
>     #     Leicester, LE1 7RH, UK 
>     #
>     # t: 0116 252 3471
>     # e: [email protected]
>     
>     _______________________________________________
>     lustre-discuss mailing list
>     [email protected]
>     
> https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Flists.lustre.org%2Flistinfo.cgi%2Flustre-discuss-lustre.org&amp;data=02%7C01%7Ccjm14%40leicester.ac.uk%7Cd30ebfdb815d4a0379ea08d79dcd6755%7Caebecd6a31d44b0195ce8274afe853d9%7C0%7C0%7C637151377842025728&amp;sdata=w2ogPwBp4j9GQ1P4mtJuhnRIGR%2FcJw94FbNb151MX%2Fk%3D&amp;reserved=0
>     
> 

-- 
-- 
# Dr. Christopher Mountford
# System specialist - Research Computing/HPC
# 
# IT services,
#     University of Leicester, University Road, 
#     Leicester, LE1 7RH, UK 
#
# t: 0116 252 3471
# e: [email protected]

_______________________________________________
lustre-discuss mailing list
[email protected]
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Reply via email to