Hi,

Starting from yesterday, we see frequent MDS crashes, all of them are showing 
ldlm_flock_deadlock.
Servers are running Lustre 2.15.4, MDT and MGT are on LDISKFS and OSTs are on 
ZFS. AlmaLinux 8.9.
Clients are mostly CentOS 7.9 with Lustre client 2.15.4.

In one of these crashes, we have a complete coredump in case if someone wants 
to check.

Thanks,

Lixin.

[15817.464501] LustreError: 22687:0:(ldlm_flock.c:230:ldlm_flock_deadlock()) 
ASSERTION( req != lock ) failed:
[15817.474247] LustreError: 22687:0:(ldlm_flock.c:230:ldlm_flock_deadlock()) 
LBUG
[15817.481497] Pid: 22687, comm: mdt01_003 4.18.0-513.9.1.el8_lustre.x86_64 #1 
SMP Sat Dec 23 05:23:32 UTC 2023
[15817.491318] Call Trace TBD:
[15817.494137] [<0>] libcfs_call_trace+0x6f/0xa0 [libcfs]
[15817.499297] [<0>] lbug_with_loc+0x3f/0x70 [libcfs]
[15817.504097] [<0>] ldlm_flock_deadlock.isra.10+0x1fb/0x240 [ptlrpc]
[15817.510398] [<0>] ldlm_process_flock_lock+0x289/0x1f90 [ptlrpc]
[15817.516402] [<0>] ldlm_lock_enqueue+0x2a5/0xaa0 [ptlrpc]
[15817.521813] [<0>] ldlm_handle_enqueue0+0x634/0x1520 [ptlrpc]
[15817.527562] [<0>] tgt_enqueue+0xa4/0x220 [ptlrpc]
[15817.532368] [<0>] tgt_request_handle+0xccd/0x1a20 [ptlrpc]
[15817.537949] [<0>] ptlrpc_server_handle_request+0x323/0xbe0 [ptlrpc]
[15817.544311] [<0>] ptlrpc_main+0xbec/0x1530 [ptlrpc]
[15817.549294] [<0>] kthread+0x134/0x150
[15817.552966] [<0>] ret_from_fork+0x1f/0x40
[15817.556980] Kernel panic - not syncing: LBUG
[15817.561248] CPU: 23 PID: 22687 Comm: mdt01_003 Kdump: loaded Tainted: G      
     OE    --------- -  - 4.18.0-513.9.1.el8_lustre.x86_64 #1
[15817.573669] Hardware name: Dell Inc. PowerEdge R640/0CRT1G, BIOS 2.19.1 
06/04/2023
[15817.581235] Call Trace:
[15817.583687]  dump_stack+0x41/0x60
[15817.587007]  panic+0xe7/0x2ac
[15817.589979]  ? ret_from_fork+0x1f/0x40
[15817.593733]  lbug_with_loc.cold.8+0x18/0x18 [libcfs]
[15817.598714]  ldlm_flock_deadlock.isra.10+0x1fb/0x240 [ptlrpc]
[15817.604557]  ldlm_process_flock_lock+0x289/0x1f90 [ptlrpc]
[15817.610121]  ? lustre_msg_get_flags+0x2a/0x90 [ptlrpc]
[15817.615346]  ? lustre_msg_add_version+0x21/0xa0 [ptlrpc]
[15817.620745]  ldlm_lock_enqueue+0x2a5/0xaa0 [ptlrpc]
[15817.625702]  ldlm_handle_enqueue0+0x634/0x1520 [ptlrpc]
[15817.631007]  tgt_enqueue+0xa4/0x220 [ptlrpc]
[15817.635365]  tgt_request_handle+0xccd/0x1a20 [ptlrpc]
[15817.640503]  ? ptlrpc_nrs_req_get_nolock0+0xff/0x1f0 [ptlrpc]
[15817.646337]  ptlrpc_server_handle_request+0x323/0xbe0 [ptlrpc]
[15817.652256]  ptlrpc_main+0xbec/0x1530 [ptlrpc]
[15817.656791]  ? ptlrpc_wait_event+0x590/0x590 [ptlrpc]
[15817.661928]  kthread+0x134/0x150
[15817.665161]  ? set_kthread_struct+0x50/0x50
[15817.669346]  ret_from_fork+0x1f/0x40


_______________________________________________
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Reply via email to