What version of Spectrum Scale is running there? Do these errors appear
since your last version update?
Am 14.02.23 um 14:09 schrieb Walter Sklenka:
Dear Collegues!
May I ask if anyone has a hint what could be the reason for Critical
Thread Watchdog warnings for Disk Leases Threads?
Is this a “local node” Problem or a network problem ?
I see these messages sometimes arriving when NSD Servers which also
serve as NFS servers when they get under heavy NFS load
Following is an excerpt from mmfs.log.latest
2023-02-14_12:06:53.235+0100: [N] Disk lease period expired 0.040
seconds ago in cluster xxx-cluster. Attempting to reacquire the lease.
2023-02-14_12:06:53.600+0100: [W] ------------------[GPFS Critical
Thread Watchdog]------------------
2023-02-14_12:06:53.600+0100: [W] PID: 7294 State: R (DiskLeaseThread)
is overloaded for more than 8 seconds
2023-02-14_12:06:53.600+0100: [W] counter: 0 (mark-idle: 0
mark-active: 0 pre-work: 0 post-work: 0) sched: (nvcsw: 0 nivcsw: 8)
2023-02-14_12:06:53.600+0100: [W] Call Trace(PID: 7294):
2023-02-14_12:06:53.600+0100: [W] #0: 0x000055CABDF49521
BaseMutexClass::release() + 0x12 at ??:0
2023-02-14_12:06:53.600+0100: [W] #1: 0xB1557721BBABD900 _etext +
0xB154F7E646041C0E at ??:0
2023-02-14_12:07:09.554+0100: [N] Disk lease reacquired in cluster
xxx-cluster.
2023-02-14_12:07:09.554+0100: [N] Disk lease period expired 5.680
seconds ago in cluster xxx-cluster. Attempting to reacquire the lease.
2023-02-14_12:07:11.605+0100: [N] Disk lease reacquired in cluster
xxx-cluster.
2023-02-14_12:10:55.990+0100: [I] Command: mmlspool /dev/fs4vm all -L -Y
2023-02-14_12:10:55.990+0100: [I] Command: successful mmlspool
/dev/fs4vm all -L -Y
2023-02-14_12:30:58.756+0100: [I] Command: mmlspool /dev/fs4vm all -L -Y
2023-02-14_12:30:58.756+0100: [I] Command: successful mmlspool
/dev/fs4vm all -L -Y
2023-02-14_13:10:55.988+0100: [I] Command: mmlspool /dev/fs4vm all -L -Y
2023-02-14_13:10:55.989+0100: [I] Command: successful mmlspool
/dev/fs4vm all -L -Y
2023-02-14_13:21:40.892+0100: [N] Node 10.20.30.2 (ogpfs2-hs.local)
lease renewal is overdue. Pinging to check if it is alive
2023-02-14_13:21:40.892+0100: [I] The TCP connection to IP address
10.20.30.2 ogpfs2-hs.local <c0n1>:[1] (socket 106) state: state=1
ca_state=0 snd_cwnd=10 snd_ssthresh=2147483647 unacked=0 probes=0
backoff=0 retransmits=0 rto=201000 rcv_ssthresh=1219344 rtt=121
rttvar=69 sacked=0 retrans=0 reordering=3 lost=0
2023-02-14_13:22:00.220+0100: [N] Disk lease period expired 0.010
seconds ago in cluster xxx-cluster. Attempting to reacquire the lease.
2023-02-14_13:22:08.298+0100: [N] Disk lease reacquired in cluster
xxx-cluster.
2023-02-14_13:30:58.760+0100: [I] Command: mmlspool /dev/fs4vm all -L -Y
2023-02-14_13:30:58.760+0100: [I] Command: successful mmlspool
/dev/fs4vm all -L -Y
Mit freundlichen Grüßen
*/Walter Sklenka/*
*/Technical Consultant/*
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at gpfsug.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org