- **status**: review --> fixed
- **Comment**:
commit 6a9481a0c76ebfd8ab433d924fab17eaae724af1 (HEAD -> develop,
origin/develop, ticket-3127)
Author: thuan.tran <[email protected]>
Date: Wed Dec 4 10:57:49 2019 +0700
mds: Using timer to continue sending queued messages [#3127]
- In overflow, receive chunk ack may stuck in retrying to send pending
messages then later chunk ack comming cannot proceed.
- Instead of retrying to send pending messages, reuse timer send chunk
ack to trigger send pending messages if any. By this, even no more Nack
or ChunkAck event comming, pending messages will be sent by timer.
---
** [tickets:#3127] mds: ckpttest 20 11 fail with mds fctrl enable**
**Status:** fixed
**Milestone:** 5.20.01
**Created:** Wed Dec 04, 2019 08:42 AM UTC by Thuan Tran
**Last Updated:** Thu Dec 05, 2019 03:51 AM UTC
**Owner:** Thuan Tran
ckpttest 20 11 failure.
ckptnd is stuck in Receive ChunkAck retrying to send first unsent message in
loop 100 times (total 5s).
No more message send out in this period cause reassemble timer expired.
~~~
<143>1 2019-12-04T03:39:40.052954+01:00 PL-3 osafckptnd 241 mds.log [meta
sequenceId="24184"] FCTRL: process flow event start [evt:6]
<143>1 2019-12-04T03:39:40.052966+01:00 PL-3 osafckptnd 241 mds.log [meta
sequenceId="24185"] FCTRL: [me] <-- [node:1001001, ref:3127698189],
RcvChkAck[fseq:649, chunk:3], sndwnd[acked:646, send:789, nacked:9223353],
queue[size:142]
<141>1 2019-12-04T03:39:40.053025+01:00 PL-3 osafckptnd 241 mds.log [meta
sequenceId="24186"] FCTRL: [me] --> [node:1001001, ref:3127698189],
SndQData[fseq:680, len:65262], sndwnd[acked:649, send:789, nacked:9027567]
<141>1 2019-12-04T03:39:40.053102+01:00 PL-3 osafckptnd 241 mds.log [meta
sequenceId="24188"] FCTRL: [me] --> [node:1001001, ref:3127698189],
SndQData[fseq:681, len:65262], sndwnd[acked:649, send:789, nacked:9027567]
<143>1 2019-12-04T03:39:40.053151+01:00 PL-3 osafckptnd 241 mds.log [meta
sequenceId="24190"] FCTRL: Receive ChunkAck
<141>1 2019-12-04T03:39:40.053407+01:00 PL-3 osafckptnd 241 mds.log [meta
sequenceId="24192"] FCTRL: [me] --> [node:1001001, ref:3127698189],
SndQData[fseq:682, len:65262], sndwnd[acked:649, send:789, nacked:9027567]
<143>1 2019-12-04T03:39:40.05344+01:00 PL-3 osafckptnd 241 mds.log [meta
sequenceId="24193"] FCTRL: process flow event end
<143>1 2019-12-04T03:39:40.053456+01:00 PL-3 osafckptnd 241 mds.log [meta
sequenceId="24194"] FCTRL: process flow event start [evt:6]
<143>1 2019-12-04T03:39:40.053471+01:00 PL-3 osafckptnd 241 mds.log [meta
sequenceId="24195"] FCTRL: [me] <-- [node:1001001, ref:3127698189],
RcvChkAck[fseq:652, chunk:3], sndwnd[acked:649, send:789, nacked:9027567],
queue[size:139]
<141>1 2019-12-04T03:39:40.053528+01:00 PL-3 osafckptnd 241 mds.log [meta
sequenceId="24196"] FCTRL: [me] --> [node:1001001, ref:3127698189],
SndQData[fseq:683, len:65262], sndwnd[acked:652, send:789, nacked:8831781]
<143>1 2019-12-04T03:39:40.054212+01:00 PL-3 osafckptnd 241 mds.log [meta
sequenceId="24199"] FCTRL: Receive ChunkAck
<143>1 2019-12-04T03:39:40.054928+01:00 PL-3 osafckptnd 241 mds.log [meta
sequenceId="24203"] FCTRL: Receive ChunkAck
<143>1 2019-12-04T03:39:40.795791+01:00 PL-3 osafckptnd 241 mds.log [meta
sequenceId="24207"] FCTRL: Receive ChunkAck
<143>1 2019-12-04T03:39:41.216893+01:00 PL-3 osafckptnd 241 mds.log [meta
sequenceId="24211"] FCTRL: Receive ChunkAck
~~~
Then ckpttest timeout for sndrsp message.
~~~
<139>1 2019-12-04T03:39:44.915533+01:00 SC-1 osafckptnd 383 mds.log [meta
sequenceId="21632"] MDTM: Tmr Mailbox Processing:Reassemble Tmr Hdl=0xfee00004
<139>1 2019-12-04T03:39:44.915561+01:00 SC-2 osafckptnd 340 mds.log [meta
sequenceId="18332"] MDTM: Tmr Mailbox Processing:Reassemble Tmr Hdl=0xfee00003
<139>1 2019-12-04T03:41:35.741697+01:00 PL-3 ckpttest 353 mds.log [meta
sequenceId="3386"] MDS_SND_RCV: Timeout occured
<139>1 2019-12-04T03:41:35.7418+01:00 PL-3 ckpttest 353 mds.log [meta
sequenceId="3387"] MDS_SND_RCV: Timeout or error occured on sndrsp message
<139>1 2019-12-04T03:41:35.741823+01:00 PL-3 ckpttest 353 mds.log [meta
sequenceId="3388"] MDS_SND_RCV: Adest=<0x0002030f,1253863792>
~~~
---
Sent from sourceforge.net because [email protected] is
subscribed to https://sourceforge.net/p/opensaf/tickets/
To unsubscribe from further messages, a project admin can change settings at
https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a
mailing list, you can unsubscribe from the mailing list._______________________________________________
Opensaf-tickets mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets