oracleloyall commented on code in PR #1435:
URL: https://github.com/apache/cloudberry/pull/1435#discussion_r2525646177
##########
contrib/interconnect/udp/ic_udpifc.c:
##########
@@ -6251,17 +6265,31 @@ checkDeadlock(ChunkTransportStateEntry *pChunkEntry,
MotionConn *mConn)
ic_control_info.lastDeadlockCheckTime = now;
ic_statistics.statusQueryMsgNum++;
+ if (Gp_interconnect_fc_method ==
INTERCONNECT_FC_METHOD_LOSS_ADVANCE && pollAcks(transportStates, pEntry->txfd,
50))
+ {
+ handleAcks(transportStates, pChunkEntry, false);
+ conn->deadlockCheckBeginTime = now;
+ }
+
/* check network error. */
- if ((now - conn->deadlockCheckBeginTime) > ((uint64)
Gp_interconnect_transmit_timeout * 1000 * 1000))
+ if ((now - conn->deadlockCheckBeginTime) > ((uint64)
600 * 1000 * 1000))
Review Comment:
Reaching the threshold (default one hour) and then expanding capacity will
trigger an error. If there is not enough time to handle the expansion, then in
the case of a ten-minute deadlock, there can be expansion logic: conn->capacity
+= 1; If it is a deadlock, data can be sent. If it is a network anomaly and the
data packet cannot be sent, the problem can be identified in time through the
one-way flow of data. Otherwise, it would be difficult because they are all
heartbeat packets, and the data packets can have retry logic.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]