OneSizeFitsQuorum commented on code in PR #11372:
URL: https://github.com/apache/iotdb/pull/11372#discussion_r1369940282


##########
iotdb-core/datanode/src/main/java/org/apache/iotdb/db/queryengine/plan/scheduler/AsyncPlanNodeSender.java:
##########
@@ -135,26 +141,55 @@ public List<TSStatus> getFailureStatusList() {
     return failureStatusList;
   }
 
-  public Future<FragInstanceDispatchResult> getResult() {
-    for (Map.Entry<Integer, TSendSinglePlanNodeResp> entry : 
instanceId2RespMap.entrySet()) {
-      if (!entry.getValue().accepted) {
-        logger.warn(
-            "dispatch write failed. status: {}, code: {}, message: {}, node 
{}",
-            entry.getValue().status,
-            TSStatusCode.representOf(entry.getValue().status.code),
-            entry.getValue().message,
-            
instances.get(entry.getKey()).getHostDataNode().getInternalEndPoint());
-        if (entry.getValue().getStatus() == null) {
-          return immediateFuture(
-              new FragInstanceDispatchResult(
-                  RpcUtils.getStatus(
-                      TSStatusCode.WRITE_PROCESS_ERROR, 
entry.getValue().getMessage())));
-        } else {
-          return immediateFuture(new 
FragInstanceDispatchResult(entry.getValue().getStatus()));
-        }
-      }
+  public boolean needRetry() {
+    // retried FI list is not empty and data region replica number is greater 
than 1
+    return !needRetryInstanceIndex.isEmpty()

Review Comment:
   Do we need to retry for single replica? Perhaps a proper retry can be useful 
for some restart scenarios without data loss



##########
iotdb-core/consensus/src/main/java/org/apache/iotdb/consensus/iot/client/DispatchLogHandler.java:
##########
@@ -69,20 +70,22 @@ public void onComplete(TSyncLogEntriesRes response) {
     logDispatcherThreadMetrics.recordSyncLogTimePerRequest(System.nanoTime() - 
createTime);
   }
 
-  private boolean needRetry(int statusCode) {
+  public static boolean needRetry(int statusCode) {
     return statusCode == TSStatusCode.INTERNAL_SERVER_ERROR.getStatusCode()
         || statusCode == TSStatusCode.SYSTEM_READ_ONLY.getStatusCode()
         || statusCode == TSStatusCode.WRITE_PROCESS_REJECT.getStatusCode();
   }
 
   @Override
   public void onError(Exception exception) {
-    logger.warn(
-        "Can not send {} to peer for {} times {} because {}",
-        batch,
-        thread.getPeer(),
-        ++retryCount,
-        exception);
+    if (logger.isWarnEnabled()) {
+      logger.warn(
+          "Can not send {} to peer for {} times {} because {}",
+          batch,
+          thread.getPeer(),
+          ++retryCount,

Review Comment:
   We should move the increment logic here outside of the if



##########
iotdb-core/datanode/src/main/java/org/apache/iotdb/db/queryengine/plan/scheduler/AsyncSendPlanNodeHandler.java:
##########
@@ -80,4 +94,15 @@ public void onError(Exception e) {
       }
     }
   }
+
+  private boolean needRetry(Exception e) {
+    Throwable rootCause = ExceptionUtils.getRootCause(e);
+    // if the exception is SocketException and its error message is Broken 
pipe, it means that the
+    // remote node may go offline
+    return isConnectionBroken(rootCause);
+  }
+
+  private boolean needRetry(TSendSinglePlanNodeResp resp) {
+    return !resp.accepted && DispatchLogHandler.needRetry(resp.status.code);

Review Comment:
   maybe we can move DispatchLogHandler.needRetry to common module?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to