Copilot commented on code in PR #59760:
URL: https://github.com/apache/doris/pull/59760#discussion_r2681117255


##########
fe/fe-core/src/main/java/org/apache/doris/job/extensions/insert/streaming/StreamingInsertJob.java:
##########
@@ -516,10 +516,18 @@ protected void fetchMeta() {
                 offsetProvider.fetchRemoteMeta(new HashMap<>());
             }
         } catch (Exception ex) {
-            //todo: The job status = MANUAL_PAUSE_ERR, No need to set 
failureReason again
             log.warn("fetch remote meta failed, job id: {}", getJobId(), ex);
-            failureReason = new 
FailureReason(InternalErrorCode.GET_REMOTE_DATA_ERROR,
-                    "Failed to fetch meta, " + ex.getMessage());
+            if (this.getFailureReason() != null
+                    && 
!InternalErrorCode.MANUAL_PAUSE_ERR.equals(this.getFailureReason().getCode())) {

Review Comment:
   The condition logic is inverted. This checks if failureReason is NOT null 
and is NOT MANUAL_PAUSE_ERR, which means it will skip setting the failure 
reason when there's already an error. The logic should be checking if 
failureReason is null OR if it's not MANUAL_PAUSE_ERR. The correct condition 
should be: `if (this.getFailureReason() == null || 
!InternalErrorCode.MANUAL_PAUSE_ERR.equals(this.getFailureReason().getCode()))`.
 Otherwise, legitimate failures after an initial error will be silently ignored.
   ```suggestion
               if (this.getFailureReason() == null
                       || 
!InternalErrorCode.MANUAL_PAUSE_ERR.equals(this.getFailureReason().getCode())) {
   ```



##########
fe/fe-core/src/main/java/org/apache/doris/job/extensions/insert/streaming/StreamingInsertJob.java:
##########
@@ -505,7 +505,7 @@ public List<AbstractStreamingTask> queryAllStreamTasks() {
         return tasks;
     }
 
-    protected void fetchMeta() {
+    protected void fetchMeta() throws JobException {

Review Comment:
   The method signature declares that it throws JobException, but all 
exceptions are caught internally and handled. The method never actually throws 
JobException. Either remove the throws clause from the method signature, or 
propagate the exception instead of catching it. This misleading signature can 
confuse callers about the method's behavior.
   ```suggestion
       protected void fetchMeta() {
   ```



##########
fe/fe-core/src/main/java/org/apache/doris/job/extensions/insert/streaming/StreamingInsertJob.java:
##########
@@ -516,10 +516,18 @@ protected void fetchMeta() {
                 offsetProvider.fetchRemoteMeta(new HashMap<>());
             }
         } catch (Exception ex) {
-            //todo: The job status = MANUAL_PAUSE_ERR, No need to set 
failureReason again
             log.warn("fetch remote meta failed, job id: {}", getJobId(), ex);
-            failureReason = new 
FailureReason(InternalErrorCode.GET_REMOTE_DATA_ERROR,
-                    "Failed to fetch meta, " + ex.getMessage());
+            if (this.getFailureReason() != null
+                    && 
!InternalErrorCode.MANUAL_PAUSE_ERR.equals(this.getFailureReason().getCode())) {
+                // When a job is manually paused, it does not need to be set 
again,
+                // otherwise, it may be woken up by auto resume.
+                this.setFailureReason(
+                        new 
FailureReason(InternalErrorCode.GET_REMOTE_DATA_ERROR,
+                                "Failed to fetch meta, " + ex.getMessage()));
+                // If fetching meta fails, the job is paused
+                // and auto resume will automatically wake it up.
+                this.updateJobStatus(JobStatus.PAUSED);
+            }

Review Comment:
   The new error handling logic for fetch meta failures lacks test coverage. 
This code path involves complex conditional logic around failure reasons and 
auto-resume behavior that should be verified with unit tests. Consider adding 
tests that cover scenarios like: 1) fetch meta fails when no prior failure 
exists, 2) fetch meta fails when MANUAL_PAUSE_ERR is already set, 3) fetch meta 
fails when a different error code is already set.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to