Will-Lo commented on code in PR #3520:
URL: https://github.com/apache/gobblin/pull/3520#discussion_r896983202


##########
gobblin-service/src/main/java/org/apache/gobblin/service/modules/orchestration/DagManager.java:
##########
@@ -963,7 +963,8 @@ private void submitJob(DagNode<JobExecutionPlan> dagNode) {
         // By this point the quota is allocated, so it's imperative to 
increment as missing would introduce the potential to decrement below zero upon 
quota release.
         // Quota release is guaranteed, despite failure, because exception 
handling within would mark the job FAILED.
         // When the ensuing kafka message spurs DagManager processing, the 
quota is released and the counts decremented
-        if (this.metricContext != null) {
+        // Ensure that we do not double increment for flows that are retried
+        if (this.metricContext != null && 
dagNode.getValue().getCurrentAttempts() == 1) {

Review Comment:
   No, since decrementing can occur irrespective of the attempt number, as 
it'll only decrement/hit an end state on the final attempt. If a job is retried 
automatically, it won't show up as a failed job status and instead 
PENDING_RETRY, and get resubmitted with `submitJob()` instead of 
`onJobFinish()`, so it'll never go through the decrement count on retry.
   
   
https://github.com/apache/gobblin/blob/b726a606cea3deb567b1fdeeba9acbcc220e6d30/gobblin-service/src/main/java/org/apache/gobblin/service/monitoring/KafkaJobStatusMonitor.java#L269
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to