Chen Guo created GOBBLIN-998:
--------------------------------

             Summary: ExecutionStatus should be reset to PENDING before a job 
retries
                 Key: GOBBLIN-998
                 URL: https://issues.apache.org/jira/browse/GOBBLIN-998
             Project: Apache Gobblin
          Issue Type: Bug
            Reporter: Chen Guo


In the modifyStateIfRetryRequired of KafkaJobStatusMonitor, when the state is 
Failed and currentAttempts < maxAttempts, the ExecutionStatus is set to 
Running. 

However, due to the checkin from 
GOBBLIN-974([https://github.com/apache/incubator-gobblin/blob/9f50a2563cc257039da44018663b6b9e119fb499/gobblin-service/src/main/java/org/apache/gobblin/service/monitoring/KafkaJobStatusMonitor.java#L159]),
 the currentAttempts update from a lower-order event(like Orchestrated) cannot 
be consumed to update the jobState file. Thus it will cause infinite retries in 
DagManagerThread for failed jobs when it poolAndAdvanceDag.

 

The solution is to update ExecutionStatus to PENDING instead of Running.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to