William Lo created GOBBLIN-1784:
-----------------------------------

             Summary: Race condition where on service restart DagManager will 
lose track of dags
                 Key: GOBBLIN-1784
                 URL: https://issues.apache.org/jira/browse/GOBBLIN-1784
             Project: Apache Gobblin
          Issue Type: Bug
          Components: gobblin-service
            Reporter: William Lo
            Assignee: Abhishek Tiwari


Gobblin-as-a-Service has a bug where on restart, the DagManager will clean up 
dags but a flow event is never sent.
This leads to a scenario where if the event is never sent by the underlying 
notification system, the dag will already be cleaned up and thus the job status 
will permanently be stuck in a running state.

The DagManager thus should only clean up its own reference of dags after it 
reads that the jobstatus monitor has properly saved the final flow status, and 
if a status hasn't been received by some timestamp (e.g. 5 mins), then the 
DagManager will re-emit the event in case it was lost.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to