[ 
https://issues.apache.org/jira/browse/GOBBLIN-1672?focusedWorklogId=797416&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-797416
 ]

ASF GitHub Bot logged work on GOBBLIN-1672:
-------------------------------------------

                Author: ASF GitHub Bot
            Created on: 02/Aug/22 22:00
            Start Date: 02/Aug/22 22:00
    Worklog Time Spent: 10m 
      Work Description: arjun4084346 commented on code in PR #3532:
URL: https://github.com/apache/gobblin/pull/3532#discussion_r936061609


##########
gobblin-service/src/main/java/org/apache/gobblin/service/modules/orchestration/DagManager.java:
##########
@@ -1132,24 +1035,39 @@ private void cleanUp() {
           DagNode<JobExecutionPlan> dagNode = dagNodeList.poll();
           deleteJobState(dagId, dagNode);
         }
-        log.info("Dag {} has finished with status FAILED; Cleaning up dag from 
the state store.", dagId);
-        onFlowFailure(dagId);
+        Dag<JobExecutionPlan> dag = this.dags.get(dagId);
+        String status = TimingEvent.FlowTimings.FLOW_FAILED;
+        if 
(TimingEvent.FlowTimings.FLOW_RUN_DEADLINE_EXCEEDED.equals(dag.getFlowEvent())) 
{
+          
this.dagManagerMetrics.emitFlowSlaExceededMetrics(DagManagerUtils.getFlowId(dag));
+        } else if 
(!TimingEvent.FlowTimings.FLOW_START_DEADLINE_EXCEEDED.equals(dag.getFlowEvent()))
 {
+          
dagManagerMetrics.emitFlowFailedMetrics(DagManagerUtils.getFlowId(this.dags.get(dagId)));
+        }
+        addFailedDag(dagId);
+        log.info("Dag {} has finished with status {}; Cleaning up dag from the 
state store.", dagId, status);
         // send an event before cleaning up dag
-        DagManagerUtils.emitFlowEvent(this.eventSubmitter, 
this.dags.get(dagId), TimingEvent.FlowTimings.FLOW_FAILED);
+        DagManagerUtils.emitFlowEvent(this.eventSubmitter, 
this.dags.get(dagId), status);
         dagIdstoClean.add(dagId);
       }
 
-      //Clean up completed dags
-      for (String dagId : this.dags.keySet()) {
+      // Remove dags that are finished and emit their appropriate metrics
+      for (Map.Entry<String, Dag<JobExecutionPlan>> dagIdKeyPair : 
this.dags.entrySet()) {
+        String dagId = dagIdKeyPair.getKey();
+        Dag<JobExecutionPlan> dag = dagIdKeyPair.getValue();
         if (!hasRunningJobs(dagId) && 
!this.failedDagIdsFinishRunning.contains(dagId)) {
           String status = TimingEvent.FlowTimings.FLOW_SUCCEEDED;
           if (this.failedDagIdsFinishAllPossible.contains(dagId)) {
-            onFlowFailure(dagId);
+            if 
(TimingEvent.FlowTimings.FLOW_RUN_DEADLINE_EXCEEDED.equals(dag.getFlowEvent())) 
{

Review Comment:
   Maybe, we can move this if block inside `addFailedDag` ?





Issue Time Tracking
-------------------

    Worklog Id:     (was: 797416)
    Time Spent: 1h 40m  (was: 1.5h)

> Refactor metrics in dagmanager and add per spec executor metrics
> ----------------------------------------------------------------
>
>                 Key: GOBBLIN-1672
>                 URL: https://issues.apache.org/jira/browse/GOBBLIN-1672
>             Project: Apache Gobblin
>          Issue Type: Improvement
>          Components: gobblin-service
>            Reporter: William Lo
>            Assignee: Abhishek Tiwari
>            Priority: Major
>          Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> Add the following metrics:
> 1. Success per executor
> 2. Fail per executor
> 3. SLA killed per executor
> 4. SLA killed per flowgroup
> 5. SLA killed per user
> 6. SLA killed overall



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to