[ 
https://issues.apache.org/jira/browse/GOBBLIN-798?focusedWorklogId=256106&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-256106
 ]

ASF GitHub Bot logged work on GOBBLIN-798:
------------------------------------------

                Author: ASF GitHub Bot
            Created on: 07/Jun/19 17:46
            Start Date: 07/Jun/19 17:46
    Worklog Time Spent: 10m 
      Work Description: sv2000 commented on pull request #2665: [GOBBLIN-798] 
Clean up workflows from Helix when the Gobblin applicat…
URL: https://github.com/apache/incubator-gobblin/pull/2665#discussion_r291687913
 
 

 ##########
 File path: 
gobblin-cluster/src/main/java/org/apache/gobblin/cluster/GobblinHelixMultiManager.java
 ##########
 @@ -363,12 +359,26 @@ void handleLeadershipChange(NotificationContext 
changeContext) {
     }
   }
 
+  /**
+   * Delete jobs from the helix cluster
+   */
+  @VisibleForTesting
+  public void cleanUpJobs() {
+    cleanUpJobs(this.jobClusterHelixManager);
+
+    if (this.taskDriverHelixManager.isPresent()) {
+      cleanUpJobs(this.taskDriverHelixManager.get());
+    }
+  }
+
   private void cleanUpJobs(HelixManager helixManager) {
     // Clean up existing jobs
     TaskDriver taskDriver = new TaskDriver(helixManager);
 
     Map<String, WorkflowConfig> workflows = taskDriver.getWorkflows();
 
+    log.debug("cleanUpJobs workflow count {} workflows {}", workflows.size(), 
workflows);
 
 Review comment:
   Maybe just dump workflows.keySet() instead of the entire map?
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 256106)
    Time Spent: 20m  (was: 10m)

> Clean up workflows from Helix when the Gobblin application master starts
> ------------------------------------------------------------------------
>
>                 Key: GOBBLIN-798
>                 URL: https://issues.apache.org/jira/browse/GOBBLIN-798
>             Project: Apache Gobblin
>          Issue Type: Task
>            Reporter: Hung Tran
>            Assignee: Hung Tran
>            Priority: Major
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> If the application master aborts a new one may be spawned by YARN. The second 
> application master will resubmit the jobs. This results in duplicate jobs in 
> Helix and multiple instances of the job may run, resulting in duplicate data.
> The Gobblin application master should clean up all workflows on startup to 
> avoid executing multiple instances of a job.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to