[jira] [Work logged] (GOBBLIN-798) Clean up workflows from Helix when the Gobblin application master starts
[ https://issues.apache.org/jira/browse/GOBBLIN-798?focusedWorklogId=257000=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-257000 ] ASF GitHub Bot logged work on GOBBLIN-798: -- Author: ASF GitHub Bot Created on: 10/Jun/19 17:05 Start Date: 10/Jun/19 17:05 Worklog Time Spent: 10m Work Description: asfgit commented on pull request #2665: [GOBBLIN-798] Clean up workflows from Helix when the Gobblin applicat… URL: https://github.com/apache/incubator-gobblin/pull/2665 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 257000) Time Spent: 2h (was: 1h 50m) > Clean up workflows from Helix when the Gobblin application master starts > > > Key: GOBBLIN-798 > URL: https://issues.apache.org/jira/browse/GOBBLIN-798 > Project: Apache Gobblin > Issue Type: Task >Reporter: Hung Tran >Assignee: Hung Tran >Priority: Major > Fix For: 0.15.0 > > Time Spent: 2h > Remaining Estimate: 0h > > If the application master aborts a new one may be spawned by YARN. The second > application master will resubmit the jobs. This results in duplicate jobs in > Helix and multiple instances of the job may run, resulting in duplicate data. > The Gobblin application master should clean up all workflows on startup to > avoid executing multiple instances of a job. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (GOBBLIN-798) Clean up workflows from Helix when the Gobblin application master starts
[ https://issues.apache.org/jira/browse/GOBBLIN-798?focusedWorklogId=256244=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-256244 ] ASF GitHub Bot logged work on GOBBLIN-798: -- Author: ASF GitHub Bot Created on: 07/Jun/19 21:50 Start Date: 07/Jun/19 21:50 Worklog Time Spent: 10m Work Description: htran1 commented on pull request #2665: [GOBBLIN-798] Clean up workflows from Helix when the Gobblin applicat… URL: https://github.com/apache/incubator-gobblin/pull/2665#discussion_r291765182 ## File path: gobblin-cluster/src/main/java/org/apache/gobblin/cluster/GobblinClusterConfigurationKeys.java ## @@ -161,4 +161,8 @@ public static final String CANCEL_RUNNING_JOB_ON_DELETE = GOBBLIN_CLUSTER_PREFIX + "job.cancelRunningJobOnDelete"; public static final String DEFAULT_CANCEL_RUNNING_JOB_ON_DELETE = "false"; + + // for cleaning up jobs on cluster manager startup + public static final String CLEAN_UP_JOBS_ON_MANAGER_START = GOBBLIN_CLUSTER_PREFIX + "cleanUpJobsOnManagerStart"; + public static final boolean DEFAULT_CLEAN_UP_JOBS_ON_MANAGER_START = false; Review comment: Yes, this is to keep behavior the same. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 256244) Time Spent: 1h 40m (was: 1.5h) > Clean up workflows from Helix when the Gobblin application master starts > > > Key: GOBBLIN-798 > URL: https://issues.apache.org/jira/browse/GOBBLIN-798 > Project: Apache Gobblin > Issue Type: Task >Reporter: Hung Tran >Assignee: Hung Tran >Priority: Major > Time Spent: 1h 40m > Remaining Estimate: 0h > > If the application master aborts a new one may be spawned by YARN. The second > application master will resubmit the jobs. This results in duplicate jobs in > Helix and multiple instances of the job may run, resulting in duplicate data. > The Gobblin application master should clean up all workflows on startup to > avoid executing multiple instances of a job. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (GOBBLIN-798) Clean up workflows from Helix when the Gobblin application master starts
[ https://issues.apache.org/jira/browse/GOBBLIN-798?focusedWorklogId=256240=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-256240 ] ASF GitHub Bot logged work on GOBBLIN-798: -- Author: ASF GitHub Bot Created on: 07/Jun/19 21:42 Start Date: 07/Jun/19 21:42 Worklog Time Spent: 10m Work Description: htran1 commented on pull request #2665: [GOBBLIN-798] Clean up workflows from Helix when the Gobblin applicat… URL: https://github.com/apache/incubator-gobblin/pull/2665#discussion_r291763549 ## File path: gobblin-cluster/src/main/java/org/apache/gobblin/cluster/GobblinHelixMultiManager.java ## @@ -363,12 +359,26 @@ void handleLeadershipChange(NotificationContext changeContext) { } } + /** + * Delete jobs from the helix cluster + */ + @VisibleForTesting + public void cleanUpJobs() { +cleanUpJobs(this.jobClusterHelixManager); + +if (this.taskDriverHelixManager.isPresent()) { + cleanUpJobs(this.taskDriverHelixManager.get()); +} + } + private void cleanUpJobs(HelixManager helixManager) { // Clean up existing jobs TaskDriver taskDriver = new TaskDriver(helixManager); Map workflows = taskDriver.getWorkflows(); +log.debug("cleanUpJobs workflow count {} workflows {}", workflows.size(), workflows); Review comment: Sure. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 256240) Time Spent: 1h 20m (was: 1h 10m) > Clean up workflows from Helix when the Gobblin application master starts > > > Key: GOBBLIN-798 > URL: https://issues.apache.org/jira/browse/GOBBLIN-798 > Project: Apache Gobblin > Issue Type: Task >Reporter: Hung Tran >Assignee: Hung Tran >Priority: Major > Time Spent: 1h 20m > Remaining Estimate: 0h > > If the application master aborts a new one may be spawned by YARN. The second > application master will resubmit the jobs. This results in duplicate jobs in > Helix and multiple instances of the job may run, resulting in duplicate data. > The Gobblin application master should clean up all workflows on startup to > avoid executing multiple instances of a job. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (GOBBLIN-798) Clean up workflows from Helix when the Gobblin application master starts
[ https://issues.apache.org/jira/browse/GOBBLIN-798?focusedWorklogId=256239=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-256239 ] ASF GitHub Bot logged work on GOBBLIN-798: -- Author: ASF GitHub Bot Created on: 07/Jun/19 21:39 Start Date: 07/Jun/19 21:39 Worklog Time Spent: 10m Work Description: htran1 commented on pull request #2665: [GOBBLIN-798] Clean up workflows from Helix when the Gobblin applicat… URL: https://github.com/apache/incubator-gobblin/pull/2665#discussion_r291762902 ## File path: gobblin-yarn/src/main/java/org/apache/gobblin/yarn/GobblinApplicationMaster.java ## @@ -81,7 +83,10 @@ public GobblinApplicationMaster(String applicationName, ContainerId containerId, Config config, YarnConfiguration yarnConfiguration) throws Exception { super(applicationName, containerId.getApplicationAttemptId().getApplicationId().toString(), -GobblinClusterUtils.addDynamicConfig(config), Optional.absent()); +GobblinClusterUtils.addDynamicConfig(config) +.withFallback(ConfigFactory.parseMap( + ImmutableMap.of(GobblinClusterConfigurationKeys.CLEAN_UP_JOBS_ON_MANAGER_START, "true"))), Review comment: No, this is explicitly setting it to "true" for the Gobblin app master. All other gobblin cluster modes use the default of "false". This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 256239) Time Spent: 1h 10m (was: 1h) > Clean up workflows from Helix when the Gobblin application master starts > > > Key: GOBBLIN-798 > URL: https://issues.apache.org/jira/browse/GOBBLIN-798 > Project: Apache Gobblin > Issue Type: Task >Reporter: Hung Tran >Assignee: Hung Tran >Priority: Major > Time Spent: 1h 10m > Remaining Estimate: 0h > > If the application master aborts a new one may be spawned by YARN. The second > application master will resubmit the jobs. This results in duplicate jobs in > Helix and multiple instances of the job may run, resulting in duplicate data. > The Gobblin application master should clean up all workflows on startup to > avoid executing multiple instances of a job. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (GOBBLIN-798) Clean up workflows from Helix when the Gobblin application master starts
[ https://issues.apache.org/jira/browse/GOBBLIN-798?focusedWorklogId=256236=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-256236 ] ASF GitHub Bot logged work on GOBBLIN-798: -- Author: ASF GitHub Bot Created on: 07/Jun/19 21:36 Start Date: 07/Jun/19 21:36 Worklog Time Spent: 10m Work Description: htran1 commented on pull request #2665: [GOBBLIN-798] Clean up workflows from Helix when the Gobblin applicat… URL: https://github.com/apache/incubator-gobblin/pull/2665#discussion_r291762313 ## File path: gobblin-cluster/src/main/java/org/apache/gobblin/cluster/GobblinClusterConfigurationKeys.java ## @@ -161,4 +161,8 @@ public static final String CANCEL_RUNNING_JOB_ON_DELETE = GOBBLIN_CLUSTER_PREFIX + "job.cancelRunningJobOnDelete"; public static final String DEFAULT_CANCEL_RUNNING_JOB_ON_DELETE = "false"; + + // for cleaning up jobs on cluster manager startup + public static final String CLEAN_UP_JOBS_ON_MANAGER_START = GOBBLIN_CLUSTER_PREFIX + "cleanUpJobsOnManagerStart"; Review comment: I have this default to true for YARN mode. For standalone mode we clean up on leader ship change. For all other modes the existing behavior is maintained. I wanted to avoid changing behavior as much as possible. Initial startup already blows away the Helix cluster. This is only for yarn restart of the Gobblin application master without a restart of the application launcher. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 256236) Time Spent: 1h (was: 50m) > Clean up workflows from Helix when the Gobblin application master starts > > > Key: GOBBLIN-798 > URL: https://issues.apache.org/jira/browse/GOBBLIN-798 > Project: Apache Gobblin > Issue Type: Task >Reporter: Hung Tran >Assignee: Hung Tran >Priority: Major > Time Spent: 1h > Remaining Estimate: 0h > > If the application master aborts a new one may be spawned by YARN. The second > application master will resubmit the jobs. This results in duplicate jobs in > Helix and multiple instances of the job may run, resulting in duplicate data. > The Gobblin application master should clean up all workflows on startup to > avoid executing multiple instances of a job. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (GOBBLIN-798) Clean up workflows from Helix when the Gobblin application master starts
[ https://issues.apache.org/jira/browse/GOBBLIN-798?focusedWorklogId=256109=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-256109 ] ASF GitHub Bot logged work on GOBBLIN-798: -- Author: ASF GitHub Bot Created on: 07/Jun/19 17:46 Start Date: 07/Jun/19 17:46 Worklog Time Spent: 10m Work Description: sv2000 commented on pull request #2665: [GOBBLIN-798] Clean up workflows from Helix when the Gobblin applicat… URL: https://github.com/apache/incubator-gobblin/pull/2665#discussion_r291687392 ## File path: gobblin-yarn/src/main/java/org/apache/gobblin/yarn/GobblinApplicationMaster.java ## @@ -81,7 +83,10 @@ public GobblinApplicationMaster(String applicationName, ContainerId containerId, Config config, YarnConfiguration yarnConfiguration) throws Exception { super(applicationName, containerId.getApplicationAttemptId().getApplicationId().toString(), -GobblinClusterUtils.addDynamicConfig(config), Optional.absent()); +GobblinClusterUtils.addDynamicConfig(config) +.withFallback(ConfigFactory.parseMap( + ImmutableMap.of(GobblinClusterConfigurationKeys.CLEAN_UP_JOBS_ON_MANAGER_START, "true"))), Review comment: Should it be ImmutableMap.of(GobblinClusterConfigurationKeys.CLEAN_UP_JOBS_ON_MANAGER_START, DEFAULT_CLEAN_UP_JOBS_ON_MANAGER_START) ? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 256109) Time Spent: 40m (was: 0.5h) > Clean up workflows from Helix when the Gobblin application master starts > > > Key: GOBBLIN-798 > URL: https://issues.apache.org/jira/browse/GOBBLIN-798 > Project: Apache Gobblin > Issue Type: Task >Reporter: Hung Tran >Assignee: Hung Tran >Priority: Major > Time Spent: 40m > Remaining Estimate: 0h > > If the application master aborts a new one may be spawned by YARN. The second > application master will resubmit the jobs. This results in duplicate jobs in > Helix and multiple instances of the job may run, resulting in duplicate data. > The Gobblin application master should clean up all workflows on startup to > avoid executing multiple instances of a job. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (GOBBLIN-798) Clean up workflows from Helix when the Gobblin application master starts
[ https://issues.apache.org/jira/browse/GOBBLIN-798?focusedWorklogId=256107=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-256107 ] ASF GitHub Bot logged work on GOBBLIN-798: -- Author: ASF GitHub Bot Created on: 07/Jun/19 17:46 Start Date: 07/Jun/19 17:46 Worklog Time Spent: 10m Work Description: sv2000 commented on pull request #2665: [GOBBLIN-798] Clean up workflows from Helix when the Gobblin applicat… URL: https://github.com/apache/incubator-gobblin/pull/2665#discussion_r291685805 ## File path: gobblin-cluster/src/main/java/org/apache/gobblin/cluster/GobblinClusterConfigurationKeys.java ## @@ -161,4 +161,8 @@ public static final String CANCEL_RUNNING_JOB_ON_DELETE = GOBBLIN_CLUSTER_PREFIX + "job.cancelRunningJobOnDelete"; public static final String DEFAULT_CANCEL_RUNNING_JOB_ON_DELETE = "false"; + + // for cleaning up jobs on cluster manager startup + public static final String CLEAN_UP_JOBS_ON_MANAGER_START = GOBBLIN_CLUSTER_PREFIX + "cleanUpJobsOnManagerStart"; Review comment: Shouldn't we always clean up on restart? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 256107) Time Spent: 0.5h (was: 20m) > Clean up workflows from Helix when the Gobblin application master starts > > > Key: GOBBLIN-798 > URL: https://issues.apache.org/jira/browse/GOBBLIN-798 > Project: Apache Gobblin > Issue Type: Task >Reporter: Hung Tran >Assignee: Hung Tran >Priority: Major > Time Spent: 0.5h > Remaining Estimate: 0h > > If the application master aborts a new one may be spawned by YARN. The second > application master will resubmit the jobs. This results in duplicate jobs in > Helix and multiple instances of the job may run, resulting in duplicate data. > The Gobblin application master should clean up all workflows on startup to > avoid executing multiple instances of a job. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (GOBBLIN-798) Clean up workflows from Helix when the Gobblin application master starts
[ https://issues.apache.org/jira/browse/GOBBLIN-798?focusedWorklogId=256110=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-256110 ] ASF GitHub Bot logged work on GOBBLIN-798: -- Author: ASF GitHub Bot Created on: 07/Jun/19 17:46 Start Date: 07/Jun/19 17:46 Worklog Time Spent: 10m Work Description: sv2000 commented on pull request #2665: [GOBBLIN-798] Clean up workflows from Helix when the Gobblin applicat… URL: https://github.com/apache/incubator-gobblin/pull/2665#discussion_r291685487 ## File path: gobblin-cluster/src/main/java/org/apache/gobblin/cluster/GobblinClusterConfigurationKeys.java ## @@ -161,4 +161,8 @@ public static final String CANCEL_RUNNING_JOB_ON_DELETE = GOBBLIN_CLUSTER_PREFIX + "job.cancelRunningJobOnDelete"; public static final String DEFAULT_CANCEL_RUNNING_JOB_ON_DELETE = "false"; + + // for cleaning up jobs on cluster manager startup + public static final String CLEAN_UP_JOBS_ON_MANAGER_START = GOBBLIN_CLUSTER_PREFIX + "cleanUpJobsOnManagerStart"; + public static final boolean DEFAULT_CLEAN_UP_JOBS_ON_MANAGER_START = false; Review comment: Is the default "false" to preserve the current behavior with Gobblin cluster in non-Yarn mode? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 256110) Time Spent: 50m (was: 40m) > Clean up workflows from Helix when the Gobblin application master starts > > > Key: GOBBLIN-798 > URL: https://issues.apache.org/jira/browse/GOBBLIN-798 > Project: Apache Gobblin > Issue Type: Task >Reporter: Hung Tran >Assignee: Hung Tran >Priority: Major > Time Spent: 50m > Remaining Estimate: 0h > > If the application master aborts a new one may be spawned by YARN. The second > application master will resubmit the jobs. This results in duplicate jobs in > Helix and multiple instances of the job may run, resulting in duplicate data. > The Gobblin application master should clean up all workflows on startup to > avoid executing multiple instances of a job. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (GOBBLIN-798) Clean up workflows from Helix when the Gobblin application master starts
[ https://issues.apache.org/jira/browse/GOBBLIN-798?focusedWorklogId=256106=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-256106 ] ASF GitHub Bot logged work on GOBBLIN-798: -- Author: ASF GitHub Bot Created on: 07/Jun/19 17:46 Start Date: 07/Jun/19 17:46 Worklog Time Spent: 10m Work Description: sv2000 commented on pull request #2665: [GOBBLIN-798] Clean up workflows from Helix when the Gobblin applicat… URL: https://github.com/apache/incubator-gobblin/pull/2665#discussion_r291687913 ## File path: gobblin-cluster/src/main/java/org/apache/gobblin/cluster/GobblinHelixMultiManager.java ## @@ -363,12 +359,26 @@ void handleLeadershipChange(NotificationContext changeContext) { } } + /** + * Delete jobs from the helix cluster + */ + @VisibleForTesting + public void cleanUpJobs() { +cleanUpJobs(this.jobClusterHelixManager); + +if (this.taskDriverHelixManager.isPresent()) { + cleanUpJobs(this.taskDriverHelixManager.get()); +} + } + private void cleanUpJobs(HelixManager helixManager) { // Clean up existing jobs TaskDriver taskDriver = new TaskDriver(helixManager); Map workflows = taskDriver.getWorkflows(); +log.debug("cleanUpJobs workflow count {} workflows {}", workflows.size(), workflows); Review comment: Maybe just dump workflows.keySet() instead of the entire map? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 256106) Time Spent: 20m (was: 10m) > Clean up workflows from Helix when the Gobblin application master starts > > > Key: GOBBLIN-798 > URL: https://issues.apache.org/jira/browse/GOBBLIN-798 > Project: Apache Gobblin > Issue Type: Task >Reporter: Hung Tran >Assignee: Hung Tran >Priority: Major > Time Spent: 20m > Remaining Estimate: 0h > > If the application master aborts a new one may be spawned by YARN. The second > application master will resubmit the jobs. This results in duplicate jobs in > Helix and multiple instances of the job may run, resulting in duplicate data. > The Gobblin application master should clean up all workflows on startup to > avoid executing multiple instances of a job. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (GOBBLIN-798) Clean up workflows from Helix when the Gobblin application master starts
[ https://issues.apache.org/jira/browse/GOBBLIN-798?focusedWorklogId=256108=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-256108 ] ASF GitHub Bot logged work on GOBBLIN-798: -- Author: ASF GitHub Bot Created on: 07/Jun/19 17:46 Start Date: 07/Jun/19 17:46 Worklog Time Spent: 10m Work Description: sv2000 commented on pull request #2665: [GOBBLIN-798] Clean up workflows from Helix when the Gobblin applicat… URL: https://github.com/apache/incubator-gobblin/pull/2665#discussion_r291690862 ## File path: gobblin-cluster/src/main/java/org/apache/gobblin/cluster/GobblinClusterManager.java ## @@ -264,6 +268,12 @@ public synchronized void start() { this.eventBus.register(this); this.multiManager.connect(); +// Standalone mode registers a handler to clean up on leadership change, so don't do the cleanup +// now even if the option to clean up on startup is set. +if (this.cleanUpJobsOnStartup && !this.isStandaloneMode) { Review comment: Is this check needed for correctness or to avoid duplicate clean up calls? If it is the latter, shouldn't the 2nd call be handled as a No-op? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 256108) Time Spent: 40m (was: 0.5h) > Clean up workflows from Helix when the Gobblin application master starts > > > Key: GOBBLIN-798 > URL: https://issues.apache.org/jira/browse/GOBBLIN-798 > Project: Apache Gobblin > Issue Type: Task >Reporter: Hung Tran >Assignee: Hung Tran >Priority: Major > Time Spent: 40m > Remaining Estimate: 0h > > If the application master aborts a new one may be spawned by YARN. The second > application master will resubmit the jobs. This results in duplicate jobs in > Helix and multiple instances of the job may run, resulting in duplicate data. > The Gobblin application master should clean up all workflows on startup to > avoid executing multiple instances of a job. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (GOBBLIN-798) Clean up workflows from Helix when the Gobblin application master starts
[ https://issues.apache.org/jira/browse/GOBBLIN-798?focusedWorklogId=255547=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-255547 ] ASF GitHub Bot logged work on GOBBLIN-798: -- Author: ASF GitHub Bot Created on: 06/Jun/19 23:43 Start Date: 06/Jun/19 23:43 Worklog Time Spent: 10m Work Description: htran1 commented on pull request #2665: [GOBBLIN-798] Clean up workflows from Helix when the Gobblin applicat… URL: https://github.com/apache/incubator-gobblin/pull/2665 …ion master starts Dear Gobblin maintainers, Please accept this PR. I understand that it will not be reviewed until I have checked off all the steps below! ### JIRA - [X] My PR addresses the following [Gobblin JIRA](https://issues.apache.org/jira/browse/GOBBLIN/) issues and references them in the PR title. For example, "[GOBBLIN-XXX] My Gobblin PR" - https://issues.apache.org/jira/browse/GOBBLIN-798 ### Description - [X] Here are some details about my PR, including screenshots (if applicable): If the application master aborts a new one may be spawned by YARN. The second application master will resubmit the jobs. This results in duplicate jobs in Helix and multiple instances of the job may run, resulting in duplicate data. The Gobblin application master should clean up all workflows on startup to avoid executing multiple instances of a job. ### Tests - [X] My PR adds the following unit tests __OR__ does not need testing for this extremely good reason: GobblinYarnAppLauncherTest.testJobCleanup() ### Commits - [X] My commits all reference JIRA issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)": 1. Subject is separated from body by a blank line 2. Subject is limited to 50 characters 3. Subject does not end with a period 4. Subject uses the imperative mood ("add", not "adding") 5. Body wraps at 72 characters 6. Body explains "what" and "why", not "how" This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 255547) Time Spent: 10m Remaining Estimate: 0h > Clean up workflows from Helix when the Gobblin application master starts > > > Key: GOBBLIN-798 > URL: https://issues.apache.org/jira/browse/GOBBLIN-798 > Project: Apache Gobblin > Issue Type: Task >Reporter: Hung Tran >Assignee: Hung Tran >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > If the application master aborts a new one may be spawned by YARN. The second > application master will resubmit the jobs. This results in duplicate jobs in > Helix and multiple instances of the job may run, resulting in duplicate data. > The Gobblin application master should clean up all workflows on startup to > avoid executing multiple instances of a job. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)