[GitHub] [incubator-gobblin] sv2000 commented on a change in pull request #2665: [GOBBLIN-798] Clean up workflows from Helix when the Gobblin applicat…
sv2000 commented on a change in pull request #2665: [GOBBLIN-798] Clean up workflows from Helix when the Gobblin applicat… URL: https://github.com/apache/incubator-gobblin/pull/2665#discussion_r291687913 ## File path: gobblin-cluster/src/main/java/org/apache/gobblin/cluster/GobblinHelixMultiManager.java ## @@ -363,12 +359,26 @@ void handleLeadershipChange(NotificationContext changeContext) { } } + /** + * Delete jobs from the helix cluster + */ + @VisibleForTesting + public void cleanUpJobs() { +cleanUpJobs(this.jobClusterHelixManager); + +if (this.taskDriverHelixManager.isPresent()) { + cleanUpJobs(this.taskDriverHelixManager.get()); +} + } + private void cleanUpJobs(HelixManager helixManager) { // Clean up existing jobs TaskDriver taskDriver = new TaskDriver(helixManager); Map workflows = taskDriver.getWorkflows(); +log.debug("cleanUpJobs workflow count {} workflows {}", workflows.size(), workflows); Review comment: Maybe just dump workflows.keySet() instead of the entire map? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-gobblin] sv2000 commented on a change in pull request #2665: [GOBBLIN-798] Clean up workflows from Helix when the Gobblin applicat…
sv2000 commented on a change in pull request #2665: [GOBBLIN-798] Clean up workflows from Helix when the Gobblin applicat… URL: https://github.com/apache/incubator-gobblin/pull/2665#discussion_r291690862 ## File path: gobblin-cluster/src/main/java/org/apache/gobblin/cluster/GobblinClusterManager.java ## @@ -264,6 +268,12 @@ public synchronized void start() { this.eventBus.register(this); this.multiManager.connect(); +// Standalone mode registers a handler to clean up on leadership change, so don't do the cleanup +// now even if the option to clean up on startup is set. +if (this.cleanUpJobsOnStartup && !this.isStandaloneMode) { Review comment: Is this check needed for correctness or to avoid duplicate clean up calls? If it is the latter, shouldn't the 2nd call be handled as a No-op? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-gobblin] sv2000 commented on a change in pull request #2665: [GOBBLIN-798] Clean up workflows from Helix when the Gobblin applicat…
sv2000 commented on a change in pull request #2665: [GOBBLIN-798] Clean up workflows from Helix when the Gobblin applicat… URL: https://github.com/apache/incubator-gobblin/pull/2665#discussion_r291685487 ## File path: gobblin-cluster/src/main/java/org/apache/gobblin/cluster/GobblinClusterConfigurationKeys.java ## @@ -161,4 +161,8 @@ public static final String CANCEL_RUNNING_JOB_ON_DELETE = GOBBLIN_CLUSTER_PREFIX + "job.cancelRunningJobOnDelete"; public static final String DEFAULT_CANCEL_RUNNING_JOB_ON_DELETE = "false"; + + // for cleaning up jobs on cluster manager startup + public static final String CLEAN_UP_JOBS_ON_MANAGER_START = GOBBLIN_CLUSTER_PREFIX + "cleanUpJobsOnManagerStart"; + public static final boolean DEFAULT_CLEAN_UP_JOBS_ON_MANAGER_START = false; Review comment: Is the default "false" to preserve the current behavior with Gobblin cluster in non-Yarn mode? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-gobblin] sv2000 commented on a change in pull request #2665: [GOBBLIN-798] Clean up workflows from Helix when the Gobblin applicat…
sv2000 commented on a change in pull request #2665: [GOBBLIN-798] Clean up workflows from Helix when the Gobblin applicat… URL: https://github.com/apache/incubator-gobblin/pull/2665#discussion_r291685805 ## File path: gobblin-cluster/src/main/java/org/apache/gobblin/cluster/GobblinClusterConfigurationKeys.java ## @@ -161,4 +161,8 @@ public static final String CANCEL_RUNNING_JOB_ON_DELETE = GOBBLIN_CLUSTER_PREFIX + "job.cancelRunningJobOnDelete"; public static final String DEFAULT_CANCEL_RUNNING_JOB_ON_DELETE = "false"; + + // for cleaning up jobs on cluster manager startup + public static final String CLEAN_UP_JOBS_ON_MANAGER_START = GOBBLIN_CLUSTER_PREFIX + "cleanUpJobsOnManagerStart"; Review comment: Shouldn't we always clean up on restart? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-gobblin] sv2000 commented on a change in pull request #2665: [GOBBLIN-798] Clean up workflows from Helix when the Gobblin applicat…
sv2000 commented on a change in pull request #2665: [GOBBLIN-798] Clean up workflows from Helix when the Gobblin applicat… URL: https://github.com/apache/incubator-gobblin/pull/2665#discussion_r291687392 ## File path: gobblin-yarn/src/main/java/org/apache/gobblin/yarn/GobblinApplicationMaster.java ## @@ -81,7 +83,10 @@ public GobblinApplicationMaster(String applicationName, ContainerId containerId, Config config, YarnConfiguration yarnConfiguration) throws Exception { super(applicationName, containerId.getApplicationAttemptId().getApplicationId().toString(), -GobblinClusterUtils.addDynamicConfig(config), Optional.absent()); +GobblinClusterUtils.addDynamicConfig(config) +.withFallback(ConfigFactory.parseMap( + ImmutableMap.of(GobblinClusterConfigurationKeys.CLEAN_UP_JOBS_ON_MANAGER_START, "true"))), Review comment: Should it be ImmutableMap.of(GobblinClusterConfigurationKeys.CLEAN_UP_JOBS_ON_MANAGER_START, DEFAULT_CLEAN_UP_JOBS_ON_MANAGER_START) ? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services